Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicate subjective kv #997

Open
James-Mart opened this issue Jan 20, 2025 · 2 comments
Open

Replicate subjective kv #997

James-Mart opened this issue Jan 20, 2025 · 2 comments
Labels
Node infra Related to the necessary infrastructure provided by all psibase infrastructure providers

Comments

@James-Mart
Copy link
Member

Need a way to replicate subjective kv tables, at least among one's own node cluster. This is necessary for failover, recovering any critical private data.

@James-Mart James-Mart added the Node infra Related to the necessary infrastructure provided by all psibase infrastructure providers label Jan 20, 2025
@bytemaster
Copy link
Contributor

The definition of "subjective" has just expanded from "this local node" to "this shared cluster" which could mean a subset of the global set of nodes on a public network with independent producers.

  1. The purpose of subjective data was data for which there was no need for consensus
  2. billing of subjective resource usage that only applies to this single node is the primary use case; however, when we expand the concept of billing to include state synchronized a cross many nodes it ceases to be entirely subjective and must become objective to a subset of nodes.

What you end up needed to do is having each node in the cluster report its own subjective view of resource demands of a user to a "central billing server". Each node is "trusted" because they are owned by one user, but never the less it is multiple nodes.

** Private Networks **
if this is a private network where one entity runs the entire blockchain and is effectively responsible for both objective and subjective billing, then the nodes can report their subjective billing demands via transactions that get logged in the global objective state. It makes no sense to invent a separate "consensus" process when 100% of producer nodes and 100% of infrastructure nodes need to agree on billing.

The private node operator would be 100% responsible for all real world costs of tracking the billing anyway, so there is no need to invent a separate system to record the billing unless it is for scaling purposes (e.g. off loading state that is independent from the rest of the objective state) or for security purposes (hiding information from those it is otherwise disclosing the blockchain to).

** Shared Networks w/ Cluster Infrastructure Providers**
Once we consider that the algorithm for reporting and tracking usage should be the same as objective transactions; then, we must only consider what blockchain these subjective transactions are broadcast to. Consider that only a subset of all nodes need this information and that there are costs to 3rd parties associated with broadcasting the state to the other participants of the public chain whom otherwise have no need for your private billing data.

Therefore, the solution is that each infrastructure provider on a public chain needs two blockchains, one to track the state of its private billing cluster, and the public chain. Individual machines would report their subjective bandwidth usage for each user to their cluster state chain, but authenticate users according to the shared public chain.

** Two Approaches **
A psinode can simultaneously track the state of multiple chains in one database.
A psinode can offload subjective billing to a separate process, potentially on a separate machine.

If you consider that the billing cluster may only need to be the minimum for fault tolerant redundancy, but the replication of the public infrastructure could scale much larger then it becomes fairly obvious that the following is true:

  1. In the simple one node case all subjective billing is tracked as part of the objective state of the main chain.
  2. In the medium complexity minimal cluster single provider case you would still put it all on one chain
  3. In the large scale (high query demand) single provider case you would broadcast resource tracking to a second chain
    • the second chain could be a single node or a minimal cluster but the main chain wouldn't care.
  4. The shared public network with multiple infrastructure providers which each have a cluster they would utilize a second chain

The code for the resource management is therefore a smart contract that can run on any chain and each psinode only needs to be configured to know which chain to point its reporting and querying to.

@cool-ant
Copy link
Member

Few background concepts I think might be useful here:

  • I don't think we need the integrity of the blockchain to prove subjective billing is happening with integrity. If you're not happy with your node provider, you can switch or leave the community or get the community to seek a new node provider.
  • I see plenty of options (and in fact would suggest) that we off-load all data centery function to data center/ops people who have specialty knowledge and tools. We just need to be compatible with such tools, i.e., not make any architectural choices that would hamstring traditional solutions.
  • billing could be handled in a decentralized way where each node reports its usage to the producing node, and the tool that does that reporting can just sit on each node. very simple solution that works on a smallish scale and can be replaced/upgraded with more complex solutions if and when they're needed. The only costs are 1) cost of the billing txs themselves (which the node provider would pay), which would actually incentivize efficiency as far as reporting frequency and 2) the blocklog and bw bloat from syncing obj billing txs. I'd imagine this to be a negligible proportion of total traffic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Node infra Related to the necessary infrastructure provided by all psibase infrastructure providers
Projects
None yet
Development

No branches or pull requests

3 participants