Building Distributed Systems with Blockchain Technology
In our new Tech Series, we will be giving some insight into the technology that is backing Cere and its product suite. Today we kick off with an article from our Lead Architect, Aurel, about building distributed systems. Get ready for a lot of content that will be released in the upcoming months!
At Cere, we are building the next generation of enterprise-focused, decentralized, SAAS platforms: The Cere Decentralized Data Cloud, or Cere DDC.
A critical component of Cere DDC is its trustless persistent storage layer, which is a distributed system consisting of many independent nodes, providing scalability and Byzantine fault tolerance. This storage layer can be thought of as a persistent message queue, like Kafka. Messages come in from producers and are then consumed by subscribers. Messages, once produced, will remain indefinitely in the system, and can be consumed at any time in the future.
However, there are several major differences between Cere DDC persistent storage and traditional message systems like Kafka. First, Cere DDC, as a service, is directly tied to a payment system running on the Cere blockchain. It also comes with a system of authentication, encryption, and data sharing, and it’s integrated with the operational and analytics databases that power applications built on Cere. This level of deep integration with blockchain-based payment and identity systems enables many scenarios that were not possible before. We will elaborate on those topics more in a future article.
Today, we will focus on the motivation for a key design choice, regarding how the data nodes coordinate with each other.
For a more general overview of Cere Networks technology, check out our Litepaper!
Distributed systems must provide certain guarantees to be considered well ordered, these include: consistency of data writes, delivery of reads, ordering, fault tolerance, etc. There needs to be some form of reliable coordination between nodes and a way to reach a consensus. There are many desirable characteristics one might want from this ordering of messages, such as authentication of peers, data redundancy, and validation rules. Nodes must be able to agree on certain global states, or achieve consensus, pick a leader, and replicate data. Even in the case of having some failed nodes, or in the event of network partitions.
In this article, you can find out how Cere has a crucial head start on complying with data regulations.
Many distributed systems come with some form of built-in consensus protocol to provide some of the above-mentioned features. These are designed to coordinate a number of nodes to make them tolerate typical hardware and network failures and provide various degrees of CAP theorem guarantees. But guess what kind of distributed system gives you all that, and then some more? That would be blockchain technology.
Blockchains are designed to coordinate many independent nodes, including some that you don’t even trust, and to tolerate anything these machines might do, either hardware failure, human errors, or malicious attacks. For instance, authentication and denial-of-service protection are baked in in the form of transaction signatures and transaction fees. Blockchain algorithms also clearly delegate the responsibility of different machines for creating consensuses, such as verifying the rules, providing replication and access to the state and history, and the clients that actually read and write. So there is a lot of flexibility in deployment and network infrastructure.
Blockchain technology can also be designed to support a variety of use-cases at once if it's a general-purpose chain. That is why programmable blockchains come with a way to program them, in particular in the form of smart contracts. There are many tools around this, in particular, the programming languages like Solidity and ink! for EVM and Rust-based blockchains. Then there are the RPC APIs, client libraries, dev tools, and importantly, the amount of accumulated experience from building decentralized protocols.
One can think of smart contracts as a method of implementing complex protocols between actors (in this case, nodes of the distributed system), except now in a very customizable and powerful way. That’s because the blockchain layer below offers so many guarantees already, and contracts are structured explicitly as state machines. There is the state (called storage), the messages that can change the state (usually called functions), the rules of the protocol (in the body of the functions), and the views of the state that help actors make decisions (that would be getter functions). Complex protocols can be expressed in a few lines of code. Highly complex protocols can be written and verified quite clearly, while ad-hoc implementations without a smart contract abstraction would be almost unthinkable.
We can use a public blockchain, running a fast proof-of-stake consensus and expect a latency of 5–10 seconds. Or we go private, with a chain dedicated to the operations of the data system, and tune the consensus layer to be as fast as physically possible.
A potential concern is, “yes, but blockchain operations are slow and computationally expensive.” Well, nobody is suggesting building live systems on top of the Bitcoin mainnet here. This is about the fundamental properties of the technology itself. We are comparing the ad-hoc methods of implementing coordination protocols, to a blockchain network specifically tuned for that same purpose. For the same — or stronger — guarantees, and comparable parameters, it will not be much slower than necessary, in fact, everything is constantly being improved upon and becoming quite optimized, and there are many knobs that can be adjusted. On the “more guarantees” end of the spectrum, we can use a public blockchain such as Cere, running a fast consensus with proof-of-stake, where we can expect a latency of say 5–10 seconds. Or we can go private, with a chain dedicated to the operations of the data system, and tune the consensus layer to be as fast as physically possible. And then there is the middle ground, which is a side chain that runs fast with medium-level security but piggybacks on the main chain. All without having to fully rewrite the core application.
Further, a key advantage of building the data system on top of the blockchain is that it can be readily hooked into the payments and incentives system. We will explore more on that soon in future articles!
More information about Cere Network: