Catchain overview

Catchain is a communication protocol between actors. It does not execute the consensus algorithm itself but prepares the data required for the decision-making of a higher-level component: Block Consensus Protocol (BCP). The main purpose of the protocol is to allow validators to send messages that explicitly depend on other messages, possibly produced by different actors. Dependencies are encoded in such a way that a node receiving a message can download the referenced messages and verify their validity.

Problem

No network can guarantee that

Messages will be delivered.
Messages will arrive in the order they were sent, even if there are 2 messages between 2 nodes.

Actors are assumed to intentionally try to break consensus. The system must avoid situations where such actors prevent honest actors from making progress.

Assumptions

Each node knows the ADNL address and the public key of every other node. Thus, every pair of nodes can ensure each other’s authenticity and establish an encrypted channel. The ADNL protocol is used to set up a private overlay network, which allows a node to send encrypted messages to a specific address or broadcast messages.

Protocol capabilities

Logical capabilities

At the logical level, the protocol provides causality preservation when sending messages. In other words, it lets a validator declare which messages cause other messages and guarantees that processing message A, which depends on messages B and C, only happens after messages B and C are processed.

Formal capabilities

The protocol implements a CRDT over the set of messages known to each node and achieves eventual consistency. As a synchronization mechanism, it uses the idea similar to vector clock with modifications that preserve eventual consistency even when some nodes are Byzantine.

Choosing neighbor nodes

Catchain selects five neighbors at random and periodically refreshes the list each random interval between 60 and 120 seconds in the current configuration. Note that the neighbor relation is not symmetric: if B is a neighbor of A, it does not imply that A is also a neighbor of B.

Message identification

The seqno of each node is the count of messages that node has sent in the current round. A node must increment this number after each send message. Nodes not following that rule are considered Byzantine. The Catchain protocol detects such nodes, as described in Handling Byzantine nodes. Every Catchain message is uniquely identified by the pair (sender; sender's seqno, body hash). A Catchain message can declare any other messages by any node as a dependency. These dependencies are stored in the message body. Every Catchain message must declare the previous message from the same node as a dependency. Otherwise, the message is considered malformed and ignored by honest nodes. If a Catchain message has seqno equal to 0, the previous message for it is genesis: the name for the initial state of the system.

Example message flow

These messages do not have an order by default. In an asynchronous system they can be received in any order. Declaring message dependencies helps to establish a partial ordering on all messages in general and on each node in particular. This is possible because in Catchain the dependency relation is anti-symmetric, transitive, and irreflexive.

Handling Byzantine nodes

The key goal is not to allow Byzantine nodes to break the partial order of the messages in Catchain. It can be broken by violating one of three key properties: anti-symmetry, transitivity, irreflexivity.

Anti-symmetry

This property means that

\forall A, B: depends(A, B) -> !depends(B, A)

. Catchain preserves this property: when A depends on B, B is included in the body hash field of A. Therefore, B cannot depend on A, as its body hash must include A.

Transitivity

This property means that

\forall A, B, C: depends(A, B) \&\& depends(B, C) -> depends(A, C)

. The protocol preserves this implicitly: a message is only delivered after all messages in its dependency cone — including transitive dependencies — are delivered. If A depends on B and B depends on C, a node delivering A must have already delivered B, which in turn required delivering C first.

Detecting forks (duplicate message IDs)

If an actor issues two messages with the same identifier (A; i) but different payloads, the situation is called a fork. Any honest validator who observes it constructs a proof consisting of the block identifier and two distinct signatures. Since all public keys are known, this proof can be broadcast to the neighbors, and then further across the network. Every honest participant who receives the proof starts ignoring all messages from node A starting from (A, i), as well as all messages that depend on (A, j), where j >= i.

Detecting skipped sequence numbers

Every message (X; i) (except the genesis message (X; 0)) depends on (X; i - 1). If a node produces (X; i) and later emits (X; i + j) with j > 1, an honest node will not be able to obtain (X; i + j - 1) and therefore will not process (X; i + j). Skipping sequence numbers is thus pointless: all messages following the missing one are ignored by honest validators.

Node state

The node state stores information about all messages processed by the current node. Since messages are ordered for each node, denote this information by the vector (A_i, B_j, C_k, D_m, …). This vector tracks the last processed message for each validator. If A_i is stored and A_i+2 arrives, the node must download A_i+1.

Dependency retrieval process

Every 0.1-0.2 seconds, a node selects three random neighbors out of its five and starts a synchronization round with them. Synchronization can be performed in two ways:

Send its node-state vector. If the neighbor has more recent blocks, it provides them.
Perform targeted downloads of specific messages. This is usually used if, after the bulk synchronization above, there are still individual missing dependencies.

Encryption

All the steps above happen inside a private ADNL overlay. There is no shared session key — instead, each pair of nodes derives its own key to encrypt message bodies. In practice, this means that an external observer on the TON blockchain cannot read the messages that validators exchange.

Documentation Index

​Problem

​Assumptions

​Protocol capabilities

​Logical capabilities

​Formal capabilities

​Choosing neighbor nodes

​Message identification

​Example message flow

​Handling Byzantine nodes

​Anti-symmetry

​Transitivity

​Detecting forks (duplicate message IDs)

​Detecting skipped sequence numbers

​Node state

​Dependency retrieval process

​Encryption