Skip to main content

· 16 min read

Banner

Introduction

As the Cardano ecosystem continues to grow and evolve, contributors to the Cardano ecosystem are committed to continually refining and optimizing Cardano's networking infrastructure. The release of Dynamic peer-to-peer (P2P) networking, delivered with node v.1.35.6, was a collaborative effort of the networking team from [IOG], [Well-Typed], [PNSol] and the [Cardano Foundation] and represents a highly performant deliverable and a significant milestone in Cardano's journey toward establishing a fully decentralized and secure blockchain platform.

Given that Cardano functions as a real-time stochastic system, its performance and security are inherently interconnected. The networking team remains committed to finding the ideal balance among various factors, including topological and topographic considerations, to enhance timeliness and connectivity.

This blog post takes you through the engineering journey behind the development of Cardano's Dynamic P2P design. It delves into the core design principles, highlights the challenges encountered along the way, and unveils the solutions the team devised to establish a robust and scalable networking system.

What is Dynamic P2P

The Dynamic P2P implementation continuously and dynamically refines the active topology through a peer selection process, with the objective of reducing the overall diffusion time across the entire network. Research findings suggest that utilizing a policy based solely on local information can result in an almost-optimal global outcome. This is achieved by monitoring the timeliness and frequency of peers that provide a block header, which is ultimately incorporated into the chain.

The primary goal is to eliminate highly ‘non-optimal’ peers while maintaining strong connectivity. To achieve this, peers considered less useful based on this metric are periodically ‘churned out’ and replaced with randomly selected alternatives. Simulation results indicate that this optimization method converges towards a near-optimal global outcome within a relatively small number of iterations.

Practically, Dynamic P2P replaces the manual configuration of peer selection (e.g. using the topology updater tool).

With manual configuration, stake pool operators (SPOs) were required to establish connections with a significant number of peers (50 for example) to maintain a minimum of 20 active connections consistently. This approach was necessary due to the static nature of configured peers and the varying availability of SPO relays.

However, with Dynamic P2P, nodes can be configured to maintain a specific number of active peer connections (e.g. 20) and select from all registered SPO relays on the chain. In the event of a lost connection with a peer, the node will automatically select alternative peers and persistently attempt connections until the desired target is reached.

As a result, Dynamic P2P eliminates the requirement for over-provisioning of connections, offering a more efficient and adaptable networking solution.

· 19 min read

Introduction

I recently gave a short presentation on the topic of stacks in the GHC JavaScript backend to the GHC team at IOG. This blog post is a summary of the content.

In the context of a program produced by the GHC JavaScript backend, two different types of stack exist: The JavaScript call stack and Haskell lightweight stacks. In this post we will focus mostly on the lightweight stacks.

First we will see why using only the JavaScript call stack is not suitable for running compiled Haskell code. Then we will introduce the calling convention we use for Haskell and see how the lightweight stacks are used for making calls and passing around data. After this, we will explore in more detail how they are used for exception handling and multithreading.

· 16 min read

IOSim on Hackage

The IOG Networking Team is pleased to announce that we published [io-sim], [io-classes], [si-timers], [strict-stm], [strict-mvar] and [io-classes-mtl] on Hackage. These are tools without which we could not imagine writing a complex distributed system like [Cardano].

These packages support our goal of using the same code to run in production and simulation, what greatly increases the reliability and quality of the final system. [io-sim] and its ecosystem is designed to let write a simulation environment which provides provided things usually provided by an operating system like networking stack or disk IO and develop as well as implement & model complex applications/systems.

For developing a robust system one needs a proper testing framework which allows one to model the key characteristics of the system. To achieve this goal we needed to create an abstraction that captures the key aspects of the Haskell runtime and operating system environment for distributed systems. The Cardano [network stack][ouroboros-network] is a highly concurrent system, and as a network application, it needs to deal with time: there are all sorts of timeouts that guard resource usage: inactivity timeouts, message timeouts, or an application level TCP's WAIT_TIMEOUT among others. The tools which we provide permitted us to capture issues related to timing (which abound in network programming) which, in production, would be extremely rare (things like simultaneous TCP open or critical race conditions) and ensure that we can test (in the simulation) these scenarios. Recently we caught a [bug][sim-tcp-open-bug] in simultaneous TCP open when one side of the connection crashed - a corner case of a corner case, that's how effective is the combination of quickcheck style property-based testing & simulation!

· 11 min read

TL;DR: This blog post intends to sum up the why and how of cargo-cabal and hs-bindgen. If you’re looking for usage walkthroughs and code examples, check out project READMEs on GitHub!

N.B. quoted paragraphs in this article give straightforward motivation regarding some systems programming basic concepts. Feel free to skip them if you know you’re likely to be already comfortable with them ;)