First look at Bloomberg's amazing Comdb2

The following is the result of our engineering team's internal evaluation of Comdb2. It is not an official endorsement/disapproval of any database mentioned here.

Most new databases we come across are built by startups. They are typically not mature. Comdb2 is a different beast. Bloomberg has been using Comdb2 as their primary database for 14 years! A single minute of donwtime of Bloomberg can cause havoc in the financial markets. Thus, Comdb2 is designed to be ultra robust to a level unseen in databases.

What is Comdb2?

Comdb2 is a consistent, synchronously replicated SQL database, that is designed to be distributed from the groundup. It is not a BigData database however.

The magic of Perfect Availability

Every other database, even the 'Highly Available' ones, force you to 'retry' your query in case of connection failure. This results in all your database code entangled in a bunch of retry statements.

Imagine a programmer inside Bloomberg forgetting to write the 'retry' code for his query. That lack of 'retry' can lead to system downtime of Bloomberg, in turn leading to financial market chaos!

Thus, Comdb2 was designed to be 'Perfectly Available'. Talking to your database is like doing multiplication on your cpu. You don't need to check if it succeeded or not. It just works!

This robustness brings Comdb2 to a level of ease unheard of in other database systems.

What about the CAP Theorem?

ComDB2 is Partition tolerant and Consistent. So does it break the CAP Theorem by being perfectly Available too?

Not really. The answer lies in the ton of effort put in both the driver and the database which work together to provide perfect availability at the user level.

The 'retry' part is taken care of at the driver level, so the user doesn't need to worry about it. Then there is the database layer itself, which can take a transaction that was half completed at Node A and resume it at Node B in the middle of the transaction without any help from the driver!

Does it pass Jepsen?

That is indeed the first question that comes to mind when you come across a distributed database. Howver, {your favorite document database} this is not! This is Bloomberg's ultra robust primary database. They eat Jepsen for breakfast! Well, they run Jepsen in their CI, so it technically does probably run during their breakfast...

Comparison to PostgreSQL

Like PostgreSQL, Comdb2 has a single master. However, both reads and writes can run on replicas. Master failover is automatic and requires no human interaction. Adding/removing nodes to a Comdb2 cluster is a breeze.

Connection overhead is extremely low in Comdb2, unlike Postgres which spawns a whole new process per connection. No need for external tools like PgBouncer.

Where Comdb2 lacks is in SQL features. It simply doesn't have all the SQL functionality of Postgres.

That said, Comdb2 has an extremely robust foundational architecture on top of which they can build out all the SQL features (which is what the Comdb2 team seems to be working at).

Adding all the features of Comdb2 to Postgres on the other hand would require a huge change, if not a complete rewrite of Postgres. Looking at the current roadmap of Postgres, there is nothing even close to Comdb2's distributed features on the horizon.

Comparison to CockroachDB, Cassandra, DynamoDB and CosmosDB

All of these databases are focused on Big Data, which Comdb2 is not. CockroachDB comes close to feeling like Comdb2 at first glance, but it too focuses on scalability first.

It would be nice if all these databases borrowed some of the 'perfect availibility' ideas from Comdb2, especially moving retries to the driver level.

Current State of Comdb2

Bloomberg has open sourced their in-progress version, 7.0, of Comdb2 on Github. The emphasis of their current work seems to be on adding more SQL features to Comdb2, which seems the right direction to go in. The current version is still not 'released' yet, though a summer release seems to be planned.

We are holding out on doing any performance measurements till an official release.

Conclusion

Comdb2 is a unique database with novel ideas.

While the rest of the database world seems obsessed with 'BigData' and 'Scalability', Comdb2 takes the humble 'SQL database' to superhuman levels of robustness and ease of use. Its distributed features lend itself well to be used easily in a containerized world. It does everything we wish PostgreSQL did from an administration point of view.

We feel Comdb2 has the highest chance of replacing PostgreSQL as it evolves its SQL functionality. It is one that the rest of the database vendors should pay attention to and borrow ideas from.

Show Comments