CassieQ at the Seattle Cassandra Users Meetup

Last night Jake and I presented CassieQ (the distributed message queue on cassandra) at the seattle cassandra users meetup at the Expedia building in Bellevue. Thanks for everyone who came out and chatted with us, we certainly learned a lot and had some great conversations regarding potential optimizations to include in CassieQ.

A couple good points that came up where how to minimize the use of compare and set with the monoton provider, whether we can move to time UUID’s for “auto” incrementing monotons. Another interesting tidbit was the discussion of using potential time based compaction strategies that are being discussed that could give a big boost given the workflow cassieq has.

But my favorite was the suggestion that we create “kafka” mode and move the logic of storing pointer offsets out of cassieq and onto the client, in which case we could get enormous gains since we no longer … Read more

Consistent hashing for fun

I think consistent hashing is pretty fascinating. It lets you define a ring of machines that shard out data by a hash value. Imagine that your hash space is 0 -Int.Max, and you have 2 machines. Well one machine gets all values hashed from 0 -Int.Max/2 and the other from Int.Max/2 -Int.Max. Clever. This is one of the major algorithms of distributed systems like cassandra and dynamoDB.

For a good visualization, check out this blog post.

The fun stuff happens when you want to add replication and fault tolerance to your hashing. Now you need to have replicants and manage when machines join and add. When someone joins, you need to re-partition the space evenly and re-distribute the values that were previously held.

Something similar when you have a node leave, you need to make sure that whatever it was responsible for in its primray space … Read more

A toy generational garbage collector

Had a little downtime today and figured I’d make a toy generational garbage collector, for funsies. A friend of mine was once asked this as an interview question so I thought it might make for some good weekend practice.

For those not familiar, a common way of doing garbage collection in managed languages is to have the concept of multiple generations. All newly created objects go in gen0. New objects are also the most probably to be destroyed, as there is a lot of transient stuff that goes in an application. If an element survives a gc round it gets promoted to gen1. Gen1 doesn’t get GC’d as often. Same with gen2.

A GC cycle usually consists of iterating through application root nodes (so starting at main and traversing down) and checking to see where a reference lays in which generation. If we’re doing a gen1 collection, we’ll also do … Read more

RMQ failures from suspended VMs

My team recently ran into a bizarre RMQ partition failure in a production cluster. RMQ doesn’t handle partition failures well, and while you can set up auto recovery (such as suspension of minority groups) you need to manually recover from it. The one time I’ve encountered this I got a very useful message in the admin managment page indicating that parts of the cluster were in partition failure, but this time things went weird.

Symptoms:

  • Could not gracefully restart rmq using rabbitmqctl stop_app/start_app. The commands would stall
  • Could not list queues for any vhost. rabbitmqctl list_queues -p [vhost] would stall
  • Logs showed partition failure
  • People could not consistently log into the admin api without stalls, or other strange issues even when clearing browsing data/local storage/incognito/different browsers
  • Rebooting the master did not help

In the end the solution was to do an NTP time sync, turn off all clustered slaves … Read more

Logging the easy way

This is a cross post from the original posting at godaddy’s engineering blog. This is a project I have spent considerable time working on and leverage a lot.

Logging is a funny thing. Everyone knows what logs are and everyone knows you should log, but there are no hard and fast rules on how to log or what to log. Your logs are your first line of defense against figuring out issues live. Sometimes logs are the only line of defense (especially in time sensitive systems).

That said, in any application good logging is critical. Debugging an issue can be made ten times easier with simple, consistent logging. Inconsistent or poor logging can actually make it impossible to figure out what went wrong in certain situations. Here at GoDaddy we want to make sure that we encourage logging that is consistent, informative, and easy to search.

Enter the GoDaddy … Read more

Serialization of lombok value types with jackson

For anyone who uses lombok with jackson, you should checkout jackson-lombok which is a fork from xebia that allows lombok value types (and lombok generated constructors) to be json creators.

The original authors compiled their version against jackson-core 2.4.* but the new version uses 2.6.*. Props needs to go to github user kazuki-ma for submitting a PR that actually addresses this. Paradoxical just took those fixes and published.

Anyways, now you get the niceties of being able to do:

@Value
public class ValueType{
    @JsonProperty
    private String name;
    
    @JsonProperty
    private String description;
}

And instantiate your mapper:

new ObjectMapper().setAnnotationIntrospector(new JacksonLombokAnnotationIntrospector());

Enjoy!… Read more

Cassandra DB migrations

When doing any application that involves a persistent data storage you usually need a way to upgrade and change your database using a set of scripts. Working with patterns like ActiveRecord you get easy up/down by version migrations. But with cassandra, which traditionally was schemaless, there aren’t that many tools out there to do this.

One thing we have been using at my work and at paradoxical is a simple java based cassandra loader tool that does “up” migrations based on db version scripts.

Assuming you have a folder in your application that stores db scripts like

db/scripts/01_init.cql
db/scripts/02_add_thing.cql
..
db/sripts/10_migrate_users.cql
..

Then each script corresponds to a particular db version state. It’s current state depends on all previous states. Our cassandra loader tracks db versions in a db_version table and lets you apply runners against a keyspace to move your schema (and data) to the target version. If your … Read more

Dalloc – coordinating resource distribution using hazelcast

A fun problem that has come up during the implementation of cassieq (a distributed queue based on cassandra) is how to evenly distribute resources across a group of machines. There is a scenario in cassieq where writes can be delayed, and as such there is a custom worker in the app (by queue) who watches a queue to see if a delayed write comes in and republishes the message to a bucket later on. It’s transparent to the user, but if we have multiple workers on the same queue we could potentially republish the message twice. While technically that falls within the SLA we’ve set for cassieq (at least once delivery) it’d be nice to avoid this particular race condition.

To solve this, I’ve clustered the cassieq instances together using hazelcast. Hazelcast is a pretty cool library since it abstracts away member discovery/connection and gives you events on membership … Read more

Leadership election with cassandra

Cassandra has a neat feature that lets you expire data in a column. Using this handy little feature, you can create simple leadership election using cassandra. The whole process is described here which talks about leveraging Cassandras consensus and the column expiration to create leadership electors.

The idea is that a user will try and claim a slot for a period of time in a leadership table. If a slot is full, someone else has leadership. While the leader is still active they needs to heartbeat the table faster than the columns TTL to act as a keepalive. If it fails to heartbeat (i.e. it died) then its leadership claim can be relinquished and someone else can claim it. Unlike most leadership algorithms that claim a single “host” as a leader, I needed a way to create leaders sharded by some “group”. I call this a “LeadershipGroup” and we can … Read more

Plugin class loaders are hard

Plugin based systems are really common. Jenkins, Jira, wordpress, whatever. Recently I built a plugin workflow for a system at work and have been mired in the joys of the class loader. For the uninitiated, a class in Java is identified uniquely by the class loader instance it is created from as well as its fully qualified class name. This means that foo.bar class loaded by class loader A is not the same as foo.bar class loaded by class loader B.

There are actually some cool things you can do with this, especially in terms of code isolation. Imagine your plugins are bundled as shaded jars that contain all the internal dependencies. By leveraging class loaders you can isolate potentially conflicting versions of libraries from the host application and the plugin. But, in order to communicate to the host layer, you need a strict set of shared interfaces that the … Read more