Dealing with a bad symbolic reference in scala

Every time this hits me I have to think about it. The compiler barfs at you with something ambiguous like

[error] error: bad symbolic reference. the classpath might be incompatible with the version used when compiling Foo.class.

What this really is saying is that Foo.class references some import or class whose namespace isn’t on the classpath or has fields missing. I usually get this when I have a project level circular dependency via transitive includes. I.e.

Repo 1/
  /project A
  /project B -> depends on C and A
Repo 2
  /project C -> depends on A

So here the dependency C pulls in a version of A but that version may not be the same that project B pulls in. If I do namespace refactoring in project A, then project B won’t compile if those namespaces are used by project C. It’s a mess.

Thankfully scala lets you maintain folder … Read more

Scripting deployment of clusters in asgard

We use asgard at work to do deployments in both qa and production. Our general flow is to check in, have jenkins build, an AMI is created, and then … we have to manually go to asgard and deploy it. That sucks.

However, its actually not super hard to write some scripts to find the latest AMI for a cluster and prepare an automated deployment pipeline from a template. Here you go:

function asgard(){
  verb=$1
  url="https://my.asgard.com/us-east-1/$2"
  shift
  http ${VERB} --verify=no "$url" -b
}

function next-ami(){
  cluster=$1

  prepare-ami $cluster true | \
    jq ".environment.images | reverse | .[0]"
}

function prepare-ami(){
  cluster=$1

  includeEnv=$2

  asgard GET "deployment/prepare/${cluster}?deploymentTemplateName=CreateAndCleanUpPreviousAsg&includeEnvironment=${includeEnv}"
}

function get-next-ami(){
  cluster=$1

  next=`next-ami ${cluster} | jq ".id"`

  prepare-ami ${cluster} "false" | jq ".lcOptions.imageId |= ${next}"
}

function start-deployment(){
  cluster=$1
  payload=$2

  echo $payload | asgard POST "deployment/start/${cluster}"
}

The gist here is to

  • Find the next AMI image of a cluster
  • Get the prepared
Read more
Unit testing DNS failovers

Something that’s come up a few times in my career is the difficulty of validating if and when your code can handle actual DNS changes. A lot of times testing that you have the right JVM settings and that your 3rd party clients can handle it involves mucking with hosts files, nameservers, or stuff like Route53 and waiting around. Then its hard to automate and deterministically reproduce. However, you can hook into the DNS resolution in the JVM to control what gets resolved to what. And this way you can tweak the resolution in a test and see what breaks! I found some info at this blog post and cleaned it up a bit for usage in scala.

The magic sauce to pull this off is to make sure you override the default sun.net.spi.nameservice.NameServiceDescriptor. Internally in the InetAddress class it tries to load an instance of the interface NameServiceDescriptorRead more

CassieQ at the Seattle Cassandra Users Meetup

Last night Jake and I presented CassieQ (the distributed message queue on cassandra) at the seattle cassandra users meetup at the Expedia building in Bellevue. Thanks for everyone who came out and chatted with us, we certainly learned a lot and had some great conversations regarding potential optimizations to include in CassieQ.

A couple good points that came up where how to minimize the use of compare and set with the monoton provider, whether we can move to time UUID’s for “auto” incrementing monotons. Another interesting tidbit was the discussion of using potential time based compaction strategies that are being discussed that could give a big boost given the workflow cassieq has.

But my favorite was the suggestion that we create “kafka” mode and move the logic of storing pointer offsets out of cassieq and onto the client, in which case we could get enormous gains since we no longer … Read more

Consistent hashing for fun

I think consistent hashing is pretty fascinating. It lets you define a ring of machines that shard out data by a hash value. Imagine that your hash space is 0 -Int.Max, and you have 2 machines. Well one machine gets all values hashed from 0 -Int.Max/2 and the other from Int.Max/2 -Int.Max. Clever. This is one of the major algorithms of distributed systems like cassandra and dynamoDB.

For a good visualization, check out this blog post.

The fun stuff happens when you want to add replication and fault tolerance to your hashing. Now you need to have replicants and manage when machines join and add. When someone joins, you need to re-partition the space evenly and re-distribute the values that were previously held.

Something similar when you have a node leave, you need to make sure that whatever it was responsible for in its primray space … Read more

A toy generational garbage collector

Had a little downtime today and figured I’d make a toy generational garbage collector, for funsies. A friend of mine was once asked this as an interview question so I thought it might make for some good weekend practice.

For those not familiar, a common way of doing garbage collection in managed languages is to have the concept of multiple generations. All newly created objects go in gen0. New objects are also the most probably to be destroyed, as there is a lot of transient stuff that goes in an application. If an element survives a gc round it gets promoted to gen1. Gen1 doesn’t get GC’d as often. Same with gen2.

A GC cycle usually consists of iterating through application root nodes (so starting at main and traversing down) and checking to see where a reference lays in which generation. If we’re doing a gen1 collection, we’ll also do … Read more

RMQ failures from suspended VMs

My team recently ran into a bizarre RMQ partition failure in a production cluster. RMQ doesn’t handle partition failures well, and while you can set up auto recovery (such as suspension of minority groups) you need to manually recover from it. The one time I’ve encountered this I got a very useful message in the admin managment page indicating that parts of the cluster were in partition failure, but this time things went weird.

Symptoms:

  • Could not gracefully restart rmq using rabbitmqctl stop_app/start_app. The commands would stall
  • Could not list queues for any vhost. rabbitmqctl list_queues -p [vhost] would stall
  • Logs showed partition failure
  • People could not consistently log into the admin api without stalls, or other strange issues even when clearing browsing data/local storage/incognito/different browsers
  • Rebooting the master did not help

In the end the solution was to do an NTP time sync, turn off all clustered slaves … Read more

Logging the easy way

This is a cross post from the original posting at godaddy’s engineering blog. This is a project I have spent considerable time working on and leverage a lot.

Logging is a funny thing. Everyone knows what logs are and everyone knows you should log, but there are no hard and fast rules on how to log or what to log. Your logs are your first line of defense against figuring out issues live. Sometimes logs are the only line of defense (especially in time sensitive systems).

That said, in any application good logging is critical. Debugging an issue can be made ten times easier with simple, consistent logging. Inconsistent or poor logging can actually make it impossible to figure out what went wrong in certain situations. Here at GoDaddy we want to make sure that we encourage logging that is consistent, informative, and easy to search.

Enter the GoDaddy … Read more

Serialization of lombok value types with jackson

For anyone who uses lombok with jackson, you should checkout jackson-lombok which is a fork from xebia that allows lombok value types (and lombok generated constructors) to be json creators.

The original authors compiled their version against jackson-core 2.4.* but the new version uses 2.6.*. Props needs to go to github user kazuki-ma for submitting a PR that actually addresses this. Paradoxical just took those fixes and published.

Anyways, now you get the niceties of being able to do:

@Value
public class ValueType{
    @JsonProperty
    private String name;
    
    @JsonProperty
    private String description;
}

And instantiate your mapper:

new ObjectMapper().setAnnotationIntrospector(new JacksonLombokAnnotationIntrospector());

Enjoy!… Read more

Cassandra DB migrations

When doing any application that involves a persistent data storage you usually need a way to upgrade and change your database using a set of scripts. Working with patterns like ActiveRecord you get easy up/down by version migrations. But with cassandra, which traditionally was schemaless, there aren’t that many tools out there to do this.

One thing we have been using at my work and at paradoxical is a simple java based cassandra loader tool that does “up” migrations based on db version scripts.

Assuming you have a folder in your application that stores db scripts like

db/scripts/01_init.cql
db/scripts/02_add_thing.cql
..
db/sripts/10_migrate_users.cql
..

Then each script corresponds to a particular db version state. It’s current state depends on all previous states. Our cassandra loader tracks db versions in a db_version table and lets you apply runners against a keyspace to move your schema (and data) to the target version. If your … Read more