Chaos monkey for docker

I work at a mostly AWS shop, and while we still have services on raw EC2, nearly all of our new development is on Amazon ECS in docker. I like docker because it provides a unified unit of operation (a container) that makes it easy to build shared tooling regardless of language/application. It also lets you reproduce your applications local in the same environment they run remote, as well as starting fast and deploying fast.

However, many services run on a shared ECS node in a cluster, and so while things like Chaos Monkey may run around turning nodes off it’d be nice to have a little less of an impact during working hours while still being able to stress recovery and our alerting.

This is actually pretty easy though with a little docker container we call The Beast. All the beast does is run on a ECS Scheduled … Read more

Tracking batch queue fanouts

When working in any resilient distributed system invariably queues come into play. You fire events to be handled into a queue, and you can horizontally scale workers out to churn through events.

One thing though that is difficult to do is to answer a question of when is a batch of events finished? This is a common scenario when you have a single event causing a fan-out of other events. Imagine you have an event called ProcessCatalog and a catalog may have N catalog items. The first handler for ProcessCatalog may pull some data and fire N events for catalog items. You may want to know when all catalog items are processed by downstream listeners for the particular event.

It may seem easy though right? Just wait for the last item to come through. Ah, but distributed queues are loosely ordered. If an event fails and retries what used to … Read more

Sbt sonatypeRelease on Travis

I figured I’d drop a quick note here for anyone else running into an issue. If you are trying to do a sonatypeRelease via sbt 1.0.3 on travis and getting a

Credentials file /home/travis/.sbt/credentials does not exist

Even though you are supplying your own inline creds, just drop in a fake creds file into that location:

realm=Sonatype Nexus Repository Manager


cp scripts/creds.fake /home/travis/.sbt/credentials

And save yourself the 2 days of headache I have had :p… Read more

Functors in scala

A coworker of mine and I frequently talk about higher kinded types, category theory, and lament about the lack of unified types in scala: namely functors. A functor is a fancy name for a thing that can be mapped on. Wanting to abstract over something that is mappable comes up more often than you think. I don’t necessarily care that its an Option, or a List, or a whatever. I just care that it has a map.

We’re not the only ones who want this. Cats, Shapeless, Scalaz, all have implementations of functor. The downside there is that usually these definitions tend to leak throughout your ecosystem. I’ve written before about ecosystem and library management, and it’s an important thing to think about when working at a company of 50+ people. You need to think long and hard about putting dependencies on things. Sometimes you can, if those libraries have … Read more

Tracing High Volume Services

This post was originally posted at

We like to think that building a service ecosystem is like stacking building blocks. You start with a function in your code. That function is hosted in a class. That class in a service. That service is hosted in a cluster. That cluster in a region. That region in a data center, etc. At each level there’s a myriad of challenges.

From the start, developers tend to use things like logging and metrics to debug their systems, but a certain class of problems crops up when you need to debug across services. From a debugging perspective, you’d like to have a higher projection of the view of the system: a linearized view of what requests are doing. I.e. You want to be able to see that service A called service B and service C called service D at the granularity of single requests.… Read more

From Thrift to Finatra

Originally posted on the curalate engineering blog

There are a million and one ways to do (micro-)services, each with a million and one pitfalls. At Curalate, we’ve been on a long journey of splitting out our monolith into composable and simple services. It’s never easy, as there are a lot of advantages to having a monolith. Things like refactoring, code-reuse, deployment, versioning, rollbacks, are all atomic in a monolith. But there are a lot of disadvantages as well. Monoliths encourage poor factoring, bugs in one part of the codebase force rollbacks/changes of the entire application, reasoning about the application in general becomes difficult, build times are slow, transient build errors increase, etc.

To that end our first foray into services was built on top of Twitter Finagle stack. If you go to the page and can’t figure out what exactly finagle does, I don’t blame you. The documentation is … Read more

The HTTP driver pattern

Yet another SOA blog post, this time about calling services. I’ve seen a lot of posts, articles, even books, on how to write services but not a good way about calling services. It may seem trivial, isn’t calling a service a matter of making a web request to one? Yes, it is, but in a larger organization it’s not always so trivial.

Distributing fat clients

The problem I ran into was the service stack in use at my organization provided a feature rich client as an artifact of a services build. It had retries, metrics, tracing with zipkin, etc. But, it also pulled in things like finagle, netty, jackson, and each service may be distributing slightly different versions of all of these dependencies. When you start to consume 3, 4, 5 or more clients in your own service, suddenly you’ve gotten into an intractable mess of dependencies. Sometimes there’s no … Read more

Bit packing Pacman

Haven’t posted in a while, since I’ve been heads down in building a lot of cool tooling at work (blog posts coming), but had a chance to mess around a bit with something that came up in an interview question this week.

I frequently ask candidates a high level design question to build PacMan. Games like pacman are fun because on the surface they are very simple, but if you don’t structure your entities and their interactions correctly the design falls apart.

At some point during the interview we had scaled the question up such that there was now a problem of knowing at a particular point in the game what was nearby it. For example, if the board is 100000 x 100000 (10 billion elements) how efficiently can we determine if there is a nugget/wall next to us? One option is to store all of these entities in a … Read more

Strongly typed http headers in finatra

When building service architectures one thing you need to solve is how to pass context between services. This is usually stuff like request id’s and other tracing information (maybe you use zipkin) between service calls. This means that if you set request id FooBar123 on an entrypoint to service A, if service A calls service B it should know that the request id is still FooBar123. The bigger challenge is usually making sure that all thread locals keep this around (and across futures/execution contexts), but before you attempt that you need to get it into the system in the first place.

I’m working in finatra these days, and I love this framework. It’s got all the things I loved from dropwizard but in a scala first way. Todays challenge was that I wanted to be able to pass request http headers around between services in a typesafe way that … Read more