what Akka doesn’t do for you

Akka is an awesome actor system, its abstraction away from the threads is absolutely great, yet, you might have to deal with those as Akka couldn’t solve everything.

In the experience of latest project, I’ve got chance playing Akka with ZooKeeper, with Cassandra, with Spray and a lot of the asynchronous world, and whenever Akka needs to step out of its actor world, ugly truth will hit you before you know it. And here’re couple of things I learned:

  • be careful of your dispatcher, sometimes, u need to separate from the default-dispatcher to deal with blocking thread operations (like zookeeper’s op)
  • be careful of anything that is blocking in your actor, say a cassandra update, or query, if not using the `async` mode, you could have an actor performing really bad. (it’s worse in my case, as that actor has some dispatching responsibilities, which made it effectively a bottleneck)
  • be careful when u deal with other `async` operations which run in a different thread pool, again, as I converted cassandra ops to async, a different threadpool is created, and messages from that threadpool back to akka must use ActorRef/ActorSelection explicitly, as the `context` is no longer with you.
  • be careful not to overwhelm the slower components, sorry, again cassandra, I did all that i could to reduce the number of operations against cassandra, as I had only access to a shared cluster, the operations are typically much longer than 10s of millis, rather, it usually gets up to 200 to 500 millis. And to deal with such a big latency, I had to creatively merge ops into much fewer, interesting as it might sound, it’s more complexity than u normally would like to handle.

Some more details about the `merge`:

  • consecutive `select`s are merged into one (based on primary key’s `in` query)
  • consecutive `insert`s are merged into one batched statement
  • consecutive `delete`s are merged into one (based on primary key’s `in` query)
  • there’s a cap of the merge, which is set to 64, and prepared statements are created from 1 to 64, and cached
  • this could be done by customized akka mailbox, which classifies same typed messages (save, delete, load etc.) into combined types.

This `merge` mechanism allowed me to dramatically reduce the network overhead, coordination overhead of cassandra, and enables much higher virtual throughput as I needed.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s