Thursday, January 19, 2012

Thoughts on SimpleDB, DynamoDB and Cassandra

I've been getting a lot of questions about DynamoDB, and these are my personal thoughts on the product, and how it fits into cloud architectures.

I'm excited to see the release of DynamoDB, it's a very impressive product with great performance characteristics. It should be the first option for startups and new cloud projects on AWS. I also think it marks the turning point on solid state disks, they will be the default for new database products and benchmarks going forward.

There are a few use cases where SimpleDB may still be useful, but DynamoDB replaces it in almost all cases. I've talked about the history of Netflix use of SimpleDB before, but it's relevant to the discussion on DynamoDB, so here goes.

When Netflix was looking at moving to cloud about three years ago we had an internal debate about how to handle storage on AWS. There were strong proponents for using MySQL, SimpleDB was fairly new, and other alternatives were nascent NoSQL projects or expensive enterprise software. We started some pathfinder projects to explore the two alternatives and decided to port an existing MySQL app to AWS, while building a replication pipeline that copied data out of our Oracle datacenter systems into SimpleDB to be consumed in the cloud. The MySQL experience showed that we would have trouble scaling, and SimpleDB seemed reliable, so we went ahead and kept building more data sources on SimpleDB, with large blobs of data on S3.

Along the way we put memcached in front of SimpleDB and S3 to improve read latency. The durability of SimpleDB is its strongest point, we have had Oracle and SAN data corruption bugs in the Datacenter over the last few years, but never lost or corrupted any SimpleDB data. The limitations of SimpleDB are its biggest problem. We worked around limits on table size, row and attribute size, and per-request overhead caused by http requests needing to be authenticated for every call.

So the lesson here is that for a first step into NoSQL, we went with a hosted solution so that we didn't have to build a team of experts to run it, and we didn't have to decide in advance how much scale we needed. Starting again from scratch today, I would probably go with DynamoDB. It's a low "friction" and developer friendly solution.

Late in 2010 we were planning the next step, turning off Oracle and making the cloud the master copy of the data. One big problem is that our backups and archives were based on Oracle, and there was no way to take a snapshot or incremental backup of SimpleDB. The only way to get data out in bulk is to run "SELECT * FROM table" over HTTP and page the requests. This adds load, takes too long, and costs a lot because SimpleDB charges for the time taken in SELECT calls.

We looked at about twenty NoSQL options during 2010, trying to understand what the differences were, and eventually settled on Cassandra as our candidate for prototyping. In a week or so, we had ported a major data source from SimpleDB to Cassandra and were getting used to the new architecture, running benchmarks etc. We evaluated some other options, but decided to take Cassandra to the next stage and develop a production data source using it.

The things we liked about Cassandra were that it is written in Java (we have a building full of Java engineers), it is packed full of state of the art distributed systems algorithms, we liked the code quality, we could get commercial support from Datastax, it is scalable and as an extra bonus it had multi-region support. What we didn't like so much was that we had to staff a team to own running Cassandra for us, but we retrained some DBAs and hired some good engineers including Vijay Parthasarathy, who had worked on the multi-region Cassandra development at Cisco Webex and who recently became a core committer on the Apache Cassandra project. We also struggled with the Hector client library, and have written our own (which we plan to release soon). The blessing and a curse of Cassandra is that it is an amazingly fast moving project. New versions come fast and furiously, which makes it hard to pick a version to stabilize on, however the changes we make turn up in the mainstream releases after a few weeks. Saying "Cassandra doesn't do X" is more of a challenge than a statement. If we need "X" we work with the rest of the Apache project to add it.

Throughout 2010 the product teams at Netflix gradually moved their backend data sources to Cassandra, we worked out the automation we needed to do easy self service deployments and ended up with a large number of clusters. In preparation for the UK launch we also made changes to Cassandra to better support multi-region deployment on EC2, and we are currently running several Cassandra clusters that span the US and European AWS regions.

Now that DynamoDB has been released, the obvious question is whether Netflix has any plans to use it. The short answer is no, because it's a subset of the Cassandra functionality that we depend on. However that doesn't detract from the major step forward from SimpleDB in performance, scalability and latency. For new customers, or people who have outgrown the scalability of MySQL or MongoDB, DynamoDB is an excellent starting point for data sources on AWS. The advantages of zero administration combined with the performance and scalability of a solid state disk backend are compelling.

Personally my main disappointment with DynamoDB is that it doesn't have any snapshot or incremental backup capability. The AWS answer is that you can extract data into EMR then store it in S3. This is basically the same answer as SimpleDB, it's a full table scan data extraction (which takes too long and costs too much and isn't incremental). The mechanism we built for Cassandra leverages the way that Cassandra writes immutable files to get a callback and compress/copy them to S3 as they are written, it's extremely low overhead. If we corrupt our data with a code bug and need to roll back, or take a backup in production and restore in test, we have all the files archived in S3.

One argument against DynamoDB is that DynamoDB is on AWS only, so customers could get locked in, however it's easy to upgrade applications from SimpleDB, to DynamoDB and to Cassandra. They have similar schema models, consistency options and availability models. It's harder to go backwards, because Cassandra has more features and fewer restrictions. Porting between NoSQL data stores is trivial compared to porting between relational databases, due to the complexity of the SQL language dialects and features and the simplicity of the NoSQL offerings. Starting out on DynamoDB then switching to Cassandra when you need more direct control over the installation or Cassandra specific features like multi-region support is a very viable strategy.

As early adopters we have had to do a lot more pioneering engineering work than more recent cloud converts. Along the way we have leveraged AWS heavily to accelerate our own development, and built a lot of automation around Cassandra. While SimpleDB has been a minor player in the NoSQL world DynamoDB is going to have a much bigger impact. Cassandra has matured and got easier to use and deploy over the last year but it doesn't scale down as far. By that I mean a single developer in a startup can start coding against DynamoDB without needing any support and with low and incremental costs. The smallest Cassandra cluster we run is six m1.xlarge instances spread over three zones with triple replication.

I've been saying for a while that 2012 is the year that NoSQL goes mainstream, DynamoDB is another major step in validating that move. The canonical CEO to CIO conversation is moving from 2010: "what's our cloud strategy?", 2011: "what's our big data strategy?" to 2012: "what's our NoSQL strategy?".

9 comments:

  1. Nice article!

    We are considering using DynamoDB for new projects, and I saw some blogs comparing it to Cassandra. Cassandra seems more fully featured and mature. The killer service would be Cassandra running on AWS' SSD's as a managed service.

    AWS is planning to have SSD storage as a service, so we can still dream about that. How fast and scalable would that be? :)

    Cheers

    ReplyDelete
  2. You say that your smallest Casandra cluster has 6 hosts but surely this isn't the smallest useable Cassandra cluster possible (for dev), is it?

    ReplyDelete
  3. You can run an embedded version of Cassandra on your box for things like unit testing. In production, you would likely want at least two nodes to get the availability benefits of Cassandra, but you could run with a single node.

    Enjoyed the article. Good info. Very interested in hearing about the Hector competitor.

    ReplyDelete
  4. Thanks for the insight into your current NoSQL configuration and your thoughts on DynamoDB. I am curious if you can you provide any additional detail into the Cassandra implementation you are using? We have several multi-region Cassandra rings currently in production, and have advised numerous customers on configuring their own Cassandra implementations. The biggest hurdle we had to overcome was with regard to managing the security of the Gossip traffic across regions. While using secure Gossip with self-signed certificates was relatively easy to configure, the management of the security groups in different regions to restrict access to only the nodes in the ring proved a laborious manual process (especially when adding a new node to the ring – security groups in seven different regions needed to be updated). Fortunately the Cassandra concepts of “racks” and “datacenters” aligned nicely with AZs and regions, so a replication strategy was fairly easy to implement, but the management of the security continues to be problematic.
    Also, can you elaborate on your experiences with MongoDB and why you decided against it in favor of Cassandra? We (I am in the PS group at RightScale) see an ever-increasing number of customer requests for a multi-shard MongoDB solution – far more than we hear in regard to Cassandra.
    Thanks again for an informative (as always) post.

    ReplyDelete
  5. Hi Brian, our Cassandra tooling and automation takes care of the security groups, token assignment, backup/restore and growing the ring for both single and multiple region clusters. Over the coming months we plan to open source this work.

    We use MongoDB for some internal, non-customer-facing tools, where it's sophisticated query capability is useful. We also tried to use it as part of a data logging system, but that pushed it beyond it's scalability limit, so we know where it stops. MongoDB is great for developers, scales better than MySQL, but ultimately doesn't keep scaling up and doesn't have as good availability characteristics as Cassandra. At our scale, for a highly available global consumer web service we think Cassandra is the right choice, it's also getting easier to use over time.

    For our developers, we have a shared Cassandra cluster that anyone can create keyspaces in. For performance testing or production it takes a few minutes for the developer to create their own cluster of any size.

    ReplyDelete
    Replies
    1. Hi Adrian,
      Thanks for the additional insight into your Cassandra setup. I (and I am sure many others) look forward to seeing the innovative solutions in your Cassandra automation tools. Many thanks in advance to you and your team for your generous plan to open source this work. Your thoughts on and experiences with MongoDB are much appreciated as well. Our current customers have not yet hit the scalability limits as you did. If they do, it will most likely be because their applications will have grown in popularity and usage, and that’s one of those “good problems to have”; and at that time we will guide them to other solutions. Thanks again.

      Delete
  6. We (I am in the PS group at RightScale) see an ever-increasing number of customer requests for a multi-shard MongoDB solution – far more than we hear in regard to Cassandra.

    Honestly Brian, a big reason for this is MongoDB's excellent marketing push and reasonable product placement. They are halfway between MySQL and a DB based on Dynamo, so they're a good gateway drug to into the "NoSQL" space. And they have a lot of traction.

    The concept that it can be sharded is a big selling point. It's very simple to get a single server or replica set working, so it's really low friction to get started.

    Of course, the reason you're hearing a lot of requests for building a sharded setup is the fact that sharding with MongoDB is very complicated. Managing sharding becomes this extensive process of tracking a bunch of special nodes. Keeping the system going requires a similar management of special nodes.

    Of course, this is not apparent right off the bat. So users who suddenly "get big" also realize that they don't have the skilled resources to manage this new complex cluster. Hence going to people like Rightscale for expertise (there is very little true expertise on scaling MongoDB outside of the 10gen staff and existing their clients, neither of whom are out looking for jobs).

    ReplyDelete
  7. DynamoDB creates an interesting dilemma for those who want to use AWS but who have a problem best served by a standard relational database. At present EC2 is just not a good platform for anything but a small scale relational database - the IOPS just aren't there, and I haven't found any promises about a flush from the host really only returning once the data is safe.

    So, the dilemma is: do I use DynamoDB with its SSDs to get adequate performance, even though I'm going to have to jump through hoops and take data integrity risks because it doesn't have the ACID characteristics or flexibility of a conventional SQL database? Currently, the right answer for a system with a flexible number of servers talking to more than small-scale relational databases is not to use AWS, and to use a vendor which offers hybrid cloud/non-cloud hosting.

    100 IOPS with EBS is really quite pathetic when 10k is easily achievable with only one pair of fairly standard SSDs. IMO, general-use SSDs would be far more important for EC2 than DynamoDB because it would open it up to a whole new class of systems with intermediate scaling requirements.

    So lets hope this is only a first step for SSDs on AWS!

    ReplyDelete
  8. Adrian,
    Thanks for a peek under the Netflix hood and your thoughts on DynamoDB. I'm pleased to say that your main concern will be resolved in the very near future - we're working with AWS to build a DynamoDB backup & restore service. We currently offer automated daily backups for SimpleDB, Google Apps and Salesforce, and we should be adding a DynamoDB solution in the next few weeks.

    Of course it's a 3rd party solution, but AWS does not seem to have any plans to provide native backup/recovery support. We're also working with AWS to try to resolve the full table scan, which as you point out, adds overhead/costs that are not really acceptable for daily delta backups.

    I realize it's a moot point since you've migrated to Cassandra, but just wanted to let others know that there is a backup solution for SimpleDB and one on the way for DynamoDB. We also provide additional utility with the ability to restore a backup to another account/region/domain either in a destructive or non-destructive manner.

    Thanks again for the informative post and letting me share a little CloudAlly PR...

    Murray Moceri
    Marketing Director, CloudAlly

    ReplyDelete