Friday, December 30, 2011

How Netflix gets out of the way of innovation

#defrag 2011 presentation script.

I'm the cloud architect for Netflix, but rather than tell you about why we moved Netflix to a cloud architecture or how we built our cloud architecture, I'm going to tell you what we do differently at Netflix to create a culture that supports innovation.

What is it that lets us get things done very quickly. Sometimes a bit too qwikly…. but how did we keep making big strategic moves, from DVD to streaming, from Datacenter to Public Cloud, from USA only to International, all in very short timescales with a fairly small team of engineers.

My presentation slides are just box-shots of movies and TV shows that are available on Netflix streaming. This script is based on the notes I made to figure out what I was going to say for each box shot. If some of you see a show you didn't know we had and want to watch that would make me happy, you can click on the box shot to visit that movie at Netflix, they were all available for streaming in the USA at the time of writing.



I've attempted to match the box shots loosely as cues to what I'm saying, but I've also used a musical theme in places since this is for Defrag and Defrag rocks!



Netflix is now one of the largest sites that runs almost entirely on public cloud infrastructure. We have become a poster child for how to build an architecture that takes full advantage of the Amazon Web Services cloud. But when I talk to other large companies about what we have done, they seem to have a lot of reasons why they couldn't or didn't do what we did, even if they wanted to.



Why is that? Why are we heading in one direction while everyone else is going the other way? Are we crazy or are they zombies? Well, I've worked at other large companies so I have some perspective on the issues.



Before I joined Netflix I worked at eBay for a few years, and helped found eBay Research Labs. This was setup because eBay felt it wasn't innovating fast enough, and they were looking for the one missing ingredient that would drive more innovation into the company.



This is a fairly common approach. "You guys go and be innovative, then hopefully we will find ways to spread it around a bit." Unfortunately the end result of setting up a separate group to add innovation to a big company is more comical than useful.



The most interesting projects got tied in knots, they trod on too many toes or were scary. We visited Xerox Parc and IBM Santa Teresa Labs to discuss how they were setup, to try and learn what might work., and we went to an Innovation Forum in New York. That was weird, some of the primary examples they were talking about emulating were eBay and Paypal!



The projects that did get out were minor tweaks to existing ideas, they could be fun, but ultimately not very interesting.



So I had to break out of there and find something new to do, and in 2007 I joined Netflix just as they first launched streaming.



One of the key attractions for me was the Netflix culture I heard about in the interviews, I wanted to get inside their heads and figure out if what they were describing was real, and if so, was it sustainable as the company grew.



What I found out over the next few years is that the culture is what enables innovation, so that Netflix can get things done quickly that other companies are too scared or too slow to try. The rest of this talk is about the key things that we do differently at Netflix.



Before I get into them I want to warn you that even with a roadmap and a guide, you probably won't be able to follow this path if you are in a large established company. Your existing culture won't let you. However if you are creating a new company from scratch, I hope you can join me in what I hope is the future of cool places to work.



Here's the key insight. It's the things you don't do that make the difference. You don't add innovation to a company culture, you get out of its way.

I'm mostly going for SciFi at this point, because it's going to sound like I was beamed in from the future to some of you.



Let me repeat that. You have to setup a company that doesn't do many of the things you would consider business as usual. That's why it's so hard to retrofit.



How about some audience participation? Hands up everyone who hates answering questions by putting their hands up..



Who works at a company that has more than one product line? Do you get along? The problem is that the company loses focus and has trouble allocating resources where they are needed so there are big fights. Pick one big thing and do it well. For Netflix, our addressable market is everyone in the universe who likes watching movies and TV shows, that should keep us busy for a while.



Who has teams spread over multiple sites and countries? We don't. It adds communication and synchronization overhead that slows your organization down. For the geeks, think of Amdahl's law applied to people. We have as many people as possible in the same building on the same site. We are planning a new bigger building to make sure we can keep everyone close together. High bandwidth, low latency communication.



Who's worked for a place that bought another company, then run it into the ground, laid everyone off and wrote down the value. Over and over again. It's crazy. I don't think Netflix has ever bought another company. It's a huge disruption to the culture, if you see something you like just hire away their best people and out execute them in the market.



Who has junior engineers, graduate hires and interns writing code? We don't. We find that engineers who cost twice as much are far more than twice as productive, and need much less management overhead. Reducing management overhead is a key enabler for an innovative culture. Engineers who don't need to be managed are worth paying extra for. We are tiny compared to companies like Google, they take on raw talent and develop it, we sometimes take a chance on someone with only five years experience.



Who has an architecture review board and centralized coding standards? We don't have that either. What we do have is tooling that creates a path of least resistance, which combined with peer pressure keeps quality high. The engineers are free and responsible for figuring it out for themselves.



Who has an ITops team that owns keeping code running in production? We don't. The developers run what they wrote. Everyone's cell phone is in the pagerduty rota, the trick is making sure you don't need to get called. All the ops people here have horror stories of stupid developers, and vice versa, but it doesn't have to be that way. We have one dev organization that does everything and no IT ops org involvement in our AWS cloud deployment.



Who has to ask permission before deploying 100's or 1000's of servers? We don't. The developers use our portal directly, they have to file a Change Management ticket to record what they did if it's in production, that's all. We've trained our developers to operate their own code. We create and destroy up to 1000 servers a day, just pushing new code. AWS takes about 5 minutes to allocate 100 servers, it takes longer than that just to boot Linux on them.



Who has a centralized push cycle and has to wait for the next "train" before they can ship their code? We don't. Every team manages their own release schedule. New code updates frequently, and the pace slows for mature services. Teams are responsible for managing interface evolution and dependencies themselves. Freedom and responsibility again.



Who has project managers tracking deliverables? We don't. The line managers do it themselves. They own the resources and set the context for their teams. They have time to do this because we took the BS out of their role.
Managers have to be great at hiring, technical and hands on enough to architect what their team does, and project manage to deliver results. Don't split this into three people. Reduce management overhead, minimize BS and time wasted. Teams are typically 3-7 people. Have a weekly team meeting and 1on1 with each engineer to maintain context.



Who has a single standard for development tools? We don't. We assume developers already know how to make themselves productive. We provide some common patterns to get new hires started, like Eclipse, IntelliJ, on Mac, Windows. Some people use Emacs on Linux. Hire experienced engineers who care, and they will take care of code quality and standards without being told how to.



Who has to work with people they don't respect? It's much too disruptive. The only way to get high talent density is to get rid of the people who are out of their depth or coasting.



That also applies to what you might call brilliant jerks. Even if they do great work, the culture can't tolerate prima donna anti-social behavior, so people who don't trust others or share what they know don't fit in.



So does that mean we value conformity? No but it's really important to be comfortable as part of a high performance team, so we look for people who naturally over-communicate and have a secure confident personality type.



If you haven't experienced a high performance culture, think about what it's like to drive flat out at a race track. Some people will be too scared to deal with it and drive around white knuckled at 40 mph, some will be overconfident and crash on the first corner, but for people who fit into the high performance culture it's exhilarating to push yourself to go faster each lap, and learn from your peers without a speed limit. When you take out the BS and friction, everyone gets so much more done that productivity, innovation and rapid agile development just happen. This is the key message, removing obstacles to a high performance culture is how innovation happens throughout an organization. Doing less to get more.



We don't pay bonuses. We don't have grades other than senior engineer, manager, director, VP. We don't count the hours or the vacation days, we say "take some". Once a year we revise everyones salary to their peers and current market rate - based on what we are paying now to hire the best people we can find.



We also have what sounds like a crazy stock option plan that grants options every month, vests the same day, and they last 10 years even if you leave Netflix. The net of this is less work for managers, they can concentrate on hiring top people, and almost everyone that leaves takes a pay cut. The test we make is "would you fight to keep your engineers if they tried to leave". If not, let them go now and get someone better. We don't make it hard to let people go.



Some of you may be thinking this sounds expensive, but what is the value of being incredibly productive and able to move faster than your competition? You can get out ahead and establish a leading position before anyone else realizes you are even in the game. Remember how a few years ago the "Analysts" said that Netflix the DVD company was going to get killed by other companies streaming, then all of a sudden people realized that we were streaming more bandwidth than anyone else?



So what could possibly go wrong? We had a near miss recently, we went too fast, partly because we could, got unlucky and screwed up. The good thing is that Netflix could re-plan and execute on the fixes we need very quickly as well, with no internal angst and finger-pointing. Also there was an Asteroid nearby earlier this week. By the way, my stepdaughter @raedioactive was the art director for this movie.



So, you are at a crossroads, you could be on stage with Eric Clapton, or in the audience watching and wondering why you can't do what they are doing. It's a radically different way to construct a corporate culture, it doesn't work for everyone, and we can't all be up on stage with Eric, but the talent is out there if you start by building a culture focused on talent density to find it and keep it.



Is it going to be the goats or the glory? I just told you all to stop doing things, what could be easier than that? It takes less process, fewer rules and simpler principles. Give people freedom, hold them responsible, replace the ones that can't or won't perform in that environment. Focus on talent density and conserving management attention span by removing the BS from their jobs.



This is your challenge, can you get a band together and go on a mission to save your company? Stop doing all the things that are slowing you down, and get rid of the unproductive BS that clogs up your management and engineers.



I will take questions in the comments or on twitter to @adrianco. Thank you.



Each question got a new box shot, but all the answers were musical.

Thursday, September 22, 2011

The economics of Nissan Leaf ownership

[Updated December 2012] We got our @NissanLeaf at the end of June 2011, and are very happy with it. It's fun to drive and is our first choice for any journey within range. Since the car is nearly silent and has over 200 lb-ft of torque, I often use full power for acceleration away from the lights, which quietly leaves the other cars behind. The low noise level is also great for listening to speech or music, and the steering wheel controls and bluetooth integration with my iPhone works well for phone calls and iPod or Pandora.

Laurel usually takes it for her 66 mile round trip commute, and if not, I take it on my 20 miles round trip commute, and any shopping trips at the weekend. So far we have averaged about 1000 miles per month, mostly mountain roads and freeways, which is the worst case for electrical consumption. The Leaf collects its milage and power use, and we can go back to look at the activity.

The record shows 1003 miles at 3.8 miles/KWh and a total of 264.6 KWh. We pay 10c/KWh, so if we had charged at home that's $26.46. Since we charge at work for free it's more like $15. We have grid-tied solar power and time of use metering for a cheap overnight rate. The meter runs backwards during the day at a higher price, and the Leaf has a charging delay timer so that we can plug it in when we get home, then it starts charging when the cheapest rate starts at 9pm.

1003 miles in the cars we would normally drive, which get about 20mpg, uses 50 gallons of premium gas at about $4.10 and about $206. So we actually saved $180 in August 2011.

If Laurel drives every day, 22 work days a month at 66 miles is 1452 miles, she tops up the charge at work each day and gets about 4 miles/KWh on that route so that's under a dollar a day. Our gas cost would be $298, so she could be saving about $280/month. That takes a big chunk out of the cost of buying the car in the first place. On top of that, the servicing costs are minimal, no engine oil changes, no gearbox, and the brakes last longer because the regenerative braking system takes a lot of the load. We have had to buy a new tire after popping it on a pothole, that was about $150, installed, but could happen to any car. We could have saved on gas costs by buying a hybrid, but they are less fun to drive and you are still paying to maintain a gas engine and a very complex transmission.

The icing on the cake is our white (for pure electric) car pool lane stickers, so Laurel can take the freeway in rush hour and zip silently past all the Prius drivers whose yellow (for hybrid) stickers no longer get them in the car pool lane. It took a total of ten weeks to get the license plates, then apply and get the white stickers.

So the value proposition for the Leaf is that it is much more fun to drive than a high mpg economy car like a Prius, gets you in the car pool lane (if you live in California), and the purchase cost is offset by ultra-low running costs if you use it regularly.

We aren't alone in figuring out that this is a good deal. At last count (end of 2012) there are more than ten Nissan Leaf owners at Netflix, along with several Volts (the latest version gets a green carpool lane sticker) and several Teslas. At Informatica, Laurel is among several Leafs and Volts sharing the chargers.

We leased our car on a three year 36,000 mile plan. We included the home charger installation in the payment (about $2K), put down a $2K initial payment, got a $7500 federal rebate bundled into the deal. Actual payment including taxes as one of the first Leaf owners was over $500/month, the current deals are much lower than this and Nissan is about to release the cost reduced US built Leaf in January 2013. We got a $2500 state rebate paid directly to us after signing up for it which covers the initial payment. We leased because we think that in three years time there may be big advances in electric car technology, we could decide to keep the Leaf, or give it back and get a the pick of the 2014 models.

For much more discussion about the car, the My Nissan Leaf forum is quite active. One thing I found there is that upgraded springs and dampers are available, since we do a lot of mountain driving, we upgraded the suspension to be stiffer and better damped than stock.

The first question everyone asks is how far will the Leaf go, and the answer is between 60 and 100 miles per charge, but it depends on where you live and how hard you drive. The usable capacity of the battery pack is about 21KWh, the actual spec is 24KWh, so there is a little bit of extra capacity beyond it's "I'm empty" point. if you drive a lot of freeways at speed and climb mountains like we do, 3-4 miles/KWh gets you 60-80 miles. In a flat urban environment 4-5 miles/KWh is quite possible to get 80-100 miles.

Since we live at the top of a mountain (2400ft) and work near sea level, it's a good idea to charge the car to 80% full at home, and 100% full at work. This way there is regenerative braking for the initial downhill run, which is free power and also saves the brake pads.

Our "carwings" summary page for August 2011 is shown below.

Wednesday, September 21, 2011

Moving Planet - Day of Action Saturday 24th Sept

There is a world-wide movement coordinated by the global warming action group 350.org which has set this coming Saturday as a day of action to raise awareness.

You can find an event near you at the Moving Planet Site. Many of the events are based around bike rides.

For people local to the bay area, there is also a large Electric Vehicle car show in Palo Alto organized by the Electric Auto Association of Silicon Valley, where people can see and get rides in electric cars like the Nissan Leaf. The electric world speed record holding 216mph Lightning Motorbike should also be there.

We have been having a lot of fun driving our Nissan Leaf, we use it almost every day, and going back to a conventional car seems so broken. What's this gearbox thing and why does it keep having to change gear? The quiet instant torque of direct drive electric power is addictive. At last count there are six Nissan Leaf's parked at Netflix every day. We have several chargers for people who need a top up, but most of us just charge at home.

For more mainstream activities, the San Jose Mercury is also organizing a big car show in central San Jose on the same day, hybrids2hotrods.com, they have a Fisker hybrid on their poster and there will be electric cars and lots of classic cars as well.

I'm aiming to write some more blog posts on our solar system and the Nissan Leaf over the next few days.

Tuesday, August 30, 2011

I come to use clouds, not to build them...

[Update: Thanks for all the comments and Ryan Lawler's GigaOM summary - also I would like to credit James Urquhart's posting on Competing With Amazon Part 1. for giving me the impetus to write this.]

My question is what are the alternatives to AWS from a developer perspective, and when might they be useful? However I will digress into a little history to frame the discussion.

There are really two separate threads of development in cloud architectures, the one I care about is how to build applications on top of public cloud infrastructure, the other is about how to build cloud infrastructure itself.

In 1984 I didn't care about how the Zilog Z80 or the Motorola 6809 microprocessors were made, but I built my own home-brew 6809 machine and wrote a code generator for a C compiler because I thought it was the best architecture, and I needed something to distract me from a particularly mind-numbing project at work.

In 1988 I joined Sun Microsystems and was one of the people who could argue in detail how SPARC was better than MIPS or whatever as an instruction set, or how Solaris and OpenLook window system were better. However I never designed a RISC microprocessor, wrote kernel code or developed window systems libraries. I helped customers use them to solve their own problems.

In 1993 I moved to the USA and worked to help customers scale their applications on the new big multiprocessor systems that Sun had just launched. I didn't re-write the operating system myself, but I figured out how to measure it and explain how to get good performance in some books I wrote at the time.

In 1995 when Java was released and the WWW was taking off, I didn't work on the Java implementation or IETF standards body, I helped people to figure out how to use Java and to get the first web servers to scale, so they could build new kinds of applications.

In 2004 I made a career change to move from the Enterprise Computing market place with Sun to the Consumer Web Services space with eBay. At the time eBay was among the first sites to have a public web API. It seemed to me that the interesting innovation was now taking place in the creation and mash-up of web services and APIs, no-one cared about what operating system they ran, what hardware that ran on, or who sold those computers.

Over time, the interesting innovation that matters has moved up the food chain to higher and higher levels of abstraction, leveraging and taking for granted the layers underneath. A few years ago I had to explain to friends who still worked at Sun, how I was completely uninterested in whether my servers ran Linux or Solaris, but I did care what version of Java we were using.

Now I'm working on a cloud architecture for Netflix, we don't really care which Content Delivery Network is used to stream the TV shows over the Internet to the customers, we interchangeably use three vendors at present. I also don't care how the cloud works underneath, I hear that AWS uses Xen, but it's invisible to me. What I do care about is how the cloud behaves, i.e. does it scale and does it have the feature set that I need.

That brings me back to my original question, what are the alternatives to AWS and when might they be useful.

Last week I attended an OpenStack meetup, thinking that I might learn about its feature set, scale and roadmap as an alternative to AWS. However the main objective of the presenters seemed to be to recruit the equivalent of chip designers and kernel developers to help them build out the project itself, and to appeal to IT operations people who want to build and run their own cloud. There was no explanation or outreach to developers who might want to build applications that run on OpenStack.

I managed to get the panel to spend a little while explaining what OpenStack consists of, and figured out a few things. The initial release is only really usable via the AWS clone APIs and doesn't have an integrated authentication system across the features. The "Diablo" release this fall should be better integrated and will have native APIs, it is probably suitable for proof of concept implementations by people building private clouds. The "Essex" version targeted at March next year is supposed to be the first production oriented release.

There are several topics that I would like to have seen discussed, perhaps people could discuss them in the comments to this blog post? One is a feature set comparison with AWS, and a discussion of whether OpenStack plans to continue to support the AWS clone APIs for equivalent features as it adds them. So far I think OpenStack has a basic EC2 clone and S3 clone, plus some networking and identity management that doesn't map to equivalent AWS APIs.

The point of my history lesson in the introduction is that a few very specialized engineers are needed to build microprocessors, operating systems, servers, datacenters, CDNs and clouds. It's difficult and interesting work, but in the end if its done right it's a commodity that is invisible to developers and their customers. One of the slides proudly showed how many developers OpenStack had, a few hundred, mostly building it from scratch. There wasn't room on the slide to show how many developers AWS has on the same scale. Someone said recently that the far bigger AWS team has more open headcount than the total number of people working on OpenStack. When you consider the developer ecosystem around AWS, there must be hundreds of thousands of developers familiar with AWS concepts and APIs.

Some of the proponents of OpenStack argue that because it's an open source community project it will win in the end. I disagree, the most successful open source projects I can think of have a strong individual leader who spends a lot of time saying no to keep the project on track. Some of the least successful are large multi-vendor industry consortiums.

The analogy that seems to fit is Apple's iOS vs. Google's Android in the smartphone market. The parts of the analogy that resonate with me are that Apple came out first and dominates the market, taking most of the profit and forcing it's competitors to try and band together to compete, changing the rules of the game and creating new products like the iPad that leave their competition floundering. By adding together all the incompatible fragmented Android market together it's possible to claim that Android is selling in a similar volume to iPhone. However it's far harder for developers to build Android apps that work on all devices, and then they usually make much less money from them. Apple and it's ecosystem is dominant, growing fast, and extremely profitable.

In the cloud space, OpenStack appears to be the consortium of people who can't figure out how to compete with AWS on their own. AWS is dominant, growing its feature set and installed capacity very fast. Every month that passes, AWS is refining and extending it's products to meet real customer needs. Measured by the reserved IP address ranges used by its instances AWS has more than doubled in the last year and now has over 800,000 IP addresses assigned to its regions worldwide.

The problem with a consortium is that it is hard to get it to agree on anything, and Brooks law applies (The Mythical Man-Month - adding resources to a late software project makes it later). While it seems obvious that adding more members to OpenStack is a good thing, in practice, it will slow the project down. I was once told that the way to kill a standards body or consortium is to keep inviting new people to join and adding to its scope. With the huge diversity of datacenter hardware and many vendors with a lot to lose if they get sidelined I expect OpenStack to fracture into multiple vendor specific "stacks" with narrow test matrixes and extended features that lock customers in and don't interoperate well.

I come to use clouds, because I work for a developer oriented company that has decided that building and running infrastructure on a global scale is undifferentiated heavy lifting, and we can leverage outside investment from AWS and others to do a better job than we could do ourselves, while we focus on the real job of developing global streaming to TVs.

Operations oriented companies tend to focus on costs and ways to control their developers. They want to build clouds, and may use OpenStack, but their developers aren't going to wait, they may be allowed to use AWS "just for development and testing" but when the time comes to deploy on OpenStack, it's lack of features is going to add a significant burden of complexity to the development team. OpenStack's lack of scale and immaturity compared to AWS is also going to make it harder to deploy products. I predict that the best developers will get frustrated and leave to work at places like Netflix (hint, we're hiring).

I haven't yet seen a viable alternative to AWS, but that doesn't mean I don't want to see one. My guess is that in about two to three years from now there may be a credible alternative. Netflix has already spent a lot of time helping AWS scale as we figured out our architecture, we don't want to do that again, so I'm also waiting for someone else (another large end-user) to kick the tires and prove that an alternative works.

Here's my recipe for a credible alternative that we could use:

AWS has too many features to list, we use almost all of them, because they were all built to solve real customer problems and make life easier for developers. The last slide of my recent cloud presentations at http://slideshare.net/adrianco contains a partial list as a terminology glossary. AWS is adding entirely new capabilities and additional detailed features every month, so this is a moving target that is accelerating fast away from the competition...

From a scale point of view Netflix has several thousand instances organized into hundreds of different instance types (services), and routinely allocates and deallocates over a thousand new instances each day as we autoscale to the traffic load and push new code. Often a few hundred instances are created in a few minutes. Some other cloud vendors we have talked to consider a hundred instances a large customer, and their biggest instances are too small for us to use. We mostly use m2.4xl and we need the 68GB RAM for memcached, Cassandra or our very large Java applications, so a 15GB max doesn't work.

In summary, although the CDN space is already a commodity with multiple interchangeable vendors, we are several years from having multiple cloud vendors that have enough features and scale to be interchangeable. The developer ecosystem around AWS concepts and APIs is dominant, so I don't see any value in alternative concepts and APIs, please try to build AWS clones that scale. Good luck :-)

Friday, March 18, 2011

Understanding and using Amazon EBS - Elastic Block Store

There has been a lot of discussion in the last few days about EBS since it was implicated in a long outage at reddit.com.

Rule of Thumb

The benchmarking Netflix did when we started on AWS highlighted some inconsistent behavior in EBS. The conclusion we reached is a rule of thumb for EBS - If you sustain less than 100 iops (input+output per second) long term average it works fine. Short term bursts can be 1000 iops. By short term I mean less than a minute, long term more than 10 minutes. YMMV.

If you are doing benchmarks like this, collect response time and throughput and plot your data over time. You need to run long enough that the performance shows steady state behavior. The problem with EBS is that it doesn't have a particularly steady state. To explain why we need to look at the underlying architecture. I don't know the details of how EBS is implemented, but there is enough information available to explain how it behaves.

EC2

The AWS EC2 architecture is built out of commodity low cost servers, they have a single 1 Gbit network, a few CPUs, a few disks and a few GBytes of RAM. Over time the models have changed, and EC2 does have a 10Gbit network option now, but for the purposes of this discussion, we will concentrate on the 1Gbit network models. Individual servers are virtualized into the familiar EC2 models by slicing up the RAM, CPUs and disk space, and sharing the network bandwidth and disk iops. When EC2 instances break or are de-configured any data on the internal disks is lost.

Elastic Block Store http://aws.amazon.com/ebs/

The AWS EBS service provides a reliable place to store data that doesn't go away when EC2 instances are dropped, but it provides the same mounted filesystem capability as the internal disks. If you need more disk space or iops you can mount more EBS volumes on a single EC2 instance and spread out the load. The EBS volume is connected to the EC2 instance over the same 1Gbit network as everything else. In a datacenter this would normally be built using commercially available high end storage from NetApp, EMC or whoever, it would be quite expensive (cost much more than the EC2 instance itself) and be fast and reliable up to the limits of the network. To build a low cost cloud, the alternative is to use RAIN (Redundant Array of Inexpensive Nodes) which could be based on standard EC2 instances, or variants that have more disks per CPU. Software is then used to coordinate the RAIN systems and provide an EBS service that will be slower than high end storage, but still be very reliable and be limited by the 1Gbit network.

S3 and Availability Zones

AWS also has an S3 storage service that behaves like a key/value store accessed via http requests and a REST API rather than a directly mounted filesystem. It is possible to rapidly snapshot an EBS volume to and from S3, including incremental backups and restores that fill as they go so you don't have to wait before using them. This implies to me that they share a common back-end infrastructure to some extent. The primary additional difference is that EBS volumes only exist in a single AWS Availability Zone, and S3 data is replicated across two or three Availability Zones. It takes longer to replicate the data for S3, so it is slower, but it is very robust and it is almost impossible to lose data. You can think of an Availability Zone as a complete datacenter. All the zones in a region are separate datacenters that are close enough together to support a high bandwidth and low latency network between them, but they have separate power sources and connections to the Internet.

Multi-Tenancy

The most efficient chunk of compute and storage resource to buy and deploy when building a cloud is either too big or too small for the actual use cases of real applications. Virtualization is used to sub-divide the chunks, but then each individual machine is supporting several independent tenants. For local disks, the space is divided between the tenants, and for network, everyone is sharing the same 1Gbit interface. This works well on average, because most use cases aren't network or disk bound, but you cannot control who you are sharing with and some of the time you will be impacted by the other tenants, increasing variance within each EC2 instance. You can minimize the variance by running on the biggest instance type, e.g. m1.xlarge, or m2.4xlarge. In this case there isn't room for another big tenant, so you get as much as possible of the disk space and network bandwidth to yourself. The virtualization layer reserves some of the capacity. It's possible to tell that another tenant is keeping the CPU busy by looking at the "stolen time", but there are no metrics for stolen iops or network bandwidth.

The EBS service is also multi-tenant. Many clients mount disk space from a common backend pool of EBS disks. You don't get to see how the disk space is allocated, or how data is replicated over more than one disk or instance for durability, but it is limited to that availability zone. A busy client can slow down other clients that share the same EBS service resources. EBS volumes are between 1GB and 1TB in size. If you allocate a 1TB volume, you reduce the amount of multi-tenant sharing that is going on for the resources you use, and you get more consistent performance. Netflix uses this technique, our high traffic EBS volumes are mostly 1TB, although we don't need that much space.

This is actually no different in principle to large shared storage area network (SAN) backends (from companies like EMC or NetApp) that are in common datacenter use. Those also have unpredictable performance when pushed hard, and they mask this issue with lots of battery backed memory. The difference is cost. EBS is 10c per Gbyte per month. If you build a competing public cloud service using high end storage, you could get better performance but your cost base would be far higher.

Visualizing Multi-Tenant Disk Access

I have come up with some diagrams to help show what happens. I'm basing them on a simplified view of AWS where the only instance type family is m1 and everything they have is made out of one underlying building block. This consists of a fairly old specification system, 8 cores, 16GB RAM, four 500GB disks and a single 1Gbit network. In reality, AWS is much more complex than this, but the principles are the same.

Starting with internal disks, this is what an m1.xlarge looks like, it takes up the whole system apart from a small amount of memory, disk space and network traffic for the VM and AWS configuration/management information. You can expect to have minimal multi-tenant contention for network or disk access.



The m1.large instance type halves the system, each instance has two disks rather than four, so it shares the network and some of the disk controller bandwidth, but it should have minimal iops contention with the other tenant.



The low cost m1.small instance type has 160GB of disk per instance, so we can fit three per disk for a total of 12 instances per machine. (Note that the memory for a real m1.small is 1.7GB, so only 9 would fit in 16GB RAM, however the c1.medium instance has 1.7GB, 350GB disk, and more CPU, so six m1.small and three c1.medium fits). You can see the multi-tenancy problem here, any of the instances could generate enough traffic to fill the network and make one of the disks busy, and that is going to affect other instances in an unpredictable and random manner.

Here's an analogy, you can rent a whole house, rent a room in a house, or rent a couch to sleep on, you get what you pay for.

If you ever see public benchmarks of AWS that only use m1.small, they are useless, it shows that the people running the benchmark either didn't know what they were doing or are deliberately trying to make some other system look better. You cannot expect to get consistent measurements of a system that has a very high probability of multi-tenant interference.



EBS Multi-Tenancy

The next few diagrams show the flow of traffic from an instance to the EBS service, which makes two copies of the data on disks connected to separate instances. I don't know if this is how EBS works, but if we wanted to build an EBS-like system using the same building block it could look like this. In practice it would make sense to have specialized back-end building blocks with much more disk space.

The first diagram shows how Netflix runs EBS, we start with an instance that has the maximum network bandwidth with no other tenants, we allocate maximum size 1TB volumes (we stripe many of them together) and the service has to use most of the disk space in the back-end to support us, so we have less chance of another tenant making the EBS disks busy. The performance of EBS in this simplified case would be higher latency than local disk, but otherwise similar. I suspect that in reality the EBS volume is spread over more disks in the backend which gives higher throughput but with higher variance.



If we drop down to a more typical m1.large configuration with 100GB of EBS each, two instances are sharing network bandwidth, the EBS service is servicing two sets of requests, and the EBS back end has many more tenants per disk, so we would expect better peak performance than the two internal disks in the m1.large but more variance.



For the case where we have many m1.small instances each accessing a 10GB EBS volume, it is clear that the peak performance is going to be far better than a share of a local disk, but the contention for network, EBS service and backend disks will be extremely variable, so performance will be very inconsistent.



How To Measure Disk and Network Performance

Someone should write a book on that (I already did, but for Solaris), however there is a useful AWS forum post that explains how to interpret Linux iostat. This blog post is too long already, so Linux iostat will have to wait for another time.

Best Practices for Cloud Storage with Cassandra

There are two basic patterns for Cassandra, one is a persistent memory cache, where we size the data to fit in memory so that all reads are fast, and writes go to disk. The m2.4xl instance type with 68GB RAM and two 850GB disks is best. The second pattern is where there is a much larger data set than memory, and m1.xlarge with 16GB RAM and four 420GB disks will have the best iops for reads, and a much lower overall cost per GB for storage. In both cases, we get all the network bandwidth for servicing clients and the inter-node replication traffic, and minimal multi-tenant variance.

Saturday, March 12, 2011

How not to build a Private Cloud

It's all $, FUD, and internal politics. An MBO Cloud is what you get when the CEO tells the CIO to "figure out that cloud thing" (Management By Objective - i.e. the CIO bonus depends on it).

There is no technical reason for private cloud to exist.

[update: to clarify, that doesn't mean that I'm against private clouds or don't think they exist, because $, FUD and internal politics are a fact of life that constrain what can be done. Change also takes time and you have to "go to war with the army you have". However, this post is about what happens if your organization reallocates the $, isn't afraid, and has effectively no internal politics getting in the way.

This post was written in the middle of a debate on twitter between @adrianco @reillyusa @beaker and others including key insights from @swardley.

You should also read Christian Reilly's follow-up post "The Hollywood Culture" http://bit.ly/ePsisJ and many thanks to @bernardgolden for pointing out the excellent Business Week cover story on Cloud Computing http://ow.ly/4dm07 - after reading it I was amazed how well it aligned with what I write here - then I saw that it was by Ashlee Vance, one of the most clueful journalists around.

Netflix ITops Security Architect Bill Burns also wrote a very interesting post on the security challenges of cloud, we've been working together and he's on the interview team for the "Global Cloud Security Architect" I mention below.]

Too big for public cloud? You should *be* a public cloud.

Organizations who run infrastructure at the scale of tens to hundreds of thousands of instances have created cloud based models and opened them up to other organizations as public clouds. Amazon, Google, Microsoft are the clearest examples, they have expertise in software architecture, which is why they dominate the API definition. Telcos and hosting companies are adopting this model to provide additional public cloud capacity, using clones and variants of the API. Other organizations at this scale are already figuring out how to expose their capacity to their customers, partners and supply chain. The task you take on is to simultaneously hire the best people to run your cloud (competing with Amazon, Google etc.), and run it at low cost, which is why you need to be at huge scale and you need to decide that running infrastructure is a core competency of your business. Netflix is too small, doesn't regard infrastructure as core, and doesn't want to hire a bunch of ITops people.

It costs too much to port our apps? Your $ are mis-allocated.

What does it cost to build a private cloud, and how long does it take, and how many consultants and top tier ITops staff do you have to hire? Sounds like a nice empire building opportunity for the CIO. The alternative is to allocate that money to the development organization, hire more developers and rewrite your legacy apps to run on the public cloud, and give the development VP the budget to run public cloud directly. The payback is more incremental and manageable, but this is effectively a re-org of your business to move a large chunk of budget and headcount around. This is what happened at Netflix. It probably takes an act-of-CEO at most companies, the barriers are mostly political. Yes it will take time, but so will bringing up a private cloud.

Replace your apps with Saas offerings.

Many internal apps can be replaced by cloud services, we just outsourced our internal help desk and incident management software. No-one I know does payroll in-house. This is uncontroversial and is happening.

We can't put confidential data in a public cloud? This is just FUD.

The enterprise vendors are desperate to sell private clouds, so they are sowing this fear, uncertainty and doubt in their customer base to slow down adoption of public clouds. The reality is that many companies are already putting confidential data in public clouds. I asked the question "when will someone using PCI level 1 be in production on AWS" at a Cloud Connect panel, and was told that it is already being done, and Terremark pointed out that they host H&R Block's tax business. There are several models of public cloud with different SLA, cost and operational models that can support confidential data securely. There is also an argument that datacenter security is not as strong as people would like to think, and that the large cloud vendors can do a better job than most enterprises at keeping the infrastructure secure. At Netflix, we are about to transition to a global cloud based business, we are currently hiring a "Cloud Security Architect" who understands compliance rules like PCI (the credit card standard) on a global basis (we didn't need global expertise before). Part of their job is going to be to implement this.

There is no way my execs will sign off on this! Do they care about being competitive?

The biggest champion at Netflix for doing public cloud and doing it "properly" with an optimized architecture was our CEO Reed Hastings. He personally argued that we should try to do NoSQL rather than MySQL to push the envelope. Why? Because the bigger risk for Netflix was that we wouldn't scale and have the agility to compete. He was right, we have grown faster than our ability to build datacenters, and we have the agility we need to outrun our competition. Netflix has never had a CIO in the first place, we do have an excellent VP of operations though, and there is plenty to do running the CDN's and Saas vendors that support Enterprise IT.

Will private clouds be successful? I think there will be a few train wrecks.

The train wrecks will come as ITops discover that it's much harder and more expensive than they thought, and takes a lot longer than expected to build a private cloud. Meanwhile their developer organization won't be waiting for them, and will increasingly turn to public clouds to get their jobs done. We could argue about definitions but there are private clouds that are effectively the back ends for specialized large scale ecosystems like engineering companies that have to interface to the things that build stuff, or operate in places where there is no effective connection into the public clouds. For example, on board a ship that has limited external bandwidth, or to support a third world construction project. My take is that these will be indistinguishable from specialized Saas offerings within a supply chain ecosystem.

How not to build a public cloud - The Netflix Way

Re-org your company to give budget and headcount to the developers, let them run the public cloud operations
Ignore the FUD, best practices and patterns for compliance and security already exist and are audit-able
Get the CEO to give the CIO a different MBO, to shrink their datacenter.

Good luck with that :-)

Wednesday, March 02, 2011

Maslow's Hierarchy of NoSQL Reads (and Writes)

I tried out Prezi to create this presentation, it was more fun to create than powerpoint but has a few limitations. I would like better drawing tools. The talk itself puts some of the main NoSQL solutions into context.



Wednesday, February 16, 2011

Solar Power - More panels on the garage roof

The SolarCity installation team arrives.


Product specification on a panel, 210W peak output, Silicon by Kyocera made in Mexico.


Fronius inverter installed inside the garage, wired to the distribution panel shown next to it.

Mounting rails on the garage roof.

Closeup of how the rails mount on the shingles.

All the panels in place, four rows of nine panels, 36 x 210W = 7560W DC, after the inverter this works out to about 6.5KW delivered AC power.

View from the front of the garage early in the morning.

It's all wired in, we are waiting for inspection and approval before we can turn it on. The same monitoring system is used as was installed for my previous solar installation on the house roof. This time we leased the panels, there are several options, from no money down to buying the whole thing up front. I've opted for an initial payment of $10K, and $90/month fixed for the duration of the lease (20 years). I don't pay anything until its turned on.

We currently generate about two thirds of our usage of electricity. The extra panels should triple the generated capacity, so we will generate about twice what we use in the short term. That evens out when we start charging a Nissan Leaf (delivery due in April), and change out our propane furnace for a heat pump and air conditioner (hopefully in time to cool us this summer).

A useful side effect is that the garage itself will be a lot cooler inside during the summer, as the panels shade the roof.