Showing posts with label culture. Show all posts
Showing posts with label culture. Show all posts

Sunday, April 29, 2012

It's not obvious how to be insanely simple

three books that I read recently resonated with me and fitted together so I'm going to try to make sense of them in a blog post rather than in a series of cryptic tweets.

My son (who is a product manager at eBay) told me about the most recent publication:
Insanely Simple: The Obsession That Drives Apple's Success by Ken Segall

At the Defrag conference last year I saw Duncan Watts present and recently finished reading his book Everything Is Obvious: Once You Know The Answer

I also recently watched Sam Harris give his talk on Free Will, and then read the book.

The connection starts with Free Will, which explains what is really going in our heads, along with Everything is Obvious which explains how our minds work collectively and interact with the real world these two books are the "Missing Manuals" for our brains. It's hard enough to figure out what is going on in the world and how best to navigate it, but it's doubly hard when you don't realize how your subconscious is pulling the strings and how common sense is confusing everyone around you.

Inside your head, the conscious thread of thoughts that you hear are post rationalizing decisions that your subconscious mind has already made. Feeding yourself a broad range of information with an open mind, connecting to your intuition and letting the power of your subconscious find the right patterns and responses lets you make faster and better decisions.

In society, we are surrounded by common sense explanations that we use to post rationalize the events around us and which are fed to us by the media, historians, politicians and our friends. Duncan deconstructs common sense to show that these explanations are mirages driven by our inner need to find a narrative and cause for effects that are essentially random co-incidences with far less significance than we assume. He then explains what un-common sense looks like and how to question the received wisdom and have better strategies for getting things done successfully.

I'm not going to summarize the whole book but there is a very useful section that should be read by anyone doing "big data analytics" that sets out the kind of things that are know-able and what (and why) other things will always remain un-knowable and impossible to predict. The advice I distilled from the discussion of strategy is that there is so much randomness in the outcome of business decisions that you cannot reliably evaluate the difference between a good strategy and a poor strategy. If you are able to get ever more detailed data about what happened you become more convinced in the value of your analysis, but the predictions you make about what to do next don't get any better. This is a counter-intuitive outcome (i.e. it violates common sense), so please read the book, which explains why you shouldn't be trusting your common sense in the first place.

The positive things we can do to overcome random outcomes really resonated with me, as they put into words several of the things I've been doing for many years, which have in some sense given me a better way to understand what's going on and get stuff done. They also describe many of the ways that Netflix figures out how to build it's products.

The first thing I do when I hear something like A caused B is a reflex reaction, I flip it around in my head, take the devils advocate position, look at the situation from a few different angles. This can be quite annoying in "polite company" as I tend to question received wisdom and common sense assertions, however I usually find a missing piece of information that could falsify the assertion, and ask the question. It could be as simple as asking exactly what time A and B happened, since if B turned out to happen before A then the assertion is clearly false. In statistics and physics this is codified as asserting the null hypothesis. (I'm the son of a statistician and I have a Physics degree...).

At Netflix we always try to construct parallel "A/B" tests of our hypotheses, like the double blind tests used in clinical trials of new medicines. We take a large number of new customers and give them a range of different experiences for long enough to measure a difference in their responses. This is the only way to reliably tell whether a new feature works, and it often goes against the common sense of what we expect and what many customers and industry analysts helpfully suggest we should be doing. As Duncan explains we can usually figure out what factors will affect an outcome, but we are extremely poor judges of how to weight those factors, even with post rationalization of what we saw happen, and all we can do is bias the statistics in a preferred direction. A recurring example is the suggestion that Netflix should allow half-stars in its movie ratings, but it turns out that given more fine grain choice fewer people rate movies, and the reduction in the number of data points out-weighs the increased accuracy. We can post rationalize why this occurs as an example of giving people too much choice, but we don't have to rationalize it, we just measured it.

In the discussion of strategy Duncan talks about creating a set of strategies that cover many scenarios, and using scenario planning to build more flexible and fuzzy strategies which are more likely to work under a range of random external influences. By putting yourself in the path of possible good randomness and avoiding bad outcomes, you can "make your own luck". By detecting problems early and having the flexibility to adapt your strategy you can run around the problems that will randomly come your way. If instead you concentrate on coming up with the best possible strategy or assuming that previous success was due to strategy rather than random outcomes you are building a brittle future that is likely to disappoint you.

The final point I will lift from Duncan's discussion of uncommon sense is that speed of execution and iteration is another fine way to cheat the chance events that will derail your plans. Long term detailed plans are a waste of time. This is one of the foundations of agile development, where rapid iteration of product features lets you discover what your users actually do with your product, as opposed to what you thought they would do or what they say they will do.

This leads to the Insanely Simple book, which talks about Apple's approach to product design, with particular emphasis on branding and marketing since Ken Segall was the guy who came up with the i in iMac and has many other fascinating stories. One reason I like working at Netflix is that for agile web services, product ideas can be built and tested in a week or a month, and fixed in minutes. For Apple they work on products for years and need to have them work perfectly when they are released. This gives them two big problems, since its hard to iterate and hard to test ideas and products in advance. Their solution seems to be that they allow better ideas to form and develop, take bigger risks and make decisions faster than their competition, which helps stay ahead of the market. The Insanely Simple design philosophy is based on the idea that its easy to listen to all those great common sense ideas about features your product has to have, but if you learn to ignore the common sense and give the customers a simple and distilled experience you will reach beyond the people who want a complicated product and find a much bigger market of people who were waiting for a simple way to get something done. Apple's competitors are so bogged down in committees and approval processes, and helpful common sense advice from customers that they are unable to release simple products.

A key example from the book is that Apple has had many award winning advertising campaigns, "Think Different", "PC and Mac" and the iPod silhouette, and none of them were test marketed in advance. Their competitors make less risky adverts after getting broad internal consensus, take much longer to get them to market and fail to understand that the success of an advert is a randomized event (with lots of useless common sense post rationalizations) so the test market response is a very poor predictor of success. It's more important to be bold, different and go big. For example Apple only advertised Think Different on the back cover of magazines, which costs far more but has a much bigger impact than inside pages.

From these three books I've found some useful focus on how to approach things, but they also give me some backup to explain to others why I think some things are important. A key part of what I have been doing for Netflix is looking out into the future of cloud and related technologies and developing a portfolio of fuzzy strategies and options. They don't all work out, but by having a well instrumented but loosely coordinated architecture that doesn't have central control and strict processes we can iterate rapidly, adopt (and discard) interesting new technologies as they come along. We can all have more fun and less frustration making Netflix Insanely Simple, and ignore all the bad common sense advice and analyst opinions that swirl around everything we do.

I'm planning a complete re-write of my cloud architecture tutorial for Gluecon in May, that will be a great opportunity to discuss these things in person over a few beers, and now is a good time to sign up to attend - you can get a 10% discount with code spkr12.

Friday, December 30, 2011

How Netflix gets out of the way of innovation

#defrag 2011 presentation script.

I'm the cloud architect for Netflix, but rather than tell you about why we moved Netflix to a cloud architecture or how we built our cloud architecture, I'm going to tell you what we do differently at Netflix to create a culture that supports innovation.

What is it that lets us get things done very quickly. Sometimes a bit too qwikly…. but how did we keep making big strategic moves, from DVD to streaming, from Datacenter to Public Cloud, from USA only to International, all in very short timescales with a fairly small team of engineers.

My presentation slides are just box-shots of movies and TV shows that are available on Netflix streaming. This script is based on the notes I made to figure out what I was going to say for each box shot. If some of you see a show you didn't know we had and want to watch that would make me happy, you can click on the box shot to visit that movie at Netflix, they were all available for streaming in the USA at the time of writing.



I've attempted to match the box shots loosely as cues to what I'm saying, but I've also used a musical theme in places since this is for Defrag and Defrag rocks!



Netflix is now one of the largest sites that runs almost entirely on public cloud infrastructure. We have become a poster child for how to build an architecture that takes full advantage of the Amazon Web Services cloud. But when I talk to other large companies about what we have done, they seem to have a lot of reasons why they couldn't or didn't do what we did, even if they wanted to.



Why is that? Why are we heading in one direction while everyone else is going the other way? Are we crazy or are they zombies? Well, I've worked at other large companies so I have some perspective on the issues.



Before I joined Netflix I worked at eBay for a few years, and helped found eBay Research Labs. This was setup because eBay felt it wasn't innovating fast enough, and they were looking for the one missing ingredient that would drive more innovation into the company.



This is a fairly common approach. "You guys go and be innovative, then hopefully we will find ways to spread it around a bit." Unfortunately the end result of setting up a separate group to add innovation to a big company is more comical than useful.



The most interesting projects got tied in knots, they trod on too many toes or were scary. We visited Xerox Parc and IBM Santa Teresa Labs to discuss how they were setup, to try and learn what might work., and we went to an Innovation Forum in New York. That was weird, some of the primary examples they were talking about emulating were eBay and Paypal!



The projects that did get out were minor tweaks to existing ideas, they could be fun, but ultimately not very interesting.



So I had to break out of there and find something new to do, and in 2007 I joined Netflix just as they first launched streaming.



One of the key attractions for me was the Netflix culture I heard about in the interviews, I wanted to get inside their heads and figure out if what they were describing was real, and if so, was it sustainable as the company grew.



What I found out over the next few years is that the culture is what enables innovation, so that Netflix can get things done quickly that other companies are too scared or too slow to try. The rest of this talk is about the key things that we do differently at Netflix.



Before I get into them I want to warn you that even with a roadmap and a guide, you probably won't be able to follow this path if you are in a large established company. Your existing culture won't let you. However if you are creating a new company from scratch, I hope you can join me in what I hope is the future of cool places to work.



Here's the key insight. It's the things you don't do that make the difference. You don't add innovation to a company culture, you get out of its way.

I'm mostly going for SciFi at this point, because it's going to sound like I was beamed in from the future to some of you.



Let me repeat that. You have to setup a company that doesn't do many of the things you would consider business as usual. That's why it's so hard to retrofit.



How about some audience participation? Hands up everyone who hates answering questions by putting their hands up..



Who works at a company that has more than one product line? Do you get along? The problem is that the company loses focus and has trouble allocating resources where they are needed so there are big fights. Pick one big thing and do it well. For Netflix, our addressable market is everyone in the universe who likes watching movies and TV shows, that should keep us busy for a while.



Who has teams spread over multiple sites and countries? We don't. It adds communication and synchronization overhead that slows your organization down. For the geeks, think of Amdahl's law applied to people. We have as many people as possible in the same building on the same site. We are planning a new bigger building to make sure we can keep everyone close together. High bandwidth, low latency communication.



Who's worked for a place that bought another company, then run it into the ground, laid everyone off and wrote down the value. Over and over again. It's crazy. I don't think Netflix has ever bought another company. It's a huge disruption to the culture, if you see something you like just hire away their best people and out execute them in the market.



Who has junior engineers, graduate hires and interns writing code? We don't. We find that engineers who cost twice as much are far more than twice as productive, and need much less management overhead. Reducing management overhead is a key enabler for an innovative culture. Engineers who don't need to be managed are worth paying extra for. We are tiny compared to companies like Google, they take on raw talent and develop it, we sometimes take a chance on someone with only five years experience.



Who has an architecture review board and centralized coding standards? We don't have that either. What we do have is tooling that creates a path of least resistance, which combined with peer pressure keeps quality high. The engineers are free and responsible for figuring it out for themselves.



Who has an ITops team that owns keeping code running in production? We don't. The developers run what they wrote. Everyone's cell phone is in the pagerduty rota, the trick is making sure you don't need to get called. All the ops people here have horror stories of stupid developers, and vice versa, but it doesn't have to be that way. We have one dev organization that does everything and no IT ops org involvement in our AWS cloud deployment.



Who has to ask permission before deploying 100's or 1000's of servers? We don't. The developers use our portal directly, they have to file a Change Management ticket to record what they did if it's in production, that's all. We've trained our developers to operate their own code. We create and destroy up to 1000 servers a day, just pushing new code. AWS takes about 5 minutes to allocate 100 servers, it takes longer than that just to boot Linux on them.



Who has a centralized push cycle and has to wait for the next "train" before they can ship their code? We don't. Every team manages their own release schedule. New code updates frequently, and the pace slows for mature services. Teams are responsible for managing interface evolution and dependencies themselves. Freedom and responsibility again.



Who has project managers tracking deliverables? We don't. The line managers do it themselves. They own the resources and set the context for their teams. They have time to do this because we took the BS out of their role.
Managers have to be great at hiring, technical and hands on enough to architect what their team does, and project manage to deliver results. Don't split this into three people. Reduce management overhead, minimize BS and time wasted. Teams are typically 3-7 people. Have a weekly team meeting and 1on1 with each engineer to maintain context.



Who has a single standard for development tools? We don't. We assume developers already know how to make themselves productive. We provide some common patterns to get new hires started, like Eclipse, IntelliJ, on Mac, Windows. Some people use Emacs on Linux. Hire experienced engineers who care, and they will take care of code quality and standards without being told how to.



Who has to work with people they don't respect? It's much too disruptive. The only way to get high talent density is to get rid of the people who are out of their depth or coasting.



That also applies to what you might call brilliant jerks. Even if they do great work, the culture can't tolerate prima donna anti-social behavior, so people who don't trust others or share what they know don't fit in.



So does that mean we value conformity? No but it's really important to be comfortable as part of a high performance team, so we look for people who naturally over-communicate and have a secure confident personality type.



If you haven't experienced a high performance culture, think about what it's like to drive flat out at a race track. Some people will be too scared to deal with it and drive around white knuckled at 40 mph, some will be overconfident and crash on the first corner, but for people who fit into the high performance culture it's exhilarating to push yourself to go faster each lap, and learn from your peers without a speed limit. When you take out the BS and friction, everyone gets so much more done that productivity, innovation and rapid agile development just happen. This is the key message, removing obstacles to a high performance culture is how innovation happens throughout an organization. Doing less to get more.



We don't pay bonuses. We don't have grades other than senior engineer, manager, director, VP. We don't count the hours or the vacation days, we say "take some". Once a year we revise everyones salary to their peers and current market rate - based on what we are paying now to hire the best people we can find.



We also have what sounds like a crazy stock option plan that grants options every month, vests the same day, and they last 10 years even if you leave Netflix. The net of this is less work for managers, they can concentrate on hiring top people, and almost everyone that leaves takes a pay cut. The test we make is "would you fight to keep your engineers if they tried to leave". If not, let them go now and get someone better. We don't make it hard to let people go.



Some of you may be thinking this sounds expensive, but what is the value of being incredibly productive and able to move faster than your competition? You can get out ahead and establish a leading position before anyone else realizes you are even in the game. Remember how a few years ago the "Analysts" said that Netflix the DVD company was going to get killed by other companies streaming, then all of a sudden people realized that we were streaming more bandwidth than anyone else?



So what could possibly go wrong? We had a near miss recently, we went too fast, partly because we could, got unlucky and screwed up. The good thing is that Netflix could re-plan and execute on the fixes we need very quickly as well, with no internal angst and finger-pointing. Also there was an Asteroid nearby earlier this week. By the way, my stepdaughter @raedioactive was the art director for this movie.



So, you are at a crossroads, you could be on stage with Eric Clapton, or in the audience watching and wondering why you can't do what they are doing. It's a radically different way to construct a corporate culture, it doesn't work for everyone, and we can't all be up on stage with Eric, but the talent is out there if you start by building a culture focused on talent density to find it and keep it.



Is it going to be the goats or the glory? I just told you all to stop doing things, what could be easier than that? It takes less process, fewer rules and simpler principles. Give people freedom, hold them responsible, replace the ones that can't or won't perform in that environment. Focus on talent density and conserving management attention span by removing the BS from their jobs.



This is your challenge, can you get a band together and go on a mission to save your company? Stop doing all the things that are slowing you down, and get rid of the unproductive BS that clogs up your management and engineers.



I will take questions in the comments or on twitter to @adrianco. Thank you.



Each question got a new box shot, but all the answers were musical.