Adrian Cockcroft's Blog

Saturday, April 21, 2007

Load Average Differences Between Solaris and Linux

A lot of people monitor their servers using load average as the primary metric. Tools such as Ganglia colorize all the nodes in a cluster view using load average. However there are a few things that aren't well understood about the calculation and how it varies between Solaris and Linux.

For a detailed explanation of the algorithm behind the metric, Neil Gunther has posted a series of articles that show how Load Average is a time-decayed metric that reports the number of active processes on the system with a one, five and fifteen minute decay period.

The source of the number of active processes can be seen in vmstat as the first few columns, and this is where Solaris and Linux differ. For example, some Linux vmstat from a busy file server is shown below.

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 4 43      0  32384 2993312 3157696  0    0  6662  3579 11345 7445  7 65  0 27

The first two columns show the number of processes that are in the run queue waiting for CPU time and in the blocked queue waiting for disk I/O to complete. These metrics are calculated in a similar manner in both Linux and Solaris, but the difference is that the load average calculation is fed by just the "r" column for Solaris, and by the "r" plus the "b" column for Linux. This means that a Linux based file server that has many disks could be running quite happily from a CPU perspective but show a large load average.

The logic behind the load average metric is that it should be a kind of proxy for responsiveness on a system. To get a more scalable measure of responsiveness, it is common to divide the load average by the number of CPUs in the system, since more CPUs will take jobs off the run queue faster. For disk intensive workloads on Linux, it may also make sense to divide the load average by the number of active disks, but this is an awkward calculation to make.

It would be best to take r/CPU count and b/active disk count then average this combination with a time decay and give it a new name, maybe the "slowed average" would be good?

Monday, April 16, 2007

SEtoolkit and XEtoolkit releases

The SEtoolkit was developed in 1993 by Rich Pettit, and I used it as a way to prototype many new tools and ideas over the years. Its a Solaris specific performance tool scripting language that supports very rapid development of new tools. The SEtoolkit has been widely deployed as the Solaris collector for the popular system performance monitor Orca. Rich gave up development of the SEtoolkit a few years ago, put the code into open source under GPL, and its now available via sourceforge, where it is being maintained by Dagobert Michelsen. A bug in the SEtoolkit was causing it to crash when used with complex disk subsystems, and this has now been fixed in the SE3.4.1 release (April 10th, 2007).

Meanwhile, Rich has been trying to make a multi-platform (Solaris, Windows, Linux, FreeBSD, OSX, HP-UX, AIX) version of SE for a long time, and finally gave up trying to implement his own language, and based his latest development, the XEtoolkit, on Java 5. The first full release XEtoolkit 1.0 came out on April 15th, 2007. The code is released and supported under both open source and commercial licenses, by Rich's new company - Captive Metrics. The GPL license allows full free use of the provided tools, and development of new and derived tools that are also contributed to the community. The commercial license allows custom XEtoolkit development for proprietary tools, with a higher level of support.

The XEtoolkit 1.0 release doesn't support HP-UX or AIX, but AIX support is coming soon. I encourage you to try it out, give Rich some feedback and make it worth his while to continue. He's one of the very best programmers and performance tool architects I've ever met....

Thursday, April 12, 2007

myPhone 2.0 Case comes off the 3D printer

More pictures of the latest myPhone 2.0 case design.

This version has a deeper case, and was printed at a simple angle, the previous attempt ended up warping, so I hope these changes will keep it flat. It also has a slot in the bottom end to take an iPod connector, which carries power, USB, stereo line level output etc. an antenna mounting hole at the top, and a retaining clip design to hold the LCD in place.

This contains 2.6 cubic inches of ABS plastic, and used 1.4 cubic inches of support material, which costs about $40 in materials and took 8 hours to print at Techshop.

Wednesday, March 28, 2007

Low Power Microprocessors for General Purpose Computing

While researching devices for my home brew mobile phone, I've realized that the current generation of CPUs for mobile devices are actually seriously powerful, very low cost and use almost no power. The performance per watt and per dollar seems to be an order of magnitude better than the PC-class CPUs that are common in commodity servers nowadays. The absolute performance and memory capacity is lower, but is comparable to common PC hardware from a few years ago, and could be useful for more than running a high end phone or portable games machine. Devices such as the Marvel PXA270 and Freescale i.MX31 run at over 500MHz, some include floating point units, they support at least 128MB of RAM (a single chip), and a myriad of I/O interfaces, with Linux 2.6 support.

While the current mainstream CPUs were driven by the development of the home PC market, this generation is driven by the development of the mobile, battery powered device market, which is a very large. For example the worldwide cellphone market is something like a billion devices a year.

I think that there could be some interesting general purpose computer systems built from low power devices (CPUs that use less than one watt). I looked around but wasn't sure what to search for... I do know about the systems that are sold for embedded use, but they are typically configured using lower speed and lower memory options.

Does anyone know of vendors selling general purpose systems, or a category name for this space?

[Update: I asked around, and thought a while, and decided that this is interesting enough to have its own name "millicomputer" and its own blog "millicomputing"].

Thursday, March 01, 2007

The OpenMoko Story

Here is a link to the OpenMoko slide set presented at ETel. Worth a read, there is a lot of activity building around this platform at the OpenMoko community site.

Picture of myPhone

Here are the CAD pictures of the phone, I made the rear case cover translucent so that some of the parts inside can be seen. The big dark block is a Treo650 battery, the block in front of it is the Telit 862 GSM/GPS module that does all the phone stuff... Its 130mm long and 75mm wide, quite big, but thats a 3.7" 480x640 LCD, and its easier to fit off the shelf parts inside :-)

MyPhone Making Progress and ETel Conference

I'm at the O'Reilly ETel conference this week, lots of new and interesting things happening. I've met the OpenMoko people and held the phone (nice hardware, quite compact).

Several people from the SVHMPC are also attending, we have been showing off the cases I made a few weeks ago, and I've been spending way too long on a new and much cooler case design. I've posted my design including all the cad files (this is a completely open source project, including hardware designs) to a page on the Homebrew Mobile Wiki.

Thats the main reason there have been relatively few postings in the last week...

Joost sends out a batch of beta invites

I only got two, they are both already take and I don't have any spare, but several people are selling Joost invites on eBay for a few bucks...

Friday, February 16, 2007

Skype downloads approach 500 Million

In the next few days, Skype will pass the 500 million downloads mark. I seem to remember someone saying that Kazaa was the most downloaded program ever, and looking at the Kazaa home page, they currently show 389 Million. Skype is at 498 Million as I write this, and is downloading at almost a million a day. The download rate was over 1M a day when they made a new release available and existing users were getting updates. Compared to Kazaa, Skype is downloading every day, what Kazaa is downloading every week.

How are the competition doing? Vonage added a few hundred thousand users last quarter. Thats a few orders of magnitude away from being a competitor!

Of course, Kazaa is based on an earlier version of the P2P technology that powers Skype, and which is about to come to the world again via Joost. Joost just released a new beta for Windows, and their first beta for Intel based OSX. I still can't justify upgrading from my own Powerbook G4 (it just works) so I'll have to wait for them to backport to the older systems, and use my work laptop.

Monday, February 12, 2007

AMD Enhanced Power Now - Variable Cores and Clock Rates

There is an interesting article in The Register about the latest variant of AMD's enterprise power management system.

As I've mentioned before, in the interests of saving power, some enterprise server systems are varying their clock rates so that they end up showing a higher utilization at low load levels that you would expect. This non-linear relationship of load to utilization is one of the things I highlighted in my CMG06 paper called "Utilization is virtually useless as a metric".

The latest twist: in AMD's upcoming four core systems, individual cores will be stopped completely if there isn't enough work for them to do.

The effect on utilization metrics will depend upon how each operating system interacts with the power management capabilities...

For Solaris the so-called "idle loop" is actually quite busy. An idle CPU watches its neighbors to see if they have too many jobs on their run queues, and gets work by migrating processes that won't get to run soon to itself. Interrupts are also bound to individual CPUs, so that data structures don't have to migrate between caches at high interrupt rates.

It will be interesting to see how these technologies interact.

Adrian

Wednesday, February 07, 2007

Embedded songs from iJigg

This is a very nice embeddable player from iJigg, below is Continuum from my friends in Fractal, and also Smuggled Mutation by Estradasphere, which is from Palace of Mirrors, one of my most-played CDs in recent history. Its so varied and well played its impossible to get bored by it.

Fractal on iJigg

Another music website, where users tag and vote for songs they like - kind of Digg for music. Worth a try, so I created an account and uploaded Fractal's title track from their first CD - Continuum.

What do you think?

Adrian

New Joost Beta

They just released a new build, it fixes some problems and has some minor user interface enhancements. I left it running for a while and now I'm starting to run out of content that I want to watch... There are about 30 channels for beta testing. probably the most demanding are live music videos, Joost has fine sound quality and keeps up with very rapid on-screen action better than I would expect in full-screen mode on my Dell laptop. It becomes a bit more pixellated when its working hard to keep up, and the sound glitches to repeat a sub-second fragment now and again if it gets behind due to network slowdowns. On low action images, its very nice and clear.

In comparison, I've noticed that on my Slingbox TiVo, it does constant pitch sound stretching to slow down the display at the start of a show, thats how it sneakily builds up a buffer without making you wait.

The Joost folks are promising new content, an OSX version and an expansion of the beta program soon. I've had a few comments requesting beta invites, and I haven't had any to give. If and when I have any spare invites, I'll post it, so hold your comments...

Saturday, February 03, 2007

You can help search for Jim Gray

I met Jim last year when he gave a talk at eBay. His sailboat went missing, and a lot of people have collaborated to allow up to date satellite images to be posted on the web for anyone to take a look at and see if they can see anything that might be his boat.

My (ex-Sun DE) friend Geoff Arnold now works at Amazon, he blogged that they have configured a web service called Mechanical Turk to support parcelling out the work of looking at images. If you already have an Amazon.com account, then sign up for the service takes a couple of clicks.

I hunted through images for a while, and found a cargo ship, that's all :-(

Geoff's blog
Werner's blog
Search for Jim

Please take a look yourself and pass the Mechanical Turk search for Jim URL on...

Friday, February 02, 2007

Who needs a custom built Myphone?

The mainstream phone manufacturers are looking for hits in the mass market, and looking for large niches to broaden their market. However if you take a "long tail" viewpoint, everyone wants a slightly different phone, and many people have phones that are the least bad one they could find.

How about phones for people with poor eyesight? If you are over 50, can you read the small print on your phone's screen without reading glasses? I have a friend who has a medical condition that causes very poor eyesight, and sets her 17" laptop screen resolution very low so she can read it. She would like a phone that has just the applications she needs, big easy to find icons for them, and big fonts on a large screen for the address list and other applications. Hard to find in the shops, but easy to custom build.

Thursday, February 01, 2007

Making a case for Myphone

I mean that in both senses of the phrase "making a case". I have produced a lump of plastic, and I have good reasons why I want to build my phone.

Here are the first parts I made, two copies of an LCD bezel with a rounded outside surface and a cutout to locate the LCD. The standing part is as it was made in the 3D printer, sideways on to get a smoother finish, and with additional "support material" that holds everything in place while the "model material" sets. The part lying flat is the same, but is face down and has the support material removed. The size is about 5" by 3" and the bezel is 0.1" thick. It took about 7 hours to print the pair of bezels, each contains 0.6 cubic inches of model material and the total including support material was about 1 cubic inch. The bezels cost $10 each to make at Techshop in Menlo Park ($100/month for membership, $10/cu inch to use the 3D printer).

This experiment went quite well, and the next step was to extend the design to form a complete top and bottom case that fit together. This was kicked off using black rather than white plastic, and lying flat, face down. This prints more quickly, but uses more support material to create the base. The base will give the phone a textured surface, which can be made shiny by dipping in acetone, and I'm hoping it will look cool, like a carbon fibre finish. Pictures of the design, cad files, and a photo of the 3D printer in action can be found at the SVHMPC Wiki, I'm updating them as work progresses. The case walls are thicker than they need to be, so the final design will use less plastic and cost less than the $40 I was charged to make this complete case.

The purpose of this build is to figure out where to put the contents, so I can mold support brackets into the design and make an attempt at a working prototype.

Why am I doing this? Some of us are attending the Emerging Telephony conference in March, and I want to have something to show. Also its fun to make things, fun to hang out at Techshop, I'm learning a lot from the 3D printer instructor, and in the end I will have a phone that evolves rather than being thrown away every few years. If I want more memory, a bigger LCD, or different trade offs in features/size/battery life, I don't have to start from scratch or accept someone else's set of compromises. That's why I call it my phone.

Friday, January 26, 2007

myPhone will be a TuxPhone

There is lots of interest and activity in the Silicon Valley Homebrew Mobile Phone Club right now centered around building a phone with a large touch screen. I'm calling it "myPhone", and TuxPhone is the generic name for the open software and hardware components we are using. The hardware design is based on components that are openly available with freely available spec sheets, we will make our schematics, design details and cad files freely available, and its quite likely that no two phones will be built the same way. If you want it thinner and lighter, go for fewer features, a smaller battery and tweak the cad file before making the case.

The size is driven primarily by the screen, and we've settled on the screen that is used by the Sony PSP, with a touchscreen overlay added. Its a 4.3" diagonal, and is widely available, low cost. We will include a standard GSM/EDGE module, so I can just pop the SIM card from my Treo into it, and we can add GPS, WiFi, Bluetooth, Accelerometer, SDcards or hard disks, whatever custom mix we feel like. The CPU and other parts will be based on Gumstix modules.

Friday, January 19, 2007

More on Open Phones

Thanks to a comment on my previous open phone post, I am reminded that I forgot to mention OpenMoko. This is a neat looking touch screen Linux phone that should be available in February 2007. I haven't seen one yet, although I have played with the Greenphone and I know someone who has ordered and received a Greenphone, so they are ahead by a couple of months.

In the homebrew mobile club, we are talking about making something with a physically bigger (perhaps 3.7" VGA) touch screen than the OpenMoko (2.8" VGA) or Greenphone (smaller QVGA), big enough to show a more usable keyboard. The existence of multiple open phone projects is great, since the nature of the open source community is that we can all share and reuse code for device drivers, user interface components and applications. We aren't building from scratch every time we want a new feature. In the meantime several of us will have Greenphone's, OpenMoko's and TuxPhone's to play with.

The big screen is pretty much the most expensive part. I like the concept of building a phone, keeping the screen and CPU for a long time, but do upgrades like replacing the GSM modem with a 3G one by performing electronic surgery inside the case.

Thursday, January 18, 2007

Joost and The Venice Project Beta Testing

I signed up for TVP's beta program a month or two ago, and got accepted recently, so I've been playing with it. A few days ago they launched under the name Joost, which I find amusing since I have a very good Dutch friend called Joost (pronounced yoast) and I'm not sure if I'm supposed to say it as yoast or juiced....

I've had a TiVo for years now, and Joost has a very nice user interface that works for me like a TiVo that I can run anywhere, and which has tens of terabytes of stored programs, not just the 30GB or whatever that I have inside my TiVo. The other thing that is better than TiVo is the finding experience. I was able to type "Lotus" into a search box, and find two episodes from "5th Gear" where they were track testing my favorite car. The content is largely European TV at present, and they have enough programs to keep beta testers happy for a while, but are going to need a lot more as the program launches.

The display quality is good, and there is a short delay when a program is first selected, especially if its one of the more obscure programs. The networking is something like a securely encrypted in-order bittorrent. You don't have to wait for the whole program to load before you start watching it.

I posted last year on disruptive innovation as it applies to the moving pictures industry, discussing the concept of a maturity model for innovation, evolution from the cinema to thepiratebay, and a more abstract maturity model. In the final part I made this statement:

The final phase in the evolution of a market occurs as the cost of replication and distribution of the product approaches zero. There is no end user cost to use the Internet to download a movie. A central utility such as YouTube can use a mixture of advertising and premium services (for a minority of power users) to offset their own costs. Peer to peer systems distribute the load so that there is no central site and no incremental cost in the system. The only service that is needed is some kind of search, so that peers can find each other's content to exchange it.

I was talking about how bittorrent based video is inevitably going to dominate over centralized services like YouTube, and there have been reports that most of the traffic on the internet is bittorrent. The inconvenience of bittorrent is that it is basically an overnight batch job to get a program. Joost fixes the problems of bittorrent, while staying as close as possible to free distribution. Joost inserts a small number of short adverts that the Joost client figures out how to target to your interests. These are intended to be enough to pay for the central seeding servers, and to pay the content owners so good programs become available, without becoming intrusive enough to switch off the users.

Based on my own maturity model, Joost is nicely setup to be the end game for this market. That doesn't mean that we stop going to the cinema or watching YouTube, just that Joost could end up being "bittorrent for the masses", and be an order of magnitude bigger than everything else.

Tuesday, January 16, 2007

Template Update

I just realised that my blog wasn't displaying properly in Internet Explorer. I couldn't figure out why, so I upgraded to a fresh new Blogger.com template that has a lot more configurability, and dropped the ads etc back into it. The formatting is still a bit off, and it looks better in Flock/FireFox but at least the basic blocks are all in the right place now.

The archive of old posts is also easier to navigate with the new template.

Build your own phone, any way you want it!

As an antidote to all the grumbling about the lack of an open developers approach to the Apple iPhone I'm going to talk about some projects that are building open phone platforms.

I'm a member of the Silicon Valley Homebrew Mobile Phone Club, and there are several projects in existence to create a free, open source, software stack and applications for mobile phones. The hardware parts are available off the shelf, so you can buy a complete open phone or use whatever parts you like. All you need is an enclosure, and you are on your way to whatever spec you wanted, with whatever applications you wanted.

Actually, if someone out there knows Autocad and wants to help design some enclosures, the assistance would be appreciated. The SVHMPC often meets at Techshop, where they have a 3D printer that can make anything that can be described by a CAD file in ABS plastic.

The phones use off the shelf GSM modems, that take a SIM card and deal with the network. They operate like any unlocked phone would work, and the GSM modem module prevents anything bad from happening to the network. The command set looks a lot like the old hayes modem AT codes, and Linux device drivers exist to manage some of the common ones. Some of the GSM modems include GPS location services as well, most of them are based on the relatively slow GPRS/EDGE standards for Internet connections.

My own ideal phone starts with a 3.7" VGA resolution touch screen and includes WiFi, bluetooth GPS and a 3G network, but I may have to put up with GPRS for now. The CPU is based on 400MHz ARM chips from Gumstix, and the parts list from the OpenCell project gives you some idea of what's available.

Its going to cost more than an iPhone, won't be as thin, and as cute, but I will be able to make it do whatever I want, and it will be runing as much of the standard Linux 2.6 distribution as I feel like.

I built my first three home computers up from bare PCBs, I still have my soldering iron, and I'm not afraid to use it :-)

Monday, January 15, 2007

What the iPhone doesn't do (yet) and thoughts on why not.

There has been a lot of commentary, complaints and opinions on the iPhone. I haven't seen much discussion of its features and strategy from the perspective of the realities of product development. In my opinion, what Apple have done is the right set of strategic and tactical moves for the first product in a new family. What was announced and shown is not the final feature set for the initial device and it does not include the full vision of the product.

Lets look at the timing of the announcement. As a new entrant in the mobile phone marketplace, the correct strategy is to pre-announce. There is no existing product from Apple to cannibalize, and there is only partial overlap with the iPod market. The announcement was made after the holiday shopping season, and the timing is setup to get a volume ramp in place for the 2007 holiday shopping season. The initial launch is based on the minimum marketable features (MMF) required to address the Apple oriented consumer marketplace. Rather than wait until the full feature set is ready or create low quality solutions for a wide feature set, the Apple strategy is to develop a small number of features to an extremely high level of quality and integration, and focus on the needs of their core market of existing OSX and iPod users.

Lets look at disclosure related issues. For a phone to be released as a product it has to go through FCC testing that takes a month or two, and the FCC process is relatively open. All the new phones are scooped by Engadget Mobile before they turn up in stores. For the big splash product announcement, it needed to be scheduled before Apple turns over an iPhone to the FCC test process. In order to keep details on the product quiet for as long as possible, it is also much easier to do the initial launch before completing negotiations with key third party application developers like Adobe and Microsoft. I've heard that there is no Adobe Flash support in the device at present, and there is a clear need to support Microsoft Office at some point in the future. These omissions are easily fixed, its just a matter of time.

New models in the iPod range are announced when they are basically in stock in the stores. If you take the iPhone package, and remove the phone parts, keeping the iTunes music and video functionality, and WiFi/web connectivity, you are left with a very nice looking wide-screen networked iPod. Its main issue would be the relatively small capacity flash, so that could be increased, or a hard disk could possibly be crammed into the package. I would not expect anything like this to be announced until it is completely ready and in-stock, but if it exists, it could end up being released this summer around the same time as the iPhone actually ships. Since it isn't a phone, its outside the Cingular agreement, but adding a WiFi only VOIP client like iChat or Skype would create a product that competes with the Sony Mylo.

During the demo's no-one tried to show the iPhone's camera, this indicates to me that it isn't finished, and I hear elsewhere that they are still working on video capture. For use as a video-phone, the camera is on the wrong side, you can't see the display while you are on-camera. This makes it seem less likely that a full iChat function will be included in the initial package.

Apple is getting a lot of criticism for its locked down and controlled approach to third party software on the iPhone, and lack of a developers program. Developer support falls outside the minimum marketable features required for initial launch into the consumer marketplace. By taking full control of the product, Apple can make sure that very high quality standards are in place, and that applications integrate with the iPhone experience. The reality of product development also makes it hard to build a stable developer API until the product is finished, so I fully expect a phased developer program. The initial phase included applications like Google Maps and Cingular Visual Voicemail from development partners (and I expect some kind of GPS location service to appear in the product soon - perhaps even in the initial release). The second phase will be a closed private developer program including big partners like Microsoft and Adobe. The third phase will start to open up to the Apple developer community, with stable public APIs and developer tools. Extensibility is a MMF for the professional/consumer (prosumer) and enterprise marketplaces, along with Microsoft Office support. This may take a year or so to arrive, its inevitable, but I can see why its not a feature of the initial product launch.

There has also been a lot of grumbling about Cingular and the lack of 3G service. Apple have dropped hints that they will support 3G sooner rather than later. My guess is that 3G is considered an MMF for the European and Asian markets, so I wouldn't be surprised if the models launched in those markets in late 2007 and 2008 included 3G support, and as Cingular's own 3G network continues to roll out over the USA the timing would make sense here as well. The real alternative to Cingular for Apple would be to setup their own Mobile Virtual Network Operator (MVNO) like Helio or Virgin Mobile. This is a big complicated thing to do without any experience, so my guess is that they decided that the highest priority was to get the product launched with a big network partner like Cingular, and to decide later on whether it is worth creating an MVNO for a less compromised product. So when the exclusivity arrangement with Cingular expires, they may well focus on their own MVNO services.

So thats my opinion, as someone who has developed products and strategies in the past and understands the compromises, but with no inside information on their actual plans. My own plan is to avoid the initial release, and see what the product looks like for the Xmas 2007 shopping season.

iPhone and Treo at MacWorld

I went to MacWorld to check out the Apple iPhone. I think that it is even more impressive in person than I was expecting. Its smaller and much thinner than I imagined from looking at pictures, and watching the live on-stage demos made it clear just how much of a step forward in usability this is for a mobile device.
I took the picture above using my Treo650 of someone else taking a picture with a Treo650, so the size can easily be compared. I was embarrased to get the Treo out anywhere near the iPhone, it just felt wrong. The overall width and length of the device is similar, but the Treo is about twice as thick. I tried to take a picture edge-ways to show how thin the iPhone is, but the Treo camera is a crappy low resolution one and the iPhone is so thin it effectively disappeared in the photo.

Tuesday, January 09, 2007

Apple iPhone Unanswered Questions...

Like everyone else, I think it looks great. The big difference between Apple and other vendors for just about everything they make is craftsmanship, everything else looks sloppy in design and execution in comparison.

So the questions, they really are around whether this is really a pocket OSX machine or a locked down phone that happens to use some parts of OSX.

What CPU does it use? Can it run unmodified OSX applications, or do they have to be downsized and ported to a new CPU (Xscale/ARM is what Treo's use)
How much memory is there for running applications? The 8GB Flash is basically filesystem space, so there must be some RAM as well.
Is it open for more applications to be loaded? For example Firefox, Skype, Slingplayer, OSX Terminal, whatever..... or is it locked down?
What is the model for developers to extend the platform?

I'm sure this will become clearer fairly soon, but all the commentary I'm reading is just going "oh wow" and no-one is asking or answering these fundamental questions that have big long term implications...

Anyway, next year's Xmas present request list will have one of these on it!

[Update...]
As an existing Treo650 user, the discussion on the iPhone at TreoCentral is very interesting. They imply that the CPU is an ARM architecture with an additional graphics processor, and that the iPhone is closed and controlled so you can't add applications to it. We'll see how long that lasts, either Apple will open it up or the hackers will have a lot of fun working out how to crack the design, just like they did for the games consoles, TiVo etc.

Sunday, January 07, 2007

Going Solar? Get a weather station...

There's nothing line a 36 hour power outage (a tree fell and took the power lines with it) to make you think about where your energy comes from. We could get a generator, but it would only be used for a few days a year. If we get Solar power, it pays for itself over time and we would get daytime power at least....

We have a very sunny high altitude microclimate, on top of the Santa Cruz mountains with a south facing roof. It seems ideal, but the economics are quite dependent on how much Sun we actually get. We called up some local solar power companies to get an idea of what would be involved, but in the short term we decided that it would be good to measure our local weather.

I've been using the Weather Underground web site for a while, to track actual local conditions such as wind speed at a station a few miles away. This site contains all you need to know about setting up your own station, and there is plenty of low cost hardware and software to do it.

I think the amount of technology available for the price is amazing. I picked the Oregon Scientific WMR100 wireless weather station. It comes with a full set of sensors for temperature, humidity, barometric pressure, rainfall, wind speed and direction, that connect to the base station via wireless. It has a USB connection for uploading to a PC. List price is over $200, but I bought it on eBay for $129.99+18 shipping, which was the best overall deal at the time.

I plan to add the optional UV sensor to get an indication of Sunlight (the UV sensor for this model is not yet available, I got an answer that it should be out in a couple of months), and monitor two temperature sensors, one in the shade, and one in full sunlight. I should be able to figure out which days are clear and sunny, and for how long.

[Update]
It arrived, I corrected a few details about the product features above, and I've assembled it and got the sensors to connect over the wireless, ready to install outside. Its nicely made, everything seems to work fine.

Slingbox-ing my TiVo

As I mentioned before, we got a Slingbox AV this Xmas, and I'm quite impressed. It was easy to setup, has been no trouble since, and now we can watch and control our TiVo from anywhere. The main use we have is to watch programs we have saved on our TiVo from anywhere in the house, and it runs a full TV resolution feed that uses a few Mbit/s to support this on our WiFi based laptop's. When running on a work PC laptop (Dell Inspiron with XP) the performance was very smooth, however on my three year old Apple Powerbook G4 I'm down on CPU power and running a beta-version of the Sling player that seems to need some more tuning. If I kill just about everything else on the Mac, it plays fairly well, but the picture keeps stalling if anything else is running.
I'm hoping that they will do some more tuning to fix this in the final release for the Mac...

Wednesday, December 27, 2006

Skype 3.0.0.190 is now in general release for Windows, amongst all the changes mentioned in the Skype Garage post is this little gem:

change: API: application can connect to oneself

This adresses an interesting issue. Unlike most network addressing schemes, skype connects to a username, and there is nothing to stop a user running skype on many machines as the same user. When app2app messaging connects to a user, you get an array of streams that connect you to all the endpoints for that user.

However, in the past you could not make a connection to yourself, and now you can. So if you connect to your own username you will get back an array of streams to your other instances. This could become quite useful for keeping all kinds of things synchronized across multiple machines.

Saturday, November 25, 2006

Processing vxstat to read into R

I got bored with my iostat data, and found some interesting looking vxstat logs to browse with the Cockcroft Headroom Plot. To get them into a regular format I wrote a short Awk script that is shown below. It skips the first record, adds a custom header and drops the time field into the first column.


# process vxstat file into regular csv format
BEGIN { skipping=1; printf("time,vol,reads,writes,breads,bwrites,tread,twrite\n"); }
NR < 4 {next}   # skip header
NF > 0 && skipping==1 {next} # skip first record of totals since boot
NF == 0 {skipping=0}
NF == 5 {time=$0}
NF == 8 {printf("%s,%s,%s,%s,%s,%s,%s,%s\n",time,$2,$3,$4,$5,$6,$7,$8);}

It turns a file that looks like this:


                        OPERATIONS           BLOCKS        AVG TIME(ms)
TYP NAME              READ     WRITE      READ     WRITE   READ  WRITE 

Mon May 01 19:00:01 2000
vol home             88159    346799  17990732   3680604   13.7   15.6 
vol local            64308    103869   3848746    410899    6.0   22.0 
vol orahome          80240    208372  18931823    886870   11.9   21.1 
vol rootvol         336544    537741  21325442   8566302    4.8  323.1 
vol swapvol          32857       339   4199304     58160   13.8   22.5 
vol usr             396221    174834  11766646   2872832    3.5  547.6 
vol var             316340   1688518  25138480  19275428   11.1   53.7 

Mon May 01 19:00:31 2000
vol home                 1        28         4       129   10.0   34.3 
vol local                0         2         0         8    0.0  330.0 
vol orahome              4        20        24        88   10.0   84.0 
vol rootvol              0        80         0       720    0.0    9.4 
vol swapvol              0         0         0         0    0.0    0.0 
vol usr                  0         1         0        16    0.0   20.0 
vol var                  4       235        54      2498   15.0   13.7 

... and so on

into


% awk -f vx.awk < vxstat.out
time,vol,reads,writes,breads,bwrites,tread,twrite
Mon May 01 19:00:31 2000,home,1,28,4,129,10.0,34.3
Mon May 01 19:00:31 2000,local,0,2,0,8,0.0,330.0
Mon May 01 19:00:31 2000,orahome,4,20,24,88,10.0,84.0
Mon May 01 19:00:31 2000,rootvol,0,80,0,720,0.0,9.4
Mon May 01 19:00:31 2000,swapvol,0,0,0,0,0.0,0.0
Mon May 01 19:00:31 2000,usr,0,1,0,16,0.0,20.0
Mon May 01 19:00:31 2000,var,4,235,54,2498,15.0,13.7
... and so on

This can easily be read into R and plotted using


> vx <- read.csv("~/vxstat.csv", header=T)
> vxhome <- vx[vx$vol=="home",]
> chp(vxhome$reads,vxhome$treads)

One of the files I tried was quite long, half a million lines. It loaded into R in fifteen seconds, and the subsequent analysis operations didn't take too long. Try that with a spreadsheet... :-)

Slingbox for Xmas

What new toys can we get this Xmas? I already have the stuff I need. I'd like a phone with Wifi and 3G network speeds and a touch screen, but my Treo 650 is OK until something better comes along. I'm curious to see what Apple may come up with next year, in the much rumoured iPhone.

I've had a Tivo since 1999, and I'd like to be able to view the programs elsewhere in the house or further afield. The Slingbox does this, lets me control the Tivo remotely and stream the programs to a Windows or OSX laptop. The Slingbox AV was $179 list price on their web site, but I had a look on shopping.com and found it for sale from an out of state vendor for $140 with free shipping and no tax. So that's going to be the new toy this Xmas....

Thursday, November 23, 2006

Cockcroft Headroom Plot - Part 3 - Histogram Fixes

I found that I had some scaling issues with the histograms that needed fixing. Ultimately this made the code look a lot more complex, but it now deals with scaling the plot and the histogram with a fixed zero origin on both axes. I think its important to maintain the zero origin for a throughput vs. response time plot.

The tricky part is that the main plot is automatically oversized from its data range by a few percent, and the units used in the histogram are completely different. A histogram with 6 bars is scaled to have the bars at unit intervals and is 6 wide plus the width of the bars etc. After lots of trial and error, I made the main plot use the maximum bucket size of the histogram as its max value, and artificially offset the histograms by what looks like about the right amount. The plot below uses fixed data as a test. You can see that the first bar includes two points, thats due to the particular algorithm used by R. Some alternative histogram algorithms are available, but this one seems to be most appropriate to throughput/response time data.


> chp(5:10,5:10)

The updated code follows.


chp <- function(x,y,xl="Throughput",yl="Response",tl="Throughput Over Time",
ml="Cockcroft Headroom Plot") {
       xhist <- hist(x,plot=FALSE)
       yhist <- hist(y, plot=FALSE)
       xbf <- xhist$breaks[1]                          # first
       ybf <- yhist$breaks[1]                          # first
       xbl <- xhist$breaks[length(xhist$breaks)]       # last
       ybl <- yhist$breaks[length(yhist$breaks)]       # last
       xcl <- length(xhist$counts)                     # count length
       ycl <- length(yhist$counts)                     # count length
       xrange <- c(0,xbl)
       yrange <- c(0,ybl)
       nf <- layout(matrix(c(2,4,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE)
       layout.show(nf)
       par(mar=c(5,4,0,0))
       plot(x, y, xlim=xrange, ylim=yrange, xlab=xl, ylab=yl)
       par(mar=c(0,4,3,0))
       barplot(xhist$counts, axes=FALSE,
               xlim=c(xcl*0.03-xbf/((xbl-xbf)/(xcl-0.5)),xcl*0.97),
               ylim=c(0, max(xhist$counts)), space=0, main=ml)
       par(mar=c(5,0,0,1))
       barplot(yhist$counts, axes=FALSE, xlim=c(0,max(yhist$counts)),
               ylim=c(ycl*0.03-ybf/((ybl-ybf)/(ycl-0.5)),ycl*0.97),
               space=0, horiz=TRUE)
       par(mar=c(2.5,1.7,3,1))
       plot(x, main=tl, cex.axis=0.8, cex.main=0.8, type="S")
}

Monday, November 20, 2006

Cockcroft Headroom Plot - Part 2 - R Version

I kept tweaking the code, and came up with a prettier version, that also has a small time series view of the throughput in the top right corner.

The code for this is


chp <- function(x,y,xl="Throughput",yl="Response",tl="Throughput Time Series", ml="Cockcroft Headroom Plot") {
       xhist <- hist(x,plot=FALSE)
       yhist <- hist(y, plot=FALSE)
       xrange <- c(0,max(x))
       yrange <- c(0,max(y))
       nf <- layout(matrix(c(2,4,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE)
       layout.show(nf)
       par(mar=c(5,4,0,0))
       plot(x, y, xlim=xrange, ylim=yrange, xlab=xl, ylab=yl)
       par(mar=c(0,4,3,0))
       barplot(xhist$counts, axes=FALSE, ylim=c(0, max(xhist$counts)), space=0, main=ml)
       par(mar=c(5,0,0,1))
       barplot(yhist$counts, axes=FALSE, xlim=c(0,max(yhist$counts)), space=0, horiz=TRUE)
       par(mar=c(2.5,1.5,3,1))
       plot(x, main=tl, cex.axis=0.8, cex.main=0.8, type="S")
}

I also made a wrapper function that steps through the data over time in chunks.


> chp.step <- function(x, y, steps=10, secs=1.0) {
    xl <- length(x)
    step <- xl/steps
    for(n in 0:(steps-1)) {
        Sys.sleep(secs)
        chp(x[(1+n*step):min((n+1)*step,xl)],y[(1+n*step):min((n+1)*step,xl)])
    }
}

To run this smoothly on windows, I had to disable double buffering using


> options("windowsBuffered"=FALSE)

and close the graphics window so that a new one opens with the new option.

The data is displayed using the same calls as described in Part 1. The next step is to try some different data sets and work on detecting saturation automatically.

Sunday, November 19, 2006

The Cockcroft Headroom Plot - Part 1 - Introducing R

I've recently written a paper for CMG06 called "Utilization is Virtually Useless as a Metric!". Regular readers of this blog will recognize much of the content in that paper. The follow-on question is what to use instead? The answer I have is to plot response time vs. throughput, and I've been thinking about a very specific way to display this kind of plot. Since I'm feeling quite opinionated about this I'm going to call it a "Cockcroft Headroom Plot" and I'm going to try and construct it using various tools. I will blog my way through the development of this, and I welcome advice and comments along the way.

The starting point is a dataset to work with, and I found an old iostat log file that recorded a fairly busy disk at 15 minute intervals over a few days. This gives me 250 data points, which I fed into the R stats package to look at. I'll also have a go at making a spreadsheet version.

The iostat data file starts like this:

                    extended device statistics              
 r/s  w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
14.8 78.4  183.0 2446.3  1.7  0.6   18.6    6.6   1  21 c1t5d0
 0.0  0.0    0.0    0.0  0.0  0.0    0.0    5.0   0   0 c0t6d0
...

I want the second line as a header, so save it (my command line is actually on OSX, but could be Solaris, Linux or Cygwin on Windows)
% head -2 iostat.txt | tail -1 > header

I want the c1t5d0 disk, but don't want the first line, since its the average since boot, and want to add back the header
% grep c1t5d0 iostat.txt | tail +2 > tailer
% cat header tailer > c1t5.txt

Now I can import into R as a space delimited file with a header line. R doesn't allow "/" or "%" in names, so it rewrites the header to use dots instead. R is a script based tool with a command line and a very powerful vector/object based syntax. A "data frame" is a table of data object like a sheet in a spreadsheet, it has names for the rows and columns, and can be indexed.
> c1t5 <- read.delim("c1t5.txt",header=T,sep="")
> names(c1t5)
[1] "r.s" "w.s" "kr.s" "kw.s" "wait" "actv" "wsvc_t" "asvc_t" "X.w" "X.b" "device"

I only want to work with the first 250 data points so I subset the data frame by indexing the rows with an array (1:250) that selects the rows I want and leaving the column selector blank.
> io250 <- c1t5[1:250,]

The first thing to do is summarize the data, the output is too wide for the blog so I'll do it in chunks by selecting columns.


> summary(io250[,1:4])
      r.s              w.s             kr.s             kw.s        
 Min.   :  1.80   Min.   :  1.8   Min.   :  13.5   Min.   :   38.5  
 1st Qu.: 10.30   1st Qu.: 87.1   1st Qu.: 107.4   1st Qu.: 2191.7  
 Median : 18.90   Median :172.4   Median : 182.8   Median : 4279.4  
 Mean   : 22.85   Mean   :187.5   Mean   : 290.1   Mean   : 4448.5  
 3rd Qu.: 28.88   3rd Qu.:274.6   3rd Qu.: 287.4   3rd Qu.: 6746.6  
 Max.   :130.90   Max.   :508.8   Max.   :4232.3   Max.   :13713.1  
> summary(io250[,5:8])
      wait             actv            wsvc_t           asvc_t      
 Min.   : 0.000   Min.   :0.0000   Min.   : 0.000   Min.   : 1.000  
 1st Qu.: 0.000   1st Qu.:0.3250   1st Qu.: 0.400   1st Qu.: 3.125  
 Median : 0.600   Median :0.8000   Median : 2.550   Median : 4.700  
 Mean   : 1.048   Mean   :0.9604   Mean   : 5.152   Mean   : 4.634  
 3rd Qu.: 1.300   3rd Qu.:1.5000   3rd Qu.: 6.350   3rd Qu.: 5.700  
 Max.   :10.600   Max.   :3.5000   Max.   :88.900   Max.   :15.100  
> summary(io250[,9:10])
      X.w             X.b       
 Min.   :0.000   Min.   : 2.00  
 1st Qu.:0.000   1st Qu.:20.00  
 Median :1.000   Median :39.50  
 Mean   :1.428   Mean   :37.89  
 3rd Qu.:2.000   3rd Qu.:55.00  
 Max.   :9.000   Max.   :92.00

Looks like a nice busy disk, so lets plot everything against everything (pch=20 sets a solid dot plotting character)
> plot(io250[,1:10],pch=20)

The throughput is either reads+writes or KB read+KB written, the response time is wsvc_t+asvc_t since iostat records time taken waiting to send to a disk as well as time spent actively waiting for a disk.

To save typing, I attach to the data frame so that the names are recognized directly.
> attach(io250)
> plot(r.s+w.s, wsvc_t+asvc_t)

This looks a bit scattered, because there is a mixture of average I/O sizes that varies during the time period. Lets look at throughput in KB/s instead.
> plot(kr.s+kw.s,wsvc_t+asvc_t)

That looks promising, but its not clear what the distribution of throughput is over the range. We can look at this using a histogram.
> hist(kr.s+kw.s)

We can also look at the distribution of response times.
> hist(wsvc_t+asvc_t)

The starting point for the thing that I want to call a "Cockcroft Headroom Plot" is all three of these plots superimposed on each other. This means rotating the response time plot 90 degrees so that its axis lines up with the main plot. After looking around in the manual pages I eventually found an example that I could use as the basis for my plot. It needs some more cosmetic work but I defined a new function chp(throughput, response) shown below.


> chp <- function(x,y,xl="Throughput",yl="Response",ml="Cockcroft Headroom Plot") {
   xhist <- hist(x,plot=FALSE)
   yhist <- hist(y, plot=FALSE)
   xrange <- c(0,max(x))
   yrange <- c(0,max(y))
   nf <- layout(matrix(c(2,0,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE)
   layout.show(nf)
   par(mar=c(3,3,1.5,1.5))
   plot(x, y, xlim=xrange, ylim=yrange, main=xl)   par(mar=c(0,3,3,1))
   barplot(xhist$counts, axes=FALSE, ylim=c(0, max(xhist$counts)), space=0, main=ml)
   par(mar=c(3,0,1,1))
   barplot(yhist$counts, axes=FALSE, xlim=c(0, max(yhist$counts)), space=0, main=yl, horiz=TRUE)
}

The result of running chp(kr.s+kw.s,wsvc_t+asvc_t)is close...

That's enough to get started.

ps3 Marketplace Research on eBay

Over at Data Mining there is some interesting info on ps3's.

However, there is no need to do manual scraping of
eBay, here is a screenshot from the marketplace research function
that is bundled with my eBay store subscription. For $2.99 for 2 days
access anyone can get at this.

http://pages.ebay.com/marketplace_research/

Skype on Solaris

http://blogs.sun.com/darren/entry/skype_1.3.0.53_on_solaris_via

Solaris has a Linux compatible subsystem called BrandZ for running Linux binaries that don't have Solaris builds (like Skype). Darren figured out how to get the Linux build of Skype to run on Opensolaris.

Thanks to Alec for pointing this out.

Saturday, November 11, 2006

10 Things to Know About Skype Ap2Ap Programming

I also posted this on the Skype Developer Wiki

The ap2ap capability is an interesting new network computing paradigm but it is not like a conventional network.

end nodes are addressed by skype name, which addresses a person, not a computer

people can login to skype multiple times, so addressable endpoints are not unique

skype can go online/offline at will, so there is a concept of "presence" that needs to be managed

you can only make ap2ap connections to your buddy list or people who you have chatted to "recently"

both ends of an ap2ap connection have to choose a unique string used to identify their conversation or protocol

if you quit and restart skype, the first login can persist for a while, so you can get multiple ap2ap connections from a single user, although the ghosts of your previous connections cannot respond to a message. I think is is because you connect to a different supernode each time, and the first one isn't sure if you have really gone away yet

messages have to be sent as text, so binary objects have to be converted first using something like base64

the network can behave differently each time you use it, and this non-determinism makes testing difficult

relayed connections are limited to about 3KB/s, direct ones can run at several MB/s over a LAN

Skype4Java is cross-platform, but the maximum message size is about 64KB on windows and 16KB on OSX/Linux, and there are several bugs and limitations in the older version of the API library that is used by Skype 2.0 and earlier releases. Use Skype 2.5 or later for the best performance and stability

Monday, October 30, 2006

Bloglines, OPML, Blogger and Flock

I aggregate 50 or so blog feeds using Bloglines, its a very useful way to keep track of infrequent blogs in particular, and it strips off the adverts and other decoration from the power bloggers.

I just did some tidying up of my blog list, exported it in OPML format and uploaded it to http://share.opml.org/

This is an interesting way to contribute to the "Top 100" blogs list on that site, and it also has some useful features like seeing who else reads the same blogs, and who has the most similar list of blogs.

I seem to have added to the "long tail" since many of the blogs I read were new to the site, so I'm the only reader. My blog had one other reader "Ian" (hello!) who has a very long list of blogs, that I may poke around in if I get some free time...

Meanwhile, Flock is working well as my cross-platform browser. I'm writing blog entries with it, although I did lose an entry I was writing a week or so ago, when I upgraded Flock and didn't save an almost complete entry first. The new version of Flock appears to automatically save entries every few minutes, but I hate re-writing things, so that entry may not be re-created for a while.

Flock seems to have some issues writing entries to blogger.com at the moment. I'm not using the updated blogger.com, but Flock fails to write the blog entry most of the time, then randomly works. I gave up and used cut and paste to post this entry directly....

Pandora Prog Channel

I've been trying Pandora on and off for Internet music for a while, and attended a talk by CEO Tim Westergren last week, which got me to try it again. They are continuously improving their algorithms for choosing music, and I was trying to make a channel that would serve me interesting new music alongside some of my favourite experimental "Prog Rock" bands. It seems to be working much better, and I keep tuning the channel by skipping tracks that I don't like and giving thumbs up to the ones I do. The nice thing is you can listen to my channel, even though you don't get exactly the same songs as I do, there should be an interesting mix of King Crimson, Zappa, Estradasphere, and many other bands playing music you won't hear often. Its easy to make your own channel (it takes less training to make a more mainstream channel) and its the best way I've found to discover completely new music.

http://www.pandora.com/?sc=sh59488715848528731

Enjoy...

Blogged with Flock

Sunday, October 08, 2006

CMG06 Conference - Reno December 3-8

As usual I'll be attending the Computer Measurement Group conference in Reno Nevada this December. I've attended every year since 1994, and its the place where I get an update on the state of the art in Performance Management, and get to mingle with my friends and peers who work on Capacity Planning.

This year I'm presenting three times:

Sunday morning 3 hour seminar on Capacity Planning with Free and Bundled Tools. This is a repeat of last years talk, presented jointly with Mario Jauvin, who covers the Windows OS and Networking related areas. I cover Solaris, Linux and the system oriented tools.
Wednesday morning conference paper titled "Utlization is Virtually useless as a Metric!". Regular readers of this blog will recognize much of the content of this paper, which gathers together all the ways in which your measurements can be corrupted by virtualization.
Thursday morning I'm giving a 3 hour training course called the Unix/Linux CMG Quick Start Course, which is part of a new feature for CMG and is based on the training classes in performance tuning that I have given for many years.

Early bird discounted registration is open until October 13th. The sunday seminars are an extra cost item, but the Thursday morning training classes are included in the regular conference fee. This is the only place I'm planning to give public training classes, and since I'm at the conference all week its a great opportunity to discuss performance and capacity issues in person. I hope to see you there...

technorati tags:cmg, solaris, performance, capacity, training, seminar, linux

Blogged with Flock

Monday, September 04, 2006

Updated Ad Setup

I just changed to the flash based ad sidebar. It lets you pick different keywords without reloading the page by clicking on the top tab.

I also continued to focus on Technology Books, picking some specific categories to try and exclude the certification and training books that I find less interesting. I then excluded Microsoft and mcse as keywords, and ended up with Cisco certification so I excluded them as well, and it looks like a more reasonable selection now.

Blogged with Flock

Sunday, September 03, 2006

Comments on Web Traffic

This blog gets about 50 visitors a day, and most new visitors arrive as the result of a google search for my name, Thumper/ZFS or the SE toolkit. There is a very small number of yahoo and msn searches. The other main source of traffic is the Sun Community blogging site, which links to Sun Alumni's blogs including this one. A few weeks ago I got a lot of traffic from Sun, and I think I traced it back to a blog entry from Jonathan Schwartz, who talked about his General Counsel's blog, and Mike Dillon the GC mentioned the Sun Community blog site. This doubled my traffic for a week or two.

The other recent change is that I stopped showing adverts from ctxbay, which was created by an eBay developer as a side project, won a prize, but never really worked well enough to be useful. I've replaced them with the official eBay in-house AdContext system, which is being beta tested. AdContext looks at your page content, and matches keywords it with popular items from eBay. It can be configured to exclude certain keywords (in my case I exclude "Adrian", which was causing problems for ctxbay), and you can choose certain categories or stores to pick items from. I've picked Technology Books and some Storage hardware categories. I'm going to experiment with different formats and constraints, to see how well it works.

There are three formats, text only, pictures (which I started with) and flash (which scrolls multiple adds into the same amount of screen space). I'll switch it to flash when I get around to it...

technorati tags:sun, thumper, traffic

Blogged with Flock

Monday, August 21, 2006

Wally's Items...

http://www.dilbert.com/comics/dilbert/archive/dilbert-20060821.html

Sounds like a good plan....

technorati tags:dilbert

Blogged with Flock

Friday, August 18, 2006

Web vs. Skype, a paradigm shift

The essential characteristics of the http based web are that by default everyone is anonymous, and everyone can get to everything. Its "free" and the trend is for the parts that are not free and anonymous to move in that direction. For example, you can now buy stuff on eBay Express without having to sign up for an eBay account, and there are fewer newspaper sites requiring paid subscriptions, since they are losing audience to the free sites.

However, the essential characteristics of the Skype peer to peer network are the opposite of the Internet. Everyone has a clear identity and no-one can get to anything without asking for permission or being invited. I think this truly a different paradigm for building systems.

Everyone on Skype is plugged into the public key encryption infrastructure (PKI) which provides a secure identity as well as secure communications between peers. However, to communicate with other peers you need to know them and have permission. For me the most interesting capability on Skype is the application to application messaging API (ap2ap) that enables a new class of distributed applications that leverage the social network formed by the mesh of Skype contact lists.

The upshot of this is that some things that are easy on the Internet are difficult on Skype, and vice versa. There is a temptation to take something that we know works on the web, and try to make something similar on Skype ap2ap, but that is pointless, just use the web! Look for things that really don't work well on the web, or look for web-based systems that connect a few people but need an expensive back-end or don't scale. This is the start of something interesting....

technorati tags:skype, ap2ap, scalability

Blogged with Flock

Sunday, August 06, 2006

Solaris Internals and Performance 2nd Edition

Richard and Jim have finally finished their updated book and got it published. Rush out and buy a copy! I just listened to a podcast where they talked about it and mentioned that you can get it for 30% off from http://www.sun.com/books. Strangely, my own Sun Performance and Tuning book isn't listed there, although my BluePrint books on Capacity Planning and Resource Management are.

I was also amused to see that in the slide deck they use to launch the book they reference Adrian's Rule of book writing (book size grows faster than you can write).

Congratulations!

Now do I have time to start another book? I'm not sure... maybe.

technorati tags:Solaris, performance

Blogged with Flock

Sun ZFS and Thumper (x4500)

I was one of the beta testers for Sun's new x4500 high density storage server, and it turned out pretty well. I was able to hire Dave Fisk as a consultant to help me do the detailed evaluation using his in-depth tools, and it turned into a fascinating investigation of the detailed behavior of the ZFS file system.

ZFS is simple to use, has lots of extremely useful features, and the price is right (bundled with Solaris 10 6/06 or OpenSolaris). However its doing lots of clever things under the hood and it behaves like nothing else. Its far more complicated to predict its performance than any other file system we've looked at. It even baffled Dave at first, he had to change his tools to support ZFS, but he's got it pretty well figured out now.

For a start, its a write anyware file system layout (WAFL) which is similar in some ways to a NetApp filer. This means that random writes are batched up, sorted by file, file system etc. and every few seconds a big burst of sequential writes commits the data to disk as a transaction. Since sequential writes to disk are always much more efficient than random writes, this mean that it gets much more performance per disk than UFS/VxFS etc for random writes.

The combination of the x4500 and ZFS works well, since ZFS knows that the firmware on the 48 SATA drives in the x4500 have a write cache that can safely be enabled and flushed on demand. This greatly improves performance and fixes an issue that I have been complaining about for years. Finally a safe way to use the write caches that exist in every modern drive.

Its actually easier to list the things that ZFS on the x4500 doesn't have.

No extra cost - its bundled in a free OS
No volume manager - its built in
No space management - file systems use a common pool
No long wait for newfs to finish - we created a 3TB file system in a second
No fsck - its transactional commit means its consistent on disk
No rsync - snapshots can be differenced and replicated remotely
No silent data corruption - all data is checksummed as it is read
No bad archives - all the data in the file system is scrubbed regularly
No penalty for software RAID - RAID-Z has a clever optimization
No downtime - mirroring, RAID-Z and hot spares
No immediate maintenance - double parity disks if you need them
No hardware failures in our testing - we didn't get to try out some of these features!

and finally, on the downside

No way to know how much performance headroom you have
No way to get at the disks without taking the top off the x4500
No clustering support - I guess they couldn't put everything on the wish list...

The performance is actually very good, and in normal use its going to be fine, but when we tried to drive ZFS to its limit, we found that the results were less consistent or predictable than more conventional file systems. Some of the issues we ran into are present in the Solaris 10 6/06 release, but when the x4500 ships it will have an update to ZFS that includes performance fixes to speed things up in general and reduce the impact of the worst case issues, so it should be more consistent.

We've put ZFS on some of our internal file servers, to see how it goes in light usage. However, it always takes a while to build up confidence in a large body of new code, especially if its storage related. If we can add this one to the list:

No nasty bugs or surprises?

Then ZFS looks like a good way to take a lot of cost out of the storage tier.

I'm interested to hear how other people are getting on with ZFS, especially mission critical production uses.

technorati tags:ZFS, Solaris, x4500, Thumper, Ortera

Blogged with Flock

Monday, July 24, 2006

IEEE Conference Paper

I attended the IEEE E-Commerce conference in San Franscisco. The conference is known as CEC06/EEE06 and some other acronyms. It was a very interesting academic oriented event with a few hundred people from all over the world, I made some good contacts and learned some new stuff.

My own paper was about how I built a large scale simulation of a peer to peer network using a very efficient architecture based on the Occam language. I used to write a lot of Occam about 20 years ago, and it seemed appropriate to the problem I wanted to solve. I think most people are baffled by the language, but I like it. Unlike most recent languages where everything is an object with types and methods, in Occam everything is a process with protocols and messages. The other difference is that Occam was designed to run fast on a 10MHz CPU and on todays CPUs it is extremely fast and small compared to recent languages like Java.

What I found at the conference was that most of the simulation frameworks people were using were run overnight to generate results. My own example simulation of 1000 nodes ran for about three seconds to produce an interesting result.

The full paper can be obtained from http://doi.ieeecomputersociety.org/10.1109/CEC-EEE.2006.81

This is the official URL, and IEEE charges non-members for downloads.

technorati tags:IEEE, Occam, simulation

Blogged with Flock

Tuesday, June 20, 2006

CPU Power Management

AMD PowerNow! for the Opteron series of server CPUs dynamically manages the CPU clock speed based on Utilization. The speed takes a few milliseconds to change, and it is not clear exactly what speeds are supported, but one report stated that the normal speed of 2.6GHz would reduce to as low as 1.2GHz under a light load. This report also shows CPU detailed configuration and power savings. http://www.gamepc.com/labs/view_content.asp?id=opteron285&page=3

The problem with this for capacity management is that there is no indication of the average clock rate in the standard system metrics collected by capacity planning tools. PowerNow! is described by AMD at http://www.amd.com/us-en/0,,3715_12353,00.html and drivers for Linux and Windows are available from http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_871_9033,00.html. In the future, operating systems may be able to take the current speed into account, and estimate the capability utilization, but the service time is higher at low clock rates, so we will still see some confusing metrics.

The current PowerNow! implementation works on a per-chip basis, and Opteron’s have two complete CPU cores per chip that share a common clock rate. In a multiprocessor system made up of several chips, each pair of processing cores could be running at a different speed, and their speed can change several times a second.

Our basic assumption for well behaved workloads, that mean service time is a constant quantity, is invalidated in a very non-linear manner, and utilization measurements will also move in mysterious ways....

Monday, June 12, 2006

eBay AdContext Contextual Adverts

I've been running contextual ads on this site for a few months, using the CTXbay service that won an eBay developers program award. I don't think I've had many people click through, since the ads aren't very relevant, and seem fixated on the word "Adrian".

Now that eBay has announced its own AdContext service is on the way, I'm planning to replace CTXbay with the official service as soon as I can get access to it. The eBay service has access to a lot more information and is quite customizable. I asked about it at the Developers Conference and was told that I could set a default category and control what kind of items appear.

eBay Wireless WAP Access | by Adrian Cockcroft | June 12th, 2006

I was at the eBay developers conference showing some proof of concept prototypes of new mobile applications, and I found that hardly anyone knew that eBay already has a WAP based mobile version of the site. It loads in seconds on any phone that has any kind of web browser, but there is no automatic redirect from the main eBay site. You should bookmark this on your phone's web browser:

http://wap.ebay.com

You can also use this site to backend other mobile applications. If your own code helps a user find an item on eBay then you can form a URL that contains the item id and go directly into the official eBay WAP based site. It handles user login, MyEbay, watchlists etc. The main problem with the wap site is that the search functionality is too simplistic. The prototypes we were showing (no, you can't access them, you should have been there...) are aimed at fixing the finding experience.

Tuesday, June 06, 2006

Part 3: Disruptive Innovation viewed as a Maturity Model | by Adrian Cockcroft | June 6th, 2006

This time I'll take a more abstract view of a maturing market as each phase evolves, and refer to the development of in-home movie watching as an example.

An emerging market is characterized by competition on the basis of technology. Early adopters like to play with new technology and are able to cope with its issues. Many different products are competing for market share on the basis of "my features are better". Think of the early days of the VCR, with VHS vs. Betamax. In a mature market, few people worry about features, most VCR or DVD players have the same feature set and very good picture quality at a very low price. If you want to be sure you get a good one, you are most likely to buy using brand name (e.g. Sony) rather than poring over detailed specifications. Margins are low, but volume is high and margins can be better if you won the brand battle.
The next phase in the market is characterized by competition on the basis of service. Think of the video rental store as a service. You visit the store and pay rental according to how much you use the service. As an emerging service, anyone could setup to rent videos and DVDs. As the market matured, larger stores with a bigger selection and more centralized buying power provided a better service, and video rental chains such as Blockbuster took over the market. Again, the power of a dominant brand became the primary differentiator as the service market matured.
The third phase in the market is the evolution of a service into a utility. A utility provides a more centralized set of resources, and a regular subscription or monthly bill. It can provide similar services, but in a more automated manner. NetFlix is my example of a utility based DVD provider service. You pay a monthly fee which encourages steady consumption, and NetFlix have automated the recommendation system, which replaces asking the counter clerk in a video rental store for advice. The recommendations are the result of many peoples opinions, so are likely to be less biased and better informed, but the most important difference in the utility approach is that it doesn't need people to provide the service directly to the customer. This makes it fundamentally cheaper. Many traditional services were transformed into utilities by the arrival of the Internet, which allows consumers to access information based utilities in a generic and efficient manner. The network effect benefit of having a large user base also causes dominant brand names to emerge. NetFlix leads mindshare in this space, despite attempts by BlockBuster to copy their business model, NetFlix can grow faster with fewer people as a pure utility.
The final phase in the evolution of a market occurs as the cost of replication and distribution of the product approaches zero. For digital content the end customer already has a computer and an Internet connection. There is no additional cost to use it to download a movie. A central utility such as YouTube can use a mixture of advertising and premium services (for a minority of power users) to offset their own costs. Peer to peer systems distribute the load so that there is no central site and no incremental cost in the system. The only service that is needed is some kind of search, so that peers can find each other's content to exchange it. PirateBay is primarily a search engine, and search engines become dominant when the brand gets well known, and they find what you are looking for because they have a comprehensive index.

So the evolution of a marketplace goes from competing on the basis of technology, to competing on service, to competing as a utility, to competing for free. In each step of the evolution, competitors shake out over time and a dominant brand emerges.

To use this as a maturity model, take a market and figure out whether the primary competition is on the basis of technology, service, utility or search, and consider whether a dominant brand has emerged in that phase. The model should then indicate what the next step is likely to be, so you can try to find the right disruptive innovation to get you there. Good luck!

Saturday, June 03, 2006

Part 2: Moving Pictures - disruptive innovation from the Cinema to PirateBay | by Adrian Cockcroft | June 3rd, 2006

Lets look at the history of movies. The initial technology to capture and replay moving pictures was developed around 100 years ago, and the initial competition between inventors went through its first transition when movie theaters became established and began to settle on a standard form of projector. The inventors who had alternative camera/recording/projector technology died out. Consumers wanted to go see movies and the movie industry formed to provide content for that market.

The next innovation was to be able to watch movies at home on film, then there were movies on TV. The movie theaters had far bigger screens, better sound and color but the technology at home gradually caught up in features and reduced in cost, and a market transition to home viewing occurred. The total market size for equipment bought to watch movies at home is huge. Its important to note that the primary vendors in each phase of the market are different. The movie theater business is very different to the home video equipment supplier business. The early battles in the home were over the standard formats, famously Betamax failed to win over VHS for video tape, and there are continuing battles over DVD formats, but Sony is a dominant brand name in a crowded market for home video equipment.

The next innovation was video rental, and Blockbuster ended up as a major player in this market, with presence on every high street. However, that presence became unnecessary as Netflix shipped DVD's directly to consumers and took over a large share of the market.

Finally, video is available directly over the internet , its being viewed on PC's rather than TV sets, anyone can create and upload it, and YouTube is this year's hot market leading name in this space for all kinds of short videos. Its also trivially easy to take a full length movie or TV program and share it using one of the many BitTorrent services, and a growing proportion of movies are being watched for free, to the consternation of the movie industry.

The PirateBay site in Sweden was recently shut down and charged with copyright violation, but it appears that a significant proportion of the population of Sweden were users and they got upset as they had got used to exchanging content for free. After three days the site came back up, hosted in Holland, and with even more users due to the publicity.

Unlike YouTube, BitTorrent sites such as PirateBay don't host the actual content, they just connect individual users who exchange content, they don't need to provide storage or bandwidth, just a searchable database of small index files that configure the BitTorrent transfer between a large number of seeders that have some or all of the file already, and leechers who want to get the file, and who can in turn become seeders.

The publicity gained as a side effect of trying to shut down the PirateBay site may even have the opposite effect of cementing the PirateBay brand as a market leader and accelerating growth in this space.

Every step in this history involves a disruptive innovation. There is a fundamental reduction in cost, offset by a large increase in unit volume, which has often increased the overall revenue using a new way to monetize the market for moving pictures. Each time the previous market leader is left behind (often kicking and screaming) as the new larger market emerges. Each time a new brand captures the attention span and trust of the consumer, and dominates the market.

Part 1: Disruptive Innovation in the path from technology to brand - a maturity model | by Adrian Cockcroft | June 3rd, 2006

Products aim to fill a need in a market, products that are disruptive innovations also reshape the market and markets tend to evolve in a series of discontinuous steps as they mature. The phrase "crossing the chasm" has been used to describe these changes, and "early adopters" are the people who first move a market to a new phase.

In the next few posts I'm going to describe a generic maturity model that applies to many markets, and show how disruptive innovations may drive a market into a more mature phase. I got the initial idea of looking at markets in this way from Dave Nocera of Innovativ in a presentation he gave at SUPerG in early 2004. He used video as an example, with the move from VCR to Video rental to Online. I have extended that example, and come up with a generic maturity model based on it, which I also apply to the Telco industry.

Monday, May 15, 2006

See you at the developers conference? | by Adrian Cockcroft | May 15th, 2006

The combined eBay, PayPal and Skype developer conference is coming up, June 10-12 in Las Vegas. I missed the event last year, but I will be staffing it this year! A few of us are being let out of the mysterious eBay Research Labs for the occasion. They told us to get to work on the future of e-commerce, and have kept us locked up for months, shipping in occasional supplies of Starbucks and fresh interns. I've been writing serious amounts of code for the first time in years, if fact I'm too busy writing Java to have time to go to JavaOne this week.

The conference is supposed to illuminate questions such as:

- What will the next technology revolution be

- How will it impact commerce and communications on the web

- And what opportunities will it provide for developers and technology innovators

- How will the Long Tail Theory play out

- Web 2.0 and how to build revenue streams

Its become a common joke to keep incrementing this: Web 2.1, Web 3.0 etc. but personally I think the most interesting developments aren't even Web based. e.g. Skype isn't a Web application, it defines its own virtual private peer to peer fabric that overlays the Internet.

See you in Vegas!

Archive

Saturday, April 21, 2007

Monday, April 16, 2007

Thursday, April 12, 2007

Wednesday, March 28, 2007

Thursday, March 01, 2007

Friday, February 16, 2007

Monday, February 12, 2007

Wednesday, February 07, 2007

Saturday, February 03, 2007

Friday, February 02, 2007

Thursday, February 01, 2007

Friday, January 26, 2007

Friday, January 19, 2007

Thursday, January 18, 2007

Tuesday, January 16, 2007

Monday, January 15, 2007

Tuesday, January 09, 2007

Sunday, January 07, 2007

Wednesday, December 27, 2006

Saturday, November 25, 2006

Thursday, November 23, 2006

Monday, November 20, 2006

Sunday, November 19, 2006

Saturday, November 11, 2006

Monday, October 30, 2006

Sunday, October 08, 2006

Monday, September 04, 2006

Sunday, September 03, 2006

Monday, August 21, 2006

Friday, August 18, 2006

Sunday, August 06, 2006

Monday, July 24, 2006

Tuesday, June 20, 2006

Monday, June 12, 2006

Tuesday, June 06, 2006

Saturday, June 03, 2006

Monday, May 15, 2006