Friday, August 18, 2006

Web vs. Skype, a paradigm shift

The essential characteristics of the http based web are that by default everyone is anonymous, and everyone can get to everything. Its "free" and the trend is for the parts that are not free and anonymous to move in that direction. For example, you can now buy stuff on eBay Express without having to sign up for an eBay account, and there are fewer newspaper sites requiring paid subscriptions, since they are losing audience to the free sites.

However, the essential characteristics of the Skype peer to peer network are the opposite of the Internet. Everyone has a clear identity and no-one can get to anything without asking for permission or being invited. I think this truly a different paradigm for building systems.

Everyone on Skype is plugged into the public key encryption infrastructure (PKI) which provides a secure identity as well as secure communications between peers. However, to communicate with other peers you need to know them and have permission. For me the most interesting capability on Skype is the application to application messaging API (ap2ap) that enables a new class of distributed applications that leverage the social network formed by the mesh of Skype contact lists.

The upshot of this is that some things that are easy on the Internet are difficult on Skype, and vice versa. There is a temptation to take something that we know works on the web, and try to make something similar on Skype ap2ap, but that is pointless, just use the web! Look for things that really don't work well on the web, or look for web-based systems that connect a few people but need an expensive back-end or don't scale. This is the start of something interesting....

technorati tags:, ,

Blogged with Flock

Sunday, August 06, 2006

Solaris Internals and Performance 2nd Edition

Richard and Jim have finally finished their updated book and got it published. Rush out and buy a copy! I just listened to a podcast where they talked about it and mentioned that you can get it for 30% off from Strangely, my own Sun Performance and Tuning book isn't listed there, although my BluePrint books on Capacity Planning and Resource Management are.

I was also amused to see that in the slide deck they use to launch the book they reference Adrian's Rule of book writing (book size grows faster than you can write).


Now do I have time to start another book? I'm not sure... maybe.

technorati tags:,

Blogged with Flock

Sun ZFS and Thumper (x4500)

I was one of the beta testers for Sun's new x4500 high density storage server, and it turned out pretty well. I was able to hire Dave Fisk as a consultant to help me do the detailed evaluation using his in-depth tools, and it turned into a fascinating investigation of the detailed behavior of the ZFS file system.

ZFS is simple to use, has lots of extremely useful features, and the price is right (bundled with Solaris 10 6/06 or OpenSolaris). However its doing lots of clever things under the hood and it behaves like nothing else. Its far more complicated to predict its performance than any other file system we've looked at. It even baffled Dave at first, he had to change his tools to support ZFS, but he's got it pretty well figured out now.

For a start, its a write anyware file system layout (WAFL) which is similar in some ways to a NetApp filer. This means that random writes are batched up, sorted by file, file system etc. and every few seconds a big burst of sequential writes commits the data to disk as a transaction. Since sequential writes to disk are always much more efficient than random writes, this mean that it gets much more performance per disk than UFS/VxFS etc for random writes.

The combination of the x4500 and ZFS works well, since ZFS knows that the firmware on the 48 SATA drives in the x4500 have a write cache that can safely be enabled and flushed on demand. This greatly improves performance and fixes an issue that I have been complaining about for years. Finally a safe way to use the write caches that exist in every modern drive.

Its actually easier to list the things that ZFS on the x4500 doesn't have.

  • No extra cost - its bundled in a free OS
  • No volume manager - its built in
  • No space management - file systems use a common pool
  • No long wait for newfs to finish - we created a 3TB file system in a second
  • No fsck - its transactional commit means its consistent on disk
  • No rsync - snapshots can be differenced and replicated remotely
  • No silent data corruption - all data is checksummed as it is read
  • No bad archives - all the data in the file system is scrubbed regularly
  • No penalty for software RAID - RAID-Z has a clever optimization
  • No downtime - mirroring, RAID-Z and hot spares
  • No immediate maintenance - double parity disks if you need them
  • No hardware failures in our testing - we didn't get to try out some of these features!

and finally, on the downside

  • No way to know how much performance headroom you have
  • No way to get at the disks without taking the top off the x4500
  • No clustering support - I guess they couldn't put everything on the wish list...

The performance is actually very good, and in normal use its going to be fine, but when we tried to drive ZFS to its limit, we found that the results were less consistent or predictable than more conventional file systems. Some of the issues we ran into are present in the Solaris 10 6/06 release, but when the x4500 ships it will have an update to ZFS that includes performance fixes to speed things up in general and reduce the impact of the worst case issues, so it should be more consistent.

We've put ZFS on some of our internal file servers, to see how it goes in light usage. However, it always takes a while to build up confidence in a large body of new code, especially if its storage related. If we can add this one to the list:

  • No nasty bugs or surprises?

Then ZFS looks like a good way to take a lot of cost out of the storage tier.

I'm interested to hear how other people are getting on with ZFS, especially mission critical production uses.

technorati tags:, , , ,

Blogged with Flock