By
Larry Dignan
Tuesday, May 26 2009 12:27 PM
URL:
http://www.zdnetasia.com/news/business/0,39044229,62054360,00.htm
When Amazon Web Services' latest--and arguably most valuable--service is a system that allows you to ship terabytes of data to the cloud via snail mail you just have to chuckle. Yes folks, for all the fancy talk of cloud computing, terabytes--not to mention petabytes--of data and technological advancement the Sneakernet is alive and kicking.
The Sneakernet, where
someone puts data on a disk, flash drive etc. and runs it to another
computer, is arguably one of our most enduring networks. I still use it all the
time. I’m sure I could network my home devices together, but the Sneakernet works just fine.
Multiply the Sneakernet on a grand scale and you understand why
Amazon is launching a service called Import/Export. There's too much data to move to the
cloud and not enough bandwidth to get it there quickly. Why take five days to
move data--and hog up all your bandwidth--when you can toss it on a storage brick
of some sort and just overnight it?
Amazon
CTO Werner Vogels explained:
In some ways the computing world has changed dramatically; networks have
become ubiquitous and the latency and bandwidth capabilities have improved
immensely. Next to this growth in network capabilities we have been able to grow
something else to even bigger proportions, namely our datasets. Gigabyte data
sets are considered small, terabyte sets are common place, and we see several
customers working with petabyte size datasets.
No matter how much we have improved our network throughput in the past 10
years, our datasets have grown faster, and this is likely to be a pattern that
will only accelerate in the coming years. While network may improve another
other of magnitude in throughput, it is certain that datasets will grow two or
more orders of magnitude in the same period of time.
Simply put, if you wanted to move a terabyte data set to EC2 it will take you
a while. On an enterprise scale, this data-moving problem is yet another
hindrance to cloud computing adoption.
Microsoft Research notes that you still have to maintain that network. And there’s labor and support.
Microsoft Research’s Jim Gray concluded that Sneakernets are the answer to
the above conundrum:
What is the best way to move a terabyte from place to place? The
Next Generation Internet (NGI) promised gigabit per second bandwidth
desktop-to-desktop by the year 2000. So, if you have the Next Generation
Internet, then this transfer is just 8 trillion bits, or about 8,000
Seconds--a few hours wait. Unfortunately, most of us are still waiting for the
Next Generation Internet--we measure bandwidth among our colleagues at between
1 megabits per second (mbps) and 100 mbps. So, it is takes us days or months to
move a terabyte from place to place using the Last Generation
Internet.
That passage was written in 2002. And guess what? We’re still waiting. Simply
put, the Sneakernet is the most efficient means of moving a terabyte of data around.
Given that fact, Amazon’s Sneakernet, the Import/Export service, may
become its most appreciated if not technologically advanced feature. Go figure.
In a nutshell, Import/Export allows you to ship data on storage devices with a
manifest that explains how and where to load the data and map it to Amazon's storage system.
Now there are costs. Amazon will charge you US$80 per storage device handled
and US$2.49 per data loading hour. And then there’s the usual storage pricing. But
add it up and it’s cheaper per terabyte than waiting a week for a dataset to
move.
Will the Sneakernet ever go away? Nope. Gray sums it up:
Until we all have inexpensive end-to-end gigabit speed networks, terascale
datasets will have to move over some form of sneaker net. We suspect that by the
time the promised end-to-end gigabit (next generation Internet) arrives, we will
be moving petabyte scale datasets and so will still need a Sneakernet
solution.
This article was first published as a blog post on ZDNet.