Monday, July 19, 2010

The bandwidth of a fully laden 747

Never underestimate the speed of shipping physical stuff when you want to move large amounts of data. The Internet is actually horribly slow even at 'high' speeds. That's why Amazon Web Services offers an Import/Export service that involves shipping physical disks around.

Back in 1999 I wrote an article for The Guardian explaining latency and bandwidth for modem users. In it I used a jet plane full of people flying across the Atlantic to illustrate the difference.

The analogy still works today, and brings me to the farcical question: what's the bandwidth of a fully laden 747?

Assuming I fill it with DAT 320 cartridges, each of which can contain 160GB of uncompressed data and each of which weigh about 50g then I can fit about 2.8m cartridges in the plane (a single 747 can lift 140 tonnes). That's about 427 PB of storage in the plane. (I'm not sure that many cartridges will actually fit inside the 747, but you get the idea).

Now assume it flies from San Francisco to London in 10 hours. That's a bandwidth of about 12 TB/s.

Which brings me to today's announcement from RackSpace about OpenStack. One of the goals of OpenStack is to remove lock in to specific providers. That's a noble goal, but if you store a lot of data in the cloud you might find yourself needing a 747 (or at least FedEx) if you decide to change providers.

PS Many people have commented that it would take a while to fill up the DAT tapes. Clearly the solution, as one commenter suggests, is to use a 747 as your data centre and fly it where you need.


Joe said...

Assuming you had 2.8m DAT cartridges already. Otherwise you would have to factor in copying on and off of the cartridges.

Xavier said...

Not to be pendantic, but that is also assuming that loading your data onto 2.8M cartridges, loading all the cartridges onto the plane, unloading them, and then loading all the data from each of them onto your destination network. In this particular case, you might be better off uploading that data.

Ryan Dietrich (dextius) said...

I disagree. You'd have to account for the time it would take to load those tapes into a tape drive and read them to your 427PB array.

How long does it take to read 160 gig off a tape? Multiply that by 2.8 million, and then add the travel time, unloading off the plane time, and the shipping time to your terminal.

Sounds like a toss up now?

MLeo said...

This worked for [email protected], the people at SETI didn't have that much bandwidth to spare, so they shipped their data through "sneakerweb" to Berkley where they loaded the data into packets for use in [email protected]

wcarss said...

No, don't you all see? He's saying that you have to be using the 747 as an active data center. You fly it to wherever you have the highest data needs, to bump up local maximums.

leenoox said...

Let's get out an envelope, and turn it over...

LTO-4 cartridges -- the largest presently available -- are 800GB native, and 1.6TB compressed, and are about 4x4x1", or 16 in3. (Yes, I know about Ampex DLT; this is denser.)

One ft3 is 1728 in3, so you can fit 12 layers of 9 tapes -- 84 tapes or 134.4TB -- in a cubic foot of space.

Assuming a spherical cow... no, wait. :-)

I have a source which suggests that the 747-400F is good for 65,155 ft3 of cargo, so assuming maximum packing density, and that the density of packed tapes doesn't take the airframe over gross or out of flyable CG, then the total number of terabytes you could load would be...

8,756,832, which is 8.7 exabytes, if I remember my prefix table properly.

Now, you're assuming the cost of the airframe, fuel, pilot salaries and airport fees, as well as the cost of 5.43M LTO-4 cartridges, so , clearly this is not an inexpensive project...

And you have to divide that into the total over all number of seconds from the time you write the first byte to the time you read the last one, so -- in the long run -- you might be better served to pay someone to trench OC-192 fiber (40Gbps) from one point to the other -- especially if you're going to do it more than once.

Chris P said...

Hey, I wrote a calculator for just such a thing. Ryan, I was considering throwing in just what you were talking about so that you could get the "real" picture.

Chris P said...

I wrote a calculator for just such a thing, but I didn't get a chance to write in the "transferring on or off of media" portion of the exercise. My assumption would be that they could just use the media in question. Which is pretty much like someone was saying -- it's like moving the entire data center.

Michael said...

Hmmm...I read a similar article not too long ago that used Micro SD cards instead of the ubiquitous backup tapes, and a humble 1985 Volvo station wagon on the transport layer.

500 gig per second, if we don't get a flat

Again, the main argument against it was the time & effort involved, especially the amount of time needed to load data onto and read off of the cards. But, if you're willing to accept that kind of latency then it's certainly a robust way of moving huge amounts of data around.

Pádraig Brady said...

Why don't you use 2TB hard disks as the example. Aren't they as cheap & dense as tapes, and also have benefits like I dunno, random access and stuff.

Covarde_Anonimo said...

leenoox, i don't know where you got 65000 cu ft of space. a 747-8F (bigger than the 400 model) have only 30000 cu ft (854 m3) of space.

that's enough for 14.7 milion DAT cartdriges, or 3686.5 tons.

but since the 747-8f is limited to 160 tons, it'd take 23 planes to carry all that volume. the maximum you can take in a single jumbo would be 3.5 million tapes, assuming 45 grams (shipping wheight i found on amazon for an HP branded DAT 160) and the capacity of a 747-8f from boeings web site.

still, a lot of data. which would take a while to save on tapes. assuming maxell's datasheet as reliable, a drive can put 24MB/s on a tape, or 41s to save a gigabyte.

which means 474 days for a single drive to record a mere petabyte. with a thousand drives working in parallel, it'd take 202 days to save john's estimated 427PB, plus one day for loading the tapes and flying them, 203 days. same 202+1 day to unload and restore the tapes, we're talking 406 days to physically move all the data to the destination system.

that's about 1 petabyte a day. with 86400 seconds in a day, that's 12.2 GIGABYTES/SECOND.

it'd take nearly 100 thousand gigabit ethernet cards to match that.

in other words: never underestimate the bandwidth of a thousand DAT drives working toghether with a 747-8f

jbooks said...

does anyone here really believe there is any company in the world that would HAVE that much data, or even a small fraction of it?