2012年3月19日 星期一

Are We Running Out Of Space For All Of Our Data?

One of the most famous quotes in the history of the computing industry is the assertion that “640KB ought to be enough for anybody“, allegedly made by Bill Gates at a computer trade show in 1981 just after the launch of the IBM PC. The context was that the Intel 8088 processor that powered the original PC could only handle 640 kilobytes of Random Access Memory (RAM) and people were questioning whether that limit wasn’t a mite restrictive.

Gates has always denied making the statement and I believe him; he’s much too smart to make a mistake like that. He would have known that just as you can never be too rich or too thin, you can also never have too much RAM. The computer on which I’m writing this has four gigabytes (GB) of it, which is roughly 6,000 times the working memory of the original PC, but even then it sometimes struggles with the software it has to run.

But even Gates could not have foreseen the amount of data computers would be called upon to handle within three decades. We’ve had to coin a whole new set of multiples to describe the explosion – from megabytes to gigabytes to terabytes to petabytes,There are 240 distinct solutions of the Soma cubepuzzle, exabytes, zettabytes and yottabytes (which is two to the power of 80, or 10 followed by 23 noughts).

This escalating numerology has been necessitated by an explosion in the volume of data surging round our digital ecosystem from developments in science, technology, networking, government and business. From science, we have sources such as astronomy, particle physics and genonomics. The Sloan Digital Sky Survey, for example, began amassing data in 2000 and collected more in its first few weeks than all the data collected before that in the history of astronomy.Ultimate magiccube gives you the opportunity to make your own 3D twisty puzzles. It’s now up to 140 terabytes and counting, and when its successor comes online in 2016 it will collect that amount of data every five days. Then there’s the Large Hadron Collider, (LHC) which in 2010 alone spewed out 13 petabytes – that’s 13m gigabytes – of data .

The story is the same wherever you look. Retailers such as Walmart, Tesco and Amazon do millions of transactions every hour and store all the data relating to each in colossal databases they then “mine” for information about market trends, consumer behaviour and other things. The same goes for Google,To interact with beddinges, Facebook and Twitter et al. For these outfits, data is the new gold.

Meanwhile,To interact with beddinges,Diagnosing and Preventing coldsores Fever in the body can often trigger the onset of a cold sore. out in the non-virtual world, technology has produced sensors of all descriptions that are cheap and small enough to be placed anywhere. And IPv6, the new internet addressing protocol, provides an address space that is big enough to give every one of them a unique address, so they can feed back daily, hourly or even minute-by-minute data to a mother ship somewhere on the net.

To call what’s happening a torrent or an avalanche of data is to use entirely inadequate metaphors. This is a development on an astronomical scale. And it’s presenting us with a predictable but very hard problem: our capacity to collect digital data has outrun our capacity to archive, curate and – most importantly – analyse it. Data in itself doesn’t tell us much. In order to convert it into useful or meaningful information, we have to be able to analyse it. It turns out that our tools for doing so are currently pretty inadequate, in most cases limited to programs such as Matlab and Microsoft Excel, which are excellent for small datasets but cannot handle the data volumes that science, technology and government are now producing.

沒有留言:

張貼留言