“Open source has always provided a number of benefits, including easing adoption costs, propagating a better understanding of the technology, and allowing for faster evolution and commercialization of products and services based on it,” noted Terry Woloszyn, Founder & CEO, Leeward Security Ltd., in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. “This is clearly evident with the OpenStack and CloudStack,” Woloszyn continued, “and others that have been quickly commercialized as...| By Jason Bloomberg | Article Rating: |
|
| February 28, 2013 08:00 AM EST | Reads: |
1,627 |
Scenario #1: out of the blue, your boss calls, looking for some long-forgotten entry in a spreadsheet from 1989. Where do you look? Or consider scenario #2: said boss calls again, only this time she wants you to analyze customer purchasing behavior...going back to 1980. Similar problem, only instead of finding a single datum, you must find years of ancient information and prepare it for analysis with a modern business intelligence tool.
The answer, of course, is archiving. Fortunately, you (or your predecessor, or predecessor's predecessor) have been archiving important-or potentially important-corporate data since your organization first started using computers back in the 1960s. So all you have to do to keep your boss happy is find the appropriate archives, recover the necessary data, and you're good to go, right?

Not so fast. There are a number of gotchas to this story, some more obvious than others. Cloud to the rescue? Perhaps, but many archiving challenges remain, and the Cloud actually introduces some new speed bumps as well. Now factor in Big Data. Sure, Big Data are big, so archiving Big Data requires a big archive. Lucky you-vendors have already been knocking on your door peddling Big Data archiving solutions. Now can you finally breathe easy? Maybe, maybe not. Here's why.
Archiving: The Long View
So much of our digital lives have taken place over the last twenty years or so that we forget that digital computing dates back to the 1940s-and furthermore, we forget that this sixty-odd year lifetime of the Information Age is really only the first act of perhaps centuries of computing before humankind either evolves past zeroes and ones altogether or kills itself off in the process. Our technologies for archiving information, however, are woefully shortsighted, for several reasons:
- Hardware obsolescence (three to five years) - Using a hard drive or tape drive for archiving? It won't be long till the hardware is obsolete. You may get more life out of the gear you own, but one it wears out, you'll be stuck. Anyone who archived to laser disk in the 1980s has been down this road.
- File format obsolescence (five to ten years) - True, today's Office products can probably read that file originally saved in the Microsoft Excel version 1 file format back in the day, but what about those VisiCalc or Lotus 123 files? Tools that will convert such files to their modern equivalents will eventually grow increasingly scarce, and you always risk the possibility that they won't handle the conversion properly, leading to data corruption. If your data are encrypted, then your encryption format falls into the file format obsolescence bucket as well. And what about the programs themselves? From simple spreadsheet formulas to complex legacy spaghetti code, how do you archive algorithms in an obsolescence-proof format?
- Media obsolescence (ten to fifteen years) - CD-ROMs and digital backup tapes have an expected lifetime. Keeping them cool and dry can extend their life, but actually using them will shorten it. Do you really want to rely upon a fifteen-year-old backup tape for critical information?
- Computing paradigm obsolescence (fifty years perhaps; it's anybody's guess) - will quantum computing or biological processors or some other futuristic gear drive binary digital technologies into the Stone Age? Only time will tell. But if you are forward thinking enough to archive information for the 22nd century, there's no telling what you'll need to do to maintain the viability of your archives in a post-binary world.
Cloud to the Rescue?
On the surface, letting your Cloud Service Provider (CSP) archive your data solves many of these issues. Not only are the new archiving services like Amazon Glacier impressively cost-effective, but we can feel reasonably comfortable counting on today's CSPs to migrate our data from one hardware/media platform to the next over time as technology advances. So, can Cloud solve all your archiving issues?
At some point the answer may be yes, but Cloud Computing is still far too immature to jump to such a conclusion. Will your CSP still be in business decades from now? As the CSP market undergoes its inevitable consolidation phase, will the new CSP who bought out your old CSP handle your archive properly? Only time will tell.
But even if the CSPs rise to the archiving challenge, you may still have the file format challenge. Sure, archiving those old Lotus 123 files in the Cloud is a piece of cake, but that doesn't mean that your CSP will return them in Excel version 21.3 format ten years hence-an unfortunate and unintentional example of garbage in the Cloud.
The Big Data Old Tail
You might think that the challenges inherent in archiving Big Data are simply a matter of degree: bigger storage for bigger data sets, right? But thinking of Big Data as little more than extra-large data sets misses the big picture of the importance of Big Data.
The point to Big Data is that the indicated data sets continue to grow in size on an ongoing basis, continually pushing the limits of existing technology. The more capacity available for storage and processing, the larger the data sets we end up with. In other words, Big Data are by definition a moving target.
One familiar estimate states that the quantity of data in the world doubles every two years. Your organization's Big Data may grow somewhat faster or slower than this convenient benchmark, but in any case, the point is that Big Data growth is exponential. So, taking the two-year doubling factor as a rule of thumb, we can safely say that at any point in time, half of your Big Data are less than two years old, while the other half of your Big Data are more than two years old. And of course, this ZapFlash is concerned with the older half.
The Big Data archiving challenge, therefore, is breaking down the more-than-two-years-old Big Data sets. Remember that this two-year window is true at any point in time. Thinking about the problem mathematically, then, you can conclude that a quarter of your Big Data are more than four years old, an eighth are more than six years old, etc.
Combine this math with the lesson of the first part of this ZapFlash, and a critical point emerges: byte for byte, the cost of maintaining usable archives increases the older those archives become. And yet, the relative size of those archives is vanishingly small relative to today's and tomorrow's Big Data. Furthermore, this problem will only get worse over time, because the size of the Old Tail continues to grow exponentially.
We call this Big Data archiving problem the Big Data Old Tail. Similar to the Long Tail argument, which focuses on the value inherent in summing up the Long Tail of customer demand for niche products, the Big Data Old Tail focuses on the costs inherent in maintaining archives of increasingly small, yet increasingly costly data as we struggle to deal with older and older information. True, perhaps the fact that the Old Tail data sets from a particular time period are small will compensate for the fact that they are costly to archive, but remember that the Old Tail continues to grow over time. Unless we deal with the Old Tail, it threatens to overwhelm us.
The ZapThink Take
The obvious question that comes to mind is whether we need to save all those old data sets anyway. After all, who cares about, say, purchasing data from 1982? And of course, you may have a business reason for deleting old information. Since information you preserve may be subject to lawsuits or other unpleasantness, you may wish to delete data once it's legal to do so.
Fair enough. But there are perhaps far more examples of Big Data sets that your organization will wish to preserve indefinitely than data sets you're happy to delete. From scientific data to information on market behavior to social trends, the richness of our Big Data do not simply depend on the information from the last year or two or even ten. After all, if we forget the mistakes of the past then we are doomed to repeat them. Crunching today's Big Data can give us business intelligence, but only by crunching yesterday's Big Data as well can we ever expect to glean wisdom from our information.
Published February 28, 2013 Reads 1,627
Copyright © 2013 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Jason Bloomberg
Jason Bloomberg is President of ZapThink, a Dovel Technologies Company. He is a global thought leader in the areas of Cloud Computing, Enterprise Architecture, and Service-Oriented Architecture. He created the Licensed ZapThink Architect (LZA) SOA course and associated credential, and runs the LZA course as well as his Cloud Computing for Architects course around the world. He is a frequent conference speaker, and prolific writer. He also serves as an analyst for GigaOM and blogger for DevX.
Mr. Bloomberg is one of the original Managing Partners of ZapThink LLC, the leading SOA advisory and analysis firm, which was acquired by Dovel Technologies in August 2011. His book, Service Orient or Be Doomed! How Service Orientation Will Change Your Business (John Wiley & Sons, 2006, coauthored with Ron Schmelzer), is recognized as the leading business book on Service Orientation. His new book, The Agile Architecture Revolution: How Cloud Computing, REST-based SOA, and Mobile Computing are Changing Enterprise IT (John Wiley & Sons), was published in March 2013.
Mr. Bloomberg has a diverse background in eBusiness technology management and industry analysis, including serving as a senior analyst in IDC’s eBusiness Advisory group, as well as holding eBusiness management positions at USWeb/CKS (later marchFIRST) and WaveBend Solutions (now Hitachi Consulting). He also co-authored the books XML and Web Services Unleashed (SAMS Publishing, 2002), and Web Page Scripting Techniques (Hayden Books, 1996).
“Open source has always provided a number of benefits, including easing adoption costs, propagating a better understanding of the technology, and allowing for faster evolution and commercialization of products and services based on it,” noted Terry Woloszyn, Founder & CEO, Leeward Security Ltd., in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. “This is clearly evident with the OpenStack and CloudStack,” Woloszyn continued, “and others that have been quickly commercialized as...May. 23, 2013 03:00 PM EDT Reads: 1,276 |
By Liz McMillan SYS-CON Events announced today that OpenStack will exhibit at SYS-CON's 12th International Cloud Expo, which will take place on June 10–13, 2013, at the Javits Center in New York City, New York. OpenStack software controls large pools of compute, storage, and networking resources throughout a datacenter, all managed by a dashboard that gives administrators control while empowering their users to provision resources through a web interface.
OpenStack powers some of the most widely-used SaaS app...May. 23, 2013 02:00 PM EDT Reads: 1,049 |
By Elizabeth White SYS-CON Events announced today that Wowrack will exhibit at SYS-CON's 12th International Cloud Expo, which will take place on June 10–13, 2013, at the Javits Center in New York City, New York.
Wowrack’s core expertise lies in high-availability Private and Public Cloud IaaS Hosting Solutions. Wowrack provides a true Hybrid service – where business release all IT management and hardware provisioning – taking the data center and server system administrative headaches off our customer’s shoulders. ...May. 23, 2013 12:15 PM EDT Reads: 1,053 |
By Liz McMillan Many have heard of OAuth but are unsure of how it might apply to their business.
In his session at the 12th International Cloud Expo, Alistair Farquharson, CTO of SOA Software, will describe how OAuth can be used to facilitate certain business models and simplify the sharing of private data.
Alistair Farquharson is a visionary industry veteran focused on using disruptive technologies to drive business growth and improve efficiency and agility within organizations. As the CTO of SOA Software A...May. 23, 2013 11:14 AM EDT Reads: 529 |
By Elizabeth White May. 23, 2013 11:00 AM EDT Reads: 1,146 |
By Pat Romanski SYS-CON Events announced today that nfina Technologies, a provider of highly reliable cloud server products, will exhibit at SYS-CON's 12th International Cloud Expo, which will take place on June 10–13, 2013, at the Javits Center in New York City, New York.
nfina Technologies develops, manufactures, and markets highly reliable cloud server products, designed to solve the most demanding data center requirements in mission-critical cloud applications. Nfina’s staff has decades of experience in co...May. 23, 2013 11:00 AM EDT Reads: 1,011 |
By Liz McMillan “Social, mobile, analytics and cloud can’t be looked at as distinct technology trends; they are facets of the same movement and an everyday reality for consumers and businesses alike,” said Craig Sowell, IBM VP of SmartCloud Marketing, in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. “This means that businesses need to start looking at trends as one: cloud is the delivery, analytics is the unique insight, social is a shareable service, and mobile is the ubiquitous access.”
...May. 23, 2013 10:00 AM EDT Reads: 1,018 |
By Pat Romanski In his session at the 12th International Cloud Expo, Dave Eichorn, Global Data Center Practice Head at Zensar, will share a case study describing how a utility services company handled the migration of its Microsoft platform to the cloud. Challenged with the time-consuming task of opening operations out of temporary offices, this company struggled with the need to simultaneously access data that was accumulated from a vast amount of data-intensive jobs. Zensar migrated the company’s application ...May. 23, 2013 10:00 AM EDT Reads: 1,016 |
By Liz McMillan Organizations across the world are increasingly starting to see the benefits of moving more and more services to the cloud. The focus on the cost-saving potential of cloud is rapidly shifting to completely transforming the business with cloud. As organizations are investing enormous sums on technology they are starting to realize that in order to maximize the return on investment and accelerate the business transformation process the first area of focus should be people. By ensuring the organiza...May. 23, 2013 09:45 AM EDT Reads: 900 |
By Elizabeth White You're getting pitched every day from your legacy enterprise software and hardware vendors about "cloud." They're doing an amazing job of convincing your CIO and CTO about what cloud is and how you should use it. The reality is they're defending their shrinking market share and keeping you on the legacy treadmill for as long as they can by selling you solutions that aren't "cloud."
In her session at the 12th International Cloud Expo, Niki Acosta, Cloud Evangelista for Rackspace, will talk thro...May. 23, 2013 09:38 AM EDT Reads: 452 |
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Enterasys Spotlights SDN's Impact on Traditional Networking in Upcoming Webinar
- Cloud Expo New York: Deploying Hybrid Cloud for Performance and Uptime
- Cloud Expo New York: Delivering Digital Marketing on the Cloud
- Cloud Expo New York: Rethink IT and Reinvent Business with IBM SmartCloud
- Cloudant to Exhibit at Cloud Expo & Big Data Expo New York
- The Accessibility of the Cloud
- Cloud Expo New York | Danger Ahead: Why File Sync Is NOT Endpoint Backup
- Cloud Expo NY: Best Practices for Delivering Oracle Database as a Service
- Cloud Expo New York: Basics of SSD Technology and Its Use in Cloud
- Cloud Computing Is Simplifying Things
- Cloud Expo New York: Developing the World’s First IaaS Marketplace
- Cloud Expo New York: Best CIO Practices Shared from SHI’s Customers
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Cloud Expo New York: How to Use Google Apps Script
- Enterasys Spotlights SDN's Impact on Traditional Networking in Upcoming Webinar
- Rackspace Hosting Named “Platinum Plus Sponsor” of Cloud Expo New York
- Cloud Expo New York: Why Big Data Is Really About Small Data
- Cloud Expo New York: Deploying Hybrid Cloud for Performance and Uptime
- Cloud Expo New York: Delivering Digital Marketing on the Cloud
- Cloud Expo New York: Requirements of a Cloud Database
- Cloud Expo New York: Rethink IT and Reinvent Business with IBM SmartCloud
- Cloudant to Exhibit at Cloud Expo & Big Data Expo New York
- Cloud Expo New York: Time to Mission @ the Speed of Cloud
- Cloud Expo New York: Best CIO Practices Shared from SHI’s Customers
- AMD Hires New PC General Manager
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Cloud Expo New York: How to Use Google Apps Script
- Enterasys Spotlights SDN's Impact on Traditional Networking in Upcoming Webinar
- ScaleOut Software to Exhibit at Cloud Expo New York
- Web Host Industry Review “Media Sponsor” of Cloud Expo NY & Silicon Valley
- Speed-up and Simplify Backup and Restores
- Software Defined Networking – A Paradigm Shift
- MokaFive Gets New CEO
- Code 42 Software to Exhibit at Cloud Expo New York
- Appcore Named “Bronze Sponsor” of Cloud Expo New York








SYS-CON Events announced today that OpenStack will exhibit at SYS-CON's 12th International Cloud Expo, which will take place on June 10–13, 2013, at the Javits Center in New York City, New York. OpenStack software controls large pools of compute, storage, and networking resources throughout a datacenter, all managed by a dashboard that gives administrators control while empowering their users to provision resources through a web interface.
OpenStack powers some of the most widely-used SaaS app...
SYS-CON Events announced today that Wowrack will exhibit at SYS-CON's 12th International Cloud Expo, which will take place on June 10–13, 2013, at the Javits Center in New York City, New York.
Wowrack’s core expertise lies in high-availability Private and Public Cloud IaaS Hosting Solutions. Wowrack provides a true Hybrid service – where business release all IT management and hardware provisioning – taking the data center and server system administrative headaches off our customer’s shoulders. ...
Many have heard of OAuth but are unsure of how it might apply to their business.
In his session at the 12th International Cloud Expo, Alistair Farquharson, CTO of SOA Software, will describe how OAuth can be used to facilitate certain business models and simplify the sharing of private data.
Alistair Farquharson is a visionary industry veteran focused on using disruptive technologies to drive business growth and improve efficiency and agility within organizations. As the CTO of SOA Software A...
SYS-CON Events announced today that nfina Technologies, a provider of highly reliable cloud server products, will exhibit at SYS-CON's 12th International Cloud Expo, which will take place on June 10–13, 2013, at the Javits Center in New York City, New York.
nfina Technologies develops, manufactures, and markets highly reliable cloud server products, designed to solve the most demanding data center requirements in mission-critical cloud applications. Nfina’s staff has decades of experience in co...
“Social, mobile, analytics and cloud can’t be looked at as distinct technology trends; they are facets of the same movement and an everyday reality for consumers and businesses alike,” said Craig Sowell, IBM VP of SmartCloud Marketing, in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. “This means that businesses need to start looking at trends as one: cloud is the delivery, analytics is the unique insight, social is a shareable service, and mobile is the ubiquitous access.”
...
In his session at the 12th International Cloud Expo, Dave Eichorn, Global Data Center Practice Head at Zensar, will share a case study describing how a utility services company handled the migration of its Microsoft platform to the cloud. Challenged with the time-consuming task of opening operations out of temporary offices, this company struggled with the need to simultaneously access data that was accumulated from a vast amount of data-intensive jobs. Zensar migrated the company’s application ...
Organizations across the world are increasingly starting to see the benefits of moving more and more services to the cloud. The focus on the cost-saving potential of cloud is rapidly shifting to completely transforming the business with cloud. As organizations are investing enormous sums on technology they are starting to realize that in order to maximize the return on investment and accelerate the business transformation process the first area of focus should be people. By ensuring the organiza...
You're getting pitched every day from your legacy enterprise software and hardware vendors about "cloud." They're doing an amazing job of convincing your CIO and CTO about what cloud is and how you should use it. The reality is they're defending their shrinking market share and keeping you on the legacy treadmill for as long as they can by selling you solutions that aren't "cloud."
In her session at the 12th International Cloud Expo, Niki Acosta, Cloud Evangelista for Rackspace, will talk thro...
Hyper-V Replica is our included asynchronous site-to-site VM replication capability for Windows Server 2012 and our free Hyper-V Server 2012 bare-metal enterprise-grade hypervisor. Using Hyper-V Replica, you can quickly implement a cost-effective disaster recovery plan for your business critical VM...
Imagine if you could take a time machine five years into the future, so that you would know which of today’s new technologies panned out and which did not.
Most companies have only started using cloud in the past two years. But there are some companies that have been using cloud for five years or...
Don and I have four children, all of whom have had the fortune to take piano lessons (I'm not sure if the youngest would agree he's fortunate at this point in his life but at five, he's not really able to answer the question with any degree of wisdom, anyway. Come to think of it, not sure the other ...
Our prior post, A Roadmap to High-Value Cloud Infrastructure: Disaster Recovery and Data Protection, discussed both the benefits and limitations of a cloud-based disaster recovery (DR) strategy. As we highlighted last week, traditional disaster recovery options leave open a huge hole: At one extreme...
Online collaboration has evolved during the last decade, delivering even greater value -- thanks to a new generation of business technology applications. Forbes Insights released "Collaborating in the Cloud," a Cisco-sponsored study examining the ways business leaders increasingly look at cloud coll...
New technologies allow schools, colleges and universities to analyze absolutely everything that happens. From student behavior, testing results, career development of students as well as educational needs based on changing societies. A lot of this data has already been stored and is used for statist...
A recent Gartner study states that the function of the modern CIO is in flux and that his or her future focus must incorporate digital assets (aka cloud-based data and applications) to remain relevant. Towards the goal of riding the sea change a compiler of stacks to a broker of business needs, secu...
In the coming years, big data will change the way organisations and societies are operated and managed. Big data however, is not the only trend that will impact significantly how organisations operate. Another major trend at the moment is gamification. Gamification will change the way organisations ...
We all talk about cloud differently, but is there a way we should be speaking about this tech?
Cloud computing is now a widely reported, if not accepted, IT movement that, depending on who you talk to, has changed or is changing the way businesses utilize infrastructure.











