Welcome!

SDN Journal Authors: Yeshim Deniz, Liz McMillan, Elizabeth White, Pat Romanski, TJ Randall

Related Topics: @DXWorldExpo, Microservices Expo, Containers Expo Blog, @CloudExpo, Apache, SDN Journal

@DXWorldExpo: Article

Organizing Big Data

Though your company likely has enough types of data to earn the moniker "big," you can't just throw it all in a single database

Big Data is bandied around so much that it seems to have lost what little meaning it had in the first place. What's so special about it, anyway? Some people think it's just a lot of data, more than most companies are used to (Google doesn't count). The real definition isn't much more … well, definite. It is, as Tech Target's glossary says, "… the voluminous amount of unstructured and semi-structured data a company creates — data that would take too much time and cost too much money to load into a relational database for analysis."

How much is "too much"? That's the tricky part; it largely depends on how you set up your database(s) and the hardware you use. For practicality, we'll just say that when you move your siloed data to a non-relational database, you're using big data. Many businesses are switching to a big-data capable infrastructure, enough that Gartner Research predicts that this switch is going to drive $232 billion in spending through 2016. Why are companies spending so much on a single area?

Though your company likely has enough types and volume of data to earn the moniker "big," you can't just throw it all in a single database and expect to see quick, reliable benefits. Hence, the hundreds of billions of dollars companies will spend in the next few years just to support these massive non-relational databases. You can't dump your data onto a traditional hard disk NAS and run reports in the time it takes to get yourself a cup of tea. Hard drives are notorious for poor random I/O performance, so your big data solution needs a solid flash NAS or SAN array that can access non-sequential data quickly.

You'll have to beef up your servers, too. Running a report on a relational sales database is easy for most servers when compared to running a sales report on a big database. Make every member of your organization's life easier by running your database in a cloud server — public, private, in-house, outsourced, hybrid, whatever type is right for you.

But why is big data so important? What makes it attractive to so many companies? Again, we'll turn to Tech Target, though I won't quote it directly:

·          Patterns: Whether you're a B2B shipping container manufacturer or a B2C online retailer, you need more information about your business, your customers and your industry. The more data you collect in one place, the easier it is to see patterns. What's your busiest time of year? Your bestselling product/service? Are most of your customers from the same region? Are people more likely to buy from you after reading 12 or more pages on your website? Having that data available and waiting for you to dig into is invaluable.

·          Space: How much duplicate data is your business storing that you could eliminate by combining all your disparate databases? Are you perhaps collecting too much data? I know my previous point harped on collecting more data, but if you're storing information that serves no purpose, it's a lot easier to eliminate if it's all in a single location.

·          Legal Compliance: If you don't have a complete view of the data you retain, it's hard to make sure you're complying with all applicable laws and policies. If someone sues your company or you undergo a compliance audit, you need to have easy access to all the information you're storing.

It's easy to dismiss big data as some overhyped phrase people throw around, but when you dive deeper, you realize that it serves a practical purpose. The days of every department having its own siloed database aren't over yet, but they should be. The advantages of big data ensure that.

More Stories By Joseph Parker

Joseph Parker has worked in management, supply chain metrics, and business/marketing strategy with small and large businesses for more than 10 years. His experience in development is personal, stemming from his work in mobile marketing and application technology. He is an avid reader of industry publications and follows the ongoing technological trends stemming from software and product development. He is an inbound marketer, avid blogger, and content provider for many business blogs.

CloudEXPO Stories
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a member of the Society of Information Management (SIM) Atlanta Chapter. She received a Business and Economics degree with a minor in Computer Science from St. Andrews Presbyterian University (Laurinburg, North Carolina). She resides in metro-Atlanta (Georgia).
In his session at 20th Cloud Expo, Mike Johnston, an infrastructure engineer at Supergiant.io, discussed how to use Kubernetes to set up a SaaS infrastructure for your business. Mike Johnston is an infrastructure engineer at Supergiant.io with over 12 years of experience designing, deploying, and maintaining server and workstation infrastructure at all scales. He has experience with brick and mortar data centers as well as cloud providers like Digital Ocean, Amazon Web Services, and Rackspace. His expertise is in automating deployment, management, and problem resolution in these environments, allowing his teams to run large transactional applications with high availability and the speed the consumer demands.
The technologies behind big data and cloud computing are converging quickly, offering businesses new capabilities for fast, easy, wide-ranging access to data. However, to capitalize on the cost-efficiencies and time-to-value opportunities of analytics in the cloud, big data and cloud technologies must be integrated and managed properly. Pythian's Director of Big Data and Data Science, Danil Zburivsky will explore: The main technology components and best practices being deployed to take advantage of data and analytics in the cloud, Architecture, integration, governance and security scenarios and Key challenges and success factors of moving data and analytics to the cloud
SYS-CON Events announced today that DatacenterDynamics has been named “Media Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY. DatacenterDynamics is a brand of DCD Group, a global B2B media and publishing company that develops products to help senior professionals in the world's most ICT dependent organizations make risk-based infrastructure and capacity decisions.
Most DevOps journeys involve several phases of maturity. Research shows that the inflection point where organizations begin to see maximum value is when they implement tight integration deploying their code to their infrastructure. Success at this level is the last barrier to at-will deployment. Storage, for instance, is more capable than where we read and write data. In his session at @DevOpsSummit at 20th Cloud Expo, Josh Atwell, a Developer Advocate for NetApp, will discuss the role and value extensible storage infrastructure has in accelerating software development activities, improve code quality, reveal multiple deployment options through automated testing, and support continuous integration efforts. All this will be described using tools common in DevOps organizations.