Welcome!

SDN Journal Authors: Pat Romanski, Elizabeth White, Carmen Gonzalez, Liz McMillan, Sanjeev Sharma

Related Topics: Big Data Journal, Java, SOA & WOA, Linux, Cloud Expo, SDN Journal

Big Data Journal: Blog Feed Post

Big Data Monitoring

At the heart of any big data architecture is going to be some sort of NoSQL data repository

The term "Big Data" is quite possibly one of the most difficult IT-related terms to pin down ever. There are so many potential types of, and applications for Big Data that it can be a bit daunting to consider all of the possibilities. Thankfully, for IT operations staff, Big Data is mostly a bunch of new technologies that are being used together to solve some sort of business problem. In this blog post I'm going to focus on what IT Operations teams need to know about big data technology and support.

Big Data Repositories
At the heart of any big data architecture is going to be some sort of NoSQL data repository. If you're not very familiar with the various types of NoSQL databases that are out there today I recommend reading this article on the MongoDB website. These repositories are designed to run in a distributed/clustered manner so they they can process incoming queries as fast as possible on extremely large data sets.

MongoDB Request Diagram

Source: MongoDB

An important concept to understand when discussing big data repositories is the concept of sharding. Sharding is when you take a large database and break it down into smaller sets of data which are distributed across server instances. This is done to improve performance as your database can be highly distributed and the amount of data to query is less than the same database without sharding. It also allows you to keep scaling horizontally and that is usually much easier than having to scale vertically. If you want more details on sharding you can reference this Wikipedia page.

Application Performance Considerations
Monitoring the performance of big data repositories is just as important as monitoring the performance of any other type of database. Applications that want to use the data stored in these repositories will submit queries in much the same way as traditional applications querying relational databases like Oracle, SQL Server, Sybase, DB2, MySQL, PostgreSQL, etc... Let's take a look at more information from the MongoDB website. In their documentation there is a section on monitoring MongoDB that states "Monitoring is a critical component of all database administration." This is a simple statement that is overlooked all too often when deploying new technology in most organizations. Monitoring is usually only considered once major problems start to crop up and by that time there has already been impact to the users and the business.

RedisDashboard

Dashboard showing Redis key metrics.

One thing that we can't forget is just how important it is to monitor not only the big data repository, but to also monitor the applications that are querying the repository. After all, those applications are the direct clients that could be responsible for creating a performance issue and that certainly rely on the repository to perform well when queried. The application viewpoint is where you will first discover if there is a problem with the data repository that is actually impacting the performance and/or functionality of the app itself.

Monitoring Examples
Now that we have built a quick foundation of big data knowledge, how do we monitor them in the real world?

End to end flow - As we already discussed, you need to understand if your big data applications are being impacted by the performance of your big data repositories. You do that by tracking all of the application transactions across all of the application tiers and analyzing their response times. Given this information it's easy to identify exactly which components are experiencing problems at any given time.

FS Transaction View

Code level details - When you've identified that there is a performance problem in your big data application you need to understand what portion of the code is responsible for the problems. The only way to do this is by using a tool that provides deep code diagnostics and is capable of showing you the call stack of your problematic transactions.

Cassandra_Call_Stack

Back end processing - Tracing transactions from the end user, through the application tier, and into the backend repository is required to properly identify and isolate performance problems. Identification of poor performing backend tiers (big data repositories, relational databases, etc...) is easy if you have the proper tools in place to provide the view of your transactions.

Backend_Detection

AppDynamics detects and measures the response time of all backend calls.

Big data metrics - Each big data technology has it's own set of relevant KPIs just like any other technology used in the enterprise. The important part is to understand what is normal behavior for each metric while performance is good and then identify when KPIs are deviating from normal. This combined with the end to end transaction tracking will tell you if there is a problem, where the problem is, and possibly the root cause. AppDynamics currently has monitoring extensions for HBase, MongoDB, Redis, Hadoop, Cassandra, CouchBase, and CouchDB. You can find all AppDynamics platform extensions by clicking here.

HadoopDashboard

Hadoop KPI Dashboard 1

HadoopDashboard2

Hadoop KPI Dashboard 2

Big data deep dive - Sometimes KPIs aren't enough to help solve your big data performance issues. That's when you need to pull out the big guns and use a deep dive tool to assist with troubleshooting. Deep dive tools will be very detailed and very specific to the big data repository type that you are using/monitoring. In the screen shots below you can see details of AppDynamics monitoring for MongoDB.

MongoDB Monitoring 1

MongoDB Monitoring 2

MongoDB Monitoring 3

MongoDB Monitoring 4

If your company is using big data technology, it's IT operations' responsibility to deploy and support a cohesive performance monitoring strategy for the inevitable performance degradation that will cause business impact. See what AppDynamics has to offer by signing up for our free trial today.

The post Big Data Monitoring written by Jim Hirschauer appeared first on Application Performance Monitoring Blog from AppDynamics.

Read the original blog entry...

More Stories By Sandi Mappic

Sandi Mappic has a passion for making apps go faster. She works with AppDynamics around the clock to help customers resolve performance pain and master application performance management. (This is AppDynamics blog feed written by several different AppDynamics bloggers.)

@CloudExpo Stories
When an enterprise builds a hybrid IaaS cloud connecting its data center to one or more public clouds, security is often a major topic along with the other challenges involved. Security is closely intertwined with the networking choices made for the hybrid cloud. Traditional networking approaches for building a hybrid cloud try to kludge together the enterprise infrastructure with the public cloud. Consequently this approach requires risky, deep "surgery" including changes to firewalls, subnets...
Ixia develops amazing products so its customers can connect the world. Ixia helps its customers provide an always-on user experience through fast, secure delivery of dynamic connected technologies and services. Through actionable insights that accelerate and secure application and service delivery, Ixia's customers benefit from faster time to market, optimized application performance and higher-quality deployments.
SYS-CON Events announced today that Stratogent will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Stratogent is a custom managed services organization based in San Mateo, California. We design, implement, and support mission critical infrastructure 24x7 on premises, in datacenters and in the Cloud. Since 2005, we have acted as an extension of internal IT teams, achieving a customer reten...
SYS-CON Events announced today that Harbinger Systems will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Harbinger Systems is a global company providing software technology services. Since 1990, Harbinger has developed a strong customer base worldwide. Its customers include software product companies ranging from hi-tech start-ups in Silicon Valley to leading product companies in the US a...
SYS-CON Events announces a new pavilion on the Cloud Expo floor where WebRTC converges with the Internet of Things. Pavilion will showcase WebRTC and the Internet of Things. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devices--computers, smartphones, tablets, and sensors – connected to the Internet by 2020. This number will con...
The only place to be June 9-11 is Cloud Expo & @ThingsExpo 2015 East at the Javits Center in New York City. Join us there as delegates from all over the world come to listen to and engage with speakers & sponsors from the leading Cloud Computing, IoT & Big Data companies. Cloud Expo & @ThingsExpo are the leading events covering the booming market of Cloud Computing, IoT & Big Data for the enterprise. Speakers from all over the world will be hand-picked for their ability to explore the economic...
SYS-CON Events announced today that Cloudian, Inc., the leading provider of hybrid cloud storage solutions, has been named “Bronze Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Cloudian is a Foster City, Calif.-based software company specializing in cloud storage. Cloudian HyperStore® is an S3-compatible cloud object storage platform that enables service providers and enterprises to bui...
SYS-CON Events announced today that Gridstore™, the leader in software-defined storage (SDS) purpose-built for Windows Servers and Hyper-V, will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Gridstore™ is the leader in software-defined storage purpose built for virtualization that is designed to accelerate applications in virtualized environments. Using its patented Server-Side Virtual C...
As the Internet of Things unfolds, mobile and wearable devices are blurring the line between physical and digital, integrating ever more closely with our interests, our routines, our daily lives. Contextual computing and smart, sensor-equipped spaces bring the potential to walk through a world that recognizes us and responds accordingly. We become continuous transmitters and receivers of data. In his session at Internet of @ThingsExpo, Andrew Bolwell, Director of Innovation for HP’s Printing a...
SAP is delivering break-through innovation combined with fantastic user experience powered by the market-leading in-memory technology, SAP HANA. In his General Session at 15th Cloud Expo, Thorsten Leiduck, VP ISVs & Digital Commerce, SAP, will discuss how SAP and partners provide cloud and hybrid cloud solutions as well as real-time Big Data offerings that help companies of all sizes and industries run better. SAP launched an application challenge to award the most innovative SAP HANA and SAP ...
The Internet of Things (IoT) promises to evolve the way the world does business; however, understanding how to apply it to your company can be a mystery. Most people struggle with understanding the potential business uses or tend to get caught up in the technology, resulting in solutions that fail to meet even minimum business goals. In his session at Internet of @ThingsExpo, Jesse Shiah, CEO / President / Co-Founder of AgilePoint Inc., will show what is needed to leverage the IoT to transform...
SYS-CON Events announced today that AIC, a leading provider of OEM/ODM server and storage solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. AIC is a leading provider of both standard OTS, off-the-shelf, and OEM/ODM server and storage solutions. With expert in-house design capabilities, validation, manufacturing and production, AIC's broad selection of products are highly flexible and are conf...
We are all here because we are sold on the transformative promise of The Cloud. But what good is all of this ephemeral, on-demand infrastructure if your usage doesn't actually improve the agility and speed of your business? How must Operations adapt in order to avoid stifling your Cloud initiative? In his session at DevOps Summit, Damon Edwards, co-founder and managing partner of the DTO Solutions, will highlight the successful organizational, process, and tooling patterns of high-performing c...
SYS-CON Events announced today that O'Reilly Media has been named “Media Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. O'Reilly Media spreads the knowledge of innovators through its books, online services, magazines, and conferences. Since 1978, O'Reilly Media has been a chronicler and catalyst of cutting-edge development, homing in on the technology trends that really matter and spurri...
SYS-CON Events announced today that Aria Systems, the recurring revenue expert, has been named "Bronze Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Aria Systems helps leading businesses connect their customers with the products and services they love. Industry leaders like Pitney Bowes, Experian, AAA NCNU, VMware, HootSuite and many others choose Aria to power their recurring revenue bu...
The Transparent Cloud-computing Consortium (abbreviation: T-Cloud Consortium) will conduct research activities into changes in the computing model as a result of collaboration between "device" and "cloud" and the creation of new value and markets through organic data processing High speed and high quality networks, and dramatic improvements in computer processing capabilities, have greatly changed the nature of applications and made the storing and processing of data on the network commonplace. ...
Seagate has a strong track record of collaborating with others to develop better cloud solutions. The Seagate Cloud Builder Alliance program, for example, leverages the company’s knowledge of storage and cloud-optimized solutions to give cloud service providers the customized, flexible and scalable server and storage solutions to meet the high levels of service their customers demand. Seagate also is a member of the OpenStack Foundation and Open Compute Project to help define and promote open-so...
The Internet of Things (IoT) is going to require a new way of thinking and of developing software for speed, security and innovation. This requires IT leaders to balance business as usual while anticipating for the next market and technology trends. Cloud provides the right IT asset portfolio to help today’s IT leaders manage the old and prepare for the new. Today the cloud conversation is evolving from private and public to hybrid. This session will provide use cases and insights to reinforce t...
What process has your provider undertaken to ensure that the cloud tenant will receive predictable performance and service? What was involved in the planning? Who owns and operates the data center? What technology is being used? How is it being supported? In his session at 14th Cloud Expo, Dave Weisbrot, Cloud Business Manager for QTS, will provide the attendees a look into what it takes to stand up and stand behind a highly available certified cloud IaaS.
SYS-CON Events announced today that Gigaom Research has been named "Media Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Ashar Baig, Research Director, Cloud, at Gigaom Research, will also lead a Power Panel on the topic "Choosing the Right Cloud Option." Gigaom Research provides timely, in-depth analysis of emerging technologies for individual and corporate subscribers. Gigaom Research'...