Welcome!

SDN Journal Authors: Chris Colosimo, Jason Bloomberg, Scott Davis, Vinod Mohan, Mike Wood

Related Topics: @BigDataExpo, Java IoT, Microservices Expo, Linux Containers, @CloudExpo, SDN Journal

@BigDataExpo: Blog Feed Post

Big Data Monitoring

At the heart of any big data architecture is going to be some sort of NoSQL data repository

The term "Big Data" is quite possibly one of the most difficult IT-related terms to pin down ever. There are so many potential types of, and applications for Big Data that it can be a bit daunting to consider all of the possibilities. Thankfully, for IT operations staff, Big Data is mostly a bunch of new technologies that are being used together to solve some sort of business problem. In this blog post I'm going to focus on what IT Operations teams need to know about big data technology and support.

Big Data Repositories
At the heart of any big data architecture is going to be some sort of NoSQL data repository. If you're not very familiar with the various types of NoSQL databases that are out there today I recommend reading this article on the MongoDB website. These repositories are designed to run in a distributed/clustered manner so they they can process incoming queries as fast as possible on extremely large data sets.

MongoDB Request Diagram

Source: MongoDB

An important concept to understand when discussing big data repositories is the concept of sharding. Sharding is when you take a large database and break it down into smaller sets of data which are distributed across server instances. This is done to improve performance as your database can be highly distributed and the amount of data to query is less than the same database without sharding. It also allows you to keep scaling horizontally and that is usually much easier than having to scale vertically. If you want more details on sharding you can reference this Wikipedia page.

Application Performance Considerations
Monitoring the performance of big data repositories is just as important as monitoring the performance of any other type of database. Applications that want to use the data stored in these repositories will submit queries in much the same way as traditional applications querying relational databases like Oracle, SQL Server, Sybase, DB2, MySQL, PostgreSQL, etc... Let's take a look at more information from the MongoDB website. In their documentation there is a section on monitoring MongoDB that states "Monitoring is a critical component of all database administration." This is a simple statement that is overlooked all too often when deploying new technology in most organizations. Monitoring is usually only considered once major problems start to crop up and by that time there has already been impact to the users and the business.

RedisDashboard

Dashboard showing Redis key metrics.

One thing that we can't forget is just how important it is to monitor not only the big data repository, but to also monitor the applications that are querying the repository. After all, those applications are the direct clients that could be responsible for creating a performance issue and that certainly rely on the repository to perform well when queried. The application viewpoint is where you will first discover if there is a problem with the data repository that is actually impacting the performance and/or functionality of the app itself.

Monitoring Examples
Now that we have built a quick foundation of big data knowledge, how do we monitor them in the real world?

End to end flow - As we already discussed, you need to understand if your big data applications are being impacted by the performance of your big data repositories. You do that by tracking all of the application transactions across all of the application tiers and analyzing their response times. Given this information it's easy to identify exactly which components are experiencing problems at any given time.

FS Transaction View

Code level details - When you've identified that there is a performance problem in your big data application you need to understand what portion of the code is responsible for the problems. The only way to do this is by using a tool that provides deep code diagnostics and is capable of showing you the call stack of your problematic transactions.

Cassandra_Call_Stack

Back end processing - Tracing transactions from the end user, through the application tier, and into the backend repository is required to properly identify and isolate performance problems. Identification of poor performing backend tiers (big data repositories, relational databases, etc...) is easy if you have the proper tools in place to provide the view of your transactions.

Backend_Detection

AppDynamics detects and measures the response time of all backend calls.

Big data metrics - Each big data technology has it's own set of relevant KPIs just like any other technology used in the enterprise. The important part is to understand what is normal behavior for each metric while performance is good and then identify when KPIs are deviating from normal. This combined with the end to end transaction tracking will tell you if there is a problem, where the problem is, and possibly the root cause. AppDynamics currently has monitoring extensions for HBase, MongoDB, Redis, Hadoop, Cassandra, CouchBase, and CouchDB. You can find all AppDynamics platform extensions by clicking here.

HadoopDashboard

Hadoop KPI Dashboard 1

HadoopDashboard2

Hadoop KPI Dashboard 2

Big data deep dive - Sometimes KPIs aren't enough to help solve your big data performance issues. That's when you need to pull out the big guns and use a deep dive tool to assist with troubleshooting. Deep dive tools will be very detailed and very specific to the big data repository type that you are using/monitoring. In the screen shots below you can see details of AppDynamics monitoring for MongoDB.

MongoDB Monitoring 1

MongoDB Monitoring 2

MongoDB Monitoring 3

MongoDB Monitoring 4

If your company is using big data technology, it's IT operations' responsibility to deploy and support a cohesive performance monitoring strategy for the inevitable performance degradation that will cause business impact. See what AppDynamics has to offer by signing up for our free trial today.

The post Big Data Monitoring written by Jim Hirschauer appeared first on Application Performance Monitoring Blog from AppDynamics.

Read the original blog entry...

More Stories By Jyoti Bansal

In high-production environments where release cycles are measured in hours or minutes — not days or weeks — there's little room for mistakes and no room for confusion. Everyone has to understand what's happening, in real time, and have the means to do whatever is necessary to keep applications up and running optimally.

DevOps is a high-stakes world, but done well, it delivers the agility and performance to significantly impact business competitiveness.

@CloudExpo Stories
SYS-CON Events announced today that Juniper Networks (NYSE: JNPR), an industry leader in automated, scalable and secure networks, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Juniper Networks challenges the status quo with products, solutions and services that transform the economics of networking. The company co-innovates with customers and partners to deliver automated, scalable and secure network...
SYS-CON Events announced today that CA Technologies has been named "Platinum Sponsor" of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, New York, and 21st International Cloud Expo, which will take place in November in Silicon Valley, California.
NHK, Japan Broadcasting, will feature the upcoming @ThingsExpo Silicon Valley in a special 'Internet of Things' and smart technology documentary that will be filmed on the expo floor between November 3 to 5, 2015, in Santa Clara. NHK is the sole public TV network in Japan equivalent to the BBC in the UK and the largest in Asia with many award-winning science and technology programs. Japanese TV is producing a documentary about IoT and Smart technology and will be covering @ThingsExpo Silicon Val...
As DevOps methodologies expand their reach across the enterprise, organizations face the daunting challenge of adapting related cloud strategies to ensure optimal alignment, from managing complexity to ensuring proper governance. How can culture, automation, legacy apps and even budget be reexamined to enable this ongoing shift within the modern software factory?
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists will examine how DevOps helps to meet th...
SYS-CON Events announced today that Hitachi Data Systems, a wholly owned subsidiary of Hitachi LTD., will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City. Hitachi Data Systems (HDS) will be featuring the Hitachi Content Platform (HCP) portfolio. This is the industry’s only offering that allows organizations to bring together object storage, file sync and share, cloud storage gateways, and sophisticated search an...
SYS-CON Events announced today that Hitachi, the leading provider the Internet of Things and Digital Transformation, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Hitachi Data Systems, a wholly owned subsidiary of Hitachi, Ltd., offers an integrated portfolio of services and solutions that enable digital transformation through enhanced data management, governance, mobility and analytics. We help globa...
@DevOpsSummit has been named the ‘Top DevOps Influencer' by iTrend. iTred processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @DevOpsSummit ranked as the number one ‘DevOps Influencer' followed by @CloudExpo at third, and @MicroservicesE at 24th.
Blockchain is a shared, secure record of exchange that establishes trust, accountability and transparency across supply chain networks. Supported by the Linux Foundation's open source, open-standards based Hyperledger Project, Blockchain has the potential to improve regulatory compliance, reduce cost and time for product recall as well as advance trade. Are you curious about Blockchain and how it can provide you with new opportunities for innovation and growth? In her session at 20th Cloud Exp...
Cloud promises the agility required by today’s digital businesses. As organizations adopt cloud based infrastructures and services, their IT resources become increasingly dynamic and hybrid in nature. Managing these require modern IT operations and tools. In his session at 20th Cloud Expo, Raj Sundaram, Senior Principal Product Manager at CA Technologies, will discuss how to modernize your IT operations in order to proactively manage your hybrid cloud and IT environments. He will be sharing be...
While some vendors scramble to create and sell you a fancy solution for monitoring your spanking new Amazon Lambdas, hear how you can do it on the cheap using just built-in Java APIs yourself. By exploiting a little-known fact that Lambdas aren’t exactly single threaded, you can effectively identify hot spots in your serverless code. In his session at 20th Cloud Expo, David Martin, Principal Product Owner at CA Technologies, will give a live demonstration and code walkthrough, showing how to o...
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, discussed the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docker c...
SYS-CON Events announced today that Hitachi, the leading provider the Internet of Things and Digital Transformation, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Hitachi Data Systems, a wholly owned subsidiary of Hitachi, Ltd., offers an integrated portfolio of services and solutions that enable digital transformation through enhanced data management, governance, mobility and analytics. We help globa...
Cloud Expo, Inc. has announced today that Aruna Ravichandran, vice president of DevOps Product and Solutions Marketing at CA Technologies, has been named co-conference chair of DevOps at Cloud Expo 2017. The @DevOpsSummit at Cloud Expo New York will take place on June 6-8, 2017, at the Javits Center in New York City, New York, and @DevOpsSummit at Cloud Expo Silicon Valley will take place Oct. 31-Nov. 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
@DevOpsSummit at Cloud taking place June 6-8, 2017, at Javits Center, New York City, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long developm...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
SYS-CON Events announced today that Progress, a global leader in application development, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Enterprises today are rapidly adopting the cloud, while continuing to retain business-critical/sensitive data inside the firewall. This is creating two separate data silos – one inside the firewall and the other outside the firewall. Cloud ISVs oft...
Five years ago development was seen as a dead-end career, now it’s anything but – with an explosion in mobile and IoT initiatives increasing the demand for skilled engineers. But apart from having a ready supply of great coders, what constitutes true ‘DevOps Royalty’? It’ll be the ability to craft resilient architectures, supportability, security everywhere across the software lifecycle. In his keynote at @DevOpsSummit at 20th Cloud Expo, Jeffrey Scheaffer, GM and SVP, Continuous Delivery Busine...