Welcome!

SDN Journal Authors: Lori MacVittie, Carmen Gonzalez, Paul Speciale, Sandi Mappic, Elizabeth White

Related Topics: Big Data Journal, Java, SOA & WOA, .NET

Big Data Journal: Article

Detecting Anomalies that Matter!

Like needles in a haystack

As Netuitive's Chief Data Scientist, I am fortunate to work closely with some of the worlds' largest banks, telcos, and eCommerce companies. Increasingly the executives that I speak with at these companies are no longer focused on just detecting application performance anomalies - they want to understand the impact this has on the business.  For example - "is the current slowdown in the payment service impacting sales?"

You can think of it as detecting IT operations anomalies that really matter - but this is easier said than done.

Like Needles in a Haystack
When it comes to IT analytics, there is a general notion that the more monitoring data you are able to consume, analyze, and correlate, the more accurate your results will be. Just pile all that infrastructure, application performance, and business metric data together and good things are bound to happen, right?

Larger organizations typically have access to voluminous data being generated from dozens of monitoring tools that are tracking thousands of infrastructure and application components.  At the same time, these companies often track hundreds of business metrics using a totally different set of tools.

The problem is that, collectively, these monitoring tools do not communicate with each other.  Not only is it hard to get holistic visibility into the performance and health of a particular business service, it's even harder to discover complex anomalies that have business impact.

Anomalies are Like Snowflakes
Compounding the challenge is the fact that no two anomalies are alike.  Anomalies that matter have multiple facets.  They reflect a composite behavior of many layers of interacting and inter-dependent components.  Additionally, they can be cleverly disguised or hidden in a haze of visible but insignificant noise.  No matter how many graphs and charts you display on the largest LCD monitor you can find - the type of scalable real-time analysis required to find and expose what's important is humanly impossible.

Enter IT Operations Analytics
Analytics such as statistical machine learning allow us to understand the "normal" behavior of each resource we are tracking - be it a single IT component, web service, application, or business process. Additional algorithms help us find patterns and correlations between the thousands of IT and business metrics that matter in a critical service.

The Shift Towards IT Operations Analytics is Already Happening
This is not about the future.  It's about what companies are doing today.

Several years ago thought-leading enterprises (primarily large banks with critical revenue driving services) began experimenting with a new breed of IT analytics platform. These companies' electronic and web facing businesses had so much revenue (and reputation) at stake that they needed to find the anomalies that matter -- the ones that were truly indicative of current or impending problems.

Starting with an almost "blank slate", these forward-thinking companies began developing open IT analytics platforms that easily integrated any type of data source in real time to provide a comprehensive view of patterns and relationships between IT infrastructure and business service performance. This was only possible with technologies that leveraged sophisticated data integration, knowledge modeling, and analytics to discover and capture the unique behavior of complex business services.  Anything less would fail, because, like snowflakes, no two anomalies are alike.

The Continuous Need for Algorithm Research
The online banking system at one bank is different than the online system at the next bank.  And the transaction slowdown that occurred last week may have a totally different root cause than the one two months ago.  Even more interesting are external factors such as seasonality and its effects on demand.  For example, payment companies see increased workload around holidays such as Thanksgiving and Mother's Day whereas gaming/betting companies' demand is driven more by factors such as the NFL Playoffs or the World Series.

For this reason, analytics research is an ongoing endeavor at Netuitive - part driven by customer needs and in part by advances in technology.   Once Netuitive technology is installed in an enterprise and integrating data collected across multiple layers in the service stack, behavior learning begins immediately.  As time passes, the statistical algorithms have more observations to feed their results and this leads to increasing confidence in both anomalies detected and proactive forecasts.  Additionally, customer domain knowledge can be layered in to Netuitive's real-time analysis in the form of knowledge bases and supervised learning algorithms.  The Research Group at Netuitive works closely with our Professional Services Group as well as directly with customers to regularly review actual delivered alarm quality to tune the algorithms that we have as well as identify new algorithms that would deliver greater value in an actionable timeframe.

Since Netuitive's software architecture allows for "pluggable" algorithms, we can incrementally introduce new analytics capabilities easily, at first in an experimental or laboratory setting and ultimately, once verified, into production.

The IT operations management market has matured over the past two decades to the point that most critical components are well instrumented.  The data is there and mainstream IT organizations (not just visionary early adopters) realize that analytics deliver measurable and tangible value.   My vision and challenge is to get our platform to the point where customers can easily customize the algorithms on their own, as their needs and IT infrastructure evolve over time.  This is where platforms need to get to because of the endless variety of ways that enterprises must discover and remediate "anomalies that matter".

Stay tuned.  In an upcoming blog I will drill down on some specific industry examples of algorithms we developed as part of some large enterprise IT analytic platform solutions.

More Stories By Elizabeth A. Nichols, Ph.D

As Chief Data Scientist for Netuitive, Elizabeth A. Nichols, Ph.D. leads development of algorithms, models, and analytics. This includes both enriching the company’s current portfolio as well as developing new analytics to support current and emerging technologies and IT-dependent business services across multiple industry sectors.

Previously, Dr. Nichols co-founded PlexLogic, a provider of open analytics services for quantitative data analysis, risk modeling and data visualization. In her role as CTO and Chief Data Scientist, she developed a cloud platform for collecting, cleansing and correlating data from heterogeneous sources, computing metrics, applying algorithms and models, and visualizing results. Prior to Plexlogic, Dr. Nichols co-founded and served as CTO for ClearPoint Metrics, a security metrics software platform that was eventually sold to nCircle. Prior to ClearPoint Metrics, Dr. Nichols served in technical advisory and leadership positions at CA, Legent Corp, BladeLogic, and Digital Analysis Corp. At CA, she was VP of Research and Development and Lead Architect for agent instrumentation and analytics for CA Unicenter. After receiving a Ph.D. in Mathematics from Duke University, she began her career as an operations research analyst developing war gaming models for the US Army.

@CloudExpo Stories
The major cloud platforms defy a simple, side-by-side analysis. Each of the major IaaS public-cloud platforms offers their own unique strengths and functionality. Options for on-site private cloud are diverse as well, and must be designed and deployed while taking existing legacy architecture and infrastructure into account. Then the reality is that most enterprises are embarking on a hybrid cloud strategy and programs. In this Power Panel at 15th Cloud Expo, moderated by Ashar Baig, Research ...
In her General Session at 15th Cloud Expo, Anne Plese, Senior Consultant, Cloud Product Marketing, at Verizon Enterprise, will focus on finding the right mix of renting vs. buying Oracle capacity to scale to meet business demands, and offer validated Oracle database TCO models for Oracle development and testing environments. Anne Plese is a marketing and technology enthusiast/realist with over 19+ years in high tech. At Verizon Enterprise, she focuses on driving growth for the Verizon Cloud pla...
StackIQ offers a comprehensive software suite that automates the deployment, provisioning, and management of Big Infrastructure. With StackIQ’s software, you can spin up fully configured big data clusters, quickly and consistently — from bare-metal up to the applications layer — and manage them efficiently. Our software’s modular architecture allows customers to integrate nearly any application with the StackIQ software stack.
As Platform as a Service (PaaS) matures as a category, developers should have the ability to use the programming language of their choice to build applications and have access to a wide array of services. Bluemix is IBM's open cloud development platform that enables users to easily build cloud-based, creative mobile and web applications without having to spend large amounts of time and resources on configuring infrastructure and multiple software licenses. In this track, you will learn about the...
The Internet of Things will greatly expand the opportunities for data collection and new business models driven off of that data. In her session at Internet of @ThingsExpo, Esmeralda Swartz, CMO of MetraTech, will discuss how for this to be effective you not only need to have infrastructure and operational models capable of utilizing this new phenomenon, but increasingly service providers will need to convince a skeptical public to participate. Get ready to show them the money! Speaker Bio: ...
Compute virtualization has been transformational, yet security policy implementation and enforcement has lagged behind in agility and automation. There are a number of key considerations when implementing policy in private and hybrid clouds. In his session at 15th Cloud Expo, Holland Barry, VP of Technology at Catbird, will discuss the impact of this new paradigm and what organizations can do today to safely move to software-defined network and compute architectures, including: How normal ope...
Samsung VP Jacopo Lenzi, who headed the company's recent SmartThings acquisition under the auspices of Samsung's Open Innovaction Center (OIC), answered a few questions we had about the deal. This interview was in conjunction with our interview with SmartThings CEO Alex Hawkinson. IoT Journal: SmartThings was developed in an open, standards-agnostic platform, and will now be part of Samsung's Open Innovation Center. Can you elaborate on your commitment to keep the platform open? Jacopo Lenzi: S...
How do APIs and IoT relate? The answer is not as simple as merely adding an API on top of a dumb device, but rather about understanding the architectural patterns for implementing an IoT fabric. There are typically two or three trends: Exposing the device to a management framework Exposing that management framework to a business centric logic • Exposing that business layer and data to end users. This last trend is the IoT stack, which involves a new shift in the separation of what stuff hap...
SYS-CON Events announced today that SOA Software, an API management leader, will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. SOA Software is a leading provider of API Management and SOA Governance products that equip business to deliver APIs and SOA together to drive their company to meet its business strategy quickly and effectively. SOA Software’s technology helps businesses to accel...
SYS-CON Events announced today that Utimaco will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Utimaco is a leading manufacturer of hardware based security solutions that provide the root of trust to keep cryptographic keys safe, secure critical digital infrastructures and protect high value data assets. Only Utimaco delivers a general-purpose hardware security module (HSM) as a customiz...
Almost everyone sees the potential of Internet of Things but how can businesses truly unlock that potential. The key will be in the ability to discover business insight in the midst of an ocean of Big Data generated from billions of embedded devices via Systems of Discover. Businesses will also need to ensure that they can sustain that insight by leveraging the cloud for global reach, scale and elasticity.
SYS-CON Events announced today that ElasticBox is holding a Hackathon at DevOps Summit, November 6 from 12 pm -4 pm at the Santa Clara Convention Center in Santa Clara, CA. You can enter as an individual or team of up to 10 developers. A New Star Is Born Every Month! All completed ElasticBoxes will then be sent to a judging panel - 12 winners will be featured on the ElasticBox website in 2015. All entrants will receive five full enterprise licenses for one year + ElasticBox headphones + Elasti...
Once the decision has been made to move part or all of a workload to the cloud, a methodology for selecting that workload needs to be established. How do you move to the cloud? What does the discovery, assessment and planning look like? What workloads make sense? Which cloud model makes sense for each workload? What are the considerations for how to select the right cloud model? And how does that fit in with the overall IT tranformation? In his session at 15th Cloud Expo, John Hatem, head of V...
Cloud services are the newest tool in the arsenal of IT products in the market today. These cloud services integrate process and tools. In order to use these products effectively, organizations must have a good understanding of themselves and their business requirements. In his session at 15th Cloud Expo, Brian Lewis, Principal Architect at Verizon Cloud, will outline key areas of organizational focus, and how to formalize an actionable plan when migrating applications and internal services to...
SAP is delivering break-through innovation combined with fantastic user experience powered by the market-leading in-memory technology, SAP HANA. In his General Session at 15th Cloud Expo, Thorsten Leiduck, VP ISVs & Digital Commerce, SAP, will discuss how SAP and partners provide cloud and hybrid cloud solutions as well as real-time Big Data offerings that help companies of all sizes and industries run better. SAP launched an application challenge to award the most innovative SAP HANA and SAP ...
Ixia develops amazing products so its customers can connect the world. Ixia helps its customers provide an always-on user experience through fast, secure delivery of dynamic connected technologies and services. Through actionable insights that accelerate and secure application and service delivery, Ixia's customers benefit from faster time to market, optimized application performance and higher-quality deployments.
SYS-CON Events announced today that Calm.io has been named “Bronze Sponsor” of DevOps Summit Silicon Valley, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Calm.io is a cloud orchestration platform for AWS, vCenter, OpenStack, or bare metal, that runs your CL tools puppet, Chef, shell, git, Jenkins, nagios, and will soon support New Relic and Docker. It can run hosted, or on premise and provides VM automation / expiry, self-service portals,...
SYS-CON Events announced today that Aria Systems, the recurring revenue expert, has been named "Bronze Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Aria Systems helps leading businesses connect their customers with the products and services they love. Industry leaders like Pitney Bowes, Experian, AAA NCNU, VMware, HootSuite and many others choose Aria to power their recurring revenue bu...
The Internet of Things (IoT) is going to require a new way of thinking and of developing software for speed, security and innovation. This requires IT leaders to balance business as usual while anticipating for the next market and technology trends. Cloud provides the right IT asset portfolio to help today’s IT leaders manage the old and prepare for the new. Today the cloud conversation is evolving from private and public to hybrid. This session will provide use cases and insights to reinforce t...
Blue Box has closed a $10 million Series B financing. The round was led by a strategic investor and included participation from prior investors including Voyager Capital and Founders Collective, as well as the Blue Box executive team. This round follows a $4.3 million Series A closed in December of 2012 and led by Voyager Capital. In May of this year, the company announced general availability of its private cloud as a service offering, Blue Box Cloud. Since that release, the company has dem...