Welcome!

SDN Journal Authors: Yeshim Deniz, Liz McMillan, Elizabeth White, Pat Romanski, TJ Randall

Related Topics: @CloudExpo, Microservices Expo, Containers Expo Blog, Agile Computing, @DXWorldExpo, SDN Journal

@CloudExpo: Article

Log Management 101: Where Do Logs Come From?

Logs are machine data generated by any sort of application or the infrastructure used to run that application

We’ve had a lot of people asking for the Log Management Primer for a while now. And, surprisingly, many of these folks have a strong technical background, including developers. Some want it for themselves, and some want it to pass on to a colleague, manager, etc. I’m going to explain what logs are, where they come from and how you can get your logs.

If you’re a developer, this post probably isn’t for you as we don’t dig into the code level nitty gritty, but it will give you a high level overview of logs, where they come from and how they get sent to a third party service.

Where Do Logs Come From?
Logs are machine data generated by any sort of application or the infrastructure used to run that application. They created a record of all the events that happen in an application system. They are also usually unstructured or semi-structured and often contain common parameters such as a time stamp, IP address, process id, etc.

Logs can come from throughout the application stack, including:

  • Mobile Apps/Devices
  • Client Browser
  • Web Application code
  • Platform as a Service
    • Application server
    • Database
    • Router
  • Infrastructure as a Service
    • Operating System
    • Other Services (think AWS S3, RDS etc…)
  • On Premise
    • Virtual Machines
    • Hypervisor
    • Network
  • Hardware
    • Server
    • Etc…

Application_Stack_Graphic

How to Get Your Logs
Pretty much all levels of the application stack kick out log data. Different levels, though, lend themselves to different methods of gaining access. That said, there are some similarities across the levels, though.

File System
For the most part, logs are sent to a file system by default. Without further action, that’s where they’d meet an untimely demise as they’re deleted to make way for new logs. The file system is not and ideal place for long term storage for this data and often only a relatively small amount of data is stored here regularly as logs often get rotated after they reach a certain size. If you want to store more of your log data, or if you want to perform any analysis, graphing, alerting, tagging, etc., then you’ll need more than just the file system. Often people will archive logs periodically (e.g., to S3) or will send them to a third party service.

Everything from the application down through the hardware level can send logs to the file system…it’s just a matter of how. For example, your applications, the app server, database, OS, and VMs will all normally send data straight to a file system.

So, now that you’ve got your logs flying to the file system, what do you do to get them out of there and into somewhere with a bit more longevity? When sending them on to a third party logging tool there are two main ways to do this, via syslog or a collector agent.

Syslog
Syslog
is the protocol you’ll use, most likely, if you’re running a Linux setup. If you’re not running Linux, move on; Syslog is the domain of Linus Torvalds and those who use his creation.

Once you’ve sent logs to the file system, Syslog will step in and forward these to your log analysis tool. Note – you’ll need to configure syslog accordingly. Or, in some cases, you can use it to forward directly from the PaaS layer (e.g. logplex from Heroku supports this).

There are several flavors of Syslog,:

The benefits of Syslog are:

  • Its available out of the box on all Linux distros
  • It’s secure
    • You can send logs via TCP (secure) or UDP (unsecure)
    • It’s a known, standard protocol that is widely supported (for instance, see our documentation on Syslog)

The downfalls? It can be challenging to configure if you don’t know what you are doing. Although if you are following a good set of docs it should take no longer than a few minutes. Also older flavors of Linux can have limitations e.g. with syslogd you cannot send data from non-syslog log files that may exist elsewhere on your file system outside of the /var/logs folder where all your syslog logs live. Rsyslog solves this however and ships with most distros these days. Finally, although Snare is a windows equivalent syslog, this approach is largely only used on Linux systems.

Agent
An agent is a lightweight application that runs on your server and (in this case) forwards your logs from the file system to your log management tool. Agents are great for when you’re not just pulling logs out of your Linux box (where you may just use Syslog). That said, you can still use them on Linux if you’re so inclined. They’ll usually send logs to your cloud log solution via an API

The benefits of agents (or at least the Logentries agent):

  • Quick
  • Easy to set up
  • Secure (they use TCP)
  • You can modify the source to filter sensitive data from being logged

The downsides are:

  • It must be updated appropriately – although the Logentries agent is plugged into the relevant Linux package management systems – so this is taken care of in this instance
  • Also sometimes people are reluctant to run unknown pieces of code on their systems – we’ve open sourced our agent for this reason – so you can look at exactly what is running on your machine. That being said you may not have the time or the inclination to do this and may prefer to use a more tried and tested approach like syslog.
  • Scaling issues in relation to deploying agents can also arise - e.g. when you’re trying to deploy on ~100 servers …do you want to do that manually? Luckily, there are tools like Puppet or Chef to automate this.

Libraries
Libraries can be set up to send logs to a logging service from the application layer via an API. Each library supports a specific language (e.g. java, ruby, node.js, c#, python, etc…). The benefit of libraries is that you can still get your logs, even if you only have access at the application code level. Many PaaS providers do not provide file system access or a way to forward logs to a third party service – so libraries are a must in this case.

Client side libraries also allow you to get a view into what is happening from an end user’s perspective. For example, they can allow you to log from your end user’s browser so that you can get a full end to end view of your system. You can use our le.js library to do just that!

Libraries can also be used to log from your mobile apps – check out our android library for this.

Conclusion
So there you have it, now you know how where logs come from and how your logs get from the different parts of your application stack to your log analyzer – all logs from the browser, to the backend, to the log management solution of your choice.

More Stories By Trevor Parsons

Trevor Parsons is Chief Scientist and Co-founder of Logentries. Trevor has over 10 years experience in enterprise software and, in particular, has specialized in developing enterprise monitoring and performance tools for distributed systems. He is also a research fellow at the Performance Engineering Lab Research Group and was formerly a Scientist at the IBM Center for Advanced Studies. Trevor holds a PhD from University College Dublin, Ireland.

@CloudExpo Stories
"We were founded in 2003 and the way we were founded was about good backup and good disaster recovery for our clients, and for the last 20 years we've been pretty consistent with that," noted Marc Malafronte, Territory Manager at StorageCraft, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Effectively SMBs and government programs must address compounded regulatory compliance requirements. The most recent are Controlled Unclassified Information and the EU's GDPR have Board Level implications. Managing sensitive data protection will likely result in acquisition criteria, demonstration requests and new requirements. Developers, as part of the pre-planning process and the associated supply chain, could benefit from updating their code libraries and design by incorporating changes. In...
Andi Mann, Chief Technology Advocate at Splunk, is an accomplished digital business executive with extensive global expertise as a strategist, technologist, innovator, marketer, and communicator. For over 30 years across five continents, he has built success with Fortune 500 corporations, vendors, governments, and as a leading research analyst and consultant.
No hype cycles or predictions of zillions of things here. IoT is big. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, Associate Partner at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He discussed the evaluation of communication standards and IoT messaging protocols, data analytics considerations, edge-to-cloud tec...
Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like “How is my application doing” but no id...
Announcing Poland #DigitalTransformation Pavilion
Digital Transformation is much more than a buzzword. The radical shift to digital mechanisms for almost every process is evident across all industries and verticals. This is often especially true in financial services, where the legacy environment is many times unable to keep up with the rapidly shifting demands of the consumer. The constant pressure to provide complete, omnichannel delivery of customer-facing solutions to meet both regulatory and customer demands is putting enormous pressure on...
CloudEXPO | DXWorldEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
DXWorldEXPO LLC announced today that All in Mobile, a mobile app development company from Poland, will exhibit at the 22nd International CloudEXPO | DXWorldEXPO. All In Mobile is a mobile app development company from Poland. Since 2014, they maintain passion for developing mobile applications for enterprises and startups worldwide.
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
The best way to leverage your CloudEXPO | DXWorldEXPO presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering CloudEXPO | DXWorldEXPO will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at CloudEXPO. Product announcements during our show provide your company with the most reach through our targeted audienc...
@DevOpsSummit at Cloud Expo, taking place November 12-13 in New York City, NY, is co-located with 22nd international CloudEXPO | first international DXWorldEXPO and will feature technical sessions from a rock star conference faculty and the leading industry players in the world.
DXWorldEXPO | CloudEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
With 10 simultaneous tracks, keynotes, general sessions and targeted breakout classes, @CloudEXPO and DXWorldEXPO are two of the most important technology events of the year. Since its launch over eight years ago, @CloudEXPO and DXWorldEXPO have presented a rock star faculty as well as showcased hundreds of sponsors and exhibitors!
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
HyperConvergence came to market with the objective of being simple, flexible and to help drive down operating expenses. It reduced the footprint by bundling the compute/storage/network into one box. This brought a new set of challenges as the HyperConverged vendors are very focused on their own proprietary building blocks. If you want to scale in a certain way, let's say you identified a need for more storage and want to add a device that is not sold by the HyperConverged vendor, forget about it...
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, discussed the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docker c...
"MobiDev is a software development company and we do complex, custom software development for everybody from entrepreneurs to large enterprises," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
"We're focused on how to get some of the attributes that you would expect from an Amazon, Azure, Google, and doing that on-prem. We believe today that you can actually get those types of things done with certain architectures available in the market today," explained Steve Conner, VP of Sales at Cloudistics, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Sanjeev Sharma Joins November 11-13, 2018 @DevOpsSummit at @CloudEXPO New York Faculty. Sanjeev Sharma is an internationally known DevOps and Cloud Transformation thought leader, technology executive, and author. Sanjeev's industry experience includes tenures as CTO, Technical Sales leader, and Cloud Architect leader. As an IBM Distinguished Engineer, Sanjeev is recognized at the highest levels of IBM's core of technical leaders.