What’s New in the World of OpenTelemetry?
Lately, it seems OpenTelemetry has become more of a de facto standard for aggregating and processing telemetry data, especially with the introduction of some of its latest features. At the time of writing, OpenTelemetry is the second-highest cloud-native Computing Foundation (CNCF) project in terms of GitHub activity, right under Kubernetes. That’s impressive for an incubating project, especially given the wealth of other observability-related open source projects.
The OpenTelemetry project boasts thousands of contributors and thousands more adopters. Aside from usage growth, what’s new with this project? Well, a lot. For one thing, with the recent general availability of the logging signal, OpenTelemetry now encompasses complete telemetry, including metrics, logs and traces. There has also been an expanding collection of libraries, services and apps now integrating OpenTelemetry in production.
At Kubecon + CloudNativeCon 2023, OpenTelemetry took the main stage and was referenced across many sessions. I also had the opportunity to chat with Morgan McLean, director of product management at Splunk and co-founder of OpenTelemetry. Below, we’ll check in on the status of OTel, learn more about the latest advancements and consider the project’s outlook for the future.
OpenTelemetry Support For Logging
OpenTelemetry was christened in 2019 when the OpenTracing and OpenCensus projects were merged, effectively creating a one-stop shop for distributed tracing and metrics collection. The metrics and traces capabilities saw good support through OTel, explained McLean, but you always needed to accompany OTel with something else to enable logs.
Now, with the stable integration of logs into the project, it “should signal even greater acceleration,” he said. We’ve already seen vendor support for OTel’s logging functionality, such as within Spunk’s OTel Collector for Kubernetes.
OpenTelemetry’s native support for logs could also replace the need for log-specific open source projects, most notably Fluentd or Fluent Bit—although McLean doesn’t see the need to jump ship immediately. There is still a bridge for OpenTelemetry with the Fluent ecosystem, and the two can continue to work together, especially to support legacy use cases, he said.
Standardized HTTP Semantics
Many other minor updates have been made to the OpenTelemetry project recently. One such subtle change with a big impact, noted McLean, is the standardization of HTTP semantic conventions, which was recently declared a stable feature. One HTTP metric is latency, and it’s common for systems to express this metric in disparate formats.
Essentially, standardizing semantics for HTTP metrics will ensure the metadata generated around HTTP is coherent, said McLean, which should help to compare and analyze metrics across environments. Since many of the connected tools we use converse over HTPP, this stable feature should give DevOps and SRE professionals more insight into traffic behaviors and overall stability.
Outlook For the Future
A CNCF microsurvey on observability conducted in late 2021 found that 49% of end users were using OpenTelemetry for observability. That percentage has likely increased over the last two years since the project has matured with more production use cases. For example, Tyk API Gateway recently declared first-class support for OpenTelemetry.
So, what should we expect from OpenTelemetry in the near future? Well, McLean shared some exciting goalposts that the OTel community is working toward.
Support for the Frontend
Most observability solutions have historically focused on backend services, but OpenTelemetry’s ambition is to be a common source of telemetry data for anything, said McLean. As such, the community is looking into expanding OTel to support client applications like Android, iOS or web-based frontends. By aggregating common metrics for the frontend, engineers could gain complete visibility into the entire customer experience.
Expanding to the Mainframe
Another piece of the observability puzzle is how we deal with mainframe servers. In the past, teams have used specific tool sets to grant visibility to the mainframe. However, OpenTelemetry is working on expanding visibility into this area, enabling it to support legacy or on-premise environments, which, let’s face it, are still pretty pervasive throughout enterprises’ hybrid multi-cloud setups.
Profiling: The Fourth Signal Type
You thought telemetry data was only metrics, logs and traces, did you? Well, think again. Profiling is the new signal type on the scene, and it’s set to empower us with deep performance information—at the level of function calls in code. Profiling could be useful for pinpointing slow speeds or errors produced by specific code expressions.
“It’s amazing the performance gains and cost savings you can find using profiling,” said McClean, who estimated that this feature could save companies double digits on their compute spend, opening a new frontier for FinOps. “Going down all the way to code is somewhat new and explored, so that’s exciting.”
Setting precise timelines for open source features is tricky, but there is a dedicated OpenTelemetry Profiling working group, and McLean estimates something tangible could be realized in late 2024.
Prediction: More Collector Pre-Processing
In addition to the features on the roadmap, McLean predicts we could soon see more emphasis on OpenTelemetry collector mechanics, such as data pre-processing, filtering or routing. For example, a bank operating thousands of VMs might expose sensitive information in its logs. Such information would need to be removed before being sent to downstream observability user interfaces and monitoring tools, while another unredacted version would need to be stored somewhere for safekeeping.
OTel might be used for this sort of extraction and reprocessing and potentially pointing to different destinations on demand, said McLean. Using collectors to process data could also root out noisy, unhelpful data, thus reducing storage and processing costs.
Cloud-Native, Open Source Observability
Some caution against a one-size-fits-all approach to observability and advocate treating it more as a discipline. “More companies will realize that observability is a practice, not a product,” noted Milin Desai, CEO of Sentry. “The current craze for one-size-fits-all observability products ignores the specialized needs of specific personas and their workflows in areas like security, data, SRE and software development.”
This underscores the need for open source software that is tooling and vendor-agnostic. And it seems like OpenTelemetry is bridging this gap quite well. “We have gotten lucky that many end users are contributing to the project,” said McClean. It’s hard to get data out of systems, and this just speaks to the value of OpenTelemetry for solving a problem that needs solving, he said.