包阅导读总结
1. `Span Events`、`TraceQL`、`Error Tracking`、`Performance Monitoring`、`Structured Logging`
2. 本文介绍了 span 事件,包括其定义和作用,阐述了添加 span 事件的时机,如错误跟踪、性能监测、结构化日志采样等,还讲解了如何使用 TraceQL 查询 span 事件,强调其对调试和提升应用性能的帮助。
3.
– Span 事件概述
– 是 span 持续期间的特定时间点,比添加更多 spans 或 span 属性更轻量,能提供更深层次的粒度
– 添加 span 事件的时机
– 错误跟踪:记录异常事件,帮助追踪导致错误的精确流程和序列
– 性能监测:捕捉精确时间戳,测量 span 内特定步骤的性能
– 采样结构化日志:作为结构化日志,因是 trace 一部分便于跟踪执行流程,采样能力比传统日志丰富
– 如何用 TraceQL 查询 span 事件
– 支持事件范围,可查询事件名称和属性
– 举例说明了查询的语法和效果
– 总结
– span 事件可改善调试和应用性能,最新的 TraceQL 能查询事件,可通过 Grafana Cloud 开始使用
思维导图:
文章地址:https://grafana.com/blog/2024/08/15/all-about-span-events-what-they-are-and-how-to-query-them/
文章来源:grafana.com
作者:Jennie Pham
发布时间:2024/8/15 14:12
语言:英文
总字数:1050字
预计阅读时间:5分钟
评分:90分
标签:分布式追踪,跨度事件,TraceQL,Grafana Cloud Traces,性能监控
以下为原文内容
本内容来源于用户推荐转载,旨在分享知识与观点,如有侵权请联系删除 联系邮箱 media@ilingban.com
If you’re already familiar with distributed tracing, you know that spans are the building blocks of traces. But are you sleeping on what span events can do for you?
First, you may need a wake-up call as to what a span event even is. While spans represent units of work or operation within a trace, a span event is a unique point in time during the span’s duration. While spans help build the structural hierarchy of your services, span events can provide a deeper level of granularity to help debug your application faster and maintain optimal performance.
Why use span events instead of just adding more spans or span attributes? While you can always add more spans, span events are much more lightweight and won’t significantly increase the size of your traces. Span attributes are also lightweight, but they lack the ability to store timestamps, which is where events really shine.
In this blog post, we’ll cover when to add span events to gain different insights and how to query span events using TraceQL in Grafana Cloud Traces.
When to add span events
Now that we understand the benefits of span events, here are some ways you can leverage events into your tracing strategy to help you gain more valuable insights from your tracing data.
Error tracking
Let’s be honest: People don’t typically check dashboards, logs, or traces when everything is running smoothly. But more eyes will pay attention when there is an alert that one of your pods is crashlooping because of an error in a function that is shared across multiple services with various entry points.
While traditional logs can capture errors, they sometimes lack the necessary context to help you fully chase down the issue. With tracing, however, you can record an exception event. This event becomes part of a span, which in turn is part of a comprehensive trace, allowing you to track the exact flow and sequence of actions leading up to the error.
Performance monitoring
A major benefit of using span events is their ability to capture precise timestamps. This allows us to measure the performance of specific steps within a span.
For instance, if a request is taking longer than expected, we can log all intermediate steps as span events. We can track the time to establish a connection with an external service, the time to acquire a read lock, or even the time to execute a database query. This detailed information can help us pinpoint whether the delay is due to network latency, a data locking issue, or the need for database query optimization.
Sampled structured logging
Unlike span attributes, span events can be viewed as structured logs. One advantage of using span events over traditional logs is that, because they are part of a trace, they allow us to easily follow the flow of execution.
Another often overlooked benefit is that tracing has richer sampling capabilities than logs. Let’s say you want to monitor the inputs and outputs of a function that is executed across multiple entry points. But in this scenario, you don’t need the information about all the entry points; you only need to know about one or two specific entry points. With tracing, we can control head sampling rates for different entry points, so that we can isolate the data for the entry points we want and at the rate in which we need.
How to query span events with TraceQL
The initial versions of TraceQL — the query language designed for selecting traces in Grafana Tempo, which also powers Grafana Cloud Traces — brought in the concept of scopes, supporting span, resource, and trace scopes. With the latest iteration, we’ve added support for event scope, allowing Tempo users to query for span events. More specifically, TraceQL now offers support for span event names and attributes.
Note: This new feature is currently available in Grafana Cloud Traces and will soon be available in the next Grafana Tempo OSS release, which we will announce soon.
Event name
To query for event names, simply use the syntax event:name
. Below is an example query for spansets where there is at least one event with name
value of updated database
.
{ event:name = "updated database" }
The query returned this matching trace where there is one span with one event matching the name updated database
. By clicking on the span id, we can see more information about this event.
Here we can see that there were actually two events within this span. One event with the name updating database
and another with the name updated database
, which is the one we searched for. Based on the two events’ timestamps, we can see that it took about 60ms for the database operation to finish.
Event attribute
To query for event attributes, simply use the event.
syntax. Below is an example of a query for an event attribute key called exception.message
with the value partially matching index out of range
.
{ event.exception.message =~ “.*index out of range.*” }
Here, we were able to find a matching trace with spans containing an event where the attribute exception.message
partially matched index out of range
. By clicking on the corresponding span id, we can see a detailed view of the trace.
With the trace view open, we can see the sequence of steps that took place up until the error.
Learn more about span events and TraceQL
Integrating span events into your tracing strategy can help you improve your debugging and application performance. They add an extra layer of granularity without blowing up your trace size. Additionally, tracing offers richer sampling capabilities compared to traditional logs, allowing you to selectively capture relevant data within span events. With the latest updates to TraceQL, you can now query these events to help you monitor performance and troubleshoot issues. So, what are you waiting for?
To learn more about tracing and TraceQL, check out our on-demand webinar “Getting started with tracing and Grafana Tempo.”
You can also reach out to the Grafana Tempo team on the Grafana Labs Community Slack channel or find Grafana Tempo in GitHub.
*The easiest way to get started with tracing and TraceQL is with Grafana Cloud. We have a generous forever-free tier and plans for every use case. Sign up for free now!