Observability

Distributed Tracing Explained: OpenTelemetry & Jaeger Tutorial

Your users are complaining that your application is slow. Sometimes it takes 8 seconds to respond, other times 2 seconds. But when you check your metrics, everything looks fine. Average response times are acceptable. All services report healthy. Your dashboards are green.

So either your users are idiots, or you’re not capable of capturing what’s actually happening with their requests. Now, I tend to assume users are right. Which means I’d have to call you… Well… I’m not going to do that. Instead, I’m going to show you why you can’t see what’s really happening.

Here’s what you’re about to learn. You’ll see exactly how to track requests as they flow through dozens of microservices, identify which specific operation is causing delays, and understand why your traditional observability tools are lying to you. By the end of this video, you’ll know how to implement distributed tracing that actually shows you what’s happening in your system.

Let’s start with why this problem exists in the first place.

Testing in Production! Progressive Delivery with Canary Deployments Explained!

Today I will make an outrageous claim. Ready? Here it goes… The only testing that truly matters is testing in production. The only way to truly verify that a release is working as expected is to run it in production with “real” users and “real” workload. Testing a release before it reaches production is helpful and I am certainly not going to tell you to stop writing and running your unit tests, and functional tests, and integration tests, and whichever other type of testing you might normally do. What I am going to tell you is that you have to test your releases in production. Confirmation that “real” users got what they expected is the only thing that truly matters.

Mastering Kubernetes Debugging: Leveraging eBPF with Inspektor Gadget

Kubernetes ecosystem is one of the most, if not the most extensive we’ve ever seen. There are tools for everything, including observability. We can collect metrics, logs, and traces for almost anything, we can query them, and we can see them in dashboards. There are hundreds of solutions for that alone, yet, sometimes, I miss simplicity of tools I would normally use in Linux. Sometimes, I crave for simple commands similar to those I would use when trying to figure out what’s going on in a single server.

…and a few others.