Keep an eye inside your critical apps

If you have a large technical solution, you probably have many applications running in your production environments. Not all of them are critical, but some are. Those critical apps must be monitored — deep, comprehensive observability is essential.

I see two levels to monitor: the low level and the product (or high) level.

The low level is common in modern, mature apps; it includes metrics such as RAM usage, CPU usage, thread counts, network, garbage collection (GC), disk usage, log errors, and so on. Those metrics are important to understand how stressed a host is and provide an opportunity to adjust resource allocation. Important: don’t deceive yourself — be realistic, and check metrics under both low- and high-load conditions. If you’re not familiar with observability, now is a good time to start.

Using .NET instrumentation, Prometheus and Grafana you can create dashboards such as the “ASP.NET OTEL Metrics” dashboard.

The product/high level is less common. It measures how your app performs its business functions and must answer business questions. This is very specific to your context: think about which HTTP requests, message-processing flows, or internal actions must complete within a target time or maintain an acceptable error rate. Create mechanisms to measure those SLAs and an alerting system so you can react before it’s too late.

Once you have one or both levels implemented, use them to monitor apps in production and as part of QA regression testing before a release. Keep in mind that observability has a cost (development, infrastructure, maintenance). Collect and store metrics and logs sensibly.

Do you use any form of observability? Leave a comment below.

Leave a Reply Cancel reply