Spotting and Fixing Monitoring Smells: A Guide to Reliable Systems
Keeping production systems healthy and reliable is a challenge. Are your services running well? During high-traffic periods, can your system handle the load without bottlenecks or failures? What about dependencies? Is everything working fine with third-party depencies or is there an outage on cloud service provider? These are everyday challenges for DevOps and SRE teams alike.
Just like messy code gives clues about deeper issues (code smells), monitoring systems can have “monitoring smells.” when something isn’t right. These are signs that your monitoring setup isn’t as good as it should be. While I won’t be diving into how to build the perfect monitoring system. This article will explain what these smells are, how to notice them, and how to fix them.