The book describes data-driven approach to optimal monitoring and alerting in distributed computer systems. It interprets monitoring as a continuous process aimed at extraction of meaning from system's data. The resulting wisdom drives effective maintenance and fast recovery - the bread and butter of web operations. The content of the book gives a scalable perspective on the following topics: anatomy of monitoring and alerting conclusive interpretation of time series data-driven approach to setting up monitors addressing system failures by their impact applications of monitoring in automation reporting on quality with quantitative means and more!
About the AuthorSlawek Ligus works as a consultant and freelance developer, based out of Dublin, Ireland. In the past he held a position of Web Operations Engineer at A9.com, where he contributed to automation of monitoring and infrastructure behind Amazon's powerful search engine.
Book InformationISBN 9781449333522
Author Slawek LigusFormat Paperback
Page Count 140
Imprint O'Reilly MediaPublisher O'Reilly Media