I have just finished reading Systems Performance: Enterprise and the Cloud by Brendan Gregg.
As expected, this book is primarily for the Linux or Solaris System Administrators. But, I found it very useful in understanding how to measure and diagnose issues with CPU, memory, storage, and networking.
Gregg promotes his USE methodology for investigating performance issues:
- Utilisation—how much of a resource is being used?
- Saturation—is the resource fully utilised?
- Errors—are there errors concerning the resource? Some resource managers fail requests rather than queue them
He provides a whole chapter on how he solved a difficult performance problem through the use of this method.
Gregg also covers benchmarking:
Benchmarking is surprisingly difficult to do well, with many opportunities for mistakes and oversights.
Gregg is a big fan of
dtrace and provides numerous scripts throughout the book, and in the appendix.
I will have to read the book again sometime soon in order to pick more ideas about performance tuning and diagnosis.