Read in Mar 2020
Book by Brendan Gregg
Gregg's previous book, Systems Performance, manages to both be an excellent book on operating systems and observability tooling. If he wrote Systems Performance today, it'd use BPF-tools instead, which frankly would make it _the_ book. For now, you'll have to read both -- and read Systems Performance first.
I see this book as an amendment to Systems Performance with "hey, we have BPF now, it's mega-powerful, and you should use that instead of system tap / whatever." It explains what BPF is: Finally we have a way to run user-code in the mainline kernel, which can aggregate whatever metrics we like with minimal overhead.
He explains the different types of probes, how BCC and bpftrace add value on top of BPF, and techniques for efficiently using it. I think the level of detail here was great. The rest of the book is essentially a reference book with each tool, and a short description of how it works. I'm not sure how valuable I find this, given that the tools are all open-source and that anything but the title doesn't seem worth remembering. I skimmed through most of this, and don't see myself referencing it again, since all that's more readily available with Google.
Again, as was my pet-peeve with Systems Performance, nothing about historical tooling. I can't not give four stars though. Hard to see who else could write this book. It's a joy to read something by someone who's such an expert in his field. The exercises are fantastic, and doing a few of them was the most value I derived from it.