Reading the Linux Performance Tools Map: A Closer Look at the Heavy Hitters

Linux observability tools

After the basic commands, the next layer is made up of tools that are much more capable—and in some cases much more dangerous if used carelessly in production. The main ones worth knowing here are:

sar
netstat
pidstat
strace
tcpdump
blktrace
iotop
slabtop
sysctl
/proc

sar

sar stands for System Activity Reporter, and it is absurdly powerful. Many of the areas usually covered by separate tools—CPU, memory, disk, and network activity—can be observed through sar alone. If you want a quick sense of its scope, just run sar -A 1 1 and look at how much it reports.

One of its biggest strengths is that it can collect statistics periodically. Because of that, a lot of system-level monitoring is built on top of sar. There are also extended versions built around the same idea, adding application-level monitoring on top of the original system metrics.

netstat

netstat is another network-focused tool. It is useful for examining socket connections: what TCP connections exist, what state they are in, and how many connections a process owns.

In practice, netstat is often only the first step. Its output is commonly piped into grep or awk for further filtering and counting, to the point that processing netstat output has become a classic example in some awk tutorials.

It also has a -s option, which summarizes packet statistics for different protocols.

pidstat

The name makes it sound like it should be related to ps, and in a way it is. The information it shows is similar, but the important difference is in how it is collected.

ps gives you a one-time snapshot of process state. pidstat, on the other hand, can target a process and report its statistics repeatedly over time. That makes it a much finer-grained way to watch process behavior.

For monitoring, this is especially useful when a particular process needs special attention.

strace

strace feels more like an application-facing performance tool than a purely system-wide one. It shows which system calls a process makes and which signals it handles.

That also makes it a great tool for understanding how a program works internally. For example, if you want to know where pidstat gets its information, you can trace which files it opens:

strace pidstat \|& grep open

Since strace writes its output to standard error, the & in the pipeline matters.

This command is extremely powerful, but it is also extremely expensive. A slowdown of roughly an order of magnitude is not unusual. It is something to use during analysis, not something to leave enabled during normal execution.

tcpdump

tcpdump is one of the best-known packet capture tools. It can save packets passing through a network interface so they can later be examined with software such as Wireshark.

If the network interface is put into promiscuous mode, it can also capture traffic from other machines on the same network segment. That alone explains why it has long been a favorite tool in offensive as well as defensive work.

Like strace, tcpdump is powerful and costly. Even with filters enabled, it can still consume a lot of resources. On a busy production machine, packet capture can become too heavy to tolerate. Some organizations do need complete network packet archives, but that kind of collection is more likely handled with dedicated hardware than with a general-purpose server doing everything else.

blktrace

This tool is easier to understand if you think of it as the I/O counterpart to strace. As the name suggests, it traces block-layer activity and can show disk I/O requests in real time: what the request is, how long it takes, where it happens, and other details.

If presented through btrace, it can be more intuitive to read. But the usual rule applies: anything with trace in the name tends to come with noticeable overhead.

iotop

This one is exactly what the name suggests: an I/O version of top.

Its interface is simple and familiar, and even the display style closely resembles top. If you run it once, you will immediately understand what it is for.

slabtop

Once you know what a slab is, slabtop becomes straightforward.

A slab is an object cache. Instead of returning frequently used small object structures directly to the system every time they are freed, the kernel keeps them cached. That avoids the performance cost of repeatedly requesting memory from the system for small allocations.

With that in mind, the output of slabtop is much easier to make sense of.

sysctl

sysctl exposes a set of kernel and system parameters whose impact on server performance can be significant. This is deep water: many important tuning options live here, including settings such as tcp_reuse and tcp_recycle.

There is a lot to understand before changing anything, and this area alone could occupy days of discussion. A good starting point is simply to inspect what is available:

sysctl -a

/proc

Many of the tools above are, in the end, reading from files under /proc and presenting the contents in a more usable form.

You can parse those files yourself, but doing it well takes real effort. In most cases, it is better to lean on the tools that already know how to interpret the data. Still, if you want to understand how those tools work internally, combining /proc exploration with strace is a very practical way to uncover the details.