Linux
There are many tools to monitor disk io in Linux, such as ‘iostat’, ‘sar’, and ‘sadc’. It is easy to use these tools to find out which device (or hard disk) is having the busiest I/O activities, but it is a bit difficult to find out which program (the exact pid) that is actually using a particular device. This task is a easier on AIX with ‘filemon’ (see AIX section). The following example is one way to determine which program may be generating the most disk I/O on Linux:
# sar -d
Linux 2.4.21-27.ELsmp (pw101) 12/04/2005
12:00:00 AM DEV tps rd_sec/s wr_sec/s
….
Average: dev8-128 7.16 5.37 75.07
Average: dev8-129 7.16 5.37 75.07
Average: dev8-130 0.00 0.00 0.00
….
The above command finds the busiest device. To determine what that device is, do:
# more /proc/devices
Character devices:
1 mem
2 pty
3 ttyp
4 ttyS
5 cua
7 vcs
…
Block devices:
1 ramdisk
2 fd
3 ide0
7 loop
8 sd
…
71 sd
129 sd
The trouble I find with the tools listed above is you need a different tool for each thing you want to look at (though sar is more general) and each tool produces output in different formats. To make a long story short go to http://collectl.sourceforge.net/ and download collectl. It lets you monitor virtually everything in the format of your choice even plottable! It’s low overhead and you can even monitor in fractional intervals. For instance, if you monitor network traffic at 1 second intervals you get the wrong numbers but if you use 0.9765 for an interval you get much better numbers. See the writeup on my website. But don’t take my word for it, see for yourself.
-mark
Thanks for the tips Mark. I will try to use and test it during my free time.
It’s been a long time and collectl has gone through a lot of changes, all for the better. It’s even part of Fedora 10 now. Have you had a chance to try it yet?
wow – just stumbled on this very old note as it’s not 2-1/2 years later. Since then collectl has been included in SuSE and Debian as well! I even released a new package called collectl-utils which when combined will collectl allow you to generate plots, with colplot, and multiplex multiple connections to other systems via colmux. Like top, colmux allows you to display any collectl command across hundreds of systems and sort by any column, in other words a ‘cluster-top’ utility.
-mark