Tru64 - Collecting performance data with the collect utility
Description
Collect is a tool that collects operating system and process data under HP Tru64 UNIX Versions 4.x and 5.x. Collect is designed for high reliability and low system-resource overhead.
Hereafter you will find some information on how to gather performance data from a Tru64 system.
Note: The "collect" utility is provided by default with Tru64 V4.x and V5.x and updated with patches.
Step-by-step guide
Collecting data
Interactive mode
The base option is to run "collect" in interactive mode just specifying the output data file and then press <CTRL-C> when complete:
|
If you don't specify anything else, collect will use default values. The most important are:
- the collection interval (time between samples) is 10 seconds
- data is collected for all subsystems (Proc,Mem,Disk,Tape,Lsm,Net,Cpu,Filesys,mQueue,ttY)
When you write to a file using the above command, the data is writting in binary form. You must use playback mode to display it (see below). If you do not specify an output file, collect will write the data in a human-readable format directly to standard output (STDOUT).
You can choose specific subsystems for which you want to collect data using "-s <list>" where <list> is a sequece composed of the letters pmdtlncfmy, which stand for the subsystems above. For example:
|
would collect data for the memory, disk, network and cpu subsystems.
To specify the interval, use "-i <seconds>". For example:
|
This will collect data once a second, and write it to /tmp/foobar.cgz.
To stop the collect utility you must press <CTRL-C> or kill the process with the TERM signal, not the KILL one otherwise you risk losing data, because collect cannot write out the data it its buffer.
It is also possible to add a duration for the collect utility to stop after an amount of time.
Example with 30 minutes:
|
Example with 12 hours:
|
Example with 1 day:
|
Batch mode
If you prefer not to have a terminal connected for the collect duration, you can use the following commands:
|
This will execute the collect utility starting from now and for 1 day duration.
It is also possible to execute this command without time limit in case you want to collect performance data while a job is running and you have no idea about it's duration:
|
To stop the collect process then you'll have to identify the process id and stop it with a TERM signal, not a KILL one otherwise the output data file will be empty.
|
Notes:
- The "
grep [c]ollect
" command output result is the same as "grep collect | grep -v grep
" - You can use "
kill -TERM
" or "kill -15
" but not "kill -KILL
" or "kill -9
"
Displaying data
The basic command to read (play back) a collect data file is the following one, preferably with output to file (with -f option) or to "more" or "grep":
|
Other parameters can be specified to restrict the analysis to a time interval specifying start time stamp and stop time stamp (with -C option) or to a specific subsystem (with -s option).
Examples:
Restricting the analysis to an interval and to the memory subsystem:
|
Here above the time interval is specified as minutes:seconds.
The example below demonstrates how to display data only a specific disk, dsk0:
|
Analyzing Data
There are a few tools available for analyzing collect data:
- cfilt can filter arbitrary fields out of collect's ASCII output and create a CSV file suitable for further processing.
- collgui is a Perl script with accompanying modules that allows collect data to be graphed. It requires Perl/Tk, X-windows and various utillities such as gnuplot and netpbm.
Both cfilt and collgui, plus Perl/Tk and utilities, are available in Rob Urban's collgui-kit for Tru64 V5.1B. Ask him for it.
cfile example:
|
This selects 3 values from the CPU subsystem, "user", "sys", and the sum of "user" and "sys", 2 values from the DISK subsystem, "rkb/s" and "wkb/s", and 2 values from the NET subsystem, "ikb" and "okb". It writes these to standard output with the format below. Note: in order to avoid filtering unnecessary data, I passed collect the parameter "-scdn" to only extract cpu, disk, and network data from pluto.cgz.
Output of above example:
in the above case:
or a real example:
|
As you can see, cfilt parameters start with the first three letters of the subsystem, followed by the names of fields in collect's output for a subystem. You can perform simple arithmetic on the field values using +, -, *, and ~ (for division). You cannot specify the same subsystem in several cfilt parameters. cfilt can do more than this. See the man page for details, or the link below.
Getting start and stop time of the collect data file
If you want to know the start time of the collect process, you can use the following commands:
|
To get the timestamp of the first stample:
# collect -p pluto.cgz | grep "^#### RECORD" | head -1 |
And to get the timestamp of the last sample:
|
More information on collect options can be found in the collect(8) man pages, see link below.
Links
HP Tru64 UNIX collect overview
Rob Urban's collect Information
Related articles
© Stromasys, 1999-2024 - All the information is provided on the best effort basis, and might be changed anytime without notice. Information provided does not mean Stromasys commitment to any features described.