Thursday, April 28, 2005

Automated rotating exdump

I've added an option that automates the processing of exacct data to text file format and rotates the logs, it is hard wired to create a new log file in /var/adm/exacct (which is an empty directory created during Solaris install as standard) and it includes the hostname and the date+time in the new file.

The previous log name is used as the base of the output file, which is written to a specified directory with a .txt extension.

The file /etc/acctadm.conf is maintained by the acctadm command, I read the log file names from it, and if there is no log file I don't start one. i.e. you need to manually start accounting with acctadm the first time to decide which logs you want to generate. The command syntax now looks like this:

# ./exdump
Usage: exdump [-vwr] [ file | -a dir ]
-v: verbose
-w: wracct all processes first
-r: rotate logs
-a dir: use acctadm.conf to get input logs, and write output files to dir

The usual way to run the command will be from cron as shown

0 * * * * /opt/exdump/exdump -war /var/tmp/exacct > /dev/null 2>&1
2 * * * * /bin/find /var/adm/exacct -ctime +7 -exec rm {} \;

This also shows how to clean up old log files, I only delete the binary files in this example, and I created /var/tmp/exacct to hold the text files. The process data in the text file looks like this:

timestamp locltime duration procid ppid uid usr sys majf rwKB vcxK icxK sigK sycK arMB mrMB command
1114734370 17:26:10 0.0027 16527 16526 0 0.000 0.002 0 0.53 0.00 0.00 0.00 0.1 0.7 28.9 acctadm
1114734370 17:26:10 0.0045 16526 16525 0 0.000 0.001 0 0.00 0.00 0.00 0.00 0.1 1.1 28.9 sh
1114734370 17:26:10 0.0114 16525 8020 0 0.001 0.005 0 1.71 0.00 0.00 0.00 0.3 1.0 28.9 exdump
1109786952 10:09:12 -1.0000 0 0 0 0.012 118.883 131 0.00 0.33 0.00 0.00 0.0 0.0 0.0 sched
1109786952 10:09:12 -1.0000 0 0 0 0.012 118.883 131 0.00 0.33 0.00 0.00 0.0 0.0 0.0 sched
1109786952 10:09:12 -1.0000 0 0 0 0.012 118.883 131 0.00 0.33 0.00 0.00 0.0 0.0 0.0 sched
1109786959 10:09:19 -1.0000 1 0 0 4.311 3.066 96 47504.69 49.85 0.18 0.34 456.2 0.9 1.0 init
1109786959 10:09:19 -1.0000 2 0 0 0.000 0.000 0 0.00 0.00 0.00 0.00 0.0 0.0 0.0 pageout

Since the -w option forced long running processes to write an accounting record they have not finished, and I chose to mark this by showing the duration as -1. One odd thing shown here is that the sched process generated three identical accounting records, this needs some more investigation.

I'm going to run this for a while, and install it on a few other machines including SPARC Solaris 10 and Solaris 8 to make sure its robust.



1 comment:

Note: Only a member of this blog may post a comment.