--
VolkanSevim - 6 Nov 2007
Advanced Scripting: Perl and Gnuplot
For a brief introduction to Perl and Gnuplot see
perl.odp.
A Simple Project: Text Extraction and Plotting
Suppose we want to extract some information from this text file
analyze1.text, which is a report generated by an e-mail server. We want to know the number of the "received" and "rejected" messages, and generate a graph of "Per-Day Traffic Summary".
Step One: Open and read the text file
This perl script
ws00.pl: opens the file analyze1.text for input, reads it line by line, and prints each line on the screen.
Step Two: Get the number of received messages
This script
ws01.pl prints only the lines ending with "received":
if($line =~ m/received$/i){ Check if line ends with the word "received".
@tmp = split(/\s+/, $line); Split the line into an array of strings seperated by whitespace
$received = $tmp[1]; Get the number
Step Three: Match only the number of messages
ws02.pl
The code in step two will match two lines:
6636 received
0 bytes received
We want only the first one, the number of messages. Therefore, the code looks for the line that contains a numeric field and ends with "received", with whitespace in between using the regular expression
m/\d+\s+received$/.
Step Four: Get the number of rejected messages
ws03.pl
This time we use a different method. We capture the number at the beginning of the line using the regular expression
m/(\d+)\s+rejected.*$/ .
The variable
$1 will be assigned to the value of the group in parantheses (in this case the number 61151).
Step Five: Get the "received" and "rejected" columns from the summary
ws04.pl
The script gets to the summary by skipping all the lines that does not match "Per-Day Traffic Summary":
while( !~ /^Per-Day Traffic Summary$/) {};
Then, it skips next two lines in the file, reads all the lines and splits them into arrays of strings until it encounters a blank line. Finally, it creates data file with colums "Day of Month", "Received", and "Rejected". The output looks like this
output.text.
Step Six: Generate a Plot
We can generate a plot from this data file using Gnuplot:
plot "output.text" using 1:2. This command plot the second column (received) as a function of the first column (Day of Month). To conect the symbols with lines and change the legend:
plot "output.text" using 1:2 w lines title "received".
We can put two graphs in the same plot:
plot "output.text" using 1:2 with lines title "received", "output.text" using 1:3 with lines title "rejected".
To plot "rejected" column as a bar graph:
plot "output.text" using 1:2 with lines title "received", "output.text" using 1:3 with boxes title "rejected".
Use
set boxwidth .5 to draw narrow bars.
To change the axis labels:
set xlabel "Day of Month"
set ylabel "Number of Messages"
To create an eps:
set term postscript eps enhanced color
set output "plot.eps"
Let us put all of these in a Gnuplot script:
genplot.plt
Run the script with
gnuplot genplot.plt.