r1 - 06 Nov 2007 - 20:04:01 - VolkanSevimYou are here: TWiki >  Computing Web > ScsWorkshops > TechSeriesF07 > PerlGnuplot
-- VolkanSevim - 6 Nov 2007

Advanced Scripting: Perl and Gnuplot

For a brief introduction to Perl and Gnuplot see perl.odp.

A Simple Project: Text Extraction and Plotting

Suppose we want to extract some information from this text file analyze1.text, which is a report generated by an e-mail server. We want to know the number of the "received" and "rejected" messages, and generate a graph of "Per-Day Traffic Summary".

Step One: Open and read the text file

This perl script ws00.pl: opens the file analyze1.text for input, reads it line by line, and prints each line on the screen.

Step Two: Get the number of received messages

This script ws01.pl prints only the lines ending with "received":

if($line =~ m/received$/i){ Check if line ends with the word "received".
@tmp = split(/\s+/, $line); Split the line into an array of strings seperated by whitespace
$received = $tmp[1]; Get the number

Step Three: Match only the number of messages

ws02.pl The code in step two will match two lines:
6636 received
0 bytes received

We want only the first one, the number of messages. Therefore, the code looks for the line that contains a numeric field and ends with "received", with whitespace in between using the regular expression m/\d+\s+received$/.

Step Four: Get the number of rejected messages

ws03.pl This time we use a different method. We capture the number at the beginning of the line using the regular expression m/(\d+)\s+rejected.*$/ . The variable $1 will be assigned to the value of the group in parantheses (in this case the number 61151).

Step Five: Get the "received" and "rejected" columns from the summary

ws04.pl The script gets to the summary by skipping all the lines that does not match "Per-Day Traffic Summary":
while( !~ /^Per-Day Traffic Summary$/) {};

Then, it skips next two lines in the file, reads all the lines and splits them into arrays of strings until it encounters a blank line. Finally, it creates data file with colums "Day of Month", "Received", and "Rejected". The output looks like this output.text.

Step Six: Generate a Plot

We can generate a plot from this data file using Gnuplot: plot "output.text" using 1:2. This command plot the second column (received) as a function of the first column (Day of Month). To conect the symbols with lines and change the legend: plot "output.text" using 1:2 w lines title "received".

We can put two graphs in the same plot:
plot "output.text" using 1:2 with lines title "received", "output.text" using 1:3 with lines title "rejected".

To plot "rejected" column as a bar graph:
plot "output.text" using 1:2 with lines title "received", "output.text" using 1:3 with boxes title "rejected". Use set boxwidth .5 to draw narrow bars.

To change the axis labels:
set xlabel "Day of Month"
set ylabel "Number of Messages"

To create an eps:
set term postscript eps enhanced color
set output "plot.eps"

Let us put all of these in a Gnuplot script: genplot.plt

Run the script with gnuplot genplot.plt.

Show attachmentsHide attachments
Topic attachments
I Attachment Action Size Date Who Comment
txttext analyze1.text manage 30.3 K 06 Nov 2007 - 19:43 VolkanSevim  
elseplt genplot.plt manage 0.2 K 06 Nov 2007 - 19:45 VolkanSevim  
txttext output.text manage 0.2 K 06 Nov 2007 - 19:45 VolkanSevim  
txttxt ws00.pl.txt manage 0.1 K 06 Nov 2007 - 19:43 VolkanSevim  
txttxt ws01.pl.txt manage 0.2 K 06 Nov 2007 - 19:43 VolkanSevim  
txttxt ws02.pl.txt manage 0.3 K 06 Nov 2007 - 19:43 VolkanSevim  
txttxt ws03.pl.txt manage 0.4 K 06 Nov 2007 - 19:44 VolkanSevim  
txttxt ws04.pl.txt manage 0.8 K 06 Nov 2007 - 19:44 VolkanSevim  
Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
SCS TWiki

This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback