r21 - 07 Nov 2006 - 13:49:58 - JimWilgenbuschYou are here: TWiki >  Computing Web > AvailableSoftware > UsingCondor > PAUPOnVanillaDetail

Running PAUP Jobs in the Vanilla Universe (no checkpointing)

Prepare a PAUP batch file

For this example you'll need two file -- one containing a NEXUS dataset and the other containing the PAUP commands. The data set looks like this:

#NEXUS 
Begin data;
        Dimensions ntax=8 nchar=200;
        Format datatype=dna interleave;
        Matrix
A CGAATATAACGGAGCCAGTACTCAGACGCACTGCCAACCCAGCGAAGCCCGATACGCCGT
B CGAATATAACGAAGCCAGTATTCAGACGCACTGCTAACCCAGCGGAGCCCGGTACGCCGT
C CGAATATAACAAAGCCAGTACTCGGACGCACTACCAACCCAGCGGAGCCCGATACGCCAT
D CGAATACAACAAAGCCAGTATTCAGACGCACTGCCAACCCAACAGAGACCGGCGTGCTAT
E CGAATACAACAAAGCCAGTATTCAGACGCACTGCCAACCCAGCAGAGACCCCCACGCTAT
F CGAATACAACAAAGCCAGTATTCAGACGCACTGCCAACCCAGCAGAGACCCACACGCTAT
G CGAATACAACAAAGCCAATATTCAGACGGACTGCCAACCCAGCAGAGACCGACACGTCAT
H CGAATACAACAAAGCCAATATTCAGACGGACTGCCAACCCGGCAGAGACCGACGCGTCAT 

...  

;
end;

Download a copy of the sample data file here.

The commads used to run a specific analysis are kept in a separate NEXUS file, which will reference the data set given above.

#NEXUS;

begin paup; 
    set autoclose=yes warnreset=no increase=auto;

    [tell paup the data file name]
    [ execute example_data.nex;]

    [log file]
    log file= example_paup.log  replace;
    
    [reconstructe a neighbour-joining tree]
    nj; 

    [ save the tree to a file]
    savetrees file= example_paup.tre replace;        
end;

Copy the paup block given above and paste it into a new file named example_paup.nex.

Test the batch file

Before launching your job under condor, test the batch file at the console to make sure that it is working properly. At this point you should have two file: example_data.nex and example_paup.nex.

Type:

paup example_paup.nex

Because this is a short analysis, the program should execute and terminate within a second or two, saving a single tree and log file to the current directory. In reality, you will be testing analyses that might run several hours or days before completing. If this is the case, you will want to terminate the analsysis after making sure that PAUP properly executes the file. To interrupt a PAUP process, simply press control+C.

Create a submit file

Now that you know you paup job will run without errors, you need to create a condor submit file. More specific information is on how to create a submit discription file is given in the UsingCondor topic.

########################################
#
# PAUP run in the Condor vanilla universe
#
#########################################

Universe     = vanilla  

InitialDir   = /home/u5/users/yanfeng/condor_dir/run1
              
Executable   = /usr/common/i686-linux/bin/paup
 # Use command "which paup" to find out the path of "paup"
Arguments    = example_paup.nex -n -f
requirements = (OpSys =="LINUX" && Arch =="INTEL")

should_transfer_files = YES
WhenToTransferOutput = ON_EXIT_OR_EVICT
transfer_input_files= example_data.nex

output       = example_paup_condor.out
error        = example_paup_condor.error
log          = example_paup_condor.log

Queue

Copy the text given above and paste it into a new file named example_paup.cmd.

Logon to an SCS submit node

SCS maintains several submit nodes that give users a way to access SCS computer resources. The general access submit node is named phoenix. There are also two other submit nodes (anfinsen and petal), which are part of restricted access research clusters. Special permission from the resource owner is required to access the petal and anfinsen submit nodes.

$ ssh <username>@phoenix.scs.fsu.edu

Submit the job

To submit a job to the condor cluster you will use the condor_submit command. For example:

$ condor_submit example_paup.cmd

You should see the following output:

 
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 56776.

Check the status of a job

After submitting your job to the condor cluster you can check on the status of your job by using the condor_q command. For example:

$ condor_q <your user name>

The output from this command should look something like this:

-- Submitter: petal017.csit.fsu.edu : <144.174.160.147:10076> : petal017.csit.fsu.edu
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
56776.0   yanfeng         5/22 18:22   0+00:00:00 R  0   1.8  paup exa

1 jobs; 0 idle, 1 running, 0 held

Don't be surprised if your job remains idle (designated by an I under ST) for serveral minutes or longer. If your job does not run right away it most likely means that you have a low priority on the cluster and the cluster is being heavily utilized or it may mean that someone job with a lower priority is taking a while to vacate a node so that your job can run. Remember, condor is based on a High Throughput Computing (HTC) model and not a High Performance Comuting (HPC) model.

You can also see what has happened to your job by looking at the condor log file. Remember the condor log file was defined in the condor submit file. To look at the file you might use the cat command. For example:

$ cat example_data.log

The output from this command should look something like this:

000 (617.000.000) 10/19 14:34:02 Job submitted from host: <144.174.160.169:11297>
...
001 (617.000.000) 10/19 14:34:32 Job executing on host: <144.174.160.207:9705>
...
005 (617.000.000) 10/19 14:34:32 Job terminated.
        (1) Normal termination (return value 1)
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
        3346  -  Run Bytes Sent By Job
        2000108  -  Run Bytes Received By Job
        3346  -  Total Bytes Sent By Job
        2000108  -  Total Bytes Received By Job
...

Moving on

After the analysis is complete, you will find *.out and *.error files in your directory. These files contain the standard out and standard error generated by the executable.

This is a barebones example submit file. See the UsingCondor topic for more information on creating submit files.

Condor in heterogeneous environment

  # paup running in Condor vanilla Universe 

  initialdir   = /home/u5/users/johndo/condor/ 
  Rank = kflops 

  Executable   = /usr/common/i686-linux/bin/paup.$$(OpSys).$$(Arch) 
  Universe     = vanilla 
  requirements = (OpSys =="OSX" && Arch =="PPC") || \
                 (OpSys =="WINNT51" && Arch =="INTEL") || \
                 (OpSys =="LINUX" && Arch =="INTEL") || \
                 (OpSys =="LINUX" && Arch =="ALPHA")  

  should_transfer_files = YES
  when_to_transfer_output = ON_EXIT_OR_EVICT
  transfer_input_files = primates.nex, rep.$(Process)

notification = NEVER
  arguments = rep.$(Process) -n -f
  output    = rep_out.$(Process)
  error     = rep_error.$(Process)
  log       = rep.log

Queue 100 
Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r21 < r20 < r19 < r18 < r17 | More topic actions
Computing.PAUPOnVanillaDetail moved from TechHelp.PAUPOnVanillaDetail on 07 Nov 2006 - 13:56 by JimWilgenbusch - put it back
 
SCS TWiki

This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback