---+ *MrBayes on the SCS Condor cluster (Vanilla universe)*
1) log in one of the SCS master nodes
SCS has several master nodes in the the Condor Pool. You may or may not be eligible for access to all of them. Master nodes available for general use include phoenix000 (Phoenix Cluster), tempest01 (Tempest Cluster), and prism01 (Prism Cluster). The following nodes require permission from their respective owners: anfinsen098 (Anfinsen Cluster), petal017 (Petal Cluster), grove000 (Grove Cluster), and waterfall000 (
TechHelp?.Waterfall Cluster). For more information, please refer to the
SCS High Performance Clusters .
Till now, only Petal Cluster and Phoenix Cluster have the MrBayes installed. If you want to run MrBayes on then , you can only have it run on Petal Cluster(require permission) and Phoenix Cluster.
2) check whether MrBayes has been installed or not
The name of MrBayes is just mb
You should see like something this
MrBayes v3.1.2
(Bayesian Analysis of Phylogeny)
by
Fredrik Ronquist and John P. Huelsenbeck
School of Computational Science
Florida State University
ronquist@csit.fsu.edu
Section of Ecology, Behavior and Evolution
Division of Biological Sciences
University of California, San Diego
johnh@biomail.ucsd.edu
Distributed under the GNU General Public License
Type "help" or "help " for information
on the commands that are available.
MrBayes >
To leave the program, just type
quit and you should get the Unix prompt back.
3) prepare your MrBayes batch file
Before we start pratice MrBayes on condor, it is usful to set up a directory to work in it. You need to konw three simple terminal commands: mkdir, cd ,and cp. If you have program with them, use
man mkdir ,
man cd ,
man cp to study it fist.
Now creat a directory called
condor_dir in your home directory. Go to that directory and creat another directory called
mb. (omit this step if they already exsit.)
Download the MrBayes example data file and save it as "primates.nex" to your
mb directory. Go to
mb directory and create a new directoy call
mb_1_vanilla. Copy the example data file to
mb_1_vanilla directoy. To see the contents of the file, use
cat primates.nex ". This is the structure you will see:
#NEXUS
begin data;
dimensions ntax=12 nchar=898;
format datatype=dna interleave=no gap=-;
matrix
Tarsius_syrichta AAGTTTCATTGGAGCCACCACTCTTATAATTG......
Lemur_catta AAGCTTCATAGGAGCAACCATTCTAATAATCG......
Homo_sapiens AAGCTTCACCGGCGCAGTCATTCTCATAATCG......
Pan AAGCTTCACCGGCGCAATTATCCTCATAATCG......
Gorilla AAGCTTCACCGGCGCAGTTGTTCTTATAATTG......
Pongo AAGCTTCACCGGCGCAACCACCCTCATGATTG......
Hylobates AAGCTTTACAGGTGCAACCGTCCTCATAATCG......
Macaca_fuscata AAGCTTTTCCGGCGCAACCATCCTTATGATCG......
M_mulatta AAGCTTTTCTGGCGCAACCATCCTCATGATTG......
M_fascicularis AAGCTTCTCCGGCGCAACCACCCTTATAATCG......
M_sylvanus AAGCTTCTCCGGTGCAACTATCCTTATAGTTG......
Saimiri_sciureus AAGCTTCACCGGCGCAATGATCCTAATAATCG......
;
end;
download the data file primates.nex
This is Nexus file. The Nexus fiel can be divided into blocks, each of which begins with the statement 'begin
;' and end with 'end;'. A file can contain many blocks; for instance, MrBayes will read and understand a "!MrBayes" block that contains commands given in the same manner as if they were tyed in from the keyboard, except that each command line needs to be terminated by a semicolon if it is contained in a MrBayes block. This allows a user to start MrBayes in batch mode executing an analysis with user input. For more information, please refer to paper "NEXUS: an extensible file format for systematic information" (Maddison DR, Swofford DL, Maddison WP. 1997).This file contains a single block, a "data" block. And we need to add the MrBayes block to it.
Tips
Commands can be added for the data file to be performed. MrBayes will start the analysis directly after the data file is loaded into the program. Alternatively, commands can be given in a separate file; analysis is started by loading this file into MrBayes; This file must then always start with the following command lines:
begin mrbayes;
execute [name of data-file];
[remaining commands.....];
end;
Let's begin with the simplest one. Open the example data file with your favorite editor(vi, pico or notepad). Add the fellowings to the data file after the 'data' block.
begin mrbayes;
[to ensure that !MrBayes does not stop during an analysis to wait for confirmation from the user]
set autoclose=yes nowarn = yes ;
[ set the evolutionary model to the GTR model with gamma-distributed rate variation across sites]
lset nst = 6 rates = gamma;
[ ensure you get at least 1,000 samples from the posterior probability distribution]
mcmcp ngen =10000 samplefreq = 10;
[begin to ran MrBayes]
mcmc;
[summarize the parameter values]
sump burnin = 250;
[summarize the trees]
sumt burnin = 250;
[quit automatically when the analysis is done]
quit;
end;
download the data file with MrBayes block added primates_mbbatch.nex
Or you can edit MrBayes block in a separate file ;
#nexus;
begin mrbayes;
[to ensure that !MrBayes does not stop during an analysis to wait for confirmation from the user]
set autoclose=yes nowarn = yes ;
[tell !MrBayes the data file name]
execute primates.nex;
[ set the evolutionary model to the GTR model with gamma-distributed rate variation across sites]
lset nst = 6 rates = gamma;
[ ensure you get at least 1,000 samples from the posterior probability distribution]
mcmcp ngen =10000 samplefreq = 10;
[begin to ran MrBayes]
mcmc;
[summarize the parameter values]
sump burnin = 250;
[summarize the trees]
sumt burnin = 250;
[quit automatically when the analysis is done]
quit;
end;
download batch block file only_mbblock.nex
Tips
There is good manual and Other Resourceson on how to use MrBayes at
MrBayes websits .
The most often used commands are listed below.
help - used to a list of the commands availbel.
- help is used to see the help information for that command as well as a
description of the listed command.
lset - used to define the structure of the model.
prset - used to define probability distributions on the parameters of the model.
showmodel - used to check the current model.
mcmcp - used to set up the analysis.
mcmc - used to start the analysis.
sump - used to summarize samples of substitution model parameters.
sumt - used to summarize samples of trees and branch lengths
Tips
This example is simple and small one. But I still have a suggestion for you. You would better to run it locally(not on condor) first before you submit your job to condor. If it crashes, check the problem first. Don't wait condor to terminate your job abnormally to check your problem.
In petal cluster, MrBayes (which hasn't been relinked against condor) is also installed. You can use the command "mb primates_mbbatch.nex" and "mb only_mbbatch.nex" to test whether it can run correctly or not. Then you can use "ctrl+c" to terminate it and delete all the files MrBayes just creates. If there is no crash or other problems, then run it on condor.
4) creat your submit description file
Now you can create a condor submit description file. If you have general questions about how to create a condor submit description file, please refer to the General Program Tutorial .
########################################
#
# a simple MrBayes job on condor vanilla universe
#
#########################################
InitialDir = /home/u5/users/yanfeng/condor_dir/mb/mb_1_vanilla
Universe = vanilla
Executable = /usr/common/i686-linux/bin/mb
# Use command "which mb" to find out the path of "mb"
Arguments = primates_mbbatch.nex
Requirements = (OpSys =="LINUX" && Arch =="INTEL")
output = primates_mbbatch.output
error = primates_mbbatch.error
log = primates_mbbatch.log
should_transfer_files = YES
WhenToTransferOutput = ON_EXIT_OR_EVICT
transfer_input_files= ./primates_mbbatch.nex
Queue
# If you edit MRBayes block in a separate file, the submit description file is like below
# InitialDir = /home/u5/users/yanfeng/condor_dir/mb/mb_1_vanilla
# Universe = vanilla
# Executable = /usr/common/i686-linux/bin/mb
# Arguments = only_mbbatch.nex
# Requirements = (OpSys =="LINUX" && Arch =="INTEL")
# output = only_mbbatch.output
# error = only_mbbatch.error
# log = only_mbbatch.log
# should_transfer_files = YES
# WhenToTransferOutput = ON_EXIT_OR_EVICT
# transfer_input_files= ./only_mbbatch.nex
# Queue
download submit description file mb_1_vanilla.submit
Tips
Under Unix, the Condor presumes a shared file system for vanilla jobs. So normally you donot need to specify the Initialdir. But the automounter on some condor nodes does not like how condor is referencing the files on it. This is due to how your shell's pwd command returns pathing. Some pwd implementations display /home... and some will display /a... This is dependent on your shell. Forcing condor to reference an initialdir value of /home/...... will make the automounter behave correctly.
Take my account for example, With "pwd" command, I find my program file and submit description file are locate at "/a/fs/u5/users/yanfeng/condor_dir/mb/mb_1_vanilla". I write the initialdir attribute in your submit file to"/home/u5/users/yanfeng/condor_dir/mb/mb_1_vanilla ".
And if you have file need to transfer, you'd better add use File Tranfer Mechanism.
should_transfer_files = YES
when_to_transfer_output = ON_EXIT_OR_EVICT
transfer_input_files = file1,file2
5) submit your job and manage it
Use condor_submit command to submit your job and use condor_q, condor_rm etc to manage your job.
| [yanfeng@petal017 mb_1_vanilla]$ condor_submit mb_1_standard.submit |
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 117.
| [yanfeng@petal017 mb_1_vanilla]$ condor_q yanfeng |
-- Submitter: phoenix000.csit.fsu.edu : <144.174.160.169:10945> : phoenix000.csit.fsu.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
117.0 yanfeng 5/26 14:44 0+00:00:00 I 0 0.8 mb primates_mbbatc
1 jobs; 1 idle, 0 running, 0 held
| [yanfeng@petal017 mb_1_vanilla]$ ls -l |
total 238
-rw-r--r-- 1 yanfeng student 1210 May 26 14:44 mb_1_vanilla.submit [submit description file]
-rw-r--r-- 1 yanfeng student 718 May 26 14:46 only_mbblock.nex [MrBayes batch block]
-rw-r--r-- 1 yanfeng student 0 May 26 14:52 primates_mbbatch.error [condor error file]
-rw-r--r-- 1 yanfeng student 698 May 26 14:52 primates_mbbatch.log [condor log file]
-rw-r--r-- 1 yanfeng student 11703 May 26 14:43 primates_mbbatch.nex [data file with batch block]
-rw------- 1 yanfeng student 434 May 26 14:52 primates_mbbatch.nex.con [contain two consenses trees]
-rw------- 1 yanfeng student 121583 May 26 14:52 primates_mbbatch.nex.p [samples of subtitution model parameters]
-rw------- 1 yanfeng student 989 May 26 14:52 primates_mbbatch.nex.parts [contain list of taxon bipartitions,
their posterior probability and the branch lengths]
-rw------- 1 yanfeng student 67270 May 26 14:52 primates_mbbatch.nex.t [samples of trees and branch lengths]
-rw------- 1 yanfeng student 1005 May 26 14:52 primates_mbbatch.nex.trprobs [contain the trees found during the MCMC search]
-rw-r--r-- 1 yanfeng student 22911 May 26 14:52 primates_mbbatch.output [condor output file]
-rw-r--r-- 1 yanfeng student 11052 May 26 14:38 primates.nex [the original data file]
You can open log file to check the status of your job. cat primates_mbbatch.log
000 (117.000.000) 05/26 14:44:44 Job submitted from host: <144.174.160.169:10945>
...
001 (117.000.000) 05/26 14:51:43 Job executing on host: <144.174.160.156:9709>
...
006 (117.000.000) 05/26 14:51:51 Image size of job updated: 8924
...
005 (117.000.000) 05/26 14:52:49 Job terminated.
(1) Normal termination (return value 1)
Usr 0 00:01:01, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:01:01, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
214192 - Run Bytes Sent By Job
864802 - Run Bytes Received By Job
214192 - Total Bytes Sent By Job
864802 - Total Bytes Received By Job
...
It means that condor finished your job.
6) more simple examples
example, run two MrBayes jobs
Suppose you want to run MrBayes twice (or more) with the same or different parameters to compare the rersults. You can write them in a submit description file and arrange your jobs run at different cluster nodes.
Go to mb directory and creat another directory called mb_2_vanilla. Copy the example data file "primates_mbbatch.nex" to the new directory. Let us use the same parameters and run it twice. You can put the MrBayes block in a seperate file. Here we only show the way to add MrBayes block in the data file.
1) get two copies of the data file with MrBayes batch block.
| [yanfeng@petal017 mb_2_vaniila]$ cp primates_mbbatch.nex primates_mbbatch_1.nex |
| [yanfeng@petal017 mb_2_vanilla]$ cp primates_mbbatch.nex primates_mbbatch_2.nex |
2) Your submitdescription file should like this (file name mb_2_vanilla.submit)
####################################################
#
# example, run two MrBayes jobs on condor vanilla universe
#
#####################################################
InitialDir = /home/u5/users/yanfeng/condor_dir/mb/mb_2_vanilla
Universe = vanilla
Executable = /usr/common/i686-linux/bin/mb
Requirements = (OpSys =="LINUX" && Arch =="INTEL")
should_transfer_files = YES
WhenToTransferOutput = ON_EXIT_OR_EVICT
transfer_input_files= ./primates_mbbatch_1.nex, ./primates_mbbatch_2.nex
Arguments = primates_mbbatch_1.nex
output = primates_mbbatch_1.output
error = primates_mbbatch_1.error
log = primates_mbbatch_1.log
Queue
Arguments = primates_mbbatch_2.nex
output = primates_mbbatch_2.output
error = primates_mbbatch_2.error
log = primates_mbbatch_2.log
Queue
download here mb_2_vanilla.submit
3) You can use condor_q yourusername to check the stauts of your job.
| [yanfeng@petal017 mb_2_vanilla]$ condor_q yanfeng |
-- Submitter: phoenix000.csit.fsu.edu : <144.174.160.169:10945> : phoenix000.csit.fsu.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
127.0 yanfeng 5/26 17:04 0+00:00:00 I 0 0.8 mb primates_mbbatc
127.1 yanfeng 5/26 17:04 0+00:00:00 I 0 0.8 mb primates_mbbatc
After it is finished, use ls -l to check your files in your directory.
| [yanfeng@petal017 mb_2_vanilla]$ ls -l |
total 461
-rw-r--r-- 1 yanfeng student 835 May 26 15:24 mb_2_vanilla.submit
-rw-r--r-- 1 yanfeng student 0 May 26 15:32 primates_mbbatch_1.error
-rw-r--r-- 1 yanfeng student 697 May 26 15:32 primates_mbbatch_1.log
-rw-r--r-- 1 yanfeng student 11703 May 26 15:18 primates_mbbatch_1.nex
-rw------- 1 yanfeng student 434 May 26 15:32 primates_mbbatch_1.nex.con
-rw------- 1 yanfeng student 121200 May 26 15:32 primates_mbbatch_1.nex.p
-rw------- 1 yanfeng student 989 May 26 15:32 primates_mbbatch_1.nex.parts
-rw------- 1 yanfeng student 67270 May 26 15:32 primates_mbbatch_1.nex.t
-rw------- 1 yanfeng student 1005 May 26 15:32 primates_mbbatch_1.nex.trprobs
-rw-r--r-- 1 yanfeng student 22923 May 26 15:32 primates_mbbatch_1.output
-rw-r--r-- 1 yanfeng student 0 May 26 15:33 primates_mbbatch_2.error
-rw-r--r-- 1 yanfeng student 698 May 26 15:33 primates_mbbatch_2.log
-rw-r--r-- 1 yanfeng student 11703 May 26 15:18 primates_mbbatch_2.nex
-rw------- 1 yanfeng student 434 May 26 15:33 primates_mbbatch_2.nex.con
-rw------- 1 yanfeng student 121259 May 26 15:33 primates_mbbatch_2.nex.p
-rw------- 1 yanfeng student 865 May 26 15:33 primates_mbbatch_2.nex.parts
-rw------- 1 yanfeng student 67270 May 26 15:33 primates_mbbatch_2.nex.t
-rw------- 1 yanfeng student 597 May 26 15:33 primates_mbbatch_2.nex.trprobs
-rw-r--r-- 1 yanfeng student 22923 May 26 15:33 primates_mbbatch_2.output
-rw-r--r-- 1 yanfeng student 11703 May 26 15:18 primates_mbbatch.nex
Tips
Here we give one output file, one error file and one log file for every queue with different names. We also can give all queues only one shared output file, one shared error file and one shared log file. In these shared files, they record the process of all queues. Then the submit description file will like below (file name mb_2_vanilla_share.submit).
########################################
#
# example, run two MrBayes jobs on condor vanilla universe
#
#########################################
InitialDir = /home/u5/users/yanfeng/condor_dir/mb/mb_2_vanilla/share
Universe = vanilla
Executable = /usr/common/i686-linux/bin/mb
Requirements = (OpSys =="LINUX" && Arch =="INTEL")
should_transfer_files = YES
WhenToTransferOutput = ON_EXIT_OR_EVICT
transfer_input_files= ./primates_mbbatch_1.nex, ./primates_mbbatch_2.nex
output = primates_mbbatch.output.$(Process)
error = primates_mbbatch.error.$(Process)
log = primates_mbbatch.log
Arguments = primates_mbbatch_1.nex
Queue
Arguments = primates_mbbatch_2.nex
Queue
download here mb_2_vanilla_share.submit
Or you may want the same name for output file name,error file name, and log file name and MrBayes result files.Then you need to creat two directory (let us call it batch_1 and batch_2) with command " mkdir batch-1 batch_2 " and move your data files there (rename it to the file name) with command " mv primates_mbbatch_1.nex ./batch_1/primates_mbbatch.nex " and " mv primates_mbbatch_2.nex ./batch_1/primates_mbbatch.nex ". Change your submit description file something like below.
########################################
#
# example, run two MrBayes jobs on condor vanilla universe
#
#########################################
InitialDir = /home/u5/users/yanfeng/condor_dir/mb/mb_2_vanilla
Universe = vanilla
Executable = /usr/common/i686-linux/bin/mb
Requirements = (OpSys =="LINUX" && Arch =="INTEL")
output = primates_mbbatch.output
error = primates_mbbatch.error
log = primates_mbbatch.log
should_transfer_files = YES
WhenToTransferOutput = ON_EXIT_OR_EVICT
InitialDir =./batch_1
transfer_input_files= ./primates_mbbatch.nex
Arguments = primates_mbbatch.nex
Queue
InitialDir =./batch_2
transfer_input_files= ./primates_mbbatch.nex
Arguments = primates_mbbatch.nex
Queue
download here mb_2_van.submit
-- YanfengShi - 24 May 2005