---+ *Migrate on the SCS Condor cluster (Standard universe)*
1) log in one of the SCS master nodes
SCS has several master nodes in the the Condor Pool. You may or may not be eligible for access to all of them. Master nodes available for general use include phoenix000 (Phoenix Cluster), tempest01 (Tempest Cluster), and prism01 (Prism Cluster). The following nodes require permission from their respective owners: anfinsen098 (Anfinsen Cluster), petal017 (Petal Cluster), grove000 (Grove Cluster), and waterfall000 (
TechHelp?.Waterfall Cluster). For more information, please refer to the
SCS High Performance Clusters .
Till now, Migrate hasn't been relinked against the condor library yet. So you can log in the general use clusters like phoenix, tempest, prism and relink it by yourself at your preferred dirctory.
2) Relink Migrate against the condor library
The Migrate hasn't been relinked against the condor library. So before you run your job on condor standard universe, you need to relink it against the condor library. You can visit
General Program Tutorial or
Condor Project Websit to find out how to relink program to condor library. Or you can follow me step by step.
a) Download the source accoring your operating system at
Dr. Beerli's Migrate website
b) Suppose we use Linux(intel) operating system, unzip it with
gunzip -c migrate-2.0.3.src.tar.gz | tar xf - or use tar xvfz migrate-2.0.3.src.tar.gz
[this creates a directory "migrate-2.0.3" with subdirectories "src", and "examples" in it.]
c) cd migrate-2.0.3/src
d) type "./configure". This will create the Makefile
e) use "
condor_compile make " command to relink it.(It took about 2 minutes on my computer). It will give the information on the screen like " LINKING FOR CONDOR : ........ ".
f) use command "ls -l" to check check your dirctory and you will find an executable file "migrate-n" there.
g) run it with "./migrate" command and you will see something like below.
[yanfeng@phoenix000 src]$ ./migrate-n
Condor: Notice: Will checkpoint to ./migrate-n.ckpt
Condor: Notice: Remote system calls disabled.
=============================================
MIGRATION RATE AND POPULATION SIZE ESTIMATION
using Markov Chain Monte Carlo simulation
=============================================
Version 2.0.6
Program started at Thu Jun 2 18:52:54 2005
Settings for this run:
D Data type currently set to: DNA sequence model
I Input/Output formats
P Parameters [start, migration model]
S Search strategy
W Write a parmfile
Q Quit the program
Are the settings correct?
(Type Y or the letter for one to change)
To leave the program, just type
quit or
q and you should get the Unix prompt back.
h) It is not convient to use the relinked program at this dirctory. So you can copy it to a directory easy to find or create a link for that in the directory easy to find. At my home directory, I created a dirctory called "condor_dir" and in that directory I created a directory called "mig". I copy the relinked program there with a new name "condor_mig" so that it will reminder me it is relinked Migrate(or we can say condor version Migrate). To do that, return to the directory where your relinked Migrate(migrate-n) is. Then use command "cp migrate-n ~/condor_dir/mig/condor_mig" to copy it to your "mig" directory.
i) Or you can just create a link for your relinked Migrate. Use command "ln -s migrate-n ~/condor_dir/mig/condor_mig " to creat a link at your "mig" directory.(use "man ln" to know more ln command)
3) prepare your Migrate input file (and parmfile)
Before we start pratice Migrate on condor, it is usful to set up a directory to work in it. You need to konw three simple terminal commands: mkdir, cd ,and cp. If you have program with them, use
man mkdir,
man cd,
man cp to study it fist.
Now creat a directory called
condor_dir in your home directory. Go to that directory and creat another directory called
mig. (omit this step if you have done this above.)
There are several different input data type, such as electrophoretic marker data, microsatellite data, sequence data. We will practice on sequence data. You can find the data in the Migrate package example directory or you download below. Save it to your
mig directory. Go to
mig directory and create a new directoy called
mig_1_standard. Copy the data file to
mig_1_standard directoy. To see the contents of the file, use command
cat infile.seq ".
4 2 Example: sequence data set wit two loci [simulated data]
145 345
25 Africa
0BAA AAAGCTTTGGAAAAACATTGGACGCAAGGATACGG ......
0BAN AAAGCTTTGGAAAAACATTCGAGGCAAGGATACGG ......
......
25 Americas
1BAX AAAGCTTTGGAAAAACATTCGAGGCAAGGATACGG ......
1BAK AAAGCTTTGGAAAAACATTCGAGGCAAGGATACGG ......
......
25 Pacific
2BAP AAAGCTTTGGAAAAACATTGGACGCAAGGATACGG ......
2BAQ AAAGCTTTGGAAAAACATTGGACGCAAGGATACGG ......
......
25 Asia
3BAH AAAGCTTTGGAAAAACATTGGACGCAAGGATACG ......
3BAR AAAGCTTTGGAAAAACATTGGACGCAAGGATACG ......
......
download infile.seq
I would not explain the data file format here. Please read the chapter "Data file specification" in the documetation downloadable at
Dr. Beerli's websit .
Except for the input data file, there are some other files may be used in the anlysis,like parmfile,geofile and seedfile. The parmfile is especially worth to be mentioned. Parmfile can hold specific menu options. Without it, Migrate will display a menu, in which you can change all the sensible options. But with parmfile, you can edit the parmfile to choose the options than making the changes in the menu. It is very like the MrBayes batch block and PAUP batch block in the Nexus file.
Then what does parmfile look like? There are several parmfiles in the Migrate packege or you can download below. Use command
cat parmfile.seq to view it.
#########################################################
# Parmfile for Migrate 0.9.1
# generated automatically on
# 04/13/00 12:39:45
#
# please report problems to Peter beerli
# email: beerli@genetics.washington.edu
# http://evolution.genetics.washington.edu/lamarc.html
#########################################################
# General options ---------------------------------------
nmlength=10
# data options ------------------------------------------
datatype=SequenceData
# Sequence data options------------
ttratio=0.600000 15.000000
freqs-from-data=YES
categories=1
rates=1:1.000000
prob-rates=1:1.000000
autocorrelation=NO
weights=NO
interleaved=NO
usertree=NO
distfile=NO
# input/output options ----------------------------------
menu=NO # Change YES to NO
# input formats
infile=infile.seq
random-seed=AUTO #OWN:955654772
# output formats
progress=YES
print-data=NO
outfile=outfile.seq
plot=NO
profile=ALL:FAST
print-tree=NONE
mathfile=mathfile
write-summary=NO
# likelihood-ratio test
# parameter options ------------------------------------
theta=FST
migration=FST
mutation=NOGAMMA
fst-type=THETA
custom-migration={**}
# search strategies ------------------------------------
short-chains=10
short-inc=20
short-sample=500
long-chains=3
long-inc=20
long-sample=5000
#obscure options
burn-in=10000
heating=NOmoving-steps=NO
long-chain-epsilon=100.000000
gelman-convergence=No
replicate=Yes:LastChains
end
download the parmfile.seq here
For the meaning of the options, please refer to the chapter "Menu and Options" in the documetation downloadable at
Dr. Beerli's websit . I can't explain better than that. One thing you need to make sure is set "menu" to "NO" such that the program will not show up the menu and run it accordding to the options you set in the parmfile directly. The default is "YES". Another thing you can't omit is to set the infile to your data file name such that Migrate can find your data file. The default name is "infile".(Or you can change your data file name to "infile")
There are two ways you can get the parmfile.The first way is edit it according to example parmfiles in the program package. The second way, which is also what I prefer to, is to run the Mirgate(local version or condor version) and set (1) Data type (2) Input/Output formats (3) Parameters (4) Search strategy as what you want. Then choose the option "W"(write a parmfile) and you will see "+++ Parmfile written to current directory +++" on the screen. Quit Migrate with "Q" , check your directory with "ls" command and you will find a file called "parmfile" created. (If you already have a file named "parmfile", it will be overwritten.) There is some more detailed explaination for the options in this parmfile than the example parmfile above.
Before you submit your data on the condor, I would suggest you to run it locally first to see if it crashs immiediately or soon after you run it. Make sure it can run correctly for a while, then quit with Ctrl+C and delete all the files the program just created. After that, submit your job to condor.

Tips
The README file in the Migrate package will tell you how to install Migrate locally. Run it with "./migrate-n parmfile.seq" to see if the program will crash or not.
4) creat your submit description file
Now you can create a condor submit description file. If you have general questions about how to create a condor submit description file, please refer to the
General Program Tutorial .
########################################
#
# a simple Migrate job on condor standard universe
#
#########################################
Universe = standard
Executable = /home/u5/users/yanfeng/condor_dir/mig/condor_mig
Arguments = parmfile.seq
output = condor_mig_seq.output
error = condor_mig_seq.error
log = condor_mig_seq.log
Queue
download submit description file mig_1_standard.submit
5) submit your job and manage it
Use condor_submit command to submit your job and use condor_q, condor_rm etc to manage your job.
| [yanfeng@phoenix000 mig_1_standard]$ condor_submit mig_1_standard.submit |
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 215.
| [yanfeng@phoenix000 mig_1_standard]$ condor_q yanfeng |
-- Submitter: phoenix000.csit.fsu.edu : <144.174.160.169:11226> : phoenix000.csit.fsu.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
215.0 yanfeng 6/6 11:39 0+00:00:00 I 0 1.8 condor_mig parmfil
1 jobs; 1 idle, 0 running, 0 held
| [yanfeng@phoenix000 mig_1_standard]$ ls -l |
total 156
-rw-r--r-- 1 yanfeng student 0 Jun 6 11:38 condor_mig_seq.error [condor error file]
-rw-r--r-- 1 yanfeng student 641 Jun 6 11:58 condor_mig_seq.log [condor log file]
-rw-r--r-- 1 yanfeng student 11418 Jun 6 11:58 condor_mig_seq.output [condor output file]
-rwxr--r-- 1 yanfeng student 51322 Jun 6 11:35 infile.seq [migrate infile]
-rw-r--r-- 1 yanfeng student 8293 Jun 6 11:58 logfile.seq [migrate log file]
-rw-r--r-- 1 yanfeng student 369 Jun 6 11:38 mig_1_standard.submit [condor submit file]
-rw-r--r-- 1 yanfeng student 62315 Jun 6 11:58 outfile.seq [migrate outfile]
-rw-r--r-- 1 yanfeng student 19908 Jun 6 11:35 parmfile.seq [migrate parmfile]
You can open the log file to check the status of your job.
cat condor_mig_seq.log
000 (215.000.000) 06/06 11:39:08 Job submitted from host: <144.174.160.169:11226>
...
001 (215.000.000) 06/06 11:40:57 Job executing on host: <144.174.160.202:9711>
...
005 (215.000.000) 06/06 11:58:54 Job terminated.
(1) Normal termination (return value 0)
Usr 0 00:04:37, Sys 0 00:01:30 - Run Remote Usage
Usr 0 00:00:43, Sys 0 00:01:11 - Run Local Usage
Usr 0 00:04:37, Sys 0 00:01:30 - Total Remote Usage
Usr 0 00:00:43, Sys 0 00:01:11 - Total Local Usage
50973144 - Run Bytes Sent By Job
4090667520 - Run Bytes Received By Job
50973144 - Total Bytes Sent By Job
4090667520 - Total Bytes Received By Job
...
It means that condor finished your job without error.
6) more simple examples
example, run two Migrate jobs
Go to
mig directory and creat another directory called
mig_2_standard. Copy the example data file "infile.seq" to the new directory. Suppose we want to change the random number seed to check if we can get similar results with different seeds. You can use the parmfile.seq above. Use "mv" and "cp" command to create two copies of parmfile with name parmfile.seq.1 and parmfile.seq.2("
cp parmfile.seq parmfile.seq.1 " and "
mv parmfile.seq parmfile.seq.2 "). Open them with vi or other editor. In the parmfile.seq.1 file, change "
"random-seed=AUTO" to "random-seed=OWN:279454829" and in the parmfile.seq.2 file, change "random-seed=AUTO" to "random-seed=OWN:279454825". ( random seedvalue can be any number but "n*4+1" is pefered).Also change the output file name from "outfile.seq" to "outfile.seq.1" and change the log file name form "logfile.seq" to "logfile.seq.1" in the parmfile.seq.1. Simlar changes to parmfile.seq.2 to avoid the overwritting.
download here parmfile.seq.1
download here parmfile.seq.2
download here infile.seq
2) Your submit description file should be something like below (file name mig_2_standard.submit)
########################################
#
# two simple Migrate job on condor standard universe
#
#########################################
Universe = standard
Executable = /home/u5/users/yanfeng/condor_dir/mig/condor_mig
output = condor_mig_seq.output.$(Process)
error = condor_mig_seq.error.$(Process)
log = condor_mig_seq.log
Arguments = parmfile.seq.1
Queue
Arguments = parmfile.seq.2
Queue
download here mig_2_standard.submit
3) You can use
condor_q yourusername to check the stauts of your job. After it is finished, use
ls -l to check your files in your directory.
| [yanfeng@phoenix000 mig_2_standard]$ condor_q yanfeng |
-- Submitter: phoenix000.csit.fsu.edu : <144.174.160.169:11226> : phoenix000.csit.fsu.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
217.0 yanfeng 6/6 11:50 0+00:25:13 R 0 1.8 condor_mig parmfil
217.1 yanfeng 6/6 11:50 0+00:22:09 R 0 1.8 condor_mig parmfil
2 jobs; 0 idle, 2 running, 0 held
| [yanfeng@phoenix000 mig_2_standard]$ ls -l |
total 248
-rw-r--r-- 1 yanfeng student 0 Jun 6 11:50 condor_mig_seq.error
-rw-r--r-- 1 yanfeng student 1283 Jun 6 12:26 condor_mig_seq.log
-rw-r--r-- 1 yanfeng student 11412 Jun 6 12:25 condor_mig_seq.output
-rwxr--r-- 1 yanfeng student 51322 Jun 3 16:27 infile.seq
-rw-r--r-- 1 yanfeng student 8287 Jun 6 12:20 logfile.seq.1
-rw-r--r-- 1 yanfeng student 8287 Jun 6 12:25 logfile.seq.2
-rw-r--r-- 1 yanfeng student 651 Jun 6 11:49 mig_2_standard.submit
-rw-r--r-- 1 yanfeng student 62451 Jun 6 12:20 outfile.seq.1
-rw-r--r-- 1 yanfeng student 62442 Jun 6 12:25 outfile.seq.2
-rw-r--r-- 1 yanfeng student 19906 Jun 3 16:40 parmfile.seq.1
-rw-r--r-- 1 yanfeng student 19906 Jun 3 16:41 parmfile.seq.2

Tips
Sometime you may want the same outfile and logfile name for your two condor jobs. Then you need to create two directories so that they will not overwrite each other. Create two directory called "parm1" and "parm2". Copy your parmfiles(parmfile.seq.1 and parmfile.seq.2) to the directories. And also copy infile to the directories respectively. Change back the output file name(outfile.seq) and log file name(logfile.seq) in both parmfiles. But still keep the seeds to be different.(Becaue the parmfile names are conflict with the parmfile names of the example above when I uploaded the files to this webpage, I changed their names to "parmfile.seq.a" and "parmfile.seq.b".)
-----|
|- mig_2_standard_same.submit (file)
|- parm1 (folder)
| |- infile.seq (file)
| |- parmfile.seq.a (file)
|
|- parm2 (folder)
|- infile.seq (file)
|- parmfile.seq.b (file)
download here parmfile.seq.a
download here parmfile.seq.b
download here infile.seq
Then the submit description file will like below (file name mig_2_standard_same.submit).
########################################
#
# 2 simple Migrate job on condor standard universe
#
#########################################
Universe = standard
Executable = /home/u5/users/yanfeng/condor_dir/mig/condor_mig
output = condor_mig_seq.output
error = condor_mig_seq.error
log = condor_mig_seq.log
InitialDir = /home/u5/users/yanfeng/condor_dir/mig/mig_2_standard/two/parm1
Arguments = parmfile.seq.a
Queue
InitialDir = /home/u5/users/yanfeng/condor_dir/mig/mig_2_standard/two/parm2
Arguments = parmfile.seq.b
Queue
download here mig_2_standard_same.submit
After the jobs are finished, you will find the files like below.
-----|
|- mig_2_standard_same.submit (file)
|- parm1 (folder)
| |- condor_mig_seq.error (file)
| |- condor_mig_seq.log (file)
| |- condor_mig_seq.output (file)
| |- infile.seq (file)
| |- logfile.seq (file)
| |- outfile.seq (file)
| |- parmfile.seq.a (file)
|
|- parm2 (folder)
|- condor_mig_seq.error (file)
|- condor_mig_seq.log (file)
|- condor_mig_seq.output (file)
|- infile.seq (file)
|- logfile.seq (file)
|- outfile.seq (file)
|- parmfile.seq.b (file)
7) about parallel migrate.
At the page 60 of the documentation of migrate, there is toturial on how to run parallel migrate. One way is to use the standard message passing interface(MPI). We will not discuss it here. The other way is by hand. Instead of secure computers for the analysis, we can use condor to run seveal jobs on condor. It is very similar to the example above( run two jobs). You can practice by yourself.
--
YanfengShi - 09 May 2005