Elementolab/Job submission on Panda
From Icbwiki
- access the ElementoLab 300Tb /zenodotus file system
Create a tunnel to a machine that mounts /zenodotus
ssh -l username -L 9999:paris.pbtech:22 -N descartes.med.cornell.edu
In a new window, you can now scp/ssh
scp -P 9999 username@localhost:/zenodotus/abc/quota* .
- (Important!!) Managing large files
If you want to use unzip/gunzip, wget, tar, ftp, sftp, and rsync on large files, run it on the four dedicated machines (boreas, notos, zephyros, and euros). Please do not run it on the Panda login node.
[From PBtech:] we ask that all I/O intensive operations be done on the "four winds" machines (boreas, notos, zephyros, and euros). This includes operations such as unzip/gunzip, wget, tar, and rsync. All file systems that are mounted on panda are available on these machines as well. If you are unable to log into them, please email pbtech@med.cornell.edu, and we will grant you access. Please use the panda login node for sge-specific tasks (qsub, qstat, etc.) only.
- SGE documentation (PBtech) http://pbtech.med.cornell.edu/pbwiki/index.php/Sun_Grid_Engine
Note: currently, the max number of jobs allowed per user = 400, and this includes jobs that are waiting in the queue. To submit a large number of jobs, use the array jobs option (-t).
- Interdepency between SGE jobs
Sometimes you'd want one job to start only after other requisite jobs have finished. You can do it using the qsub and qalter commands. E.g.: You want to summarize your BWA (qBWA.sh) and BLAST (qBLAST.sh) results using qSum.sh. You can do: qsub -N qBWA.sh qBWA.sh # submit qBWA.sh and give it a job name (-N) qsub -N qBLAST.sh qBLAST.sh # submit qBLAST.sh and give it a job name qsub -h -N qSum.sh qSum.sh # submit qSum.sh but put it on hold (-h) qalter -hold_jid qBLAST.sh,qBWA.sh qSum.sh # change the dependency requirement for qSum.sh so it starts only when qBLAST.sh and qSum.sh are finished qalter -h U qSum.sh # resume qSum.sh
Note: if qSum.sh is an array job, use -hold_jid_ad instead of -hold_jid .
- How to log into a node
qrsh -l h_vmem=4g -l h_rt=8:00:00 -now yes qrsh -l h_vmem=16g -l h_rt=8:00:00 -pe smp 2 -now no
where:
-l h_vmem=16g # means you request 16Gb of RAM -l h_rt=8:00:00 # means you request 8h of CPU -pe smp 2 # means you need 2 CPUs
- Basic script parameters
#! /bin/bash -l #$ -j y #$ -cwd #$ -m a #$ -M ole2001@med.cornell.edu #$ -N ole_job #$ -l h_rt=20:00:00 # runtime (20h here); jobs that excess runtime will be killed #$ -pe smp 10 date cp -v $SGE_O_WORKDIR/wg.fa $TMPDIR # ... insert how many commands as needed ... this is basically like a bash script
Once you have created a script, eg test.sh, simply use the following command to execute it
qsub test.sh
- Using PBS.pm
use PBS; my $pbs = PBS->new; $pbs->setPlatform("panda"); $pbs->setScriptName("$ARGV[1].alnjob"); $pbs->setWallTime("4:00:00"); $pbs->setNumCPUs(24); $pbs->setEmail("ole2001@med.cornell.edu"); $pbs->addCmd("cd $ENV{PWD}"); $pbs->addCmd("cp -v \$SGE_O_WORKDIR/$ARGV[1] \$TMPDIR"); $pbs->addCmd("..."); $pbs->submit; # alternatives: $pbs->print; or $pbs->execute;
- Basic commands
qstat # shows cluster usage (e.g. if your job is running or waiting on the queue) qstat -u "*" qstat -g c qhold [job id] qconf -sc
- Useful combinations:
# show usage by user qstat -u "*"|grep "^[ 0-9]" |sed -r 's/\s+/\t/g'|cut -f 5,6|sort|uniq -c|sort -rn # remove all of your jobs named "job name" qstat |grep "job name"|awk '{print "qdel " $1}'|sh