Elementolab/Job submission on Panda

From Icbwiki

Jump to: navigation, search
  • access the ElementoLab 300Tb /zenodotus file system

Create a tunnel to a machine that mounts /zenodotus

ssh -l username -L 9999:paris.pbtech:22 -N descartes.med.cornell.edu

In a new window, you can now scp/ssh

scp -P 9999 username@localhost:/zenodotus/abc/quota* .
  • (Important!!) Managing large files

If you want to use unzip/gunzip, wget, tar, ftp, sftp, and rsync on large files, run it on the four dedicated machines (boreas, notos, zephyros, and euros). Please do not run it on the Panda login node.

[From PBtech:] we ask that all I/O intensive operations be done on the "four winds" machines (boreas, notos, zephyros, and euros). This includes operations such as unzip/gunzip, wget, tar, and rsync. All file systems that are mounted on panda are available on these machines as well. If you are unable to log into them, please email pbtech@med.cornell.edu, and we will grant you access. Please use the panda login node for sge-specific tasks (qsub, qstat, etc.) only.


Note: currently, the max number of jobs allowed per user = 400, and this includes jobs that are waiting in the queue. To submit a large number of jobs, use the array jobs option (-t).


  • Interdepency between SGE jobs
Sometimes you'd want one job to start only after other requisite jobs have finished. You can do it using the qsub and qalter commands.
E.g.:
You want to summarize your BWA (qBWA.sh) and BLAST (qBLAST.sh) results using qSum.sh. You can do:

qsub -N qBWA.sh qBWA.sh                                        # submit qBWA.sh and give it a job name (-N)
qsub -N qBLAST.sh qBLAST.sh                                    # submit qBLAST.sh and give it a job name
qsub -h -N qSum.sh qSum.sh                                     # submit qSum.sh but put it on hold (-h)
qalter -hold_jid qBLAST.sh,qBWA.sh qSum.sh                     # change the dependency requirement for qSum.sh so it starts only when qBLAST.sh and qSum.sh are finished
qalter -h U qSum.sh                                            # resume qSum.sh
Note: if qSum.sh is an array job, use -hold_jid_ad instead of -hold_jid .
  • How to log into a node
qrsh -l h_vmem=4g -l h_rt=8:00:00 -now yes
qrsh -l h_vmem=16g -l h_rt=8:00:00 -pe smp 2 -now no

where:

-l h_vmem=16g   # means you request 16Gb of RAM
-l h_rt=8:00:00 # means you request 8h of CPU
-pe smp 2       # means you need 2 CPUs


  • Basic script parameters
#! /bin/bash -l
#$ -j y
#$ -cwd
#$ -m a
#$ -M ole2001@med.cornell.edu
#$ -N ole_job
#$ -l h_rt=20:00:00             # runtime (20h here); jobs that excess runtime will be killed
#$ -pe smp 10
date
cp -v $SGE_O_WORKDIR/wg.fa $TMPDIR
# ... insert how many commands as needed ... this is basically like a bash script

Once you have created a script, eg test.sh, simply use the following command to execute it

qsub test.sh


  • Using PBS.pm
use PBS;
my $pbs = PBS->new;

$pbs->setPlatform("panda");
$pbs->setScriptName("$ARGV[1].alnjob");
$pbs->setWallTime("4:00:00");
$pbs->setNumCPUs(24);
$pbs->setEmail("ole2001@med.cornell.edu");
$pbs->addCmd("cd $ENV{PWD}");
$pbs->addCmd("cp -v \$SGE_O_WORKDIR/$ARGV[1] \$TMPDIR");
$pbs->addCmd("...");
$pbs->submit;  # alternatives: $pbs->print; or $pbs->execute;
  • Basic commands
qstat                                      # shows cluster usage (e.g. if your job is running or waiting on the queue) 
qstat -u "*"
qstat -g c
qhold [job id]
qconf -sc
  • Useful combinations:
# show usage by user
qstat -u "*"|grep "^[ 0-9]" |sed -r 's/\s+/\t/g'|cut -f 5,6|sort|uniq -c|sort -rn
 
# remove all of your jobs named "job name"
qstat |grep "job name"|awk '{print "qdel " $1}'|sh
Personal tools