next up previous contents
Next: qsub Basics Up: Queue commands - Basic Previous: Queue commands - Basic   Contents

qstat Basics


As previously stated, the qstat command will display the status of the queue(s), specifically, its most common usage is to retrieve information regarding the jobs running on the scheduling system. Depending on the cluster and the configuration, this command may return a list of jobs submitted only by yourself, or a list of all jobs submitted, as seen in these examples. Axiom is configured to report only jobs submitted by the user, while umms-amino reports on all jobs currently submitted to the system.



Note: Once a job is complete, it will no longer be reported in qstat, completed jobs will be discussed in another section.

Example:

[jdpoisso@axiom ~]$ qstat 
Job id                    Name             User            Time Use S Queue 
------------------------- ---------------- --------------- -------- - ----- 
1027.axiom                15-34            jdpoisso               0 Q first     
1028.axiom                16-34            jdpoisso               0 Q first     
1029.axiom                17-34            jdpoisso               0 Q first     
1030.axiom                18-34            jdpoisso               0 Q first     
1031.axiom                19-34            jdpoisso               0 Q first     
1032.axiom                20-34            jdpoisso               0 Q first     
1033.axiom                21-34            jdpoisso               0 Q first  
1034.axiom                22-34            jdpoisso               0 Q first
[jdpoisso@axiom ~]$

Example:

[jdpoisso@umms-amino ~]$ qstat 
Job id                    Name             User            Time Use S Queue 
------------------------- ---------------- --------------- -------- - ----- 
1036842.umms-amino        T10372_2_AMF_Z   yzhang          00:05:55 R casp  
1036843.umms-amino        T10372_2_A_closc yzhang          00:05:53 R casp    
1036850.umms-amino        d1l5ja2          jinrui          00:03:28 R default 
1036852.umms-amino        S46334_2F_1_run  zhanglabs       00:03:01 R urgent  
1036853.umms-amino        S46334_3F_1_run  zhanglabs       00:02:51 R urgent   
1036854.umms-amino        d1l5ja3          jinrui          00:02:43 R default
[jdpoisso@umms-amino ~]$


The qstat command (when run with no arguments) will return information broken down into six columns. first column is the Job ID, this is the unique job number of the job and the scheduling server that job number was submitted to. Each job submitted is given a unique number that is used by the scheduler to reference the job. This job number is required for many other
textttTORQUE commands, and can even be given to the qstatcommand to gain more information about a particular job.


The second column, Name, is the jobs defined name. Defining names will be covered when discussing the qsub command. For now it is enough to know that each job may optionally be given a non-unique name to describe the job in the queue. This is useful as it allows you to describe the job with a word, or phrase, or input, that means more to a human reader than a job number. Also, for future reference, job outputs returned by the scheduler are written to files using this defined name.


The third column, User, lists the username which submitted the job. This allows you to know who submitted the job to the queue, and search for jobs submitted by collaborators, or by yourself. system does allow you to submit on behalf of another user, so all jobs in this field are accurate as to who is used the qsub command.


The fourth column, Time Use, lists long your job has been running for. Until the job actually starts running, the number is set to 0. Depending on how the job is run, and the arguments being used this can be displayed either in CPU time (default), or walltime. time is the aggregate amount of time each CPU in the system has devoted to running your job, so if one of your job used two CPUs for thirty seconds (00:00:30) each, your CPU time will be one minute (00:01:00). Walltime, on the other hand is defined by how long the job took to run according to the clock on the wall, i.e. a normal clock. So that same job that used two CPUs for thirty seconds each, if using the CPUs at the exact same time, will finish in thirty seconds, and the walltime will be thirty seconds (00:00:30).


Note: Walltime is also considered a resource, as briefly mentioned before, the system will run jobs based on availability of resources. If you expect your job to run for a long period, say twenty hours, you must request a walltime resource for twenty (20:00:00) hours, meaning the cluster will schedule your job when it most optimally can allocate twenty hours according to its queue and configuration. If a walltime resource is not specified when you submit your job, the default walltime allotment will be used, the exact amount of which varies by system and by queue. Exceeding the walltime allotment may cause the scheduler to forcibly terminate your job, whether it is complete or not.
textttWalltime will be covered further in sections about advanced job submission.


The fifth column, S, represents that status of the job. This is a one letter code for what your jobs current state is. The possible job states are:



Q - Queued, waiting to be run



R - , job has been assigned resources and is running



E - , job is complete and is cleaning up, and copying files



H - , the job has been held, either by the submitter or an administrator


The sixth column, Queue, lists the current job queue that the job is submitted to. As each queue has varying resources and rules, you are able to use this information to help deduce what resources the job is using, or what resources the job is waiting to become available for.


next up previous contents
Next: qsub Basics Up: Queue commands - Basic Previous: Queue commands - Basic   Contents
2010-08-27