r1 - 06 Nov 2006 - 22:36:57 - JimWilgenbuschYou are here: TWiki >  Computing Web > TempestCluster > TempestBatch

Running Jobs on Tempest


The Tempest cluster is a general access SCS resource. The Tempest cluster is configured to serve long running batch style jobs, by using either the Sun Grid Engine or Condor. Jobs run outside of these two systems will be terminated after 30 minutes. SGE provides the principle submission method and Condor acts as an opportunistic backfilling mechanism. Jobs submitted to the SGE have priority over Condor jobs, hence only one SGE job can be submitted at a time. See the Resource Policy section below for more details. Jobs submitted to the Condor queue, while secondary to the SGE queue, can run on more than just the Tempest cluster. Jobs submitted to the Condor queue may be matched with a machine in another cluster or a desktop or classroom workstation.

Table of Contents

Resource Policy

  • Users are not permitted to log in to the phoenix compute nodes.
  • Idle sessions are automatically logged out after two hours.
  • Jobs run outside of SGE and Condor will be terminited after 30 minutes. Use the Applications? cluster for short interactive jobs.
  • No more than four jobs per user can be submitted to the SGE queues
  • No more than two SGE jobs per user will run at the same time
  • SGE jobs will run a maximum of four CPU days

Software Resources

A complete description of the software available on this cluster and other SCS computational resources can be found at: AvailableSoftware.

Primary SGE Queue

The primary job submission queue is managed by the Sun Grid Engine (SGE). The following link will take you to a quick start, which was taken from the Rock Cluster documentation and modified only slightly for the SCS environment.

  • Click here for general SGE information and examples -- ScsSGE.

The original rocks documenation can be found at http://www.rocksclusters.org/roll-documentation/sge/4.2.1/. For more extensive SGE documenation checkout the official SGE web site http://gridengine.sunsource.net/documentation.html.

Secondary Condor Queue

The secondary job submission queue is managed by Condor. Condor acts as an opportunistic backfilling mechanism and provides a mechanism for jobs to be run on general access clusters, our research clusters, classroom workstations, and desktop machines when they are not in use.

  • Click here for general Condor information and examples -- ScsCondor.
Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
SCS TWiki

This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback