W. M. Keck High Performance Compute Cluster

Cluster Machine List and Queue overview

Click here for a list of names and specs for each of the nodes in the cluster along with an overview of the queues that you can submit jobs to.

View the current cluster usage

See here for visual graphs on the current usage. This is an outside link to the ganglia page for the current cluster.


Overview

This compute cluster is part of the College of Engineering’s high performance computing (HPC) resources for scientific research and teaching. It consists of state of the art computing nodes, high compute power and memory, scheduling software, and general purpose engineering applications. If you have any questions, or need help submitting jobs to the clusters, please email ETS at gridhelp@engr.colostate.edu.

General Specifications

For a complete machine list please see here

  • This cluster consists of 750 CPU cores and a total of 7936 GB RAM including:
    • A Master Node to submit projects with 2 Intel Xeon 6-core processors and 128 GB RAM
    • A Sandbox Node to test code and other projects with 2 Intel Xeon 6-core processors and 256 GB RAM
    • 42 regular compute nodes
      • 22 nodes with 2 x Intel Xeon 6-core processors and 256 GB RAM
      • 7 nodes with 2 x Intel Xeon 6-core processors and 64 GB RAM
      • 1 node with 2 x Intel Xeon 14-core processors and 64 GB RAM
      • 12 nodes with 2 x Intel Xeon 8-core processors and 64 GB RAM
    • 12 GPU compute notes
      • 6 nodes with 2 x Intel Xeon 6-core processors, 64 GB RAM, and 3 x GTX 780 GPUs
      • 4 nodes with 2 x Intel Xeon 6-core processors, 64 GB RAM, and 4 x GTX 1080 GPUs
      • 1 node with 2 x Intel Xeon 6-core processors, 64 GB RAM, 3 x GTX 780 GPUs, and 1 x Titan X
      • 1 node with 2 x Intel Xeon 6-core processors, 64 GB RAM, 3 x GTX 780 GPUs, and 1 x Tesla K40
  • Storage
    • 44 TB storage
  • Connectivity
    • Infiniband interconnect
  • Scheduler
    • Univa grid engine

Software

Scheduler

The Keck cluster uses Univa Grid Engine, which is one variant of the grid engine, as its general job scheduler. This is the software that reserves your resources, and runs the jobs that you submit. When we generally speak of using the cluster, it is most often meant that you are using the job scheduler to interact with the cluster. There are multiple online tutorials for grid engine online, or you can look directly at Univa’s Users Guide. However, the best way to get started is to come talk with us in person and we can walk you through logging in and running your first few jobs.

General Applications

Software is installed as and when requested by the users. In the case of non-freeware software, users should ensure that the software is licensed appropriately. ETS can help with this. More information on obtaining software. Most software can be found in /usr/local. Programs in bold are loaded and accessed with user modules and the latest versions can be seen by typing “module avail” in the command prompt.

  • ANSYS (research license) including:
    • Fluent
    • CFD
  • Atlas
  • Boost/Bjam
  • Blitz
  • Cuda for GPU programming
  • GCC (Multiple Versions)
  • GROMACS
  • LAMMPS (with GPU support)
  • MATLAB (Multiple Versions) (research license)
  • MPI
    • Mpich2
    • Mpich3
    • Mvapich2
    • Openmpi (Multiple Versions)
  • NWChem
  • PetSc
  • Phenix
  • Python with numerous modules such as scipy, numpy, etc.
  • StarCCM

Getting Started

Getting Started Overview (Overview In Flow Chart Form)

  1. Request an account by emailing gridhelp@engr.colostate.edu. This user account is separate from your Engineering account.
  2. Once you receive an account, connect to the submit host for the cluster.
  3. Write your code or create input files for the applications you wish to use.
  4. Write a submit file to submit the job.
  5. Submit the job using the qsub command.
  6. (Optional) Check the status of the job while its running using the qstat command.
  7. (Optional) Log out, and your job will continue to run.
  8. Check your output.

Try it yourself!

Connect to the cluster and then enter the commands below to submit a simple test Matlab job.

  • module load apps/matlab
  • mkdir ~/my_matlab_job && cd ~/my_matlab_job
  • cp /usr/local/examples/matlab/* ~/my_matlab_job
  • qsub sample_submission.sh

That’s it. Check out my_matlab_output.txt for results. You can use the files copied as a starting point, or find more examples by navigating to /usr/local/examples on the cluster.

Useful Links

General

MPICH

OpenMP

Parallel Jobs in MATLAB

Parallel Jobs in Fluent

  • The user guide for Fluent is not available online. It can be accessed when you click the Help button when starting fluent. It gives a detailed and clear explanation of running jobs in Fluent.

GPU Computing

Univa Grid Engine