...
Farm Name | Cores (or GPUs) | Cores (or GPUs) Per Node | Memory Per Core (or GPU) | Interconnect | Notes |
---|---|---|---|---|---|
suncat | 2272 Nehalem X5550 | 8 | 3GB | 1Gbit Ethernet |
|
suncat2 | 768 Westmere X5650 | 12 | 4GB | 2Gbit Ethernet |
|
suncat3 | 512 Sandy Bridge E5-2670 | 16 | 4GB | 40Gbit QDR Infiniband |
|
suncat4 | 1024 Sandy Bridge E5-2680 | 16 | 2GB | 1Gbit Ethernet | LIMITED MEMORY AND NO LOCAL DISK |
gpu | 119 Nvidia M2090 | 7 | 6GB | 40Gbit QDR Infiniband |
|
Jobs should typically request a multiple of the number of cores per node.
...
Code Block |
---|
bjobs (shows your current list of batch jobs and jobIds) bjobs -d (shows list of your recently completed batch jobs) bqueues suncat-long (shows number of cores pending and running) bjobs -u all | grep suncat (show jobs of all users in the suncat queues) bpeek <jobId> (examine logfile output from job that may not have been flushed to disk) bkill <jobId> (kill job) btop <jobId> (moves job priority to the top) bbot <jobId> (moves job priority to the bottom) bsub -w "ended\(12345\)" (wait for job id 12345 to be EXITed or DONE before running) bmod [options] <jobId> (modify job parameters after submission, e.g. priority (using -sp flag)) bswitch suncat-xlong 12345 (move running job id 12345 to the suncat-xlong queue) bmod -n 12 12345 (change number of cores or pending job 12345 to 12) |
suncat4 Guidelines
These experimental computing nodes have relatively little memory and no local disk. Please use the following guidelines when submitting jobs:
- if you exceed the 2GB/core memory limit, the node will crash. planewave codes (espresso, dacapo/jacapo, vasp) use less memory. If you use GPAW make sure you check the memory estimate before submitting your job.
- you can observe the memory usage of the nodes for your job with "lsload psanacs002" (if your job uses node "psanacs002"). The last column shows the free memory.
- if you run espresso, you must use the following options, since there is no local disk:
Code Block output = {'avoidio':True, 'removewf':True, 'wf_collect':False},
- use the same job submission commands that you would use for suncat/suncat2
- use queue name "suncat4-long"
- the "-N" batch option (to receive email on job completion) does not work