Resource Limits
Here is a summary of the system limits for users of our computing resources. The memory is the maximum you can allocate per node in practice. Nodes is the total number of nodes, and cores is the total maximum per user, not the cores per node.
These are the default values; individual users and units may have limits that are higher or lower than these (hardware permitting).
Deigo
Deigo has a number of partitions, each with its own restrictions on maximum cores and time.
Compute Partition |
Memory per node |
Nodes total |
Cores per user |
Memory per user |
Runtime per job |
Notes |
---|---|---|---|---|---|---|
Compute | 500G | 356 | 2000 | 7500G | 4 days | (1) |
Short | 500G | 648 | 4000 | 6500G | 2 hours | (1) |
Largemem | 500G | 84 | 5 nodes | — | ∞ | unrestricted (2) |
Largemem | 750G | 14 | 5 nodes | — | ∞ | unrestricted (2) |
Bigmem | 2978G | 1 | 8 | — | ∞ | unrestricted (2) |
Largejob | 500G | 100 | 50 nodes | — | 2 days | Special (3) |
datacp | 185G | 4 | 4 | 19G | ∞ | for moving data (4) |
Storage System |
Amount | Notes | |
---|---|---|---|
/home | 50G | user limit | |
/flash | 10T | Per-unit limit | |
/bucket | 50T | Per-unit limit, expandable | |
naruto | — | Tape for archiving; expends as neeed |
- If you need more compute time than the current limit, you can ask us to increase that limit in exchange for fewer cores, as detailed here.
- Largemem and Bigmem have no fixed maximum time. However, it's a bad idea to run for more than about a month. The risk increases greatly that a hardware or software error, an electical outage, or emergency maintenance will kill your job prematurely.
- Largejob is for many-core computations, and requires an application from the users' unit leader. You can apply using this form.
- Datacp partition has four nodes, and is only for transferring large data volumes between Bucket and Flash. Do not use these for any kind of computation.
Saion
Saion is a general accellerated computing system with three subsystems. You can read more about this on this page.
Partition | Memory per node |
Nodes partition |
Cores per user |
GPUs per node |
Runtime | Notes |
---|---|---|---|---|---|---|
test-gpu | 497 | 6 | 8 | 4 | — | preemptible (1) |
Kannel | 120G | 16 | 4096 | — | 7 days | KNL nodes (2) |
GPU | 497G | 16 | 16 | 4 | 7 days | P100, V100 (3) |
PowerNV | 500G | 8 | 1216 | 4 | 7 days | P100, V100 (4) |
Storage System |
Amount | Notes | |
---|---|---|---|
/home | 50GB | user limit, same system as Deigo | |
/bucket | 50TB | unit-limit; same system as Deigo | |
/work | 10TB | Per-unit limit |
- The test-gpu partition is accessible to anybody. However, it is a low-priority partition; if somebody using a restricted partition wants to use the hardware, your job may be suspended or killed. This is best used for development and testing, not long computations.
- You can use all of the knl nodes at once for massively parallel computations. This is an especially good resource for developing and testing parallel code. However, we may ask you to reduce the number if other users are waiting for the resource.
- You could take 16 cores on a single node; or up to 16 nodes, each with a single core. This is a popular system. If you are using too much resources at once, and blocking other users form doing their work we may terminate your job. Please be mindful of how much you use.
- The PowerNV system is meant for experienced users, and we keep a hands-off approach to maintaining it. If you are a PowerNV user and have specific needs that the default settings do not cover you are welcome to come by and discuss this with us.
Storage
This summarizes the public storage systems at OIST, and which system uses each one. Also see our pages on the research storage system for more up to date information.
System | Storage | Amount | Notes | |
---|---|---|---|---|
Deigo, Saion | /home | 50G | Per-user | |
Deigo | /flash | 10T | Per-unit | |
Deigo, Saion | /bucket | 50T | Per-unit | |
Saion | /work | 10T | Per-unit | |
All | naruto | — | Tape for archiving |