Use the module system
At OIST we install most scientific software on the clusters as modules. That way we can have several versions of the same software. You can use the latest release or you can stay with a trusted older version, and you can switch between versions with a single command.
You can create your own modules for yourself or for your unit. The details of doing that are available here.
NOTE: The Deigo and Saion clusters use different versions of the module system. The basic 'module' command works the same, but Deigo has some additional useful features. We discuss Deigo below, and note where Saion differs.
Load Software
You use the module system to list the available modules and to load the software that you need. Use the 'module'
command to interact with the module system. All commands are of the form 'module <command>
'. The longer commands can be abbreviated, usually to 2-3 characters.
In addition, on Deigo (but not Saion) there is a very convenient short-form command 'ml'
that you can use instead of 'module'
. We will show both interchangeably below.
First, let us see what modules are available. We do that with 'module avail
' (short for "available") or 'ml av
':
$ module avail
---------------------- /apps/.metamodules81 ----------------------
amd-modules sango-legacy-modules
intel-modules user-modules
---------------------- /apps/.modulefiles81 ----------------------
BUSCO/3.0.2 java-jdk/11
BUSCO/4.0.6 (D) java-jdk/14 (D)
Gaussian/09RE01 jellyfish/2.2.7
HTSeq/0.9.1 julia/1.3.1
MaterialsStudio/2016 julia/1.4.1 (D)
MrBayes.mpi/3.2.3 jupyter/3.7
'avail
' or 'av
' gives us a list of all software modules on the system. As you can see, each software is listed by name and version, with many packages having multiple versions. Each package version is a separate module.
On Deigo, we have "metamodules", listed at the top. Metamodules are collections of modules that belong together in some way. We will talk more about these further down.
You load a module with the 'load
' command:
$ module load julia
The short form on Deigo is simply:
$ ml julia
Let's see what version of Julia we loaded:
$ julia --version
julia version 1.4.1
The load
command will normally load the latest version of the software by default. If you want a specific version, you give the module name and the version separated with a slash:
$ module load julia/1.3.1
$ julia --version
julia version 1.3.1
You can see what modules you have loaded with 'list
' or 'li
':
$ module li
# or short form:
$ ml
Currently Loaded Modules:
1) julia/1.3.1
You can remove a module again with the 'unload
' command:
$ module unload bamtools
$ module list
No Modulefiles Currently Loaded.
With the short form we can list modules with just 'ml'
, and we can unload modules by adding a minus sign "-" in front:
# load julia, ruse and bamtools at once
$ ml julia ruse bamtools
$ ml
Currently Loaded Modules:
1) julia/1.4.1 2) ruse/1.0 3) bamtools/2.5.1
# unload ruse and bamtools, and switch to Julia 1.3.1
$ ml -ruse -bamtools julia/1.3.1
Some modules depend on other modules, and will load them automatically. Take, for instance, the BUSCO application:
$ ml BUSCO
$ ml
Currently Loaded Modules:
1) openmpi.gcc/4.0.3 4) hmmer/3.1b2 7) Prodigal/2.6.2
2) python/3.7.3 5) bamtools/2.4.1 8) BUSCO/4.0.6
3) ncbi-blast/2.7.1+ 6) augustus/3.3
BUSCO depends on a number of other modules, some of which depend on others in turn. The module system loads all the modules we need for us. The module system keeps track of how a module was loaded, so when you unload BUSCO these dependencies will also be unloaded.
Sometimes you want to clear everything out. Instead of unloading modules one by one you can use the 'purge
' command to unload all loaded modules:
$ module purge
$ module list
No modules loaded
This is also useful in scripts when you want to make sure that you're starting from a clean slate. Begin the script with a 'module purge
' and you won't have any other modules interfering by accident.
A Summary
Here's a summary of commands, with the 'ml
' command, the equivalent 'module
' command, and the effect:
command | full module command | effect |
---|---|---|
ml | module list | lists modules you have loaded |
ml <module> | module load <module> | loads <module> |
ml -<module> | module unload <module> | unloads module (note the minus sign for ml) |
ml av | module av | lists available modules (also 'avail' and 'available') |
ml purge | module purge | removes all loaded modules |
The 'ml
' command can take all the same subcommands as 'module
'. The one thing it can't do is load a module that happens to have the same name as a module subcommand. If you have a module named "purge" for instance, 'ml purge'
would purge all loaded modules, not load the "purge" module. In such a case, use 'module load purge
' instead.
The Metamodules
We have organised the modules on Deigo into separate areas. We now have a common area where most software is installed, and four specialised areas.
Module area | Purpose |
---|---|
common | The default area. Most software is installed here. |
intel-modules | Software that runs best or only on the Intel nodes. |
amd-modules | Software that runs best or only on the AMD nodes. |
sango-legacy-modules | The old Sango modules and the "sango" container. |
user-modules | User-maintained modules. |
To use, say, amd-specific software, you load the "amd-modules" metamodule:
$ module load amd-modules
If you then look at available modules:
$ module av
------------------------------ /apps/.amd-modulefiles81 --------------------
aocc/2.1.0 aocl.gcc/2.1 gmap-gsnap/2020-04-08 (D)
aocl.aocc/2.1 gmap-gsnap/2014-10-16 gromacs/2020.1
-------------------------------- /apps/.metamodules81 ----------------------
amd-modules (L) intel-modules sango-legacy-modules user-modules
-------------------------------- /apps/.modulefiles81 ----------------------
BUSCO/3.0.2 comsol/51 matlab/R2013a
Gaussian/09RE01 comsol/52 matlab/R2013b
HTSeq/0.9.1 comsol/54 (D) matlab/R2014a
MaterialsStudio/2016 crystalwave/4.9 matlab/R2014b
MrBayes.mpi/3.2.3 csds/2016.0 matlab/R2015a
...
The AMD-specific modules are now listed first, followed by the metamodules for the different areas, and then the common modules. You could now load, say, "gromacs" for AMD.
You can load multiple metamodules at the same time. If they have modules of the same name, the last loaded area is picked. If you would load "intel-modules" now, then load "gromacs", you would get the version built for the Intel nodes.
NOTE: modules in "amd-modules" will work on all systems. Modules in "intel-modules" will be fast on the Intel nodes, but will crash on AMD nodes. The intel compiler in intel-modules is an exception and does work everywhere.
Make sure that you use the appropriate version of your software when running your jobs. If you are using the "short" partition it's probably best to stick with common modules and those in "amd-modules". For more information, please read more about this on the Deigo page.
Finding Information
How do you find the module you want? You may know what you need, but not the name of the module. Or maybe you want to know more about how a module is installed.
First, the 'whatis
' command will show you a brief description of a module:
$ ml whatis augustus
augustus/3.3.3 : Name: augustus
augustus/3.3.3 : Version: 3.3.3
augustus/3.3.3 : URL: http://bioinf.uni-greifswald.de/augustus/
augustus/3.3.3 : Category: bioinformatics
augustus/3.3.3 : Keywords: sequencing, analysis
augustus/3.3.3 : Description: Predict genes in eukaryotic genomic sequences.
On Deigo, the 'ml spider
' subcommand will let you search for modules by name:
# 'spider' searches for module names
$ ml spider tri
----------------------------------------------------------------------------------------------------------------------------------------------------------
Trimmomatic: Trimmomatic/0.33
----------------------------------------------------------------------------------------------------------------------------------------------------------
Description:
A flexible trimmer for Illumina sequence data.
...
# 'spider' by itself lists all modules with a short description:
$ module spider
--------------------------------------------------------------------------
The following is a list of the modules and extensions currently available:
--------------------------------------------------------------------------
BUSCO: BUSCO/3.0.2, BUSCO/4.0.6
Assess genome assembly and annotation completeness with benchmarking
universal single-copy orthologs.
Gaussian: Gaussian/09RE01
HTSeq: HTSeq/0.9.1
High-throughput sequencing data analysis with Python.
...
'module help
' will give you in-depth information on a single module:
$ $ ml help qiime2
---------------------- Module Specific Help for "qiime2/2019.1" ----------------------
Powerful, extensible, and decentralized microbiome analysis package with a
focus on data and analysis transparency. A complete redesign and rewrite of
QIIME 1.
QIIME 2 comes distributed as a container. We add a small script that lets you
run it as 'qiime' without having to deal with the container directly.
Some modules have only a brief description. Some have more information, including helpful tips for running them on the cluster. If you feel a module description could be improved, please let us know!
The 'key
' subcommand (on Saion use 'apropos
' instead) will search the tags and words in the description in every module. This is good when you don't really know what you are looking for:
$ ml key numeric
----------------------------------------------------------------------------------
The following modules match your search criteria: "numeric"
----------------------------------------------------------------------------------
OpenBLAS.gcc: OpenBLAS.gcc/0.3.9
An optimized BLAS and lapack library.
R: R/3.4.2
A popular software environment for statistical computing and graphics.
aocl.aocc: aocl.aocc/2.1
A set of numerical libraries tuned specifically for the AMD EPYC processor
family. This is the AOCC version.
...
Unit-specific modules
Your unit may have software installed in your own unit-specific area. If you want to use that software, you need to tell the module system where to find those module files. If they have been installed according to our instructions, you can add your software modules with the 'module use
' command:
$ module use /apps/unit/[unit name]U/.modulefiles/
The module commands will now look in that directory as well for module files, and you will be able to use any software you have installed there. If you use this often, it might be a good idea to add this command to your .bashrc
file so it gets run each time you log in.
If you want to remove the unit-specific modules again, the 'module unuse
' command will do that for you:
$ module unuse /apps/unit/[unit name]U/.modulefiles/
Common module commands
Find modules
-
av, avail
Show all available modules -
whatis [module name]
Shows you a brief description of a module. Without a module name you get a
list of all modules and their descriptions. -
show [module name]
Shows you the details of one module -
key [keyword (Deigo), apropos [keyword] (Saion)
Searches all module descriptions for [keyword].
Use modules:
-
load
Load a module so you can use it. -
unload
remove the module again. -
purge
remove all loaded modules -
list
List the modules you have loaded.
Use your own modules:
-
use
Add a directory to look for modulefiles. -
unuse
Remove a module file directory.