Use the module system

At OIST we install most scientific software on the clusters as modules. That way we can have several versions of the same software. You can use the latest release or you can stay with a trusted older version, and you can switch between versions with a single command.

You can create your own modules for yourself or for your unit. The details of doing that are available here.

NOTE: The Deigo and Saion clusters use different versions of the module system. The basic 'module' command works the same, but Deigo has some additional useful features. We discuss Deigo below, and note where Saion differs.

Load Software

You use the module system to list the available modules and to load the software that you need. Use the 'module' command to interact with the module system. All commands are of the form 'module <command>'. The longer commands can be abbreviated, usually to 2-3 characters.

In addition, on Deigo (but not Saion) there is a very convenient short-form command 'ml' that you can use instead of 'module'. We will show both interchangeably below.

First, let us see what modules are available. We do that with 'module avail' (short for "available") or 'ml av':

 

$ module avail
---------------------- /apps/.metamodules81 ----------------------
   amd-modules      sango-legacy-modules
   intel-modules    user-modules

---------------------- /apps/.modulefiles81 ----------------------
   BUSCO/3.0.2                 java-jdk/11
   BUSCO/4.0.6          (D)    java-jdk/14        (D)
   Gaussian/09RE01             jellyfish/2.2.7
   HTSeq/0.9.1                 julia/1.3.1
   MaterialsStudio/2016        julia/1.4.1        (D)
   MrBayes.mpi/3.2.3           jupyter/3.7

 

'avail' or 'av' gives us a list of all software modules on the system. As you can see, each software is listed by name and version, with many packages having multiple versions. Each package version is a separate module.

On Deigo, we have "metamodules", listed at the top. Metamodules are collections of modules that belong together in some way. We will talk more about these further down.

You load a module with the 'load' command:
 

$ module load julia

The short form on Deigo is simply:

$ ml julia

Let's see what version of Julia we loaded:

$ julia --version

julia version 1.4.1

The load command will normally load the latest version of the software by default. If you want a specific version, you give the module name and the version separated with a slash:

$ module load julia/1.3.1
$ julia --version

julia version 1.3.1

You can see what modules you have loaded with 'list' or 'li':

$ module li
# or short form:
$ ml li

Currently Loaded Modules:
  1) julia/1.3.1

You can remove a module again with the 'unload' command:

$ module unload bamtools
$ module list
No Modulefiles Currently Loaded.

With the short form we can list modules with just 'ml', and we can unload modules by adding a minus sign "-" in front:
 

# load julia, ruse and bamtools at once
$ ml julia ruse bamtools
$ ml
Currently Loaded Modules:
  1) julia/1.4.1   2) ruse/1.0   3) bamtools/2.5.1

# unload ruse and bamtools, and switch to Julia 1.3.1
$ ml -ruse -bamtools julia/1.3.1


Some modules depend on other modules, and will load them automatically. Take, for instance, the BUSCO application:
 

$ ml BUSCO
$ ml

Currently Loaded Modules:
  1) openmpi.gcc/4.0.3   4) hmmer/3.1b2      7) Prodigal/2.6.2
  2) python/3.7.3        5) bamtools/2.4.1   8) BUSCO/4.0.6
  3) ncbi-blast/2.7.1+   6) augustus/3.3


BUSCO depends on a number of other modules, some of which depend on others in turn. The module system loads all the modules we need for us. The module system keeps track of how a module was loaded, so when you unload BUSCO these dependencies will also be unloaded.

Sometimes you want to clear everything out. Instead of unloading modules one by one you can use the 'purge' command to unload all loaded modules:
 

$ module purge
$ module list
No modules loaded


This is also useful in scripts when you want to make sure that you're starting from a clean slate. Begin the script with a 'module purge' and you won't have any other modules interfering by accident.

A Summary

Here's a summary of commands, with the 'ml' command, the equivalent 'module' command, and the effect:

command full module command effect
ml module list lists modules you have loaded
ml <module> module load <module> loads <module>
ml -<module> module unload <module> unloads module (note the minus sign for ml)
ml av module av lists available modules (also 'avail' and 'available')
ml purge module purge removes all loaded modules


The 'ml' command can take all the same subcommands as 'module'. The one thing it can't do is load a module that happens to have the same name as a module subcommand. If you have a module named "purge" for instance, 'ml purge' would purge all loaded modules, not load the "purge" module. In such a case, use 'module load purge' instead.

 

The Metamodules

We have organised the modules on Deigo into separate areas. We now have a common area where most software is installed, and four specialised areas.

Module area Purpose
common The default area. Most software is installed here.
intel-modules Software that runs best or only on the Intel nodes.
amd-modules Software that runs best or only on the AMD nodes.
sango-legacy-modules The old Sango modules and the "sango" container.
user-modules User-maintained modules.


To use, say, amd-specific software, you load the "amd-modules" metamodule:

$ module load amd-modules

If you then look at available modules:

$ module av

------------------------------ /apps/.amd-modulefiles81 --------------------
   aocc/2.1.0       aocl.gcc/2.1             gmap-gsnap/2020-04-08 (D)
   aocl.aocc/2.1    gmap-gsnap/2014-10-16    gromacs/2020.1

-------------------------------- /apps/.metamodules81 ----------------------
   amd-modules (L)    intel-modules    sango-legacy-modules    user-modules

-------------------------------- /apps/.modulefiles81 ----------------------
   BUSCO/3.0.2                 comsol/51                matlab/R2013a
   Gaussian/09RE01             comsol/52                matlab/R2013b
   HTSeq/0.9.1                 comsol/54         (D)    matlab/R2014a
   MaterialsStudio/2016        crystalwave/4.9          matlab/R2014b
   MrBayes.mpi/3.2.3           csds/2016.0              matlab/R2015a

...

The AMD-specific modules are now listed first, followed by the metamodules for the different areas, and then the common modules. You could now load, say, "gromacs" for AMD.

You can load multiple metamodules at the same time. If they have modules of the same name, the last loaded area is picked. If you would load "intel-modules" now, then load "gromacs", you would get the version built for the Intel nodes.

NOTE: modules in "amd-modules" will work on all systems. Modules in "intel-modules" will be fast on the Intel nodes, but will crash on AMD nodes. The intel compiler in intel-modules is an exception and does work everywhere.

Make sure that you use the appropriate version of your software when running your jobs. If you are using the "short" partition it's probably best to stick with common modules and those in "amd-modules". For more information, please read more about this on the Deigo page.

Finding Information

How do you find the module you want? You may know what you need, but not the name of the module. Or maybe you want to know more about how a module is installed.

First, the 'whatis' command will show you a brief description of a module:

$ ml whatis augustus
augustus/3.3.3      : Name: augustus
augustus/3.3.3      : Version: 3.3.3
augustus/3.3.3      : URL: http://bioinf.uni-greifswald.de/augustus/
augustus/3.3.3      : Category: bioinformatics
augustus/3.3.3      : Keywords: sequencing, analysis
augustus/3.3.3      : Description: Predict genes in eukaryotic genomic sequences.


On Deigo, the 'ml spider' subcommand will let you search for modules by name:

# 'spider' searches for module names
$ ml spider tri
----------------------------------------------------------------------------------------------------------------------------------------------------------
  Trimmomatic: Trimmomatic/0.33
----------------------------------------------------------------------------------------------------------------------------------------------------------
    Description:
      A flexible trimmer for Illumina sequence data.
...

# 'spider' by itself lists all modules with a short description:
$ module spider
--------------------------------------------------------------------------
The following is a list of the modules and extensions currently available:
--------------------------------------------------------------------------
  BUSCO: BUSCO/3.0.2, BUSCO/4.0.6
    Assess genome assembly and annotation completeness with benchmarking
    universal single-copy orthologs.

  Gaussian: Gaussian/09RE01

  HTSeq: HTSeq/0.9.1
    High-throughput sequencing data analysis with Python.
...

'module help' will give you in-depth information on a single module:

$ $ ml help qiime2

---------------------- Module Specific Help for "qiime2/2019.1" ----------------------
Powerful, extensible, and decentralized microbiome analysis package with a
focus on data and analysis transparency. A complete redesign and rewrite of
QIIME 1.

QIIME 2 comes distributed as a container. We add a small script that lets you
run it as 'qiime' without having to deal with the container directly. 

Some modules have only a brief description. Some have more information, including helpful tips for running them on the cluster. If you feel a module description could be improved, please let us know!

The 'key' subcommand (on Saion use 'apropos' instead) will search the tags and words in the description in every module. This is good when you don't really know what you are looking for:

$ ml key numeric
----------------------------------------------------------------------------------

The following modules match your search criteria: "numeric"
----------------------------------------------------------------------------------

  OpenBLAS.gcc: OpenBLAS.gcc/0.3.9
    An optimized BLAS and lapack library.

  R: R/3.4.2
    A popular software environment for statistical computing and graphics.

  aocl.aocc: aocl.aocc/2.1
    A set of numerical libraries tuned specifically for the AMD EPYC processor
    family. This is the AOCC version.
...

 

Unit-specific modules

Your unit may have software installed in your own unit-specific area. If you want to use that software, you need to tell the module system where to find those module files. If they have been installed according to our instructions, you can add your software modules with the 'module use' command:

$ module use /apps/unit/[unit name]U/.modulefiles/

The module commands will now look in that directory as well for module files, and you will be able to use any software you have installed there. If you use this often, it might be a good idea to add this command to your .bashrc file so it gets run each time you log in.

If you want to remove the unit-specific modules again, the 'module unuse' command will do that for you:

$ module unuse /apps/unit/[unit name]U/.modulefiles/

 

Common module commands

Find modules

  • av, avail
    Show all available modules

  • whatis [module name]
    Shows you a brief description of a module. Without a module name you get a
    list of all modules and their descriptions.

  • show [module name]
    Shows you the details of one module

  • key [keyword (Deigo), apropos [keyword] (Saion)
    Searches all module descriptions for [keyword].

Use modules:

  • load
    Load a module so you can use it.

  • unload
    remove the module again.

  • purge
    remove all loaded modules

  • list
    List the modules you have loaded.

Use your own modules:

  • use
    Add a directory to look for modulefiles.

  • unuse
    Remove a module file directory.