Running Comsol jobs

If you want to run Comsol on your local computer you can get access from this form.

When you want to use Comsol on the cluster your first create your model interactively on your own computer; then you copy the model files to your home and use Comsol in batch mode to simulate the model on the cluster.

Please see here for information about using Slurm to run your jobs. The Comsol site has a page here with options for running Comsol on Linux on a cluster.

Comsol and temporary data

Important: When running on the cluster, Comsol creates files in the hidden ".comsol/" directory in your home. Under ".comsol/v61/recoveries" it stores snapshots of your simulation and under ".comsol/v61/configuration" it stores per-process configuration data. ".comsol/v61/logs" keeps a log file for every Comsol job you run.

If your job fails to finish, Comsol will not remove these files. You should periodically clean up these directories, or you may completely fill up your home with this data.

Comsol and Shared Memory Multiprocessing

You run a serial or shared memory multithreaded Comsol job like below. This uses the workflow that we recommend, where you create and use a temporary directory in flash, copy your result files to Bucket, then delete the temporary directory.

You ask Slurm for the number of cores you need with --cpus-per-task. Comsol will automatically use these cores, so you don’t need to set any specific options. Set the model file with “jobfile”.

#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=8G
#SBATCH --partition=compute
#SBATCH --time=0-1

module load comsol/61

# The name and location of your model file.
modelfile="/bucket/MyunitU/rf_heating.mph"

# Set Comsol options. For multithreading we need none.
comsol_options=""

# create a temporary directory for this job and save the name
# Change "MyunitU" to reflect your unit name
tempdir=$(mktemp -d /flash/MyunitU/Comsoljob.XXXXXX)

# copy our model file to the temporary directory
cp ${modelfile} ${tempdir}

# Go to our temporary directory
cd ${tempdir}

# Run Comsol. We get the model ouput and the log file
# We break the line for readability
comsol ${comsol_options} batch -inputfile ${modelfile} -outputfile model_out.mph \
                               -study std1 -batchlog output.log

# Copy results back to Bucket. Replace "MyunitU" with your unit directory.
scp model_out.mph output.log deigo:/bucket/MyunitU/

# Clean our temporary directory
rm -r ${tempdir}

You can set --cpus-per-task to any value between 1 and 128. 1 core will run as a serial job. 128 cores would be using all the cores on a single node.

Please note that more cores is not necessarily better! Almost all jobs have a limit on how many cores they can use, and if you give them more than that, your computation will slow down. Don’t just set this to 128 and assume this is the fastest setting - it probably is not.

Comsol Distributed Processing (MPI jobs)

To run a Comsol job with MPI it needs a few options. The script is almost the same as for the multithreaded job above — we only need to change our Slurm settings and add a couple of options to comsol_options.

#!/bin/bash 
#SBATCH --ntasks=8
#SBATCH --cpus-per-task=1
#SBATCH --mem=8G
#SBATCH --partition=compute
#SBATCH --time=0-1

module load comsol/61

# The name and location of your model file.
modelfile="/bucket/MyunitU/rf_heating.mph"

# Set Comsol options for MPI distributed processing
comsol_options="-mpibootstrap slurm -nn ${SLURM_NPROCS} -mpiroot ${I_MPI_ROOT}/"

# create a temporary directory for this job and save the name
# Change "MyunitU" to reflect your unit name
tempdir=$(mktemp -d /flash/MyunitU/Comsoljob.XXXXXX)

# copy our model file to the temporary directory
cp ${modelfile} ${tempdir}

# Go to our temporary directory
cd ${tempdir}

# Run Comsol. We get the model ouput and the log file
# We break the line for readability
comsol ${comsol_options} batch -inputfile ${modelfile} -outputfile model_out.mph \
                               -study std1 -batchlog output.log

# Copy results back to Bucket. Replace "MyunitU" with your unit directory.
scp model_out.mph output.log deigo:/bucket/MyunitU/

# Clean our temporary directory
rm -r ${tempdir}

We are starting 8 separate Comsol processes, and they use MPI to communicate with each other. This allows them to run on separate nodes if needes, and can potentially scale to several hundred processes.

However, this is generally less efficient than using shared memory multiprocessing; for many models shared multiprocessing is preferable.

Hybrid Comsol Jobs

You can set both the number of tasks and the number of cpus (cores) per task to more than 1. For instance, if you decide you want to use 16 cores in total, you could allocate them as:

ntasks cpus-per-task
16 1
8 2
4 4
2 8
1 16

 

Which combination is fastest? This depends a lot on the size of your model, and also on what kind of model it is. The only way to know is to try the different combinations; but as a rule of thumb, the larger the model, the more it benefits from increasing the number of cores per task.