The Alvis Cluster
The Alvis cluster is a national NAISS resource dedicated for Artificial Intelligence and Machine Learning research. The Alvis profile is configured to use the GPU resources there.
TensorFlow and PiNN
Since TensorFlow is already installed on Alvis, it's recommended to run PiNN with that. To do so, make a python environment with the supplied TF:
ml TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1
python -m venv $HOME/pinn-tf26
source $HOME/pinn-tf26/bin/activate
pip install git+https://github.com/teoroo-cmc/pinn.git
The above creates a python virtual environment based on the system TF module. When you need to run PiNN manually in a new bash session, you need to load the module and activate the environment:
You might also want to make this enivronment avaialble to the Alvis OnDemand portal, following the instruction (after activating your environment):
pip install ipykernel
python -m ipykernel install --user --name pinn-tf26 --display-name "pinn-tf26"
CP2K
The container image in NGC for CP2K supports acceleration through CUDA. You
will need to build the singularity file following the NGC instructions, and
point the profile to your image. The accelerators should be picked up
automatically, for which you can verify by looking for the ACC:
tags in the
CP2K log file.
Profile
profiles {
alvis {
params{
cp2k_cmd = 'OMP_NUM_THREADS=2 mpirun -n 4 cp2k.psmp'
}
executor{
name = 'slurm'
queueSize = 100
submitRateLimit = '120 min'
}
process {
time = '3d'
executor = 'slurm'
errorStrategy='ignore'
withLabel: 'tips|pinn' {
beforeScript = 'source $HOME/pinn-tf26/bin/activate'
module = 'TensorFlow/2.6.0-fosscuda-2021a-CUDA-11.3.1'
}
withLabel: 'tips|molutils' {executor='local'}
withLabel: 'pinn' {
scratch=true
clusterOptions = '--gres=gpu:T4:1'
container='teoroo/pinnacle:pinn'
}
withLabel: 'cp2k' {
scratch=true
clusterOptions = '--gres=gpu:T4:2'
container='nvcr.io/hpc/cp2k:v9.1.0'
}
}
}
}