PiNN workflows
pinnTrain
The pinnTrain
process runs a training given a dataset and a PiNN input file.
The input file can optionally be a model folder in such case the process
contines training from the last checkpoint available. For list of flags
available, see the PiNN
documentation.
Input Channels
Channel | Type | i/o | Note |
---|---|---|---|
name |
val |
in[0] |
an id to identify the process |
dataset |
file |
in[1] |
a dataset recognizable by PiNN |
input |
file |
in[2] |
a PiNN .yml input file |
flag |
val |
in[3] |
flags for pinn train |
name |
val |
out.*[0] |
same as input |
model |
file |
out.model[1] |
Trained PiNN model |
log |
file |
out.log[2] |
PiNN log (the evaluation errors) |
pinnMD
The pinnMD
process takes a trained model and runs a MD trajecotry, with some
limited options regarding the dynamics. The process suppors PiNN models only for
now. A list of models can be supplied to run an ensemble MD. For more complex
processes, consider writing a customized MD process and include it in your
workflow.
Input Channels
Channel | Type | i/o | Note |
---|---|---|---|
name |
val |
in[0] |
an id to identify the process |
model |
file |
in[1] |
trained PiNN model, can be a list of models |
init |
file |
in[2] |
initial geometry, in any ASE recognizable format |
flags |
val |
in[3] |
a string specifying the MD simulation see options below |
name |
val |
out.*[0] |
same as input |
traj |
file |
out.traj[1] |
output trajectory |
log |
file |
out.log[1] |
MD log |
Options
The flags for pinnMD is specified in the form of --option1 val1 --option2 val
,
the available options and default valeus are listed below.
Option | Default | Note |
---|---|---|
--ensemble |
'nvt' |
'npt' or 'nvt' |
--T |
273 |
Temperature in K |
--t |
100 |
Time in ps |
--dt |
0.5 |
Time step in fs |
--taut |
100 |
Damping factor for thermostat in steps |
--taup |
1000 |
Damping factor for barostat in steps |
--log-every |
5 |
Log interval in steps |
--pressure |
1 |
pressure in bar |
--compressibility |
4.57e-5 |
compressibility in bar |
Source code
nextflow.enable.dsl=2
params.publish = 'pinn'
process train {
label 'pinn'
publishDir "$params.publish/$name"
input:
tuple val(name), path(dataset), path(input, stageAs:'input'), val(flags)
output:
tuple val(name), path('model', type:'dir'), emit: model
tuple val(name), path('pinn.log'), emit: log
script:
convert_flag = "${(flags =~ /--seed[\s,\=]\d+/)[0]}"
train_flags = "${flags.replaceAll(/--seed[\s,\=]\d+/, '')}"
dataset = (dataset instanceof Path) ? dataset : dataset[0].baseName+'.yml'
"""
#!/bin/bash
pinn convert $dataset -o 'train:9,eval:1' $convert_flag
if [ ! -f $input/params.yml ]; then
mkdir -p model; cp $input model/params.yml
else
cp -rL $input model
fi
pinn train model/params.yml --model-dir='model'\
--train-ds='train.yml' --eval-ds='eval.yml'\
$train_flags
pinn log model/eval > pinn.log
"""
}
process md {
label 'pinn'
publishDir "$params.publish/$name"
input:
tuple val(name), path(model,stageAs:'model*'), path(init, stageAs:'init*'), val(flags)
output:
tuple val(name), path('asemd.traj'), emit: traj
tuple val(name), path('asemd.log'), emit: log
script:
"""
#!/usr/bin/env python
import re
import pinn
import tensorflow as tf
from ase import units
from ase.io import read
from ase.io.trajectory import Trajectory
from ase.md import MDLogger
from ase.md.velocitydistribution import MaxwellBoltzmannDistribution
from ase.md.nptberendsen import NPTBerendsen
from ase.md.nvtberendsen import NVTBerendsen
from tips.bias import EnsembleBiasedCalculator
# ------------ patch ase properties to write extra cols --------------------
from ase.calculators.calculator import all_properties
all_properties+=[f'{prop}_{extra}' for prop in ['energy', 'forces', 'stress'] for extra in ['avg','std','bias']]
# --------------------------------------------------------------------------
setup = {
'ensemble': 'nvt', # ensemble
'T': 340, # temperature in K
't': 50, # time in ps
'dt': 0.5, # timestep is fs
'taut': 100, # thermostat damping in steps
'taup': 1000, # barastat dampling in steps
'log-every': 20, # log interval in steps
'pressure': 1, # pressure in bar
'compressibility': 4.57e-4, # compressibility in bar^{-1}
'bias': None,
'kb': 0,
'sigma0': 0,
}
flags = {
k: v for k,v in
re.findall('--(.*?)[\\s,\\=]([^\\s]*)', "$flags")
}
setup.update(flags)
ensemble=setup['ensemble']
T=float(setup['T'])
t=float(setup['t'])*units.fs*1e3
dt=float(setup['dt'])*units.fs
taut=int(setup['taut'])
taup=int(setup['taup'])
every=int(setup['log-every'])
pressure=float(setup['pressure'])
compressibility=float(setup['compressibility'])
${(model instanceof Path) ?
"calc = pinn.get_calc('$model')" :
"""
models = ["${model.join('", "')}"]
calcs = [pinn.get_calc(model) for model in models]
if len(calcs) == 1:
calc = calcs[0]
else:
calc = EnsembleBiasedCalculator(calcs,
bias=setup['bias'],
kb=float(setup['kb']),
sigma0=float(setup['sigma0']))
"""}
atoms = read("$init")
atoms.set_calculator(calc)
if not atoms.has('momenta'):
MaxwellBoltzmannDistribution(atoms, T*units.kB)
if ensemble == 'npt':
dyn = NPTBerendsen(atoms, timestep=dt, temperature=T, pressure=pressure,
taut=dt * taut, taup=dt * taup, compressibility=compressibility)
if ensemble == 'nvt':
dyn = NVTBerendsen(atoms, timestep=dt, temperature=T, taut=dt * taut)
dyn.attach(
MDLogger(dyn, atoms, 'asemd.log',stress=True, mode="w"),
interval=int(every))
dyn.attach(
Trajectory('asemd.traj', 'w', atoms).write,
interval=int(every))
dyn.run(int(t/dt))
"""
}