TIPS module

The tips.nf module contains several processes supplied by the TIPS library.

convertDS

The convertDS process converts a one dataset to another. The input/output formats are controlled by the flags channel.

Channel specification

Element Type i/o Note
name val in[0] an id to identify the process
input path in[1] input dataset
flag val in[2] flags for tips convert
name val out[0] same as input
converted path out[1] converted dataset [converted.*]

mergeDS

The mergeDS process merges a number of single point calculations into one. Note that the process also expect a idx element in the input channel, which should give an index of the corresponding single point calculation, and will be saved into the merged.idx file.

Channel specification

Element Type i/o Note
name val in[0] an id to identify the process
idx val in[1] indices of single point simulations
logs path in[2] logs from single point computations
name val out[0] same as input
idx path out[1] file that records the indices [merged.idx]
merged path out[2] merged dataset [merged.traj]

mixDS

The mixDS process takes two datasets, called newDS and oldDS, and two flags newFlag and oldFlag, the datasets are first subsampled with corresponding flags, and them merged together. This process is mainly used to update a training set in an activated learning loop.

Channel specification

Element Type i/o Note
name val in[0] an id to identify the process
newDS path in[1] new dataset
oldDS path in[2] old dataset
newFlag path in[3] subsample flag for newDS
oldFlag path in[4] subsample flag for oldDS
name val out[0] same as input
idx path out[1] merged index (merged.idx)

checkConverge

This workflow compares a sampled trajectories to labelled data. The output geometry will be:

The convergence is controlled by the following parameters.

Channel specification

Element Type i/o Note
name val in[0] an id to identify the process
idx path in[1] index of labels in the trajectory
label val in[2] labelled data set
traj val in[3] sampled trajectory
name val out[0] same as input
geo path out[1] geometry
out val out[2] a string of convergence information

Parameters

Parameter Default Description
fmaxtol 2.0 Max error on forces
emaxtol 0.02 Max error on energy
frmsetol 0.15 Tolerance for force RMSE
ermsetol 0.005 Tolerance for energy RMSE
Source code
nextflow.enable.dsl=2

params.publish = "."

def space_sep(in) {(in instanceof Path) ?in :in.join(' ')}

process convert {
  label 'tips'
  publishDir "$params.publish/$name"

  input:
    tuple val(name), path(in, stageAs:'.in*/*'), val(flags)

  output:
    tuple val(name), path('*')

  script:
    """
    tips convert ${space_sep(in)} $flags
    """
}

process dsmix {
  label 'tips'
  publishDir "$params.publish/$name"
  input: tuple val(name), path(newDS, stageAs:'*.traj'), path(oldDS, stageAs:'old/*'), val(newFlag), val(oldFlag)
  output: tuple val(name), path('mix-ds.{tfr,yml}')

  script:
  """
  tips convert old/${oldDS[0].baseName}.yml -f pinn -o old-ds -of asetraj $oldFlag
  tips convert ${space_sep(newDS)} -f asetraj -o tmp.traj -of asetraj
  tips convert tmp.traj -f asetraj -o new-ds -of asetraj $newFlag
  tips convert new-ds.traj old-ds.traj -f asetraj -o mix-ds -of pinn --shuffle $params.filters
  rm {new-ds,old-ds,tmp}.*
  """
}

process merge {
  label 'tips'
  publishDir "$params.publish/$name"
  input: tuple val(name), val(idx), path(in, stageAs:'.in*/*'), val(flags)
  output: tuple val(name), path('merged.idx'), path('merged.traj')

  script:
  """
  printf "${idx.join('\\n')}" > merged.idx
  tips convert ${space_sep(in)} -o merged -of asetraj $flags
  """
}

process check {
  label 'tips'
  publishDir "$params.publish/$name"

  input:
  tuple val(name), path(idx), path(logs), path(traj)

  output:
  tuple val(name), path('*.xyz'), stdout

  script:
  fmaxtol = params.fmaxtol
  emaxtol = params.emaxtol
  frmsetol = params.frmsetol
  ermsetol = params.ermsetol
  sp_points = params.sp_points
  """
  #!/usr/bin/env python
  import numpy as np
  from ase import Atoms
  from ase.io import read, write
  from tips.io import load_ds
  from tips.io.filter import filters2fn

  filters = "$params.filters".replace("'", '').split(' ')[1::2]
  filter_fn = filters2fn(filters) # ^ a crude extractor

  idx = [int(i) for i in np.loadtxt("$idx")]
  logs = load_ds("$logs", fmt='asetraj')
  traj = load_ds("$traj", fmt='asetraj')

  idx, logs = tuple(zip(*(
      (i, datum) for (i, datum) in zip(idx, logs) if filter_fn(datum))))

  e_label = np.array([datum['energy']/len(datum['elem']) for datum in logs])
  f_label = np.array([datum['force'] for datum in logs])
  e_pred = np.array([traj[i]['energy']/len(traj[i]['elem']) for i in idx])
  f_pred = np.array([traj[i]['force'] for i in idx])

  ecnt = np.sum(np.abs(e_pred-e_label)>$emaxtol)
  fcnt = np.sum(np.any(np.abs(f_pred-f_label)>$fmaxtol,axis=(1,2)))
  emax = np.max(np.abs(e_pred-e_label))
  fmax = np.max(np.abs(f_pred-f_label))
  ermse = np.sqrt(np.mean((e_pred-e_label)**2))
  frmse = np.sqrt(np.mean((f_pred-f_label)**2))
  converged = (emax<$emaxtol) and (fmax<$fmaxtol) and (ermse<$ermsetol) and (frmse<$frmsetol) and (len(idx)==$sp_points)

  geoname = "$name".split('/')[1]
  if converged:
      msg = f'Converged; will restart from latest frame.'
      new_geo = logs[np.argmax(idx)]
  else:
      msg = f'energy: {ecnt}/{len(idx)} failed, max={emax:.2f} rmse={ermse:.2f}; '\
            f'force: {fcnt}/{len(idx)} failed, max={fmax:.2f} rmse={frmse:.2f}.'
      new_geo = logs[np.argmin(idx)]
  atoms = Atoms(new_geo['elem'], positions=new_geo['coord'], cell=new_geo['cell'],
                pbc=True)
  write(f'{geoname}.xyz', atoms)
  print(msg)
  """
}
« Previous
Next »