Migrating to PiNN 1.x (TF2)
Since version 1.x, PiNN switched to TensorFlow 2 as a backend, this introduces changes to the API. This document provides information for the changes and guides for migration.
New features
CLI:
PiNN 1.x introduces a new entry point pinn
as the command line interface. The
trainer module will be replaced with the pinn train
sub-command. The CLI also
exposes utilities like dataset conversion for easier usage.
Parameter file: in PiNN 1.0 the parameter file will serve as a comprehensive input for PiNN models, the structure of the parameter file is changed, see the documentation for more information.
Extended Kalman filter: an experimental extended Kalman filter (EKF) optimizer is implemented.
Notes for developers
- Documentation is now built with mkdocs.
- Documentation is moved to Github pages.
- Continuous integration is moved to Github Actions.
- The Docker Hub repo is now teoroo/pinn.
Datasets: dataset loaders should be most compatible with PiNN 0.x. With the
TF2 update, dataset may be inspected interactively with eager execution.
Splitting option is simplified (see below), and splitting of load_tfrecord
becomes possible.
Networks: following the guideline of TF2, networks in PiNN 1.x are new Keras models and layers becomes Keras layers. This means the PiNN networks can be used to perform some simple prediction tasks. Note that PiNN models are still implemented as TensorFlow estimators since they provide a better control over the training and prediction behavior. Like the design of PiNN 0.x, the models interpret the predictions of PiNN networks as physical quantities and interface them to atomic simulation packages.
Models:
new helper function export_mode
and class MetricsCollector
are implemented to
simplify the implementation of models, see the source of dipole
model for an
example.
Breaking changes
- Models trained in PiNN 0.x will not be usable in PiNN 1.x.
- Model parameters need to be adapted to the new parameter format.
- For dataset loaders
load_*
:- the
split
argument is renamed tosplits
; - splitting is disabled by default;
- nested splits like
{'train':1, 'test':[1,2,3]}
is not supported anymore.
- the
format_dict
is renamed asds_spec
to be consistent with TensorFlow.