The PiNet network
The PiNet network implements the network architecture described in our paper.1 The network architecture features the graph-convolution which recursively generates atomic properties from local environment. One distinctive feature of PiNet is that the convolution operation is realized with pairwise functions whose form are determined by the pair, called pairwise interactions.
Network architecture
The overall architecture of PiNet is illustrated with the illustration below:
The preprocess part of the network are implemented with shared layers (see
Layers). The graph-convolution (GC) block are further divided
into PI and IP operations, each consists several layers. Those operations are
recursively applied to update the latent variables, and the output is updated
after each iteration (OutLayer
).
We classify the latent variables into the atom-centered "properties"
(\(\mathbb{P}\)) and the pair-wise "interactions" (\(\mathbb{I}\)) in our notation.
Since the layers that transform \(\mathbb{P}\) to \(\mathbb{P}\) or \(\mathbb{I}\) to
\(\mathbb{I}\) are usually standard feed-forward neural networks (FFLayer
), the
special part of PiNet are PILayer
and IPLayers
, which transform between
those two types of variables.
We use the superscript to identify each tensor, and the subscripts to differentiate the indices of different types for each variable, following the convention:
- \(b\): basis function index;
- \(\alpha,\beta,\gamma,\ldots\): feature channels;
- \(i,j,k,\ldots\): atom indices;
- \(x,y,z\): Cartesian coordinate indices.
\(\mathbb{P}^{t}_{i\alpha}\) thus denote value of the \(\alpha\)-th channel of the \(i\)-th atom in the tensor \(\mathbb{P}^{t}\). We always provide all the subscripts of a given tensor in the equations below, so that the dimensionality of each tensor is unambiguously implied.
For instance, \(r_{ij}\) entails a scalar distance defined between each pair of atoms, indexed by \(i,j\); \(\mathbb{P}_{i\alpha}\) entails the atomic feature vectors indexed by \(i\) for the atom, and \(\alpha\) for the channel. The equations that explain each of the above layers and the hyperparameters available for the PiNet network are detailed below.
The parameters for PiNet
are outlined in the network specification and can be applied in the configuration file as shown in the following snippet:
"network": {
"name": "PiNet",
"params": {
"atom_types": [1, 8],
"basis_type": "gaussian",
"depth": 5,
"ii_nodes": [16, 16, 16, 16],
"n_basis": 10,
"out_nodes": [16],
"pi_nodes": [16],
"pp_nodes": [16, 16, 16, 16],
"rc": 6.0,
}
},
Network specification
pinet.PiNet
Bases: Model
This class implements the Keras Model for the PiNet network.
Source code in pinn/networks/pinet.py
328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 |
|
__init__(atom_types=[1, 6, 7, 8], rc=4.0, cutoff_type='f1', basis_type='polynomial', n_basis=4, gamma=3.0, center=None, pp_nodes=[16, 16], pi_nodes=[16, 16], ii_nodes=[16, 16], out_nodes=[16, 16], out_units=1, out_pool=False, act='tanh', depth=4)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
atom_types
|
list
|
elements for the one-hot embedding |
[1, 6, 7, 8]
|
pp_nodes
|
list
|
number of nodes for PPLayer |
[16, 16]
|
pi_nodes
|
list
|
number of nodes for PILayer |
[16, 16]
|
ii_nodes
|
list
|
number of nodes for IILayer |
[16, 16]
|
out_nodes
|
list
|
number of nodes for OutLayer |
[16, 16]
|
out_pool
|
str
|
pool atomic outputs, see ANNOutput |
False
|
depth
|
int
|
number of interaction blocks |
4
|
rc
|
float
|
cutoff radius |
4.0
|
basis_type
|
string
|
basis function, can be "polynomial" or "gaussian" |
'polynomial'
|
n_basis
|
int
|
number of basis functions to use |
4
|
gamma
|
float | array
|
width of gaussian function for gaussian basis |
3.0
|
center
|
float | array
|
center of gaussian function for gaussian basis |
None
|
cutoff_type
|
string
|
cutoff function to use with the basis. |
'f1'
|
act
|
string
|
activation function to use |
'tanh'
|
Source code in pinn/networks/pinet.py
331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 |
|
call(tensors)
PiNet takes batches atomic data as input, the following keys are required in the input dictionary of tensors:
ind_1
: sparse indices for the batched data, with shape(n_atoms, 1)
;elems
: element (atomic numbers) for each atom, with shape(n_atoms)
;coord
: coordintaes for each atom, with shape(n_atoms, 3)
.
Optionally, the input dataset can be processed with
PiNet.preprocess(tensors)
, which adds the following tensors to the
dictionary:
ind_2
: sparse indices for neighbour list, with shape(n_pairs, 2)
;dist
: distances from the neighbour list, with shape(n_pairs)
;diff
: distance vectors from the neighbour list, with shape(n_pairs, 3)
;prop
: initial properties(n_pairs, n_elems)
;
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tensors
|
dict of tensors
|
input tensors |
required |
Returns:
Name | Type | Description |
---|---|---|
output |
tensor
|
output tensor with shape |
Source code in pinn/networks/pinet.py
386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 |
|
Layer specifications
pinet.FFLayer
Bases: Layer
FFLayer
is a shortcut to create a multi-layer perceptron (MLP) or a
feed-forward network. A FFLayer
takes one tensor as input of arbitratry
shape, and parse it to a list of tf.keras.layers.Dense
layers, specified
by n_nodes
. Each dense layer transforms the input variable as:
, where \(W_{\alpha\beta}\), \(b_{\beta}\) are the learnable weights and biases,
\(h\) is the activation function, and \(\mathbb{X}\) can be
\(\mathbb{P}_{i\alpha}\) or \(\mathbb{I}_{ij\alpha}\) with \(\alpha,\beta\) being
the indices of input/output channels. The keyward arguments are parsed into
the class, which can be used to specify the bias, activation function, etc
for the dense layer. FFLayer
outputs a tensor with the shape [...,
n_nodes[-1]]
.
In the PiNet architecture, PPLayer
and IILayer
are both instances of the
FFLayer
class , namely:
, with the difference that IILayer
s have their baises set to zero to avoid
discontinuity in the model output.
Source code in pinn/networks/pinet.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
|
__init__(n_nodes=[64, 64], **kwargs)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_nodes
|
list
|
dimension of the layers |
[64, 64]
|
**kwargs
|
dict
|
options to be parsed to dense layers |
{}
|
Source code in pinn/networks/pinet.py
52 53 54 55 56 57 58 59 60 61 |
|
call(tensor)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tensor
|
tensor
|
input tensor |
required |
Returns:
Name | Type | Description |
---|---|---|
tensor |
tensor
|
tensor with shape |
Source code in pinn/networks/pinet.py
63 64 65 66 67 68 69 70 71 72 73 |
|
pinet.PILayer
Bases: Layer
PILayer
takes the properties (\(\mathbb{P}_{i\alpha},
\mathbb{P}_{j\alpha}\)) of a pair of atoms as input and outputs a set of
interactions for each pair. The inputs will be broadcasted and concatenated
as the input of a feed-forward neural network (FFLayer
), and the
interactions are generated by taking the output of the FFLayer
as weights
of radial basis functions, i.e.:
, where \(w_{ij(b\beta)}\) is an intemediate weight tensor for the
radial basis functions, output by the FFLayer
; the output channel is
reshaped into two dimensions, where \(b\) is the index for the basis function
and \(d\) is the index for output interaction.
n_nodes
specifies the number of nodes in the FFLayer
. Note that the last
element of n_nodes specifies the number of output channels after applying
the basis function (\(d\) instead of \(bd\)), i.e. the output dimension of
FFLayer is [n_pairs,n_nodes[-1]*n_basis]
, the output is then summed with
the basis to form the output interaction.
Source code in pinn/networks/pinet.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
|
__init__(n_nodes=[64], **kwargs)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_nodes
|
list of int
|
number of nodes to use |
[64]
|
**kwargs
|
dict
|
keyword arguments will be parsed to the feed forward layers |
{}
|
Source code in pinn/networks/pinet.py
105 106 107 108 109 110 111 112 113 |
|
build(shapes)
Source code in pinn/networks/pinet.py
115 116 117 118 119 120 |
|
call(tensors)
PILayer take a list of three tensors as input:
- ind_2: sparse indices of pairs with shape
(n_pairs, 2)
- prop: property tensor with shape
(n_atoms, n_prop)
- basis: interaction tensor with shape
(n_pairs, n_basis)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tensors
|
list of tensors
|
list of |
required |
Returns:
Name | Type | Description |
---|---|---|
inter |
tensor
|
interaction tensor with shape |
Source code in pinn/networks/pinet.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
|
pinet.IPLayer
Bases: Layer
The IPLayer transforms pairwise interactions to atomic properties
The IPLayer has no learnable variables and simply sums up the pairwise interations. Thus the returned property has the same shape with the input interaction, i.e.:
Source code in pinn/networks/pinet.py
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
|
__init__()
IPLayer does not require any parameter, initialize as IPLayer()
.
Source code in pinn/networks/pinet.py
164 165 166 167 168 |
|
call(tensors)
IPLayer take a list of three tensors list as input:
- ind_2: sparse indices of pairs with shape
(n_pairs, 2)
- prop: property tensor with shape
(n_atoms, n_prop)
- inter: interaction tensor with shape
(n_pairs, n_inter)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tensors
|
list of tensor
|
list of [ind_2, prop, inter] tensors |
required |
Returns:
Name | Type | Description |
---|---|---|
prop |
tensor
|
new property tensor with shape |
Source code in pinn/networks/pinet.py
170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
|
pinet.ResUpdate
Bases: Layer
ResUpdate
layer implements ResNet-like update of properties that
addresses vanishing/exploding gradient problems (see
arXiv:1512.03385).
It takes two tensors (old and new) as input, the tensors should have the same shape except for the last dimension, and a tensor with the shape of the new tensor is always returned.
If shapes of the two tensors match, their sum is returned. If the two tensors' shapes differ in the last dimension, the old tensor will be added to the new after a learnable linear transformation that matches its shape to the new tensor, i.e., according to the above flowchart:
, where \(W_{\alpha\beta}\) is a learnable weight matrix if needed.
In the PiNet architecture above, ResUpdate is only used to update the
properties after the IPLayer
, when ii_nodes[-1]==pp_nodes[-1]
, the
weight matrix is only necessary at \(t=0\).
Source code in pinn/networks/pinet.py
250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 |
|
__init__()
ResUpdate does not require any parameter, initialize as ResUpdate()
.
Source code in pinn/networks/pinet.py
281 282 283 284 285 |
|
build(shapes)
Source code in pinn/networks/pinet.py
287 288 289 290 291 292 293 294 295 |
|
call(tensors)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tensors
|
list of tensors
|
two tensors with matching shapes expect the last dimension |
required |
Returns:
Name | Type | Description |
---|---|---|
tensor |
tensor
|
updated tensor with the same shape as the second input tensor |
Source code in pinn/networks/pinet.py
297 298 299 300 301 302 303 304 305 306 |
|
pinet.OutLayer
Bases: Layer
OutLayer
updates the network output with a FFLayer
layer, where the
out_units
controls the dimension of outputs. In addition to the FFLayer
specified by n_nodes
, the OutLayer
has one additional linear biasless
layer that scales the outputs, specified by out_units
.
Source code in pinn/networks/pinet.py
189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 |
|
__init__(n_nodes, out_units, **kwargs)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_nodes
|
list
|
dimension of the hidden layers |
required |
out_units
|
int
|
dimension of the output units |
required |
**kwargs
|
dict
|
options to be parsed to dense layers |
{}
|
Source code in pinn/networks/pinet.py
197 198 199 200 201 202 203 204 205 206 207 208 209 |
|
call(tensors)
OutLayer takes a list of three tensors as input:
- ind_1: sparse indices of atoms with shape
(n_atoms, 2)
- prop: property tensor with shape
(n_atoms, n_prop)
- prev_output: previous output with shape
(n_atoms, out_units)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tensors
|
list of tensors
|
list of [ind_1, prop, prev_output] tensors |
required |
Returns:
Name | Type | Description |
---|---|---|
output |
tensor
|
an updated output tensor with shape |
Source code in pinn/networks/pinet.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 |
|
-
1 Y. Shao, M. Hellström, P.D. Mitev, L. Knijff, and C. Zhang, “PiNN: A python library for building atomic neural networks of molecules and materials,” J. Chem. Inf. Model. 60(3), 1184–1193 (2020). ↩