Parameterized Kinetic Functionals ================================= This example illustrates how we can extend the :ref:`kinetic functional template ` class, ``KineticFunctional``, to create parameterized functionals (e.g. neural network functionals), which can be fitted to data. :math:`\mu` TF + :math:`\lambda` vW Functional and Linear Response Fitting -------------------------------------------------------------------------- Let us start with a simple example to show how the ``KineticFunctional`` class can be used to define the :math:`\mu` TF + :math:`\lambda` vW functional. :: class TFvW(KineticFunctional): def __init__(self, init_args=None): super().__init__() # init_args is a tuple if init_args is None: mu, lamb = 1, 1 # defauts to TF + vW else: mu, lamb = init_args # make µ and λ trainable parameters self.mu = torch.nn.Parameter(torch.tensor([mu], dtype=torch.double, device=self.device)) self.lamb = torch.nn.Parameter(torch.tensor([lamb], dtype=torch.double, device=self.device)) self.initialize() def forward(self, box_vecs, den): return self.mu * Weizsaecker(box_vecs, den) + self.lamb * ThomasFermi(box_vecs, den) Now let us use the ``get_inv_G()`` utility function to fit the (:math:`\mu`, :math:`\lambda`) parameters to reproduce the Lindhard function. (A simpler example of how the ``get_inv_G()`` function can be used can be found :ref:`here `.) :: shape = (61, 61, 61) box_vecs = 8 * torch.eye(3, dtype=torch.double) den = torch.ones(shape, dtype=torch.double) # compute Lindhard response function eta, G_inv_lind = G_inv_lindhard(box_vecs, den) # initialize µ TF + λ vW functional TFvW_train = TFvW() # set its parameters to have requires_grad = True to make them trainable TFvW_train.param_grad(True) print('Initial (µ, λ) = ({:.5g}, {:.5g})\n'.format(TFvW_train.mu.item(), TFvW_train.lamb.item())) for i in range(20): eta, G_inv = get_inv_G(box_vecs, den, TFvW_train.forward, requires_grad=True) # computes the loss function and performs optimization to minimize it loss = TFvW_train.grid_error(G_inv_lind, G_inv) # these are inherited helper functions from the TFvW_train.update_params(loss) # KineticFunctional class to facilitate training print('Epoch = {}, Loss = {:.5g}'.format(i, loss.item())) TFvW_train.param_grad(False) print('\nOptimized (µ, λ) = ({:.5g}, {:.5g})'.format(TFvW_train.mu.item(), TFvW_train.lamb.item())) eta, G_inv_opt = get_inv_G(box_vecs, den, TFvW_train.forward) # make plot to compare against Lindhard response plt.rc('font', family='serif') plt.subplots(figsize=(5, 4)) plt.plot(eta[0, 0, :], G_inv_lind[0, 0, :], '-k') plt.plot(eta[0, 0, :], G_inv_opt[0, 0, :], '--r') plt.xlim([0, eta[0, 0, -1]]) plt.ylim([0, 1.05]) plt.xlabel(r'$\eta$', fontsize=12) plt.ylabel(r'$G^{-1}(\eta)$', fontsize=12) plt.xticks(fontsize=12) plt.yticks(fontsize=12) labels = ['Lindhard', r'Optimized $\mu$TF + $\lambda$vW'] plt.legend(labels=labels, loc="upper right", borderaxespad=0.4, ncol=1, prop={'size': 12}) plt.tight_layout() plt.show() This results in :: Initial (µ, λ) = (1, 1) Epoch = 0, Loss = 0.0014799 Epoch = 1, Loss = 0.0011208 Epoch = 2, Loss = 0.00074545 Epoch = 3, Loss = 0.00059983 Epoch = 4, Loss = 0.00059983 Epoch = 5, Loss = 0.00060346 Epoch = 6, Loss = 0.00060346 Epoch = 7, Loss = 0.00057833 Epoch = 8, Loss = 0.00056777 Epoch = 9, Loss = 0.00056521 Epoch = 10, Loss = 0.00056278 Epoch = 11, Loss = 0.0005615 Epoch = 12, Loss = 0.00056022 Epoch = 13, Loss = 0.00056301 Epoch = 14, Loss = 0.00056301 Epoch = 15, Loss = 0.0005608 Epoch = 16, Loss = 0.0005603 Epoch = 17, Loss = 0.0005603 Epoch = 18, Loss = 0.00056026 Epoch = 19, Loss = 0.00056026 Optimized (µ, λ) = (0.70015, 0.60223) and makes the plot .. image:: images/TFvW_invG.png :width: 450 Neural Network Functional and Kinetic Potential Fitting ------------------------------------------------------- Let us now consider how the ``KineticFunctional`` class can be used to define a toy neural network functional. We use a simple Laplacian-level semi-local functional of the form .. math:: T_\text{S}[n] = T_\text{vW}[n] + \int_\Omega d^3\mathbf{r}~F_\text{enh}(s,q) \tau_\text{TF}(\mathbf{r}) where :math:`T_\text{vW}` is the :ref:`von Weizsaecker functional `, :math:`\tau_\text{TF}(\mathbf{r})` is the :ref:`Thomas-Fermi ` kinetic energy density, :math:`s` is the :ref:`reduced gradient ` and :math:`q` is the :ref:`reduced Laplacian `. We shall use a neural network to approximate the Pauli enhancement factor, :math:`F_\text{enh}(s,q)`. :: class NeuralNetworkFunctional(KineticFunctional): def __init__(self, inner_layer_sizes): super().__init__() self.init_args = inner_layer_sizes layer_sizes = [2] + self.init_args + [1] self.nn = torch.nn.Sequential() for i in range(len(layer_sizes) - 1): self.nn.add_module('Linear_{}'.format(i), torch.nn.Linear(layer_sizes[i], layer_sizes[i + 1], dtype=torch.double)) if i != len(layer_sizes) - 2: self.nn.add_module('Activation_{}'.format(i), torch.nn.SiLU()) self.nn.add_module('Activation_-1', torch.nn.Softplus()) self.initialize() def forward(self, box_vecs, den): # getting descriptors kx, ky, kz, k2 = wavevecs(box_vecs, den.shape) s = reduced_gradient(kx, ky, kz, den) q = reduced_laplacian(k2, den) # compute Pauli enhancement factor Fenh = torch.squeeze(self.nn(torch.cat((s.unsqueeze(-1), q.unsqueeze(-1)), dim=-1))) # assembling the terms vol = torch.abs(torch.linalg.det(box_vecs)) TF_ked = 0.3 * (3 * np.pi * np.pi)**(2 / 3) * den.pow(5 / 3) Pauli_T = torch.mean(Fenh * TF_ked) * vol return Weizsaecker(box_vecs, den) + Pauli_T We now give an example of how the ``get_functional_derivative()`` utility function can be used to facilitate the fitting of this neural network functional's functional derivative (or kinetic potential) to some reference kinetic potential. (A simpler example of how the ``get_functional_derivative()`` function can be used can be found :ref:`here `.) As this is just a toy model to illustrate how one might use PROFESS-AD's utilities for such training, we shall fit the neural network functional's kinetic potential to that of the Wang-Teter functional's for a face-centred cubic (fcc) aluminium system, where the density was optimized using the Wang-Teter functional. Of course, this is just a toy example to explain how PROFESS-AD can be used for such training - it is uncertain if the model trained is transferable enough to be used for density optimizations due to overfitting on a single example. :: # generate an optimized density to be used terms = [IonIon, IonElectron, Hartree, WangTeter, PerdewBurkeErnzerhof] box_vecs, frac_ion_coords = get_cell('fcc', vol_per_atom=16.9, coord_type='fractional') ions = [['Al', 'al.gga.recpot', frac_ion_coords]] shape = System.ecut2shape(2000, box_vecs) system = System(box_vecs, shape, ions, terms, units='a', coord_type='fractional') system.optimize_density(ntol=1e-10) # extract optimized density and lattice vectors (in atomic units) den = system.density() box_vecs = system.lattice_vectors('b') # as a toy example, let's try to fit the neural network functional's kinetic potential # to that of the Wang-Teter functional for this optimized density # the "target" WT_kp = get_functional_derivative(box_vecs, den, WangTeter) # create the neural network functional with 3 hidden layers having 8 nodes each model = NeuralNetworkFunctional([8, 8, 8]) # explicitly defining the optimizer and its parameters model.optimizer = torch.optim.Rprop(model.parameters(), lr=0.01, step_sizes=(1e-8, 50)) model.param_grad(True) for iter in range(201): NN_kp = get_functional_derivative(box_vecs, den, model.forward, requires_grad=True) loss = model.grid_error(WT_kp, NN_kp) model.update_params(loss) if iter % 20 == 0: print('Iteration {}, RMSE = {:.5g}'.format(iter, torch.sqrt(loss).item())) model.param_grad(False) NN_kp = get_functional_derivative(box_vecs, den, model.forward) # make plot plt.rc('font', family='serif') fig, axs = plt.subplots(figsize=(5, 5), nrows=2, sharex=True, gridspec_kw={'hspace': 0}) r = np.linspace(0, 1, shape[0]) axs[0].plot(r, [den[i, i, i] for i in range(den.shape[0])], '-k') axs[1].plot(r, [WT_kp[i, i, i] for i in range(den.shape[0])], '-b') axs[1].plot(r, [NN_kp[i, i, i] for i in range(den.shape[0])], '--r') axs[0].set_xlim([0, 1]) axs[1].set_xlim([0, 1]) axs[0].set_ylabel('Electron Density (a.u.)') axs[1].set_ylabel('Kinetic Potential (a.u.)') axs[1].set_xlabel(r'[111] direction ($a_0 \sqrt{3}$)') labels = ['WT', 'NN'] plt.legend(labels=labels, loc="lower center", borderaxespad=0.4, ncol=1, prop={'size': 10}) plt.tight_layout() plt.show() This results in :: Iteration 0, RMSE = 0.1618 Iteration 20, RMSE = 0.05397 Iteration 40, RMSE = 0.029365 Iteration 60, RMSE = 0.022676 Iteration 80, RMSE = 0.021627 Iteration 100, RMSE = 0.020801 Iteration 120, RMSE = 0.020157 Iteration 140, RMSE = 0.019637 Iteration 160, RMSE = 0.01925 Iteration 180, RMSE = 0.018822 Iteration 200, RMSE = 0.018475 and makes the plot .. image:: images/nn_kp.png :width: 450 As the neural network parameters are initialized randomly, running the above code may lead to different results each time.