{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "2de07083",
   "metadata": {
    "vscode": {
     "languageId": "plaintext"
    }
   },
   "source": [
    "# Conceptual Overview of ase_uhal\n",
    "\n",
    "This notebook serves to provide a visual overview of the content included in the written theory section. Rather than present concepts at full MLIP scale, we consider building a surrogate model for a much simpler 1D potential:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "730587ce",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from scipy.optimize import minimize\n",
    "import scipy\n",
    "np.random.seed(42)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "64138877",
   "metadata": {},
   "outputs": [],
   "source": [
    "roots = [-2, -1, -0.5, 2]\n",
    "\n",
    "def f(x):\n",
    "    '''\n",
    "    \"True\" potential\n",
    "    '''\n",
    "    return np.prod([x-root for root in roots], axis=0)\n",
    "\n",
    "def f_grad(x):\n",
    "    '''\n",
    "    Gradient of the true potential\n",
    "    '''\n",
    "\n",
    "    rt = np.array(roots)\n",
    "    grad = 0\n",
    "\n",
    "    for i in range(len(roots)):\n",
    "        mask = np.ones(len(roots), dtype=bool)\n",
    "        mask[i] = 0\n",
    "        r = rt[mask]\n",
    "        grad += np.prod([x-root for root in r])\n",
    "    return grad\n",
    "\n",
    "\n",
    "x_ref = np.linspace(-3, 2.5, 100)\n",
    "\n",
    "fig, ax = plt.subplots(1, figsize=(8, 6))\n",
    "\n",
    "ax.plot(x_ref, f(x_ref))\n",
    "\n",
    "plt.ylabel(\"f(x)\")\n",
    "plt.xlabel(\"x\")\n",
    "plt.title(\"True Potential\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "68aea3c3",
   "metadata": {},
   "source": [
    "## Least-squares fitting of a simple model\n",
    "\n",
    "For this example, we use a set of squared-exponential basis functions:\n",
    "$$\n",
    "\\begin{equation}\n",
    "m(x, \\theta) = \\sum_i \\phi_i(x) \\theta_i\n",
    "\\end{equation}\n",
    "$$\n",
    "$$\n",
    "\\begin{equation}\n",
    "\\phi_i = \\mathrm{e}^{-\\frac{(x - c_i)^2}{2l}}\n",
    "\\end{equation}\n",
    "$$\n",
    "\n",
    "We can fit this model using least squares, by minimising the squared difference between model and target\n",
    "$$\n",
    "\\begin{equation}\n",
    "\\theta_\\text{LS} = \\text{min}_\\theta \\left(m(x, \\theta) - f(x)\\right)^2\n",
    "\\end{equation}\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "71cf22b9",
   "metadata": {},
   "outputs": [],
   "source": [
    "def phi(x, nbasis=10, l=0.5):\n",
    "    '''\n",
    "    Returns a vector of phi_i(x) for all i\n",
    "    '''\n",
    "    centers = np.linspace(-2.5, 2.5, nbasis)\n",
    "\n",
    "    return np.array([np.exp(-(x-c)**2/(2*l)) for c in centers])\n",
    "\n",
    "def phi_grad(x, nbasis=10, l=0.5):\n",
    "    '''\n",
    "    Gradient of each of the phi functions\n",
    "    '''\n",
    "    centers = np.linspace(-2.5, 2.5, nbasis)\n",
    "    return np.array([-2 * (x-c)/l * np.exp(-(x-c)**2/(2*l)) for c in centers])\n",
    "\n",
    "def m(x, weights, nbasis=10, l=0.5):\n",
    "    '''\n",
    "    Main model\n",
    "    '''\n",
    "    return phi(x, nbasis, l).T @ weights\n",
    "\n",
    "def m_grad(x, weights, nbasis=10, l=0.5):\n",
    "    '''\n",
    "    Gradient of the model\n",
    "    '''\n",
    "    return phi_grad(x, nbasis, l).T @ weights\n",
    "\n",
    "def loss(weights, x_samples, y_samples):\n",
    "    '''\n",
    "    Loss function, for a given set of model weights\n",
    "    '''\n",
    "    return np.sum((y_samples - m(x_samples, weights))**2)\n",
    "\n",
    "# Train initial model from 3 random samples near the true minimum\n",
    "x_samples = np.random.random(3) * 2\n",
    "y_samples = f(x_samples)\n",
    "res = minimize(loss, np.ones(10), args=(x_samples, y_samples))\n",
    "\n",
    "w0 = res.x # Initial weights\n",
    "\n",
    "fig, ax = plt.subplots(1, figsize=(8, 6))\n",
    "plt.plot(x_ref, f(x_ref), label=\"True Potential\")\n",
    "plt.plot(x_ref, m(x_ref, w0), label=\"Model Prediction\")\n",
    "plt.scatter(x_samples, y_samples, label=\"Samples\", color=\"C1\", marker=\"x\")\n",
    "plt.legend()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd14e88a",
   "metadata": {},
   "source": [
    "We can see that the model fits the truth decently well in the region where there is data, but clearly the model struggles to extrapolate away from the data.\n",
    "\n",
    "Now let's try to gather some more training data. Because this toy system is very simple, we could directly do some sampling in 1D, or run some velocity verlet using the true potential. However, in the development of MLIP models, we are typically working with much higher dimensional problems, where the ground truth (usually DFT) is significantly more expensive to compute.\n",
    "\n",
    "Instead, we will use our surrogate model to drive verlet MD, and take every 100th iteration as a new training point."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "87fe7d63",
   "metadata": {},
   "outputs": [],
   "source": [
    "def verlet_step(grad_func, x, v, dt):\n",
    "    a_t = - grad_func(x) # a(t)\n",
    "\n",
    "    v_half = v + 0.5 * a_t * dt # v(t+0.5dt)\n",
    "    x_new = x + v_half * dt # x(t+dt)\n",
    "\n",
    "    a_t_plus = - grad_func(x_new) # a(t+dt)\n",
    "    v_new = v_half + 0.5 * a_t_plus * dt # v(t+dt)\n",
    "\n",
    "    return x_new, v_new\n",
    "\n",
    "def verlet(grad_func, x0, v0, dt, nsteps):\n",
    "    x = np.zeros(nsteps+1)\n",
    "    v = np.zeros_like(x)\n",
    "\n",
    "    x[0] = x0\n",
    "    v[0] = v0\n",
    "\n",
    "    for i in range(nsteps):\n",
    "        x[i+1], v[i+1] = verlet_step(grad_func, x[i], v[i], dt)\n",
    "\n",
    "    return x, v\n",
    "\n",
    "\n",
    "def mg(x):\n",
    "    return m_grad(x, weights=w0)\n",
    "# Run verlet on the true distribution\n",
    "\n",
    "x0 = 1.0\n",
    "v0 = -4.0\n",
    "\n",
    "x, v = verlet(mg, x0, v0, 0.01, 300)\n",
    "\n",
    "# Choose some of the trajectory as new training data\n",
    "new_x_samples = x[::100]\n",
    "new_y_samples = f(new_x_samples)\n",
    "\n",
    "# Retrain the model\n",
    "x_train = np.concatenate([x_samples, new_x_samples])\n",
    "y_train = np.concatenate([y_samples, new_y_samples])\n",
    "\n",
    "res = minimize(loss, w0, args=(x_train, y_train))\n",
    "\n",
    "w = res.x # Updated weights\n",
    "\n",
    "fig, ax = plt.subplots(1, figsize=(8, 6))\n",
    "plt.plot(x_ref, f(x_ref), label=\"True Potential\")\n",
    "plt.plot(x_ref, m(x_ref, w0), label=\"Initial Model Prediction\")\n",
    "plt.plot(x_ref, m(x_ref, w), label=\"Updated Model Prediction\")\n",
    "plt.scatter(x_samples, y_samples, label=\"Initial Samples\", color=\"C1\", marker=\"x\")\n",
    "plt.scatter(new_x_samples, new_y_samples, label=\"New Samples\", color=\"C2\", marker=\"x\")\n",
    "plt.legend()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d818f129",
   "metadata": {},
   "source": [
    "We can see that the updated model is a closer fit to the truth than the old model, in the potential well that the MD started in. However, the MD has been unable to effectively explore the available space, and so our new model is unable to learn anything about the other minimum.\n",
    "\n",
    "## Bayesian Regression\n",
    "Instead of using a least-squares approach, we could use Bayesian regression. The main difference here is that a Bayesian treatment returns a probability distribution over the weights, known as the _posterior_ distribution, rather than just a single \"best\" point.\n",
    "\n",
    "Here, we assume that any noise present on our observations follows a normal distribution with standard deviation y_err. We also assume a normal distribution with standard deviation $\\alpha$ for our prior on the weights. This means that the resulting posterior distribution is also a normal distribution:\n",
    "$$\n",
    "\\begin{equation}\n",
    "\\theta_\\text{Bayes} \\sim \\mathcal{N}(\\bar{\\theta}, \\Sigma)\n",
    "\\end{equation}\n",
    "$$\n",
    "where $\\bar{\\theta}$ is the mean weights for the distribution, $\\Sigma$ is the covariance of the posterior distribution, and $\\theta_\\text{Bayes}$ is a sample of the posterior distribution. For details of the linear algebra involved in solving this posterior distribution, see https://pubs.aip.org/aip/jcp/article/159/17/174108/2919934/Gaussian-approximation-potentials-Theory-software equations 4-9.\n",
    "\n",
    "Once we have found the posterior distribution, we can take the mean of the distribution to be our optimal model. However, we can also draw samples from the distribution to form a committee of models, which are all compatible with the training data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e8c0beb5",
   "metadata": {},
   "outputs": [],
   "source": [
    "def make_posterior_covariance(x_samples, alpha=0.1, y_err=1, nbasis=10, l=0.5):\n",
    "    N = x_samples.shape[0]\n",
    "\n",
    "    A = np.zeros((N+nbasis, nbasis))\n",
    "    A[0:N, :] = phi(x_samples, nbasis, l).T / y_err\n",
    "    A[N:, :] = np.eye(nbasis) * alpha\n",
    "    Q, R = np.linalg.qr(A)\n",
    "\n",
    "    return Q, R\n",
    "\n",
    "def solve_posterior(x_samples, y_samples, alpha=0.1, y_err=1, nbasis=10, l=0.5):\n",
    "    N = x_samples.shape[0]\n",
    "\n",
    "    Q, R = make_posterior_covariance(x_samples, alpha, y_err, nbasis, l)\n",
    "\n",
    "    b = np.zeros(N + nbasis)\n",
    "    b[:N] = y_samples\n",
    "\n",
    "    weights = scipy.linalg.solve_triangular(R, Q.T @ b, lower=False)\n",
    "\n",
    "    return weights, R\n",
    "\n",
    "w0, R = solve_posterior(x_samples, y_samples)\n",
    "\n",
    "biased_weights = w0.copy()\n",
    "\n",
    "# Sample a committee of models\n",
    "ncomm = 15\n",
    "z = np.random.normal(size=(10, ncomm))\n",
    "wcomm = scipy.linalg.solve_triangular(R, z, lower=False).T\n",
    "for i in range(ncomm):\n",
    "    wcomm[i, :] += w0\n",
    "\n",
    "fig, ax = plt.subplots(2, figsize=(8, 12), sharex=True)\n",
    "ax[0].plot(x_ref, f(x_ref), label=\"True Potential\")\n",
    "ax[0].plot(x_ref, m(x_ref, w0), label=\"Bayesian Model Prediction\")\n",
    "ax[0].scatter(x_samples, y_samples, label=\"Samples\", color=\"C1\", marker=\"x\")\n",
    "\n",
    "for i in range(ncomm-1):\n",
    "    ax[0].plot(x_ref, m(x_ref, wcomm[i, :]), color=\"C2\", alpha=0.2)\n",
    "ax[0].plot(x_ref, m(x_ref, wcomm[-1, :]), color=\"C2\", alpha=0.2, label=\"Committee Predictions\")\n",
    "ax[0].legend()\n",
    "\n",
    "\n",
    "## Plot the standard deviation of the committee\n",
    "def comm_std(x, comm_weights):\n",
    "    ncomm = comm_weights.shape[0]\n",
    "    if type(x) == np.ndarray:\n",
    "        N = x.shape[0]\n",
    "    else:\n",
    "        N = 1\n",
    "    comm_pred = np.zeros((ncomm, N))\n",
    "\n",
    "    for i in range(ncomm):\n",
    "        comm_pred[i, :] = m(x, comm_weights[i, :])\n",
    "\n",
    "    return np.std(comm_pred, axis=0)\n",
    "\n",
    "std = comm_std(x_ref, wcomm)\n",
    "\n",
    "ax[1].plot(x_ref, std, color=\"C2\", label=\"Std of Committee Predictions\")\n",
    "\n",
    "ax[1].axvline(x_samples[0], color=\"C1\", linestyle=\"dashed\", label=\"Samples\")\n",
    "for xs in x_samples[1:]:\n",
    "    ax[1].axvline(xs, color=\"C1\", linestyle=\"dashed\")\n",
    "\n",
    "ax[1].legend()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bdcfd0e4",
   "metadata": {},
   "source": [
    "We can see that the committee agrees rather well in the potential well close to the training data, but that the standard deviation diverges as we get further from the training data.\n",
    "\n",
    "In the HyperActive Learning (HAL) approach by van der Oord et. al., they use the standard deviation of a committee of ACE MLIP models to form a biasing potential, so that the biased energy $E_\\text{Bias}$ is given by\n",
    "$$\n",
    "\\begin{equation}\n",
    "E_\\text{Bias}(x) = E_\\text{ACE}(x, \\bar{\\theta}) - \\tau \\,\\text{std}_i \\, E_\\text{ACE}(x, \\theta_i)\n",
    "\\end{equation}\n",
    "$$\n",
    "where $\\tau$ is the strength of the committee biasing. The effect of this biased energy is to artificially lower the potential energy of structures where the committee strongly disagree, which makes it more likely for an MD calculation to explore spaces which are far from the training data. Applying this concept to our toy problem, we get:  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e372cd65",
   "metadata": {},
   "outputs": [],
   "source": [
    "def m_biased(x, comm_weights, mean_weights, tau):\n",
    "    std = comm_std(x, comm_weights)\n",
    "    mean = m(x, mean_weights)\n",
    "\n",
    "    return mean - tau * std\n",
    "\n",
    "tau = 1.0\n",
    "\n",
    "fig, ax = plt.subplots(1, figsize=(8, 6))\n",
    "plt.plot(x_ref, f(x_ref), label=\"True Potential\")\n",
    "plt.plot(x_ref, m(x_ref, w0), label=\"Bayesian Model Prediction\")\n",
    "plt.plot(x_ref, m_biased(x_ref, wcomm, w0, tau), label=\"Biased Potential\")\n",
    "plt.legend()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bdc58ffa",
   "metadata": {},
   "source": [
    "The introduction of the biasing here has lowered the potential energy barrier between minima, and has also made the other minimum much lower in energy, since it is very far from the training data points.\n",
    "\n",
    "To show how the biasing affects MD, lets run verlet dynamics with both approaches, and plot a histogram of the samples found in both runs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9ac060bf",
   "metadata": {},
   "outputs": [],
   "source": [
    "x0 = 0.0\n",
    "v0 = -2.0\n",
    "tau = 1.0\n",
    "\n",
    "def mg_bias(x):\n",
    "    m_bar = m(x, w0)\n",
    "    dm_bar = m_grad(x, w0)\n",
    "\n",
    "    N = wcomm.shape[0]\n",
    "    std = comm_std(x, wcomm)\n",
    "\n",
    "    # Biasing potental = std(m(x, w_i))\n",
    "    # Derivative w.r.t. x is (1/(N std(x, w_i))) * sum_j (m(x, w_i) - m(x, w_mean)) * m_grad(x, w_i) \n",
    "\n",
    "    s = np.sum([(m(x, wcomm[i, :]) - m_bar) * m_grad(x, wcomm[i, :]) for i in range(N)])\n",
    "    \n",
    "    s /= (N * std)\n",
    "\n",
    "    return (dm_bar - tau * s)[0]\n",
    "\n",
    "x, v = verlet(mg, x0, v0, 0.01, 1_000)\n",
    "x_bias, v_bias = verlet(mg_bias, x0, v0, 0.01, 1_000)\n",
    "\n",
    "fig, ax = plt.subplots(2, figsize=(8, 12), sharex=True)\n",
    "ax[0].plot(x_ref, f(x_ref), label=\"True Potential\")\n",
    "ax[0].plot(x_ref, m(x_ref, w0), label=\"Bayesian Model Prediction\")\n",
    "ax[0].plot(x_ref, m_biased(x_ref, wcomm, w0, tau), label=\"Biased Potential\")\n",
    "\n",
    "ax[1].hist(x, bins=30, density=True, label=\"Unbiased Dynamics\", alpha=0.4, color=\"C1\")\n",
    "ax[1].hist(x_bias, bins=30, density=True, label=\"Biased Dynamics\", alpha=0.4, color=\"C2\")\n",
    "ax[1].legend()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c62beb6e",
   "metadata": {},
   "source": [
    "The biasing has clearly allowed the MD to overcome the barrier between minima, enabling sampling over a broader range of the input space x, for this choice of initial position and velocity.\n",
    "\n",
    "Lets again draw samples from both MD trajectories to try to fit improved models:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f3cf2e8b",
   "metadata": {},
   "outputs": [],
   "source": [
    "new_x_samples = x[::100]\n",
    "new_x_samples_bias = x_bias[::100]\n",
    "\n",
    "new_y_samples = f(new_x_samples)\n",
    "new_y_samples_bias = f(new_x_samples_bias)\n",
    "\n",
    "x_train = np.concatenate([x_samples, new_x_samples])\n",
    "y_train = np.concatenate([y_samples, new_y_samples])\n",
    "\n",
    "weights_unbiased, _ = solve_posterior(x_train, y_train)\n",
    "\n",
    "x_train_bias = np.concatenate([x_samples, new_x_samples_bias])\n",
    "y_train_bias = np.concatenate([y_samples, new_y_samples_bias])\n",
    "\n",
    "weights_biased, R_biased = solve_posterior(x_train_bias, y_train_bias)\n",
    "\n",
    "fig, ax = plt.subplots(1, figsize=(8, 6), sharex=True)\n",
    "plt.plot(x_ref, f(x_ref), label=\"True Potential\")\n",
    "plt.plot(x_ref, m(x_ref, w0), label=\"Old Model Prediction\")\n",
    "plt.plot(x_ref, m(x_ref, weights_unbiased), label=\"Unbiased Improved Model Prediction\")\n",
    "plt.plot(x_ref, m(x_ref, weights_biased), label=\"Biased Improved Model Prediction\")\n",
    "plt.legend()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d0ab3bee",
   "metadata": {},
   "source": [
    "The biasing has clearly allowed the training data samples to cover a wider range of inputs, and thus the bias-improved model is a significantly better surrogate model for the true potential."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "888d8627",
   "metadata": {},
   "outputs": [],
   "source": [
    "z = np.random.normal(size=(10, ncomm))\n",
    "wcomm_biased = scipy.linalg.solve_triangular(R_biased, z, lower=False).T\n",
    "for i in range(ncomm):\n",
    "    wcomm_biased[i, :] += weights_biased\n",
    "\n",
    "fig, ax = plt.subplots(2, figsize=(8, 12), sharex=True)\n",
    "\n",
    "ax[0].plot(x_ref, f(x_ref), label=\"True Potential\")\n",
    "ax[0].plot(x_ref, m(x_ref, weights_biased), label=\"Updated Model Prediction\")\n",
    "\n",
    "for i in range(ncomm-1):\n",
    "    ax[0].plot(x_ref, m(x_ref, wcomm_biased[i, :]), color=\"C1\", alpha=0.2)\n",
    "ax[0].plot(x_ref, m(x_ref, wcomm_biased[-1, :]), color=\"C1\", alpha=0.2, label=\"Updated Committee Predictions\")\n",
    "ax[0].scatter(x_train_bias, y_train_bias, color=\"C0\", label=\"Training Data\", marker=\"x\")\n",
    "ax[0].legend()\n",
    "\n",
    "std = comm_std(x_ref, wcomm_biased)\n",
    "\n",
    "ax[1].plot(x_ref, std, color=\"C2\", label=\"Std of Updated Committee\")\n",
    "ax[1].axvline(x_train_bias[0], color=\"C1\", linestyle=\"dashed\", label=\"Training Data\")\n",
    "for xs in x_train_bias[1:]:\n",
    "    ax[1].axvline(xs, color=\"C1\", linestyle=\"dashed\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f4c9711",
   "metadata": {},
   "source": [
    "We can see that the committee of this improved model is in much tighter agreement. We can see that the structure of the standard deviation has changed significantly, and thus a new error-biased potential would again more favourably explore data-sparse space.\n",
    "\n",
    "### On-the-fly updates to the committee\n",
    "One quirk of the Bayesian posterior in our case is that the posterior covariance (which defines the variance of the committee) only depends on the $\\phi_i(x)$ values, not on $f(x)$ observarions. This means that we can update the biasing potential without needing to evaluate $f(x)$ - in this simple example, $f(x)$ is trivial to compute, but this is not always true (e.g. $f(x)$ could be a DFT calculation).\n",
    "\n",
    "The limitation to this quirk is that by not including the impact of $f(x)$ observations, we cannot update the committee mean based on this new data, and therefore can't improve the quality of the underlying potential. \n",
    "\n",
    "Lets visually demonstrate how this quirk works, by selecting a new datapoint at $x=2.3$ and updating the posterior variance, but keeping the poterior mean the same "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9a7d990",
   "metadata": {},
   "outputs": [],
   "source": [
    "R_old = R_biased.copy()\n",
    "\n",
    "new_x_point = 2.3\n",
    "\n",
    "new_x_train = np.append(x_train_bias, new_x_point)\n",
    "\n",
    "_, R_updated = make_posterior_covariance(new_x_train)\n",
    "\n",
    "ncomm = 40\n",
    "z = np.random.normal(size=(10, ncomm))\n",
    "wcomm_old = scipy.linalg.solve_triangular(R_old, z, lower=False).T\n",
    "wcomm_updated = scipy.linalg.solve_triangular(R_updated, z, lower=False).T\n",
    "for i in range(ncomm):\n",
    "    wcomm_old[i, :] += weights_biased\n",
    "    wcomm_updated[i, :] += weights_biased\n",
    "\n",
    "fig, ax = plt.subplots(1, figsize=(8, 6), sharex=True)\n",
    "\n",
    "\n",
    "std = comm_std(x_ref, wcomm_old)\n",
    "std_updated = comm_std(x_ref, wcomm_updated)\n",
    "\n",
    "plt.plot(x_ref, std, color=\"C2\", label=\"Std of Old Committee\")\n",
    "plt.plot(x_ref, std_updated, color=\"C3\", label=\"Std of Updated Committee\")\n",
    "plt.axvline(new_x_point, color=\"C0\", linestyle=\"dashed\", label=\"Newly added point\")\n",
    "plt.axvline(x_train_bias[0], color=\"C1\", linestyle=\"dashed\", label=\"Training Data\")\n",
    "for xs in x_train_bias[1:]:\n",
    "    plt.axvline(xs, color=\"C1\", linestyle=\"dashed\")\n",
    "\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5476fec5",
   "metadata": {},
   "source": [
    "It is clear that the newly added datapoint has reduced the committee standard deviation close to this point, but that the standard deviation far from the new point is largely unaffected. In this way, resampling without needing a refit allows us to update the biasing potential to remove the bias towards the points we have just sampled.\n",
    "\n",
    "We are of course free at any point to stop sampling new datapoints, perform the missing $f(x)$ calculations, and fully refit the model using the conventional Bayesian approach.\n",
    "\n",
    "Below is the indended useage of the ase_uhal package, where biased MD simulations are used to rapidly select a small set of new training structures and then those structures are all evaluated with DFT in parallel. \n",
    "<div>\n",
    "<img src=\"UHAL.png\" width=\"500\"/>\n",
    "</div>\n",
    "\n",
    "### Selection scores\n",
    "Up until this point, we have selected new data points randomly from a biased or unbiased MD trajectory. Because we are able to update the committee after each selection, if we used some metric based off the committee error to score how much we should want to select a structure, we can also update this score on the fly, and avoid resampling the same space multiple times.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d0539ab6",
   "metadata": {},
   "outputs": [],
   "source": [
    "start_xtrain = x_train.copy()\n",
    "\n",
    "x_traj = x_bias.copy()\n",
    "\n",
    "min_selections = 1\n",
    "max_selections = 2\n",
    "\n",
    "x_select = []\n",
    "comm_weights = []\n",
    "ncomm = 40\n",
    "\n",
    "for nselect in range(max_selections+1):\n",
    "\n",
    "    new_xtrain = np.append(start_xtrain, x_select)\n",
    "\n",
    "    _, R = make_posterior_covariance(new_xtrain)\n",
    "    z = np.random.normal(size=(10, ncomm))\n",
    "    wcomm = scipy.linalg.solve_triangular(R, z, lower=False).T\n",
    "    for i in range(ncomm):\n",
    "        wcomm[i, :] += weights_biased\n",
    "\n",
    "    stds = comm_std(x_traj, wcomm)\n",
    "\n",
    "    new_x_selected = x_traj[np.argmax(stds)]\n",
    "    x_select.append(new_x_selected)\n",
    "    comm_weights.append(wcomm)\n",
    "\n",
    "\n",
    "# Precompute std values for more responsive slider\n",
    "stds = [comm_std(x_ref, wcomm) for wcomm in comm_weights]    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bbc684a0",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "fig, ax = plt.subplots(1, figsize=(8, 6))\n",
    "\n",
    "for i in range(max_selections+1):    \n",
    "    ax.plot(x_ref, stds[i], color=f\"C{i}\", label=f\"{i} Selections\")\n",
    "    ax.axvline(x_select[i], color=f\"C{i}\", linestyle=\"dashed\")\n",
    "ax.set_yscale(\"log\")\n",
    "ax.legend()\n",
    "ax.set_xlabel(\"x\")\n",
    "ax.set_ylabel(\"Committee Variance\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d5383f32",
   "metadata": {},
   "source": [
    "We can see from the plot that updating the committee after each selection lowers the local committee standard deviation around that selection, and thus reduces the chance of oversampling in that region."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}