Update exericses and slides

timmens · timmens · commit bed23a900d3f · 2022-07-11T19:21:28.000-05:00
diff --git a/src/scipy_dev/notebooks/07_automatic_differentiation.ipynb b/src/scipy_dev/notebooks/07_automatic_differentiation.ipynb
@@ -69,8 +69,16 @@
     "\n",
     "## Task 2 (Windows): Gradient\n",
     "\n",
-    "- Compute the gradient of the criterion (the whole function) analytically\n",
-    "- Implement the analytical gradient"
+    "The analytical gradient of the function is given by:\n",
+    "\n",
+    "- $\\partial_a f(a, b, C) = 2 (a - \\pi)$\n",
+    "- $\\partial_b f(a, b, C) = 2 (b - \\begin{pmatrix}0,1,2\\end{pmatrix}^\\top)$\n",
+    "- $\\partial_C f(a, b, C) = 2 (C - I_2)$\n",
+    "\n",
+    "---\n",
+    "\n",
+    "- Implement the analytical gradient\n",
+    "    - return the gradient in the form of `{\"a\": ..., \"b\": ..., \"C\": ...}`"
    ]
   },
   {
diff --git a/src/scipy_dev/notebooks/solutions/07_automatic_differentiation.ipynb b/src/scipy_dev/notebooks/solutions/07_automatic_differentiation.ipynb
@@ -9,8 +9,6 @@
     "\n",
     "In this exercise you will use automatic differentiation in JAX and estimagic to solve the previous problem.\n",
     "\n",
-    "> Note. Here you will only find the solution for Unix and Linux.\n",
-    "\n",
     "## Resources\n",
     "\n",
     "- https://jax.readthedocs.io/en/latest/jax.numpy.html\n",
@@ -49,7 +47,7 @@
    "outputs": [],
    "source": [
     "def criterion(x):\n",
-    "    first = (x[\"a\"] - jnp.pi) ** 4 \n",
+    "    first = (x[\"a\"] - jnp.pi) ** 2\n",
     "    second =  jnp.sum((x[\"b\"] - jnp.arange(3)) ** 2)\n",
     "    third = jnp.sum((x[\"c\"] - jnp.eye(2)) ** 2)\n",
     "    return first + second + third\n",
@@ -71,7 +69,7 @@
     {
      "data": {
       "text/plain": [
-       "DeviceArray(25.0352401, dtype=float64)"
+       "DeviceArray(8.58641909, dtype=float64)"
       ]
      },
      "execution_count": 3,
@@ -83,32 +81,63 @@
     "criterion(start_params)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "c690e3bf",
+   "metadata": {},
+   "source": [
+    "## Solution, Task 1 (Windows):"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "22bfb278",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "def criterion_windows(x):\n",
+    "    first = (x[\"a\"] - jnp.pi) ** 2\n",
+    "    second =  np.sum((x[\"b\"] - np.arange(3)) ** 2)\n",
+    "    third = np.sum((x[\"c\"] - np.eye(2)) ** 2)\n",
+    "    return first + second + third\n",
+    "    \n",
+    "    \n",
+    "start_params_windows = {\n",
+    "    \"a\": 1.,\n",
+    "    \"b\": np.ones(3).astype(float),\n",
+    "    \"c\": np.ones((2, 2)).astype(float)\n",
+    "}"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "9c2814c9",
    "metadata": {},
    "source": [
-    "## Task 2: Gradient\n",
+    "## Solution, Task 2: Gradient\n",
     "\n",
     "- Compute the gradient of the criterion (the whole function). Hint: look at the [`autodiff_cookbook` documentation](https://jax.readthedocs.io/en/latest/notebooks/autodiff_cookbook.html) and slides if you have any questions."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 5,
    "id": "122f2831",
    "metadata": {},
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "{'a': DeviceArray(-39.28896575, dtype=float64, weak_type=True),\n",
+       "{'a': DeviceArray(-4.28318531, dtype=float64, weak_type=True),\n",
        " 'b': DeviceArray([ 2.,  0., -2.], dtype=float64),\n",
        " 'c': DeviceArray([[0., 2.],\n",
        "              [2., 0.]], dtype=float64)}"
       ]
      },
-     "execution_count": 4,
+     "execution_count": 5,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -120,15 +149,15 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 6,
    "id": "7aefa2e9",
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "8.25 ms ± 975 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
+      "11.5 ms ± 2.05 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
      ]
     }
    ],
@@ -138,15 +167,15 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 7,
    "id": "dd8ffcc6",
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "10.7 µs ± 248 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n"
+      "17.2 µs ± 7.57 µs per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n"
      ]
     }
    ],
@@ -155,12 +184,46 @@
     "%timeit jitted_gradient(start_params)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "92f96dcc",
+   "metadata": {},
+   "source": [
+    "## Solution, Task 2 (Windows):\n",
+    "\n",
+    "The analytical gradient of the function is given by:\n",
+    "\n",
+    "- $\\partial_a f(a, b, C) = 2 (a - \\pi)$\n",
+    "- $\\partial_b f(a, b, C) = 2 (b - \\begin{pmatrix}0,1,2\\end{pmatrix}^\\top)$\n",
+    "- $\\partial_C f(a, b, C) = 2 (C - I_2)$\n",
+    "\n",
+    "---\n",
+    "\n",
+    "- Implement the analytical gradient\n",
+    "    - return the gradient in the form of `{\"a\": ..., \"b\": ..., \"C\": ...}`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "2201091d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def gradient(params):\n",
+    "    return {\n",
+    "        \"a\": 2 * (params[\"a\"] - np.pi),\n",
+    "        \"b\": 2 * (params[\"b\"] - np.array([0, 1, 2])),\n",
+    "        \"c\": 2 * (params[\"c\"] - np.eye(2))\n",
+    "    }"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "e9e578b5",
    "metadata": {},
    "source": [
-    "## Task 3: Minimize\n",
+    "## Solution, Task 3: Minimize\n",
     "\n",
     "- Use estimagic to minimize the criterion\n",
     "    - pass the gradient function you computed above to the minimize call.\n",
@@ -169,20 +232,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 9,
    "id": "f23ead7a",
    "metadata": {},
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "{'a': 3.1292550669508072,\n",
-       " 'b': DeviceArray([-4.86427306e-06,  1.00000000e+00,  1.99999782e+00], dtype=float64),\n",
-       " 'c': DeviceArray([[ 1.00000000e+00, -4.86427306e-06],\n",
-       "              [-4.86427306e-06,  1.00000000e+00]], dtype=float64)}"
+       "{'a': 3.141592653589793,\n",
+       " 'b': DeviceArray([3.33066907e-16, 1.00000000e+00, 2.00000000e+00], dtype=float64),\n",
+       " 'c': DeviceArray([[1.00000000e+00, 3.33066907e-16],\n",
+       "              [3.33066907e-16, 1.00000000e+00]], dtype=float64)}"
       ]
      },
-     "execution_count": 7,
+     "execution_count": 9,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -197,6 +260,37 @@
     "\n",
     "res.params"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "1ef9fc6e",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'a': 3.141592653589793,\n",
+       " 'b': array([3.33066907e-16, 1.00000000e+00, 2.00000000e+00]),\n",
+       " 'c': array([[1.00000000e+00, 3.33066907e-16],\n",
+       "        [3.33066907e-16, 1.00000000e+00]])}"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "res = em.minimize(\n",
+    "    criterion=criterion_windows,\n",
+    "    derivative=gradient,\n",
+    "    params=start_params_windows,\n",
+    "    algorithm=\"scipy_lbfgsb\",\n",
+    ")\n",
+    "\n",
+    "res.params"
+   ]
   }
  ],
  "metadata": {
diff --git a/src/scipy_dev/presentation/main.md b/src/scipy_dev/presentation/main.md
@@ -1770,8 +1770,8 @@ Economic problem:
 >>> import jax.numpy as jnp
 >>> from jaxopt import LBFGS
 
->>> x0 = jnp.array([1.0, 2, 3])
->>> shift = x0.copy()
+>>> x0 = jnp.array([1.0, -2, -5])
+>>> shift = jnp.array([-2.0, -4, -6])
 
 >>> def criterion(x, shift):
 ...     return jnp.vdot(x, x + shift)
@@ -1780,7 +1780,7 @@ Economic problem:
 
 >>> result = solver.run(init_params=x0, shift=shift)
 >>> result.params
-DeviceArray([-0.5, -1. , -1.5], dtype=float64)
+DeviceArray([1.0, 2.0, 3.0], dtype=float64)
 ```
 
 </div>
diff --git a/src/scipy_dev/source_repo/test_installation.py b/src/scipy_dev/source_repo/test_installation.py
@@ -10,7 +10,7 @@ def is_installed(executable):
 # Check Installations
 # ======================================================================================
 
-required_estimagic_version = "0.3.2"
+required_estimagic_version = "0.4.0"
 
 try:
     import estimagic  # noqa: F401