|
| 1 | +.. Copyright (c) 2023 - 2023, Oracle and/or its affiliates. All rights reserved. |
| 2 | +.. Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/. |
| 3 | +
|
| 4 | +.. References/links |
| 5 | +.. _Witness: https://github.com/testifysec/witness |
| 6 | +.. _SLSA: https://slsa.dev |
| 7 | + |
| 8 | + |
| 9 | +.. _tutorials: |
| 10 | + |
| 11 | +========= |
| 12 | +Tutorials |
| 13 | +========= |
| 14 | + |
| 15 | +On this page, you will find tutorials to get you started with Macaron. The tutorials show Macaron |
| 16 | +in action, analyzing a software component and its dependencies that are built using GitHub Actions |
| 17 | +or GitLab. Macaron supports artifacts published on GitHub release, `Maven Central <https://central.sonatype.com>`_, |
| 18 | +or privately hosted registries, such as `JFrog <https://jfrog.com/>`_. |
| 19 | + |
| 20 | +--------------------------------------------------------------------- |
| 21 | +Detect a malicious Java dependency uploaded manually to Maven Central |
| 22 | +--------------------------------------------------------------------- |
| 23 | + |
| 24 | +In this tutorial we show how Macaron can determine whether the dependencies of a Java project are built |
| 25 | +and published via transparent CI workflows or manually uploaded to Maven Central. When an artifact is |
| 26 | +manually uploaded, the artifact distributor can modify the artifact and potentially include malicious |
| 27 | +code without being detected. |
| 28 | + |
| 29 | +The example project we analyze in this tutorial is `example-maven-app <https://github.com/behnazh-w/example-maven-app>`_, |
| 30 | +which is hosted on GitHub. This example application uses Maven to build and manage dependencies, and has two |
| 31 | +dependencies: |
| 32 | + |
| 33 | + |
| 34 | +.. list-table:: |
| 35 | + :widths: 25 50 |
| 36 | + :header-rows: 1 |
| 37 | + |
| 38 | + * - Artifact name |
| 39 | + - `Package URL (PURL) <https://github.com/package-url/purl-spec>`_ |
| 40 | + * - `guava <https://central.sonatype.com/artifact/com.google.guava/guava>`_ |
| 41 | + - `` pkg:maven/com.google.guava/[email protected]?type=jar`` |
| 42 | + * - `jackson-databind <https://central.sonatype.com/artifact/io.github.behnazh-w.demo/jackson-databind>`_ |
| 43 | + - `` pkg:maven/io.github.behnazh-w.demo/[email protected]?type=jar`` |
| 44 | + |
| 45 | +While the ``guava`` dependency follows best practices to publish artifacts automatically with minimal human |
| 46 | +intervention, ``jackson-databind`` is a malicious dependency that pretends to provide data-binding functionalities |
| 47 | +like `the official jackson-databind <https://github.com/FasterXML/jackson-databind>`_ library (note that |
| 48 | +this artifact is created for demonstration purposes and is not actually malicious). |
| 49 | + |
| 50 | +Now let's see how Macaron can help us with evaluating the supply chain security posture of |
| 51 | +``example-maven-app`` and its dependencies. |
| 52 | + |
| 53 | +************ |
| 54 | +Installation |
| 55 | +************ |
| 56 | + |
| 57 | +Please follow the instructions :ref:`here <installation-guide>`. In summary, you need: |
| 58 | + |
| 59 | +* Docker |
| 60 | +* the ``run_macaron.sh`` script to run the Macaron image. |
| 61 | + |
| 62 | +.. note:: At the moment, Docker alternatives (e.g. podman) are not supported. |
| 63 | + |
| 64 | +************* |
| 65 | +Prerequisites |
| 66 | +************* |
| 67 | + |
| 68 | +You need to provide Macaron with a GitHub token through the ``GITHUB_TOKEN`` environment variable. |
| 69 | + |
| 70 | +To obtain a GitHub Token: |
| 71 | + |
| 72 | +* Go to ``GitHub settings`` → ``Developer Settings`` (at the bottom of the left side pane) → ``Personal Access Tokens`` → ``Fine-grained personal access tokens`` → ``Generate new token``. Give your token a name and an expiry period. |
| 73 | +* Under ``"Repository access"``, choosing ``"Public Repositories (read-only)"`` should be good enough in most cases. |
| 74 | + |
| 75 | +Now you should be good to run Macaron. For more details, see the documentation :ref:`here <prepare-github-token>`. |
| 76 | + |
| 77 | + |
| 78 | +*********************** |
| 79 | +Run ``analyze`` command |
| 80 | +*********************** |
| 81 | + |
| 82 | +First, we need to run the ``analyze`` command of Macaron to run a number of :ref:`checks <checks>` and collect evidence for ``example-maven-app`` and its dependencies. |
| 83 | + |
| 84 | +.. code-block:: shell |
| 85 | +
|
| 86 | + ./run_macaron.sh analyze -rp https://github.com/behnazh-w/example-maven-app |
| 87 | +
|
| 88 | +.. note:: By default, Macaron clones the repositories and creates output files under the ``output`` directory. To understand the structure of this directory please see :ref:`Output Files Guide <output_files_guide>`. |
| 89 | + |
| 90 | +By default, this command analyzes the the latest commit of the default branch of the repository. You can also analyze the repository |
| 91 | +at a specific commit by providing the branch and commit digest. See the :ref:`CLI options<analyze-action-cli>` of the ``analyze`` command for more information. |
| 92 | +After running the ``analyze`` command, we can view the data that Macaron has gathered about the ``example-maven-app`` repository in an HTML report. |
| 93 | + |
| 94 | +.. code-block:: shell |
| 95 | +
|
| 96 | + open output/reports/github_com/behnazh-w/example-maven-app/example-maven-app.html |
| 97 | +
|
| 98 | +.. _fig_example-maven-app: |
| 99 | + |
| 100 | +.. figure:: ../../_static/images/tutorial_example_maven_app_report.png |
| 101 | + :alt: HTML report for ``example-maven-app`` |
| 102 | + :align: center |
| 103 | + |
| 104 | +| |
| 105 | +
|
| 106 | +The image above shows the results of the checks for `example-maven-app <https://github.com/behnazh-w/example-maven-app>`_ repository itself. |
| 107 | +As you can see, some of the checks are passing and some are failing. In summary, this project |
| 108 | + |
| 109 | +* is not producing any `SLSA`_ or `Witness`_ provenances (``mcn_provenance_available_1``) |
| 110 | +* is using GitHub Actions to build and test using ``mvnw`` (``mcn_build_service_1``) |
| 111 | +* but it is not deploying any artifacts automatically (``mcn_build_as_code_1``) |
| 112 | +* and no CI workflow runs are detected that automatically publish artifacts (``mcn_infer_artifact_pipeline_1``) |
| 113 | + |
| 114 | +As you scroll down in the HTML report, you will see a section for the dependencies that were automatically identified: |
| 115 | + |
| 116 | +.. _fig_example-maven-app-deps: |
| 117 | + |
| 118 | +.. figure:: ../../_static/images/tutorial_example_maven_app_report_dependencies.png |
| 119 | + :alt: HTML report for dependencies of ``example-maven-app`` |
| 120 | + :align: center |
| 121 | + |
| 122 | +| |
| 123 | +| Macaron has found the two dependencies as expected: |
| 124 | +
|
| 125 | +* ``io.github.behnazh-w.demo:jackson-databind:1.0`` |
| 126 | +* ``com.google.guava:guava:32.1.2-jre`` |
| 127 | + |
| 128 | +When we open the reports for each dependency, we see that ``mcn_infer_artifact_pipeline_1`` is passed for ``com.google.guava:guava:32.1.2-jre`` |
| 129 | +and a GitHub Actions workflow run is found for publishing version ``32.1.2-jre``. However, this check is failing for ``io.github.behnazh-w.demo:jackson-databind:1.0``. |
| 130 | +This means that ``io.github.behnazh-w.demo:jackson-databind:1.0`` could have been built and published manually to Maven Central |
| 131 | +and could potentially be malicious. |
| 132 | + |
| 133 | +.. _fig_infer_artifact_pipeline_guava: |
| 134 | + |
| 135 | +.. figure:: ../../_static/images/tutorial_guava_infer_pipeline.png |
| 136 | + :alt: mcn_infer_artifact_pipeline_1 for com.google.guava:guava:32.1.2-jre |
| 137 | + :align: center |
| 138 | + |
| 139 | + ``com.google.guava:guava:32.1.2-jre`` |
| 140 | + |
| 141 | +.. _fig_infer_artifact_pipeline_bh_jackson_databind: |
| 142 | + |
| 143 | +.. figure:: ../../_static/images/tutorial_bh_jackson_databind_infer_pipeline.png |
| 144 | + :alt: mcn_infer_artifact_pipeline_1 for io.github.behnazh-w.demo:jackson-databind:1.0 |
| 145 | + :align: center |
| 146 | + |
| 147 | + ``io.github.behnazh-w.demo:jackson-databind:1.0`` |
| 148 | + |
| 149 | +| |
| 150 | +
|
| 151 | +After running the ``analyze`` command, all the check results are stored in ``output/macaron.db``. |
| 152 | +Next, we show how to use the policy engine to detect if the dependencies of ``example-maven-app`` |
| 153 | +are not published from a publicly available CI workflow run. |
| 154 | + |
| 155 | +***************************** |
| 156 | +Run ``verify-policy`` command |
| 157 | +***************************** |
| 158 | + |
| 159 | +While the ``analyze`` command shown in the previous section collects information, |
| 160 | +it does not automatically confirm whether a repository satisfies **your** security requirements. |
| 161 | +This is where the ``verify-policy`` command comes in. With Macaron, you can use `Soufflé Datalog <https://souffle-lang.github.io/index.html>`_ |
| 162 | +in order to express the security requirements and let Macaron automatically validate it against the collected data. |
| 163 | +Datalog is very similar to SQL and allows writing declarative queries for the |
| 164 | +results collected by the ``analyze`` command. We use such queries as policy rules as described next. |
| 165 | + |
| 166 | +The security requirement in this tutorial is to mandate dependencies of our project to have a |
| 167 | +transparent artifact publish CI workflows. To write a policy for this requirement, first we need to |
| 168 | +revisit the checks shown in the HTML report in the previous :ref:`step <fig_example-maven-app>`. |
| 169 | +The result of each of the checks can be queried by the check ID in the first column. For the policy in this tutorial, |
| 170 | +we are interested in the ``mcn_infer_artifact_pipeline_1`` and ``mcn_provenance_level_three_1`` checks: |
| 171 | + |
| 172 | +.. code-block:: c++ |
| 173 | + |
| 174 | + #include "prelude.dl" |
| 175 | + |
| 176 | + Policy("detect-malicious-upload", component_id, "") :- |
| 177 | + is_component(component_id, _), |
| 178 | + !violating_dependencies(component_id). |
| 179 | + |
| 180 | + .decl violating_dependencies(parent: number) |
| 181 | + violating_dependencies(parent) :- |
| 182 | + transitive_dependency(parent, dependency), |
| 183 | + !check_passed(dependency, "mcn_infer_artifact_pipeline_1"), |
| 184 | + !check_passed(dependency, "mcn_provenance_level_three_1"). |
| 185 | + |
| 186 | + apply_policy_to("detect-malicious-upload", component_id) :- |
| 187 | + is_repo(_, "github.com/behnazh-w/example-maven-app", component_id). |
| 188 | + |
| 189 | + |
| 190 | +This policy requires that all the dependencies |
| 191 | +of repository ``github.com/behnazh-w/example-maven-app`` either pass the ``mcn_provenance_level_three_1`` (have non-forgeable |
| 192 | +`SLSA`_ provenances) or ``mcn_infer_artifact_pipeline_1`` check. Note that if an artifact already has a non-forgeable provenance, it means it is produced |
| 193 | +by a hosted build platform, such as GitHub Actions CI workflows. So, the ``mcn_infer_artifact_pipeline_1`` needs to pass |
| 194 | +only if ``mcn_provenance_level_three_1`` fails. |
| 195 | + |
| 196 | +Let's take a closer look at this policy to understand what each line means. |
| 197 | + |
| 198 | +.. code-block:: c++ |
| 199 | + |
| 200 | + #include "prelude.dl" |
| 201 | + |
| 202 | +This line imports the predefined Datalog relations into your Datalog specification. These relations |
| 203 | +can be thought of as select statements specifically provided by Macaron to make it easier for you |
| 204 | +to write policies. In our example policy, the following relations are pre-defined: |
| 205 | + |
| 206 | +* ``Policy(policy_id: symbol, target_id: number, message: symbol)`` |
| 207 | +* ``is_component(component_id: number, purl: symbol)`` |
| 208 | +* ``transitive_dependency(parent: number, dependency: number)`` |
| 209 | +* ``check_passed(component_id: number, check_name: symbol)`` |
| 210 | +* ``apply_policy_to(policy_id: symbol, component_id: number)`` |
| 211 | +* ``is_repo(repo_id: number, repo_complete_name: symbol, component_id: number)`` |
| 212 | + |
| 213 | +And the following relation is declared in this policy: |
| 214 | + |
| 215 | +* ``violating_dependencies(parent: number)`` |
| 216 | + |
| 217 | +Feel free to browse through the available |
| 218 | +relations `here <https://github.com/oracle/macaron/blob/main/src/macaron/policy_engine/prelude/>`_ |
| 219 | +to see how they are constructed before moving on. |
| 220 | + |
| 221 | +.. code-block:: c++ |
| 222 | + |
| 223 | + Policy("detect-malicious-upload", component_id, "") :- |
| 224 | + is_component(component_id, _), |
| 225 | + !violating_dependencies(component_id). |
| 226 | + |
| 227 | +This rule populates the ``Policy`` relation if ``component_id`` exists in the database and |
| 228 | +``violating_dependencies`` relation for this component is empty. |
| 229 | + |
| 230 | +.. code-block:: c++ |
| 231 | + |
| 232 | + .decl violating_dependencies(parent: number) |
| 233 | + violating_dependencies(parent) :- |
| 234 | + transitive_dependency(parent, dependency), |
| 235 | + !check_passed(dependency, "mcn_infer_artifact_pipeline_1"), |
| 236 | + !check_passed(dependency, "mcn_provenance_level_three_1"). |
| 237 | + |
| 238 | +This is the rule that the user needs to design to detect dependencies that violate a security requirement. |
| 239 | +Here we declare a relation called ``violating_dependencies`` and populate it if the dependencies in the |
| 240 | +``transitive_dependency`` relation do not pass any of the ``mcn_infer_artifact_pipeline_1`` and |
| 241 | +``mcn_provenance_level_three_1`` checks. |
| 242 | + |
| 243 | +.. code-block:: c++ |
| 244 | + |
| 245 | + apply_policy_to("detect-malicious-upload", component_id) :- |
| 246 | + is_repo(_, "github.com/behnazh-w/example-maven-app", component_id). |
| 247 | + |
| 248 | +Finally, the ``apply_policy_to`` rule applies the policy ``detect-malicious-upload`` on the |
| 249 | +repository ``github.com/behnazh-w/example-maven-app``. Note that each run of Macaron analyzes a repository at a specific |
| 250 | +commit. So, the database can include more than one result for a repository and this policy will be |
| 251 | +validated on all commits available in the database. |
| 252 | + |
| 253 | +Let's name this policy ``example-maven-app.dl``. To verify this policy run: |
| 254 | + |
| 255 | +.. code-block:: shell |
| 256 | +
|
| 257 | + ./run_macaron.sh verify-policy --database ./output/macaron.db --file ./example-maven-app.dl |
| 258 | +
|
| 259 | +You can see the policy result both in the console and ``output/policy_report.json``. The results |
| 260 | +printed to the console will look like the following: |
| 261 | + |
| 262 | +.. code-block:: javascript |
| 263 | +
|
| 264 | + passed_policies |
| 265 | + component_satisfies_policy |
| 266 | + failed_policies |
| 267 | + ['detect-malicious-upload'] |
| 268 | + component_violates_policy |
| 269 | + ['1', 'pkg:github.com/behnazh-w/example-maven-app@34c06e8ae3811885c57f8bd42db61f37ac57eb6c', 'detect-malicious-upload'] |
| 270 | +
|
| 271 | +As you can see, the policy has failed because the ``io.github.behnazh-w.demo:jackson-databind:1.0`` |
| 272 | +dependency is manually uploaded to Maven Central and does not meet the security requirement. |
| 273 | + |
| 274 | +You can use this policy in your GitHub Actions to prevent a deployment or fail a CI test during the |
| 275 | +development. Alternatively, you can treat the result as a warning and manually investigate the |
| 276 | +dependencies to make sure they are secure and can be trusted. |
0 commit comments