Adds new artifacts colab. #526

katjacksonWB · 2024-05-15T22:06:41Z

Adds a new Artifacts colab to replace the old, outdated one linked on the Artifacts landing page.

review-notebook-app · 2024-05-15T22:06:46Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

github-actions · 2024-05-15T22:07:53Z

Thanks for contributing to wandb/examples!
We appreciate your efforts in opening a PR for the examples repository. Our goal is to ensure a smooth and enjoyable experience for you 😎.

Guidelines

The examples repo is regularly tested against the ever-evolving ML stack. To facilitate our work, please adhere to the following guidelines:

Notebook naming: You can use a combination of snake_case and CamelCase for your notebook name. Avoid using spaces (replace them with _) and special characters (&%$?). For example:

Cool_Keras_integration_example_with_weights_and_biases.ipynb

is acceptable, but

Cool Keras Example with W&B.ipynb

is not. Avoid spaces and the & character. To refer to W&B, you can use: weights_and_biases or just wandb (it's our library, after all!)

Managing dependencies within the notebook: You may need to set up dependencies to ensure that your code works. Please avoid the following practices:
- Docker-related activities. If Docker installation is required, consider adding a full example with the corresponding Dockerfile to the wandb/examples/examples folder (where non-Colab examples reside).
- Using pip install as the primary method to install packages. When calling pip in a cell, avoid performing other tasks. We automatically filter these types of cells, and executing other actions might break the automatic testing of the notebooks. For example,
```
pip install -qU wandb transformers gpt4
```
is acceptable, but
```
pip install -qU wandb
import wandb
```
is not.
- Installing packages from a GitHub branch. Although it's acceptable 😎 to directly obtain the latest bleeding-edge libraries from GitHub, did you know that you can install them like this:
```
!pip install -q git+https://github.com/huggingface/transformers
```
You don't need to clone, then cd into the repo and install it in editable mode.
- Avoid referencing specific Colab directories. Google Colab has a /content directory where everything resides. Avoid explicitly referencing this directory because we test our notebooks with pure Jupyter (without Colab). Instead, use relative paths to make the notebook reproducible.
The Jupyter notebook file .ipynb is nothing more than a JSON file with primarily two types of cells: markdown and code. There is also a bunch of other metadata specific to Google Colab. We have a set of tools to ensure proper notebook formatting. These tools can be found at wandb/nb_helpers.

Before merging, wait for a maintainer to clean and format the notebooks you're adding. You can tag @tcapelle.

Before marking the PR as ready for review, please run your notebook one more time. Restart the Colab and run all. We will provide you with links to open the Colabs below

The following colabs were changed
-colabs/wandb-artifacts/Artifact_fundamentals.ipynb

noaleetz · 2024-05-16T21:34:29Z

Hey @katjacksonWB - some feedback from reviewing the whole thing:

I think it's important to cover how to version an artifact because if not the colab doesn't really show the utility of logging your stuff to an artifact. It can be a minimal example, like adding a few new images and showing that a new version is created
the colab should also showcase how someone can navigate to the UI for the artifact logged and link to a public project that the user can look at the understand how the actions done in colab reflect in UI (the SDK prints out a URL so we should show user how to find this to get to the UI from their log_artifact command)

noaleetz

left a comment with some requested changes!

colabs/artifact_basics/Artifact_Basics.ipynb

noaleetz

left another round of comments!

Co-authored-by: Noa <[email protected]>

tcapelle · 2024-05-27T08:07:38Z

Ping me when ready for final review/merge

colabs/artifact_basics/Artifact_Basics.ipynb

tcapelle

Why we don't show the classic:

# you can log using the one-liner:
wandb.log_artifact("file.csv", name="my_artifact", type="data")

# or
at = Artifact(name="my_artifact",  type="data")
at.add_file("file.csv")
# add_dir(...)
wandb.log_artifact(at)

Shouldn't this live in wandb-artifacts?
Please also replace the "old outdated one"

noaleetz · 2024-06-10T17:22:09Z

Why we don't show the classic:
# you can log using the one-liner:
wandb.log_artifact("file.csv", name="my_artifact", type="data")

# or
at = Artifact(name="my_artifact",  type="data")
at.add_file("file.csv")
# add_dir(...)
wandb.log_artifact(at)
Shouldn't this live in wandb-artifacts?

Please also replace the "old outdated one"

is referring to a specific line?

noaleetz

last request on lineage

noaleetz · 2024-06-10T22:17:58Z

colabs/artifact_basics/Artifact_Basics.ipynb

+      "source": [
+        "You can also manage your Artifacts via the W&B platform. This can give you insight into your model's performance or dataset versioning. To navigate to the relevant information, click this [link](https://wandb.ai/wandb/artifact-basics/overview), then click on the **Artifacts** tab.\n",
+        "\n",
+        "Navigating to the **Lineage** section in the tab will show the dependency graph formed by calling `run.use_artifact()` when an Artifact is an input to a run, and `run.log_artifact()` when an Artifact is output to a run. This helps visualize the relationship between different model versions and other objects like datasets and jobs in your project. Click [this](https://wandb.ai/wandb/artifact-basics/artifacts/dataset/my_first_artifact/v0/lineage) link to navigate to the project's lineage page."


can you make sure we include a screenshot of a more complex lineage for a user to explore, and also link the relevant project (probably the artifact workflow project)

yeap, I am missing some screenshots. Add those or from the docs or upload them as files in the same folder.

rymc · 2024-06-11T07:52:05Z

colabs/artifact_basics/Artifact_Basics.ipynb

+        "                    inplace=True)\n",
+        "csvData.to_csv(\"/content/sample_data/california_housing_test.csv\") # overwrites file with the sorted data\n",
+        "# adds the new file to the artifact\n",
+        "run = wandb.init(project=\"artifact-basics\")\n",


I think would be better here to init the run at the start of the code block.

maybe split in 2 cells.

csv_data instead of csvData.

rymc · 2024-06-11T07:54:07Z

colabs/artifact_basics/Artifact_Basics.ipynb

+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Now the sorted file will be logged in `my_first_artifact`. Any changes you log to an artifact will overwrite any older version. \n",


hmm the wording of overwriting old versions may be confusing to users (line 228 and 236) as we don't really overwrite the old version, we create a new version instead. Overwriting to me implies it replaces the previous.

rymc · 2024-06-11T07:55:20Z

colabs/artifact_basics/Artifact_Basics.ipynb

+        "artifact = run.use_artifact(artifact_or_name=\"my_first_artifact:latest\")\n",
+        "# This will download the specified artifact to where your code is running\n",
+        "datadir = artifact.download()\n",
+        "run.finish()\n",


I would suggest moving the run.finish() to the end of the block as it is usually better practice (e.g., in this case, it would capture what is printed out)

rymc · 2024-06-11T07:56:56Z

colabs/artifact_basics/Artifact_Basics.ipynb

+      "source": [
+        "You can also manage your Artifacts via the W&B platform. This can give you insight into your model's performance or dataset versioning. To navigate to the relevant information, click this [link](https://wandb.ai/wandb/artifact-basics/overview), then click on the **Artifacts** tab.\n",
+        "\n",
+        "Navigating to the **Lineage** section in the tab will show the dependency graph formed by calling `run.use_artifact()` when an Artifact is an input to a run, and `run.log_artifact()` when an Artifact is output to a run. This helps visualize the relationship between different model versions and other objects like datasets and jobs in your project. Click [this](https://wandb.ai/wandb/artifact-basics/artifacts/dataset/my_first_artifact/v0/lineage) link to navigate to the project's lineage page."


I think the following wording here may be a little confusing to some users:

"run.log_artifact() when an Artifact is output to a run"

I think "to a run" should be "of a run"?

tcapelle · 2024-06-12T09:46:27Z

Why we don't show the classic:
# you can log using the one-liner:
wandb.log_artifact("file.csv", name="my_artifact", type="data")

# or
at = Artifact(name="my_artifact",  type="data")
at.add_file("file.csv")
# add_dir(...)
wandb.log_artifact(at)
Shouldn't this live in wandb-artifacts?

Please also replace the "old outdated one"
is referring to a specific line?

This plan and the code don't match:

You are doing 2+3+4 in one line.

tcapelle · 2024-06-12T09:48:12Z

I would also use this PR to remove/replace old stuff in wandb-artifacts (and put this file in there as a getting started)

tcapelle · 2024-06-12T09:58:28Z

colabs/artifact_basics/Artifact_Basics.ipynb

+      "outputs": [],
+      "source": [
+        "!pip install wandb\n",
+        "import wandb\n",


split pip install from import wandb please

tcapelle · 2024-06-12T09:59:02Z

colabs/artifact_basics/Artifact_Basics.ipynb

+        "The general workflow for creating an Artifact is:\n",
+        "\n",
+        "\n",
+        "1.   Intialize a run.\n",


Just feel like we are not following with code this plan.

tcapelle · 2024-06-12T10:00:36Z

colabs/artifact_basics/Artifact_Basics.ipynb

+        "                    inplace=True)\n",
+        "csvData.to_csv(\"/content/sample_data/california_housing_test.csv\") # overwrites file with the sorted data\n",
+        "# adds the new file to the artifact\n",
+        "run = wandb.init(project=\"artifact-basics\")\n",


maybe split in 2 cells.

csv_data instead of csvData.

tcapelle · 2024-06-12T10:01:52Z

colabs/artifact_basics/Artifact_Basics.ipynb

+      "source": [
+        "run = wandb.init(project=\"artifact-basics\")\n",
+        "artifact = run.use_artifact(artifact_or_name=\"my_first_artifact:latest\")\n",
+        "# This will download the specified artifact to where your code is running\n",


add blanck line before the comment for readability

tcapelle · 2024-06-12T10:03:06Z

colabs/artifact_basics/Artifact_Basics.ipynb

+      "source": [
+        "You can also manage your Artifacts via the W&B platform. This can give you insight into your model's performance or dataset versioning. To navigate to the relevant information, click this [link](https://wandb.ai/wandb/artifact-basics/overview), then click on the **Artifacts** tab.\n",
+        "\n",
+        "Navigating to the **Lineage** section in the tab will show the dependency graph formed by calling `run.use_artifact()` when an Artifact is an input to a run, and `run.log_artifact()` when an Artifact is output to a run. This helps visualize the relationship between different model versions and other objects like datasets and jobs in your project. Click [this](https://wandb.ai/wandb/artifact-basics/artifacts/dataset/my_first_artifact/v0/lineage) link to navigate to the project's lineage page."


yeap, I am missing some screenshots. Add those or from the docs or upload them as files in the same folder.

noaleetz · 2024-07-07T00:20:26Z

Hey @rymc - would you be up to revising the changes you proposed directly? Katherine is out on medical leave so I am trying to get some support with wrapping up her in-flight docs PR so we can get the new and improved Artifacts colab out. If it is easy enough to make those fixes directly that would be a huge help so I can focus on some of the other docs work.

rymc · 2024-07-08T09:52:27Z

Hey @noaleetz done. Addressed comments and confirmed working on Colab.

noaleetz · 2024-07-08T20:34:36Z

Hey @noaleetz done. Addressed comments and confirmed working on Colab.

Ryan you are awesome, thank you so so much. I will give the colab a final run myself, but we should be good to merge. @ngrayluna can I ask you to give your review and stamp as well?

ngrayluna

Blocker: Notebook needs to be executable

ngrayluna · 2024-07-09T22:08:48Z

colabs/wandb-artifacts/Artifact_Basics.ipynb

+      "outputs": [],
+      "source": [
+        "run = wandb.init(project=\"artifact-basics\")\n",
+        "run.log_artifact(artifact_or_path=\"/content/sample_data/mnist_test.csv\", name=\"my_first_artifact\", type=\"dataset\")\n",


Notebooks need to be executable...we'll want to use a real dataset before merging this in.

rymc · 2024-07-10T08:07:42Z

Ah, good point @ngrayluna. I've pushed a new version that makes the Colab notebook executable regardless of where it runs.

ngrayluna · 2024-07-22T15:42:49Z

PR for small nits: #548

colabs/wandb-artifacts/Artifact_Basics.ipynb

tcapelle · 2024-07-24T21:46:09Z

can you make both wandbcode consistent?

ngrayluna · 2024-07-24T21:51:12Z

can you make both wandbcode consistent?

Not sure I follow?

Adds new artifacts colab.

bd188ed

katjacksonWB requested review from moredatarequired, ngrayluna and noaleetz May 15, 2024 22:07

Finishes all the runs in the notebook.

e2f825c

noaleetz requested changes May 16, 2024

View reviewed changes

katjacksonWB added 2 commits May 20, 2024 17:48

Adds new section.

4d9573f

Adds a new section.

bbb30b2

noaleetz reviewed May 21, 2024

View reviewed changes

colabs/artifact_basics/Artifact_Basics.ipynb Outdated Show resolved Hide resolved

noaleetz reviewed May 21, 2024

View reviewed changes

colabs/artifact_basics/Artifact_Basics.ipynb Outdated Show resolved Hide resolved

noaleetz reviewed May 21, 2024

View reviewed changes

colabs/artifact_basics/Artifact_Basics.ipynb Outdated Show resolved Hide resolved

noaleetz reviewed May 21, 2024

View reviewed changes

colabs/artifact_basics/Artifact_Basics.ipynb Outdated Show resolved Hide resolved

noaleetz reviewed May 21, 2024

View reviewed changes

colabs/artifact_basics/Artifact_Basics.ipynb Outdated Show resolved Hide resolved

noaleetz reviewed May 21, 2024

View reviewed changes

colabs/artifact_basics/Artifact_Basics.ipynb Outdated Show resolved Hide resolved

noaleetz reviewed May 21, 2024

View reviewed changes

colabs/artifact_basics/Artifact_Basics.ipynb Outdated Show resolved Hide resolved

noaleetz reviewed May 21, 2024

View reviewed changes

colabs/artifact_basics/Artifact_Basics.ipynb Outdated Show resolved Hide resolved

noaleetz requested changes May 21, 2024

View reviewed changes

katjacksonWB and others added 2 commits May 23, 2024 12:15

Update colabs/artifact_basics/Artifact_Basics.ipynb

01edf7c

Co-authored-by: Noa <[email protected]>

Added more specific text.

663e980

katjacksonWB requested a review from noaleetz May 23, 2024 19:59

katjacksonWB and others added 3 commits May 23, 2024 15:00

Update colabs/artifact_basics/Artifact_Basics.ipynb

46cc6e9

Co-authored-by: Noa <[email protected]>

New header.

eefe863

Lineage changes.

65639ab

noaleetz reviewed May 28, 2024

View reviewed changes

colabs/artifact_basics/Artifact_Basics.ipynb Outdated Show resolved Hide resolved

noaleetz requested review from rymc and removed request for moredatarequired May 28, 2024 19:10

tcapelle reviewed Jun 3, 2024

View reviewed changes

noaleetz requested changes Jun 10, 2024

View reviewed changes

rymc reviewed Jun 11, 2024

View reviewed changes

tcapelle requested changes Jun 12, 2024

View reviewed changes

addressing outstanding issues with the new basic Artifacts colab

66dc524

ngrayluna requested changes Jul 9, 2024

View reviewed changes

making the artifact-basics.ipynb colab environment agnostic

6b8bf2b

removing output from the artifact-basics.ipynb colab

7cee2f6

noaleetz approved these changes Jul 14, 2024

View reviewed changes

ngrayluna reviewed Jul 22, 2024

View reviewed changes

colabs/wandb-artifacts/Artifact_Basics.ipynb Outdated Show resolved Hide resolved

ngrayluna added 3 commits July 24, 2024 11:19

Added Pandas import, removed extra img tags (#548)

84750d8

Fixed typo

01abaac

Renamed notebook

60ffbc7

ngrayluna approved these changes Jul 24, 2024

View reviewed changes

clean up

0398013

tcapelle approved these changes Jul 25, 2024

View reviewed changes

ngrayluna merged commit f46ecee into master Jul 25, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds new artifacts colab. #526

Adds new artifacts colab. #526

katjacksonWB commented May 15, 2024

review-notebook-app bot commented May 15, 2024

github-actions bot commented May 15, 2024 •

edited

Loading

noaleetz commented May 16, 2024

noaleetz left a comment

noaleetz left a comment

tcapelle commented May 27, 2024

tcapelle left a comment •

edited

Loading

noaleetz commented Jun 10, 2024

noaleetz left a comment

noaleetz Jun 10, 2024

tcapelle Jun 12, 2024

rymc Jun 11, 2024

tcapelle Jun 12, 2024

rymc Jun 11, 2024

rymc Jun 11, 2024

rymc Jun 11, 2024

tcapelle commented Jun 12, 2024

tcapelle commented Jun 12, 2024

tcapelle Jun 12, 2024

tcapelle Jun 12, 2024

tcapelle Jun 12, 2024

tcapelle Jun 12, 2024

tcapelle Jun 12, 2024

noaleetz commented Jul 7, 2024

rymc commented Jul 8, 2024

noaleetz commented Jul 8, 2024

ngrayluna left a comment

ngrayluna Jul 9, 2024

rymc commented Jul 10, 2024

ngrayluna commented Jul 22, 2024

tcapelle commented Jul 24, 2024

ngrayluna commented Jul 24, 2024

Adds new artifacts colab. #526

Adds new artifacts colab. #526

Conversation

katjacksonWB commented May 15, 2024

review-notebook-app bot commented May 15, 2024

github-actions bot commented May 15, 2024 • edited Loading

Guidelines

Before marking the PR as ready for review, please run your notebook one more time. Restart the Colab and run all. We will provide you with links to open the Colabs below

noaleetz commented May 16, 2024

noaleetz left a comment

Choose a reason for hiding this comment

noaleetz left a comment

Choose a reason for hiding this comment

tcapelle commented May 27, 2024

tcapelle left a comment • edited Loading

Choose a reason for hiding this comment

noaleetz commented Jun 10, 2024

noaleetz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tcapelle commented Jun 12, 2024

tcapelle commented Jun 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

noaleetz commented Jul 7, 2024

rymc commented Jul 8, 2024

noaleetz commented Jul 8, 2024

ngrayluna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rymc commented Jul 10, 2024

ngrayluna commented Jul 22, 2024

tcapelle commented Jul 24, 2024

ngrayluna commented Jul 24, 2024

github-actions bot commented May 15, 2024 •

edited

Loading

tcapelle left a comment •

edited

Loading