DOC-753 | Graph ML UI #709

New issue

Jump to bottom

Merged

nerpaula merged 16 commits into main from DOC-753

Jun 19, 2025

+264 −0

Contributor

bluepal-thirumala-thotapalli commented Jun 10, 2025 •

edited by Simran-B

Loading

Description

TODO: Update screenshots due to name change Data Science (Suite) -> GenAI Suite

Upstream PRs

3.10:
3.11:
3.12:
3.13:

Thirumala added 3 commits

June 9, 2025 20:48


          create md file for graph ml ui

21ec540


          Reduce the size of images

8c818b5


          Changes in documentation

aeb5c96

bluepal-thirumala-thotapalli requested a review from Simran-B

June 10, 2025 07:11

bluepal-thirumala-thotapalli self-assigned this

Contributor

arangodb-docs-automation bot commented Jun 10, 2025

Deploy Preview Available Via
https://deploy-preview-709--docs-hugo.netlify.app

This comment was marked as duplicate.

Sign in to view

Simran-B changed the title ~~Doc 753~~ DOC-753 | Graph ML UI

Simran-B requested changes

View reviewed changes

site/content/3.13/data-science/arangographml/arangograph-ml.md Outdated

Comment on lines 2 to 3

		title: ArangoGraphML Web Interface
		menuTitle: ArangoGraphML Web Interface

Contributor

Simran-B Jun 11, 2025

Title to be discussed (we might rename it to just GraphML)

Contributor Author

bluepal-thirumala-thotapalli Jun 16, 2025

Yes, I’ve updated the title and menuTitle to "GraphML" as suggested.

site/content/3.13/data-science/arangographml/arangograph-ml.md Outdated Show resolved Hide resolved

site/content/3.13/data-science/arangographml/arangograph-ml.md Outdated

+              aliases:
+                - getting-started-with-arangographml
+              ---
+              Solve high-computational graph problems with Graph Machine Learning. Apply ML on a selected graph to predict connections, get better product recommendations, classify nodes, and perform node embeddings. Configure and run the whole machine learning flow entirely in the web interface.

Contributor

Simran-B Jun 11, 2025

We only have node classification and embeddings available as immediate options. If we mention something like link predictions, we should at least outline how to achieve that.

Would also be good to have a more technical explanation here about how GraphML works (GraphSage, using depth 2 neighborhood, as mentioned in Slack team channel).

Please also add an overview over the process instead of immediately starting with project creation etc., users should first get an understanding of the hierarchy and steps involved.

Contributor Author

bluepal-thirumala-thotapalli Jun 16, 2025

I’ve addressed the points as suggested:

Mentioned only node classification and embeddings as the currently available options.

Added a brief technical explanation of how GraphML works, referencing GraphSAGE with depth 2 neighborhood, based on our Slack discussion and information from the official GraphSAGE site.

Included an overview section at the beginning to explain the overall process, hierarchy, and steps before diving into project creation.

site/content/3.13/data-science/arangographml/arangograph-ml.md Outdated Show resolved Hide resolved

site/content/3.13/data-science/arangographml/arangograph-ml.md Outdated Show resolved Hide resolved

site/content/3.13/data-science/arangographml/arangograph-ml.md Outdated Show resolved Hide resolved

site/content/3.13/data-science/arangographml/arangograph-ml.md Outdated Show resolved Hide resolved

site/content/3.13/data-science/arangographml/arangograph-ml.md Outdated Show resolved Hide resolved

site/content/3.13/data-science/arangographml/arangograph-ml.md Outdated


		## Prediction Phase

		Once the best-performing model has been selected, the final step of the GraphML pipeline is to generate predictions for new or unlabeled data

Contributor

Simran-B Jun 11, 2025

As I explained, we don't have the capability to only process new/unlabeled data

Contributor Author

bluepal-thirumala-thotapalli Jun 16, 2025

Updated – Rewrote the section to remove the inaccurate reference to “new or unlabeled data” as suggested.
Replaced it with:

After selecting a model, you can create a Prediction Job. The Prediction Job generates predictions and persists them to the source graph, either in a new collection or within the source documents.
Let me know if any further adjustments are needed.

site/content/3.13/data-science/arangographml/arangograph-ml.md Outdated Show resolved Hide resolved


          Changes in documentation based review

9f32a3d

This comment was marked as duplicate.

Sign in to view


          Changes in documentation based review

ddce412

This comment was marked as duplicate.

Sign in to view


          Changes in documentation based review

6b23778

This comment was marked as duplicate.

Sign in to view


          Changes in documentation based review

0e21c13

This comment was marked as duplicate.

Sign in to view


          Changes in documentation based on review

9d8106c

This comment was marked as duplicate.

Sign in to view

Simran-B reviewed

View reviewed changes

site/content/images/datascience-intro.jpgZone.Identifier Outdated Show resolved Hide resolved


          Changes in documentation based on review

5f6acb0

This comment was marked as duplicate.

Sign in to view

Simran-B requested changes

View reviewed changes

site/content/3.13/data-science/arangographml/ui.md Outdated

Comment on lines 7 to 8

		aliases:
		- getting-started-with-arangographml

Contributor

Simran-B Jun 16, 2025

This seems to be copied from the getting-started.md file, needs to be removed

Contributor Author

bluepal-thirumala-thotapalli Jun 19, 2025

I have removed the aliases section from the ui.md file, including the alias getting-started-with-arangographml

site/content/3.13/data-science/arangographml/ui.md

Comment on lines +2 to +3

		title: GraphML
		menuTitle: GraphML

Contributor

Simran-B Jun 17, 2025

Looking at the bigger picture, I think we need to make some structural changes to the parent chapter to accommodate the new content.

We have a Deploy subpage for the setup, but the other page is Getting Started and covers the usage of ArangoGraphML at the API level only. I think the web interface is a lot more suitable for getting started, but perhaps it's better to just split it into a UI and an API page? Having both on one page (with tabs) would work for the steps (Featurization, Training, ...) but the rest of the UI-related content would be on its own. The benefit I see is that we could have just a single description of the options for UI and API. Should be discussed in the team.

site/content/3.13/data-science/arangographml/ui.md Outdated

+              menuTitle: GraphML
+              weight: 15
+              description: >-
+               Enterprise-ready, graph-powered machine learning as a cloud service or self-managed

Contributor

Simran-B Jun 17, 2025

This doesn't say anything about the content being about the UI for GraphML

Contributor Author

bluepal-thirumala-thotapalli Jun 19, 2025

I have added UI-related content and removed the previous content.

site/content/3.13/data-science/arangographml/ui.md Outdated


		GraphML directly supports two primary machine learning tasks:

		* Node Classification: Automatically assign a category or label to nodes in your graph. For example, you can classify customers as "likely to churn" or "high value," or identify fraudulent transactions.

Contributor

Simran-B Jun 17, 2025

I think formally, it's always a label (a single categorical value out of the predicted likelihoods, of which the top likely is selected)

Contributor Author

bluepal-thirumala-thotapalli Jun 19, 2025

I've updated the document to use the more formal term "label" as you recommended.

site/content/3.13/data-science/arangographml/ui.md Outdated

+              GraphML directly supports two primary machine learning tasks:
+              *   **Node Classification:** Automatically assign a category or label to nodes in your graph. For example, you can classify customers as "likely to churn" or "high value," or identify fraudulent transactions.
+              *   **Node Embeddings:** Generate powerful numerical representations (vectors) for each node. These embeddings capture a node's features as well as its unique structural position within the graph.

Contributor

Simran-B Jun 17, 2025

Calling the numerical representations itself powerful seems odd.

I think there should be a mention here that it's about node similarity. Otherwise, it could be misunderstood as something that captures features and positions in an absolute way, but it's more about proximity in a high-dimensional space where closeness stands for (assumed) semantic similarity.

Contributor Author

bluepal-thirumala-thotapalli Jun 19, 2025

you were right, focusing on similarity makes the definition much clearer and more accurate. I've rewritten the section as you recommended.

site/content/3.13/data-science/arangographml/ui.md

Comment on lines +203 to +206

+              **Featurize new documents:** Enable this option to generate features for documents that have been added since the model was trained. This is useful for getting predictions on new data without having to retrain the model.
+              **Featurize outdated documents:** Enable this option to re-generate features for documents that have been modified since the last featurization. This ensures your predictions reflect the latest changes to your data.
+              In addition to these settings, you will also define the target data, where to store results, and whether to run the job on a recurring schedule.

Contributor

Simran-B Jun 17, 2025

Should be an unordered list

site/content/3.13/data-science/arangographml/ui.md

Comment on lines +203 to +205

		Featurize new documents: Enable this option to generate features for documents that have been added since the model was trained. This is useful for getting predictions on new data without having to retrain the model.

		Featurize outdated documents: Enable this option to re-generate features for documents that have been modified since the last featurization. This ensures your predictions reflect the latest changes to your data.

Contributor

Simran-B Jun 17, 2025

Should be an unordered list

site/content/3.13/data-science/arangographml/ui.md

Comment on lines +224 to +228

+              **Featurize New Documents:**
+              This option controls whether newly added documents are automatically featurized. It is useful when new data arrives after training, allowing predictions to continue without requiring a full retraining process.
+              **Featurize Outdated Documents:**
+              Enable or disable the featurization of outdated documents. Outdated documents are those whose attributes (used during featurization) have changed since the last feature computation. This ensures prediction results are based on up-to-date information.

Contributor

Simran-B Jun 17, 2025

This is already described above

site/content/3.13/data-science/arangographml/ui.md

+              **Featurize outdated documents:** Enable this option to re-generate features for documents that have been modified since the last featurization. This ensures your predictions reflect the latest changes to your data.
+              In addition to these settings, you will also define the target data, where to store results, and whether to run the job on a recurring schedule.
+              In addition to these settings, you also define the target data, where to store results, and whether to run the job on a recurring schedule.

Contributor

Simran-B Jun 17, 2025

Should have an internal link to the scheduling details

site/content/3.13/data-science/arangographml/ui.md


		When scheduling is turned on, predictions run automatically based on a set CRON expression. This helps keep prediction results up to date as new data is added to the system.

		#### Schedule (CRON expression)

Contributor

Simran-B Jun 17, 2025

Can we do without a subheading?


          removed aliases

ab49682

cla-bot bot commented Jun 19, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Thirumala.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails


          Removed old description and Added new descrption

22cb510

cla-bot bot commented Jun 19, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Thirumala.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails


          Removed term category and added lable

559c41d

cla-bot bot commented Jun 19, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Thirumala.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails


          changed Node Embedding definition

27c734b

cla-bot bot commented Jun 19, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Thirumala.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails


          Aded graph link

c71ba70

cla-bot bot commented Jun 19, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Thirumala.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails


          Apply suggestions from code review

352259e

Co-authored-by: Simran <[email protected]>

cla-bot bot commented Jun 19, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Thirumala.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails


          Merge branch 'main' into DOC-753

e58b5e0

cla-bot bot commented Jun 19, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Thirumala.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

nerpaula approved these changes

View reviewed changes

Contributor

nerpaula left a comment

Further changes will be treated in a separate PR

nerpaula merged commit 737a3e6 into main

4 of 5 checks passed

nerpaula deleted the DOC-753 branch

June 19, 2025 12:07

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet