Skip to content

Commit 1c03c7f

Browse files
Merge pull request #1 from oracle-samples/fde
Fde
2 parents 9bcd86d + cb61262 commit 1c03c7f

File tree

3 files changed

+149
-0
lines changed

3 files changed

+149
-0
lines changed
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Databricks to AIDP migration utility
2+
3+
Utility to Export Databricks files(non-data) and notebooks to Oracle AI Data Platform (AIDP)
4+
* Preserves folder structure
5+
* Converts notebooks to .ipynb
6+
* No code translation; files are moved as-is
7+
* Optional plain string replacement from a provided mapping. Replacement is simple find/replace (no parsing)
8+
* This is not intended for Data Files.
9+
10+
Run this notebook from AIDP. The user must have read permission on Databricks Path and write permission on the AIDP destination path.
11+
12+
## Running the Samples
13+
14+
Before running the notebook, replace the following placeholders with your environment-specific values:
15+
Required Parameters
16+
17+
DATABRICKS_WORKSPACE_URL: Your Databricks workspace URL
18+
19+
DATABRICKS_TOKEN: Your Databricks personal access token
20+
21+
DATABRICKS_PATH: Source path in Databricks workspace to export
22+
23+
AIDP_PATH: Target directory path in AIDP
24+
25+
dbx_to_aidp_replacement_mappings: Optional string-replacement map used during export. Basic example could be to rewrite path prefixes of referenced files/notebooks.
26+
27+
## Documentation
28+
29+
### Recursive Export
30+
Traverses nested directory structures in Databricks workspace
31+
### Format Preservation
32+
Exports notebooks as Jupyter(.ipynb) and other files as is.
33+
### String Replacement
34+
Supports source-to-target string mapping during export
35+
### Structure Maintenance
36+
Recreates the original folder hierarchy in AIDP
37+
### Multiple File Types
38+
Handles both notebooks and regular files.
39+
### No Code Conversion
40+
It does not do any code conversion.
41+
### Permissions
42+
Need read permission on databricks and write permission on AIDP.
43+
44+
## Get Support
45+
46+
47+
## Security
48+
49+
Please consult the [security guide](/SECURITY.md) for our responsible security vulnerability disclosure process.
50+
51+
## Contributing
52+
53+
This project welcomes contributions from the community. Before submitting a pull request, please [review our contribution guide](/CONTRIBUTING.md).
54+
55+
## License
56+
57+
See [LICENSE](/LICENSE.txt)
58+
59+
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
{
2+
"metadata": {
3+
"kernelspec": {
4+
"name": "notebook",
5+
"display_name": "Python 3"
6+
},
7+
"language_info": {
8+
"file_extension": ".py",
9+
"mimetype": "text/x-python",
10+
"name": "python"
11+
},
12+
"Last_Active_Cell_Index": 5
13+
},
14+
"nbformat_minor": 5,
15+
"nbformat": 4,
16+
"cells": [
17+
{
18+
"id": "aedb2989-04d1-4e7d-894d-aff632ce0297",
19+
"cell_type": "markdown",
20+
"source": "Oracle AI Data Platform v1.0\n\nCopyright © 2025, Oracle and/or its affiliates.\n\nLicensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/",
21+
"metadata": {
22+
"type": "markdown"
23+
}
24+
},
25+
{
26+
"id": "a7f00219-2c74-4cba-a12b-67d2eb829168",
27+
"cell_type": "markdown",
28+
"source": "### Sample Code: Exporting Databricks Files to AIDP.\n\nThis example demonstrates how to export files recursively from databricks workspace using `databricks-sdk` Library and write to an **AIDP**.\n\n**Note:** \n\n- Replace all placeholders (e.g., `<DATABRICKS_WORKSPACE_URL>`, `<DATABRICKS_TOKEN>`, `<DATABRICKS_PATH>`, `<AIDP_PATH>` etc.) with values specific to your environment before running the notebook. \n- Provide Source to Target String replacement if you wish to do while importing to AIDP.\n- Use with caution: The notebook is designed for exporting notebooks & code related files only.",
29+
"metadata": {
30+
"type": "markdown"
31+
}
32+
},
33+
{
34+
"id": "6df91459-a48a-4920-9d16-973c61bee150",
35+
"cell_type": "code",
36+
"source": "import os\nimport base64\nfrom databricks.sdk import WorkspaceClient\nfrom databricks.sdk.service import workspace",
37+
"metadata": {
38+
"type": "python",
39+
"trusted": true
40+
},
41+
"outputs": [],
42+
"execution_count": null
43+
},
44+
{
45+
"id": "4a233ed1-458c-47bd-9444-63622ea8cf6b",
46+
"cell_type": "code",
47+
"source": "#Databricks Workspace URL\ndatabricks_workspace_url = \"DATABRICKS_WORKSPACE_URL\"\n#Databricks Token\ndatabricks_token = \"DATABRICKS_TOKEN\"\n# Define the Databricks folder you want to export\ndatabricks_path = \"DATABRICKS_PATH\"\n# Define the local AIDP directory to write the exported content\naidp_path = \"AIDP_PATH\"",
48+
"metadata": {
49+
"type": "python",
50+
"trusted": true
51+
},
52+
"outputs": [],
53+
"execution_count": null
54+
},
55+
{
56+
"id": "4b8ab52e-0885-4ae2-8864-5b7563d20b79",
57+
"cell_type": "code",
58+
"source": "#Provide Comma Seperated mapping to replace Source String with Target String. These are just string replacement so mapping should be provided carefully.\ndbx_to_aidp_replacement_mappings = {\n \"SOURCE_STR_1\": \"TARGET_STR_1\",\n \"SOURCE_STR_2\": \"TARGET_STR_2\"\n}",
59+
"metadata": {
60+
"type": "python",
61+
"trusted": true
62+
},
63+
"outputs": [],
64+
"execution_count": null
65+
},
66+
{
67+
"id": "c9ae41ae-4e57-4fd9-8a59-f360a3cb60ad",
68+
"cell_type": "code",
69+
"source": "#Recursively exports a Databricks workspace folder to a local directory, preserving the nested folder structure and exporting notebooks as .ipynb files.\n\ndef export_folder_recursively(databricks_path: str , aidp_path: str , w: WorkspaceClient):\n\n try:\n # List contents of the current workspace path\n contents = w.workspace.list(path=databricks_path)\n except Exception as e:\n print(f\"Failed to list contents of Databricks path {databricks_path}: {e}\")\n return\n\n for item in contents:\n dbx_item_path = item.path\n\n # Determine the relative path to maintain the nested structure\n dbx_relative_path = os.path.relpath(dbx_item_path , databricks_path)\n aidp_full_path = os.path.join(aidp_path , dbx_relative_path)\n\n if item.object_type == workspace.ObjectType.DIRECTORY:\n # Create the local directory and recurse into it\n os.makedirs(aidp_full_path , exist_ok=True)\n print(f\"Created local directory: {aidp_full_path}\")\n export_folder_recursively(dbx_item_path , aidp_full_path , w)\n elif item.object_type == workspace.ObjectType.FILE or item.object_type == workspace.ObjectType.NOTEBOOK:\n file_name = os.path.basename(dbx_item_path)\n if item.object_type == workspace.ObjectType.NOTEBOOK:\n local_file_path = os.path.join(os.path.dirname(aidp_full_path) , f\"{file_name}.ipynb\")\n format = workspace.ExportFormat.JUPYTER\n else:\n local_file_path = os.path.join(os.path.dirname(aidp_full_path) , file_name)\n format = workspace.ExportFormat.SOURCE\n\n try:\n # Export the file/notebook content\n print(f\"Exporting File/Notebook: {dbx_item_path} to {local_file_path}\")\n dbx_file_content = w.workspace.export(\n path=dbx_item_path ,\n format=format\n )\n\n \n binary_content = base64.b64decode(dbx_file_content.content)\n code_string = binary_content.decode('utf-8')\n \n # Iterate through the mapping and replace content\n for dbx_str, aidp_str in dbx_to_aidp_replacement_mappings.items():\n code_string = code_string.replace(dbx_str, aidp_str)\n \n modified_binary_content = code_string.encode('utf-8')\n\n with open(local_file_path , \"wb\") as f:\n f.write(modified_binary_content)\n\n print(f\"Downloaded File: {file_name} as {local_file_path}\")\n\n except Exception as export_error:\n print(f\"Failed to export notebook {dbx_item_path}: {export_error}\")\n\n else:\n print(f\"Skipping unsupported object type: {item.object_type} at {dbx_item_path}\")",
70+
"metadata": {
71+
"type": "python",
72+
"trusted": true
73+
},
74+
"outputs": [],
75+
"execution_count": null
76+
},
77+
{
78+
"id": "adaeed13-c355-4503-90bc-9aa8262c30cb",
79+
"cell_type": "code",
80+
"source": "# Initialize the WorkspaceClient\nw = WorkspaceClient(\n host=databricks_workspace_url ,\n token=databricks_token ,\n)\n\nprint(f\"Starting export from Databricks path '{databricks_path}' to local path '{aidp_path}'\")\n\n# Create AIDP local directory if not exists.\nos.makedirs(aidp_path , exist_ok=True)\n\n# Start the recursive export\nexport_folder_recursively(databricks_path , aidp_path , w)\n\nprint(\"\\nExport process finished.\")",
81+
"metadata": {
82+
"type": "python",
83+
"trusted": true
84+
},
85+
"outputs": [],
86+
"execution_count": null
87+
}
88+
]
89+
}
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
databricks-sdk

0 commit comments

Comments
 (0)