Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 17 additions & 5 deletions .github/workflows/java-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,17 @@ on:
workflow_dispatch:
inputs:
mode:
description: "dry_run: build & package only, release: build & deploy to OSSRH"
description: 'Release mode'
required: true
default: "dry_run"
type: choice
default: dry_run
options:
- dry_run
- release
ref:
description: 'The branch, tag or SHA to checkout'
required: false
type: string

jobs:
publish:
Expand All @@ -37,8 +41,14 @@ jobs:
run:
working-directory: java
steps:
- uses: actions/checkout@v4

- name: Checkout repository
uses: actions/checkout@v4
with:
# When triggered by a release, use the release tag
# When triggered manually with a ref, use the provided ref
# Otherwise (PR or manual without ref), use the default branch
ref: ${{ github.event.release.tag_name || inputs.ref || '' }}

- name: Set up Java sdk
uses: actions/setup-java@v4
with:
Expand All @@ -56,7 +66,9 @@ jobs:
git config --global user.email "dev+gha@lance.org"

- name: Dry run
if: github.event_name == 'pull_request'
if: |
github.event_name == 'pull_request' ||
inputs.mode == 'dry_run'
run: |
./mvnw --batch-mode -DskipTests package

Expand Down
15 changes: 12 additions & 3 deletions .github/workflows/python-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,17 @@ on:
workflow_dispatch:
inputs:
mode:
description: "dry_run: build & test only, release: build & publish to PyPI"
description: 'Release mode'
required: true
default: "dry_run"
type: choice
default: dry_run
options:
- dry_run
- release
ref:
description: 'The branch, tag or SHA to checkout'
required: false
type: string

jobs:
publish:
Expand All @@ -37,8 +41,13 @@ jobs:
id-token: write # Required for PyPI trusted publishing
contents: read
steps:
- name: Checkout code
- name: Checkout repository
uses: actions/checkout@v4
with:
# When triggered by a release, use the release tag
# When triggered manually with a ref, use the provided ref
# Otherwise (PR or manual without ref), use the default branch
ref: ${{ github.event.release.tag_name || inputs.ref || '' }}

- name: Set up Python
uses: actions/setup-python@v5
Expand Down
22 changes: 16 additions & 6 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@ jobs:
id: versions
run: |
BASE_VERSION="${{ steps.base_version.outputs.version }}"
CURRENT_VERSION="${{ steps.current_version.outputs.version }}"
if [ "${{ inputs.release_channel }}" == "stable" ]; then
TAG="v${BASE_VERSION}"
POM_VERSION="${BASE_VERSION}"
Expand All @@ -117,13 +118,22 @@ jobs:
POM_VERSION="${BASE_VERSION}-beta.${BETA_NUM}"
fi

# Check if version actually changes (needed for commit/push decisions)
if [ "$CURRENT_VERSION" != "$POM_VERSION" ]; then
VERSION_CHANGED="true"
else
VERSION_CHANGED="false"
fi

echo "tag=$TAG" >> $GITHUB_OUTPUT
echo "pom_version=$POM_VERSION" >> $GITHUB_OUTPUT
echo "version_changed=$VERSION_CHANGED" >> $GITHUB_OUTPUT
echo "Tag will be: $TAG"
echo "POM version will be: $POM_VERSION"
echo "Version changed: $VERSION_CHANGED"

- name: Update version (when version changes)
if: inputs.release_type != 'current'
if: steps.versions.outputs.version_changed == 'true'
run: |
python ci/bump_version.py --version "${{ steps.versions.outputs.pom_version }}"

Expand All @@ -133,20 +143,20 @@ jobs:
git config user.email 'dev+gha@lance.org'

- name: Regenerate modules after version update
if: inputs.release_type != 'current'
if: steps.versions.outputs.version_changed == 'true'
run: |
git diff
make clean
make gen

- name: Update Cargo lock version (when version changes)
if: inputs.release_type != 'current'
if: steps.versions.outputs.version_changed == 'true'
working-directory: rust
run: |
make build

- name: Create release commit (when version changes)
if: inputs.release_type != 'current'
if: steps.versions.outputs.version_changed == 'true'
run: |
git add -A
git commit -m "chore: release version ${{ steps.versions.outputs.pom_version }}" || echo "No changes to commit"
Expand All @@ -163,7 +173,7 @@ jobs:
# Configure git to use the token for authentication
git remote set-url origin "https://x-access-token:${GITHUB_TOKEN}@github.com/${{ github.repository }}.git"

if [ "${{ inputs.release_type }}" != "current" ]; then
if [ "${{ steps.versions.outputs.version_changed }}" == "true" ]; then
# Push the version bump commit
git push origin main
fi
Expand All @@ -188,7 +198,7 @@ jobs:
echo "- **Release Type:** ${{ inputs.release_type }}" >> $GITHUB_STEP_SUMMARY
echo "- **Release Channel:** ${{ inputs.release_channel }}" >> $GITHUB_STEP_SUMMARY
echo "- **Current Version:** ${{ steps.current_version.outputs.version }}" >> $GITHUB_STEP_SUMMARY
if [ "${{ inputs.release_type }}" != "current" ]; then
if [ "${{ steps.versions.outputs.version_changed }}" == "true" ]; then
echo "- **New Version:** ${{ steps.versions.outputs.pom_version }}" >> $GITHUB_STEP_SUMMARY
fi
echo "- **Tag:** ${{ steps.versions.outputs.tag }}" >> $GITHUB_STEP_SUMMARY
Expand Down
34 changes: 27 additions & 7 deletions .github/workflows/rust-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,17 @@ on:
workflow_dispatch:
inputs:
mode:
description: "dry_run: build & test only, release: build & publish to crates.io"
description: 'Release mode'
required: true
default: "dry_run"
type: choice
default: dry_run
options:
- dry_run
- release
ref:
description: 'The branch, tag or SHA to checkout'
required: false
type: string

env:
# This env var is used by Swatinem/rust-cache@v2 for the cache
Expand All @@ -48,8 +52,13 @@ jobs:
runs-on: ubuntu-24.04
timeout-minutes: 60
steps:
- name: Checkout code
- name: Checkout repository
uses: actions/checkout@v4
with:
# When triggered by a release, use the release tag
# When triggered manually with a ref, use the provided ref
# Otherwise (PR or manual without ref), use the default branch
ref: ${{ github.event.release.tag_name || inputs.ref || '' }}
- name: Install dependencies
run: |
sudo apt update
Expand All @@ -63,10 +72,21 @@ jobs:
- uses: Swatinem/rust-cache@v2
with:
workspaces: rust
- uses: katyo/publish-crates@v2
- name: Dry run (build and package only)
if: |
github.event_name == 'pull_request' ||
(github.event_name == 'workflow_dispatch' && github.event.inputs.mode == 'dry_run')
working-directory: rust
run: |
cargo build --all-features
cargo package --all-features --allow-dirty

- name: Publish to crates.io
if: |
(github.event_name == 'release' && github.event.action == 'released') ||
(github.event_name == 'workflow_dispatch' && github.event.inputs.mode == 'release')
uses: katyo/publish-crates@v2
with:
# registry-token: ${{ steps.auth.outputs.token }}
registry-token: ${{ secrets.CARGO_REGISTRY_TOKEN }}
args: "--all-features"
path: rust
dry-run: ${{ github.event_name == 'pull_request' || (github.event_name == 'workflow_dispatch' && github.event.inputs.mode == 'dry_run') }}
path: rust
5 changes: 3 additions & 2 deletions ci/calculate_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,11 @@ def calculate_next_version(current_version, release_type, channel):
elif release_type == 'patch':
new_version = f"{major}.{minor}.{patch + 1}"
elif release_type == 'current':
# Keep current version - used for:
# Keep current base version - used for:
# - Subsequent preview releases (v0.0.16-beta.2, beta.3, etc.)
# - Finalizing preview to stable (v0.0.16-beta.X -> v0.0.16)
new_version = current_version
# Strip any pre-release suffix to get base version (e.g., 0.1.3-beta.1 -> 0.1.3)
new_version = f"{major}.{minor}.{patch}"
else:
raise ValueError(f"Unknown release type: {release_type}")

Expand Down
39 changes: 19 additions & 20 deletions docs/src/client/operations/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,34 +108,33 @@ These operations provide the foundational metadata management capabilities neede
without requiring data or index operation support. With the namespace able to provide basic information about the table,
the Lance SDK can be used to fulfill the other operations.

### Why Not CreateTable and DropTable?
### Why Not `CreateTable` and `DropTable`?

`CreateTable` and `DropTable` are intentionally excluded from the recommended basic operations because they involve
`CreateTable` and `DropTable` are common in most catalog systems,
but are intentionally excluded from the recommended basic operations because they involve
data operations that present challenges for catalog implementations:

**Data Operation Complexity:**
Both `CreateTable` and `DropTable` are considered data operations rather than pure metadata operations.
They can be long-running, especially when dealing with large datasets or remote storage systems.
This makes them difficult to implement reliably in catalog systems that are designed for fast metadata lookups.
- **Data Operation Complexity:**
Both `CreateTable` and `DropTable` are data operations rather than pure metadata operations.
They can be long-running, especially when dealing with large datasets or remote storage systems.
This makes them difficult to implement reliably in catalog systems designed for fast metadata lookups.

**Atomicity Guarantees:**
Data operations require careful handling of atomicity. A failed `CreateTable` or `DropTable` operation
can leave the system in an inconsistent state with partially created or deleted data files.
Catalog implementations would need to implement complex cleanup and recovery mechanisms.
- **Atomicity Guarantees:**
Data operations require careful handling of atomicity. A failed `CreateTable` or `DropTable` operation
can leave the system in an inconsistent state with partially created or deleted data files.
Catalog implementations would need to implement complex cleanup and recovery mechanisms.

**CreateTable Challenges:**
`CreateTable` is particularly difficult for catalogs to fully implement because features like
CREATE TABLE AS SELECT (CTAS) require either complicated staging mechanisms or multi-table
multi-statement transaction support. Most catalog systems are not designed to handle such complex workflows.
- **CreateTable Challenges:**
`CreateTable` is particularly difficult for catalogs to fully implement because features like
CREATE TABLE AS SELECT (CTAS) require either complicated staging mechanisms or multi-statement
transaction support.

While some catalog systems can handle these complex workflows,
doing so typically requires deep, dedicated integration.
Lance Namespace aims to enable as many catalogs as possible to adopt Lance format. By focusing on
`DeclareTable` and `DeregisterTable` instead of `CreateTable` and `DropTable`, namespace implementations only
need to handle metadata operations that are always fast and atomic across all catalog solutions.

**Recommended Approach:**
- Use **DeclareTable** to reserve a table name and location, then use the Lance SDK to write data
- Use **DeregisterTable** to unregister a table while preserving its data for potential re-registration
- Use the Lance SDK directly for data operations when full control over the data lifecycle is needed
need to handle metadata operations that are simple, fast and atomic across all catalog solutions.
`CreateTable` and `DropTable` can then be fulfilled by combining these metadata operations with the Lance SDK.

## Operation Versioning

Expand Down