Skip to content

Add OpenSearch Serverless (AOSS) support#802

Open
norrishuang wants to merge 4 commits into
zilliztech:mainfrom
norrishuang:aoss-serverless
Open

Add OpenSearch Serverless (AOSS) support#802
norrishuang wants to merge 4 commits into
zilliztech:mainfrom
norrishuang:aoss-serverless

Conversation

@norrishuang

Copy link
Copy Markdown
Contributor

Summary

Add support for Amazon OpenSearch Serverless (AOSS) to the awsopensearch client.

Changes

  • config.py: Add is_serverless and aws_region fields; use AWS SigV4 authentication via requests-aws4auth for AOSS connections
  • cli.py: Add --serverless flag and --aws-region option
  • aws_opensearch.py: Adapt client for AOSS constraints:
    • Skip unsupported operations (cluster settings, force merge, manual refresh, replica updates, warmup API)
    • Use smaller batch size (100) for bulk inserts
    • Store id as a document field (AOSS doesn't support custom _id)
    • Retrieve id from _source in search results
    • Remove engine and encoder from method config (AOSS manages internally)
    • Disable http_compress to avoid SigV4 checksum verification failures
  • README.md: Add OpenSearch Serverless section with usage example and notes

Usage

vectordbbench awsopensearch --db-label aoss \
  --serverless --aws-region us-east-1 \
  --host <collection-id>.aoss.us-east-1.on.aws --port 443 \
  --case-type Performance768D1M \
  --m 16 --ef-construction 200 --ef-search 40 \
  --number-of-shards 8 --number-of-replicas 0 \
  --engine faiss --metric-type cosine \
  --num-concurrency 80,100,120

Prerequisites

  • AWS credentials configured
  • requests-aws4auth installed
  • IAM identity policy with aoss:APIAccessAll
  • AOSS Data Access Policy granting index/collection permissions

Testing

Tested against a live AOSS collection with 1M 768-dim vectors (Cohere dataset). Data loading and index creation verified successfully.

- Add --serverless and --aws-region CLI options
- Use AWS SigV4 authentication via requests-aws4auth for AOSS
- Skip unsupported operations for serverless: cluster settings, force merge,
  manual refresh, replica updates, warmup API
- Use smaller batch size (100) for serverless bulk inserts
- Store id as document field (serverless doesn't support custom _id)
- Retrieve id from _source in search results for serverless
- Remove 'engine' and 'encoder' from index method config for serverless
  (AOSS manages these internally)
@sre-ci-robot

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: norrishuang
To complete the pull request process, please assign xuanyang-cn after the PR has been reviewed.
You can assign the PR to them by writing /assign @xuanyang-cn in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants