[Feature Request] Primary Shard Count Constraint #17293

pandeydivyansh1803 · 2025-02-07T04:43:05Z

Is your feature request related to a problem? Please describe

OpenSearch is a distributed search and analytics engine designed to handle large volumes of data with high performance and availability. A critical aspect of its architecture is data replication, which ensures data durability and enables fast query performance across the cluster.
Traditionally, OpenSearch has used document-level replication. In this model:

When a document is indexed, it's first processed by the primary shard.
The document is then sent to each replica shard.
Each replica independently processes the document, creating its own segment.

While effective, this approach has some drawbacks:

CPU Intensity: Each replica performs the same compute-intensive tasks as the primary, including text analysis, inverted index creation, and codec operations.
Network Overhead: Individual documents are sent over the network, which can be inefficient for bulk operations.
Replication Lag: Under high indexing loads, replicas can fall behind the primary, potentially affecting search consistency.

To address these challenges, segment replication has been introduced:

Instead of replicating individual documents, entire Lucene segments are copied from the primary to replica shards.
Replicas receive fully-formed segments, eliminating the need to repeat CPU-intensive indexing operations.
This approach significantly reduces CPU usage on replica nodes, as they no longer need to analyze and index each document individually.
Network efficiency is improved, especially for bulk indexing operations, as segments can be transferred in larger, optimized chunks.
Replication lag is reduced, helping to maintain better consistency between primary and replica shards.

For remote store backed cluster, Segment Replication is used as the replication strategy. With segment replication, segments are created only on primary shard and these segments are copied to the replica shards. As segment creation is CPU intensive, we have observed CPU skew between nodes of the same cluster where primary shards are not balanced.

The earlier attempts to rebalance primary shards across nodes (per index, across all nodes) are definitely helping to reduce the skew but they work on the best effort basis and don’t add any constraint.

Describe the solution you'd like

Implement two new settings in OpenSearch:

index.routing.allocation.total_primary_shards_per_node: An index-level setting to limit primary shards per node for a specific index. Store this limit (indexTotalPrimaryShardsPerNodeLimit) in index metadata, similar to indexTotalShardsPerNodeLimit.
cluster.routing.allocation.total_primary_shards_per_node: A cluster-level setting to limit total primary shards per node across all indices.

These settings will enhance control over primary shard distribution, improving cluster balance and performance management.
The existing ShardsLimitAllocationDecider class already contains the necessary infrastructure and logic to evaluate shard allocation constraints. It has access to the current cluster state, routing information, and methods to check shard counts per node. Given this existing functionality, we propose implementing the new primary shard limit settings within this class. This approach leverages the current decision-making framework, ensuring consistency with existing allocation rules and minimizing code duplication. By extending the ShardsLimitAllocationDecider, we can efficiently integrate the new primary shard limit checks into the existing allocation decision process.

Related component

No response

Describe alternatives you've considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

pandeydivyansh1803 added enhancement Enhancement or improvement to existing feature or request untriaged labels Feb 7, 2025

github-actions bot added the _No response_ label Feb 7, 2025

pandeydivyansh1803 changed the title ~~[Feature Request] <Primary Shard Count Constraint>~~ [Feature Request] Primary Shard Count Constraint Feb 7, 2025

pandeydivyansh1803 linked a pull request Feb 7, 2025 that will close this issue

Added a new index level setting to limit the total primary shards per node per index #17295

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Primary Shard Count Constraint #17293

[Feature Request] Primary Shard Count Constraint #17293

pandeydivyansh1803 commented Feb 7, 2025 •

edited

Loading

[Feature Request] Primary Shard Count Constraint #17293

[Feature Request] Primary Shard Count Constraint #17293

Comments

pandeydivyansh1803 commented Feb 7, 2025 • edited Loading

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Related component

Describe alternatives you've considered

Additional context

pandeydivyansh1803 commented Feb 7, 2025 •

edited

Loading