-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Astra multi-tenancy - Shard Assignment #25
base: airbnb-main
Are you sure you want to change the base?
Conversation
long startTimeEpochMs, | ||
long endTimeEpochMs, | ||
List<String> partitions, | ||
boolean usingDedicatedPartition) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need usingDedicatedPartition
field here? I think we can drop it.
import com.slack.astra.metadata.core.AstraMetadata; | ||
import java.util.Objects; | ||
|
||
/** PartitionMetadata Object to track the utilization and isPartitionShared in zookeeper */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please expand on the class doc. PartitionMetadata Object is used to track the utilization of a kafka partition in Zookeeper. isPartitionShared
field is set when this partition is shared by multiple tenants. If false, this partition is used by a single tenant.
public class PartitionMetadata extends AstraMetadata { | ||
public final String partitionId; | ||
public final long utilization; | ||
public final boolean isPartitionShared; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit but isSharedPartition may be more appropriate.
* @param requireDedicatedPartition - if dedicated partitions are required | ||
* @return list of string of partition ids | ||
*/ | ||
public List<String> findPartition(long requiredThroughput, boolean requireDedicatedPartition) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PartitionMetadataStore should only contain the logic related to ZK interactions. Move the business logic to calculate the partitions to the Cluster manager logic. Keep this class for updating ZK structs only.
public class PartitionMetadataStore extends AstraMetadataStore<PartitionMetadata> { | ||
public static final String PARTITION_MAP_METADATA_STORE_ZK_PATH = "/partition_map"; | ||
|
||
public static final int PARTITION_CAPACITY = 5000000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a TODO to make this a config.
int numberOfPartitions = getPartitionCount(requiredThroughput); | ||
|
||
long perPartitionCapacity = requiredThroughput / numberOfPartitions; | ||
// we want minimum of two partitions assigned to a tenant for redundancy purpose |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make the minimum number of partitions also configurable. Some use cases may need 3.
new PartitionMetadata(partitionMetadata.partitionId, perPartitionCapacity, false)); | ||
} | ||
} else { | ||
// add logic for shared partition here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this code can use a comment. Clean up comments?
return partitionIdsList; | ||
} | ||
} | ||
Thread.sleep(500); // to handle local cache sync |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sleep is an anti-pattern since it can take longer to update the ZK data. Can we re-write this code without using sleep? It may be better to limit the manager grpc such that there is one of these operations happening at a time, so we can change this utilization without the sleep.
} | ||
} | ||
Thread.sleep(500); // to handle local cache sync | ||
// if we reached here means we did not find required number of partitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this comment be before return? Is List.of()
better than collections.emptyList()
here?
rpc CreatePartition(CreatePartitionRequest) returns (PartitionMetadata) {} | ||
|
||
// GetPartition returns a single partition metadata by partition_id | ||
rpc GetPartition(GetPartitionRequest) returns (PartitionMetadata) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ListPartitions
to return all partitions may be a better API here since user needs to know how many partitions there are. We can keep the GetPartition API also, but we can skip it in favor of ListPartitions
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a new GRPC API ListPartition
Summary
Proposal doc
This PR implements 5 GRPC API calls in cluster manager.
Implementation to manage PartitionMetadata related information is also added. The PartitionMetadata tracks each partitions utilization and isShared status.
CreatePartition --> Creates a Partition in ASTRA zookeeper map.
GetPartition --> Get Partition metadata information from zookeeper about a request partition ID
CreateTenant --> Create a new Tenant in ASTRA. Find partitions which has capacity and assign to the new tenant programmatically.
RemoveTenant --> remove a tenant currently taking traffic from ASTRA. Indirectly, manages the partitions utilization also.
UpdateTenant --> update throughput and / or dedicated partition requirement for a tenant. Indirectly, manages the partitions utilization also.
Note: Currently, I have not incorporated per tenant schema validation when sharing partitions since we do not have per tenant schema implemented. Once, we have that in future, we will need to account for that also when assigning shared partitions.
Testing
Requirements