- 
                Notifications
    You must be signed in to change notification settings 
- Fork 503
Glossary
This article lists the technical terms and provides a brief overview of various modules involved in AutoMQ.
- 
Cloud Service Concepts: Covers the cloud services and product components used by AutoMQ. Users can refer to the documentation of each cloud provider for more detailed information. 
- 
Apache Kafka Concepts: Covers some existing concepts of Apache Kafka that may vary due to AutoMQ's implementation. 
- 
AutoMQ Concepts: Covers the new concepts defined within the various modules of AutoMQ. 
EBS (Elastic Block Store) is a high-performance, scalable, durable, and low-latency block storage service. In AutoMQ's system design, EBS is used to temporarily store some message data that has not yet been uploaded to object storage, aiming to achieve lower message sending and receiving latency. Different cloud service providers may have different product names for EBS.
S3 (Simple Storage Service) is a secure, durable, and highly scalable object storage service. In AutoMQ's system design, object storage is used as the primary storage medium for messages, allowing for on-demand usage and pay-as-you-go pricing, reducing storage costs by 90% compared to Apache Kafka. S3 also refers to object storage in subsequent documentation, and different cloud service providers may have different product names for their object storage services.
Bucket is the fundamental container for object storage services, used for efficient data management. When deploying AutoMQ, some Buckets need to be pre-allocated as message storage configuration.
Auto Scaling Group (ASG) is a service that can automatically adjust computing resources to meet application load demands. ASG can automatically increase or decrease the number of instances in a group of virtual hosts, ensuring high availability of applications and optimizing costs. AutoMQ uses ASG to implement automatic elasticity and scaling features. Different cloud providers may use different product names for ASG.
Broker is the logical role in the Apache Kafka system responsible for processing, storing, and transmitting messages. Multiple Broker nodes together form a Kafka cluster. In AutoMQ's system design, Broker specifically refers to the logical role that handles routine message sending and receiving, excluding the Controller role used for scheduling and allocation.
Controller is the logical role in the Apache Kafka system responsible for scheduling and coordinating task allocation among multiple nodes. Depending on the version, the Controller might have different implementation schemes. In AutoMQ's system design, the Controller is built on the KRaft mode, no longer relying on ZooKeeper services. Among multiple Controller nodes, there will be one Active Controller node serving as the primary decision-making node.
Partition is the logical shard of an Apache Kafka Topic, used to achieve parallel data processing and increase throughput. Each Partition is an ordered, immutable sequence of messages. In AutoMQ's system design, Partition retains its original functional definition but no longer stores data on local disks. Instead, it leverages object storage to achieve infinite capacity and on-demand scalability.
AutoMQ is a next-generation Apache Kafka release redesigned based on cloud-native concepts, offering up to a tenfold cost advantage and hundreds of times the elasticity while being 100% compatible with the Apache Kafka protocol.
S3Stream is a low-latency, high-throughput, elastic, and cost-effective streaming repository built on EBS and object storage, integrating externally through the Stream operation interface. AutoMQ replaces Apache Kafka's Log storage with S3Stream, ensuring 100% compatibility with Apache Kafka's upper-layer functionalities while offering ten times the cost advantage and hundreds of times the elasticity advantage.
S3Url is a unified configuration item used by AutoMQ for rapid cluster deployment, containing information such as object storage access points and identity credentials. It is recommended to use the installation tool to generate the S3Url configuration to pre-validate parameter legality and resource compatibility, avoiding the cumbersome cluster ID generation and storage formatting operations in Apache Kafka.
WAL (Write-Ahead Log) is a high-throughput, low-latency, persistent cache based on EBS in the S3Stream library. It temporarily caches data not yet committed to object storage. In AutoMQ, WAL is allocated at the Broker level. When a Broker receives a message, it first writes the message sequentially to the WAL and immediately returns a client response, then asynchronously uploads the WAL data to object storage.
Stream Object is the smallest unit for storing Stream data in S3Stream. Data from each Stream is distributed across multiple Stream Objects, which collectively simulate an infinite Stream.
Stream Set Object is a temporary data structure in S3Stream used to merge scattered Stream write requests. When uploading the temporary data from WAL to object storage, data from multiple scattered Streams is merged into a single Stream Set Object before uploading. Subsequently, the Stream Set Object is asynchronously classified and organized into regular Stream Objects.
- What is automq: Overview
- Difference with Apache Kafka
- Difference with WarpStream
- Difference with Tiered Storage
- Compatibility with Apache Kafka
- Licensing
- Deploy Locally
- Cluster Deployment on Linux
- Cluster Deployment on Kubernetes
- Example: Produce & Consume Message
- Example: Simple Benchmark
- Example: Partition Reassignment in Seconds
- Example: Self Balancing when Cluster Nodes Change
- Example: Continuous Data Self Balancing
- Architecture: Overview
- S3stream shared streaming storage
- Technical advantage
- Deployment: Overview
- Runs on Cloud
- Runs on CEPH
- Runs on CubeFS
- Runs on MinIO
- Runs on HDFS
- Configuration
- Data analysis
- Object storage
- Kafka ui
- Observability
- Data integration