A scalable Distributed Logging System designed to collect, process, and store logs generated by multiple services in a centralized and searchable manner.
This project demonstrates how modern logging pipelines are built using Python, Kafka, Fluentd, and Elasticsearch.
In distributed systems, logs are generated across multiple services and machines.
Without a centralized logging mechanism, debugging, monitoring, and analysis become difficult and error-prone.
This project solves that problem by:
- Collecting logs from multiple services
- Streaming logs reliably using Kafka
- Processing and forwarding logs via Fluentd
- Storing logs in Elasticsearch for querying and analysis
[ Python Services ]
|
v
Log Files / Stdout
|
v
Fluentd
|
v
Kafka
|
v
Log Consumers
|
v
Elasticsearch
distributedloggingsystem
├── README.md
├── services
│ ├── service1.py
│ ├── service2.py
│ └── service3.py
├── fluentd
│ ├── conf1.conf
│ ├── conf2.conf
│ └── conf3.conf
├── consumers
│ ├── consumer_first.py
│ └── consumer_final.py
├── storage
│ └── logs_indexing_storage.py
└── samples
├── inventory_service.log
└── sample_initial_consumer_output.json
- Python – Log generation and consumers
- Apache Kafka – Message streaming
- Fluentd – Log collection and forwarding
- Elasticsearch – Log storage and search
- Git – Version control
- Centralized logging for distributed services
- Decoupled producer-consumer architecture
- Fault-tolerant log streaming
- Scalable and extensible design
- Easily integrable with monitoring tools
This project helps understand:
- Distributed system observability
- Message-driven architectures
- Real-world logging pipelines
- Event streaming with Kafka
- Log aggregation and indexing
Kavya Samhitha Computer Science Student | Backend & Systems Enthusiast