CC Assignment 2

Services

For detailed information about the system architecture, please refer to the overview report.

There is some additional information for each service below:

APIs

Image API

Description

This API performs CRUD operations for images using MongoDB and pushes changes to Kafka topics.

Kubernetes Configuration Files:

basic_services/api/image_api

Image Analyzer API

Description

This API contains ML models that analyze leaf diseases based on different image representations.

Kubernetes Configuration Files:

basic_services/api/image_analyzer_api

Jobs

Camera

Description

This job automatically generates and uploads new pepper, potato, or tomato photos every 30 seconds.

Kubernetes Configuration Files:

basic_services/jobs/inner_jobs/camera

Leaf Disease Recognizer

Description

This job identifies diseases for one image every 100 seconds.

Kubernetes Configuration Files:

basic_services/jobs/inner_jobs/leaf_disease_recognizer

Users

Description

This job continuously updates information and can also activate and delete images.

This service comprises three distinct jobs:

Job to activate or deactivate images, which runs 300 seconds.
Job to update image metadata, which runs every 300 seconds.
Job to delete images, which runs every 700 seconds.

Kubernetes Configuration Files:

basic_services/jobs/inner_jobs/users

Db Synchronizer

Description

This job synchronizes the producer and consumer databases using a batch approach.

Kubernetes Configuration Files:

basic_services/jobs/outer_jobs/db_synchronizer

Databases

Databases stores images along with their related information.

Producer DB

Description

This database is a part of Leaf Image Management System and stores images along with their related information, which are produced.

Kubernetes Configuration Files:

basic_services/mongodb/producer_db/plant_db

Consumer Db

Description

This database stores images along with their related information, which are consumed.

Kubernetes Configuration Files:

basic_services/mongodb/consumer_db/plant_db

Metrics

Metrics allow us to monitor the important aspects of the system, which are related to the performance, availability, and reliability.

Prometheus

Description

This service collects metrics from the system. In our case, we collect metrics from image-api for stream-processing, and db-synchronizer for batch-processing.

Metrics for image-api:

image_api_image - number of images according to their plant, id and disease
image_api_image_size - size of images in bytes with metadata

Metrics for db-synchronizer:

db_synchronizer_job_image - number of images according to their plant, id and disease
db_synchronizer_job_image_size - size of images in bytes with metadata

Kubernetes Configuration Files:

Prometheus is deployed using Helm chart. Remember that you should install Helm before deploying Prometheus.

Grafana

Description

This service visualizes metrics from Prometheus. To login to Grafana use the following credentials: admin (login)/admin(password).

Kubernetes Configuration Files:

metrics/grafana

Dashboard import

Here is the mechanism to import the dashboard to Grafana:

You should open Grafana and login to it. After that you should click on the '+' button and select 'Import dashboard' option.

You should add .json file from metrics/grafana/dashboard folder and click on 'Load' button.

You should skip the 'Options' section and click on 'Import' button.

You should see the imported dashboard.

Deployment Instructions

For this project we use .stg files for staging and .prod files for production.

Locally

Install minikube
Start minikube

minikube start

You can change the context to minikube using the following command:

kubectl config view
kubectl config current-context
kubectl config use-context minikube

Run the following script:

sh start.stg.sh

All the services will be deployed locally in 'leaf-image-management-system' namespace.

Wait 2-3 minutes until all pods are running and the all the data has been loaded into the databases
You can expose the ports of all services using the foolowing script: open.stg.sh

sh open.stg.sh

Each service has been deployed to 127.0.0.1 with ClusterIP type:

image_api: 8080
image_api_prometheus: 8050
image_analizer_api: 8081
camera: 5050
leaf_disease_recognizer: 5051
users: 5052
db_synchronizer: 5053
db_synchronizer_prometheus: 8051
producer_db: 27017
consumer_db: 27018
kafka: 9093

You can close the ports using the following script:

sh close.stg.sh

After you implemented your consumer you can deploy Prometheus cluster with Grafana, run the following script:

sh start-metrics.stg.sh

All the services from this script will be deployed locally in 'metrics' namespace. Prometheus will be deployed using Helm chart. Remember that you should install Helm before deploying Prometheus.

If you have already installed Helm, you can deploy Prometheus calling the following command before running the script:

helm repo update

Wait 7-9 minutes until all pods are running and the all the data has been loaded into the databases.
You can expose the ports of all services using the foolowing script: open-metrics.stg.sh

sh open-metrics.stg.sh

Each service has been deployed to 127.0.0.1 with ClusterIP type:

prometheus: 9090
grafana: 3000

You can close the ports using the following script:

sh close-metrics.stg.sh

In grafana you can see the dashboard showing the batch and stream processing metrics.

Cloud (GCP)

Create a GKE cluster. You can read the description file here.
Run the following script:

sh start.prod.sh your_project_id

All the services will be deployed to 'leaf-image-management-system' namespace.

Wait 7-9 minutes until all pods are running and the all the data has been loaded into the databases.
GKE deploys services with NodePort type. The external IP addresses of the services are:

image_api: 30080
image_api_prometheus: 30051
image_analizer_api: - (not deployed)
camera: 30550
leaf_disease_recognizer: - (not deployed)
users: 30551
db_synchronizer: 30552
db_synchronizer_prometheus: 30051
producer_db: 30017
consumer_db: 30018

As you can see, image_analizer_api and leaf_disease_recognizer are not deployed to GKE. It is because we do not use GPU for this project.

You can connect to the services using the following command:

4.1 Find out the external IP address of the service: kubectl get nodes -o wide

4.2 Connect to the service using the following address: your_external_ip:service_port

For the port exposing this schema uses ingress.
After you implemented your consumer you can deploy Prometheus cluster with Grafana, run the following script:

sh start-metrics.prod.sh

All the services from this script will be deployed to 'metrics' namespace. Prometheus will be deployed using Helm chart.

Wait 2-3 minutes until all pods are running and the all the data has been loaded into the databases.
GKE deploys Grafana with LoadBalancer type.
In Grafana you can add the dashboard showing the batch and stream processing metrics.

Delete Instructions

Locally:

For the entire system:

sh stop.stg.sh

For metrics:

sh stop-metrics.stg.sh

Cloud (GCP):

For the entire system:

sh stop.prod.sh

For metrics:

sh stop-metrics.prod.sh

Task

Based on the overview report and the deployment instructions, you will develop a service, which interacts with this system using and event-driven approach.

This assignment is divided into three parts:

Task 1 [55%]: Develop a service for consuming and logging data from a Kafka cluster. Test and run the application locally. For this purpose you should use .stg files to deploy the system.
Task 2 [30%]: Test and run the existing application, the Kafka cluster, and your service on Google Cloud Platform (GCP) using Google Kubernetes Engine (GKE). For this purpose you should use .prod files to deploy the system on GCP.
Task 3 [15%]: Compare the bandwidth between the pre-implemented batch processing application and the stream processing you are tasked to implement. For this task you can see the difference between two approaches opening Grafana dashboards for the batch and stream processing processing.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
basic_services		basic_services
images		images
metrics		metrics
report		report
README.md		README.md
close-metrics.stg.sh		close-metrics.stg.sh
close.stg.sh		close.stg.sh
open-metrics.stg.sh		open-metrics.stg.sh
open.stg.sh		open.stg.sh
start-metrics.prod.sh		start-metrics.prod.sh
start-metrics.stg.sh		start-metrics.stg.sh
start.prod.sh		start.prod.sh
start.stg.sh		start.stg.sh
stop-metrics.prod.sh		stop-metrics.prod.sh
stop-metrics.stg.sh		stop-metrics.stg.sh
stop.prod.sh		stop.prod.sh
stop.stg.sh		stop.stg.sh

AndrewAlscher/cc_assignment_2

Folders and files

Latest commit

History

Repository files navigation

CC Assignment 2

Services

APIs

Image API

Description

Kubernetes Configuration Files:

Image Analyzer API

Description

Kubernetes Configuration Files:

Jobs

Camera

Description

Kubernetes Configuration Files:

Leaf Disease Recognizer

Description

Kubernetes Configuration Files:

Users

Description

Kubernetes Configuration Files:

Db Synchronizer

Description

Kubernetes Configuration Files:

Databases

Producer DB

Description

Kubernetes Configuration Files:

Consumer Db

Description

Kubernetes Configuration Files:

Metrics

Prometheus

Description

Metrics for image-api:

Metrics for db-synchronizer:

Kubernetes Configuration Files:

Grafana

Description

Kubernetes Configuration Files:

Dashboard import

Deployment Instructions

Locally

Cloud (GCP)

Delete Instructions

Locally:

Cloud (GCP):

Task

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages