Skip to content

Commit

Permalink
Update metric documentation and several related subjects
Browse files Browse the repository at this point in the history
  • Loading branch information
jeroenvandisseldorp committed Jun 27, 2024
1 parent 320b1b8 commit 336a765
Show file tree
Hide file tree
Showing 15 changed files with 286 additions and 139 deletions.
42 changes: 42 additions & 0 deletions docs/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
* [Function Types](#function-types)
* [Function parameters](#function-parameters)
* [Logger](#logger)
* [Metrics](#metrics)
* [State stores](#state-stores)

## Introduction
Expand Down Expand Up @@ -161,6 +162,47 @@ Output of the above statements looks like:
[LOG TIMESTAMP] DEBUG function.name I'm printing five variables here: 1, 2, 3, text, {"json":"is cool"}. Lovely isn't it?
```

### Metrics

KSML supports metric collection and exposure through JMX and built-in Prometheus agent. Metrics for Python functions are
automatically generated and collected, but users can also specify their own metrics. For an example,
see `17-example-inspect-with-metrics.yaml` in the `examples` directory.

KSML supports the following metric types:

* Counter: an increasing integer, which counts for example the number of calls made to a Python function.
* Meter: used for periodically updating a measurement value. Preferred over Counter when don't care too much about exact
averages, but want to monitor trends instead.
* Timer: measures the time spent by processes or functions, that get called internally.

Every Python function in KSML can use the `metrics` variable, which is made available by KSML. The object supports the
following methods to create your own metrics:

* counter(name: str, tags: dict) -> Counter
* counter(name: str) -> Counter
* meter(name: str, tags: dict) -> Meter
* meter(name: str) -> Meter
* timer(name: str, tags: dict) -> Timer
* timer(name: str) -> Timer

In turn these objects support the following:

#### Counter

* increment()
* increment(delta: int)

#### Meter

* mark()
* mark(nrOfEvents: int)

#### Timer

* updateSeconds(valueSeconds: int)
* updateMillis(valueMillis: int)
* updateNanos(valueNanos: int)

### State stores

Some functions are allowed to access local state stores. These functions specify the
Expand Down
17 changes: 9 additions & 8 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,23 @@
Welcome to the KSML documentation, Use the menu on the left to navigate through the various sections

## Quick Start

If you want to get going quickly, go to the KSML Quickstart.

## Introduction
KSML allows anyone to specify a powerful Kafka Streams application in just a few lines of YAML and Python snippets.

KSML allows anyone to specify a powerful Kafka Streams application in just a few lines of YAML and Python snippets.

## Contents

1. [Introduction](introduction.md)
1. [Stream Types](streams.md)
1. [Functions](functions.md)
1. [Pipelines](pipelines.md)
1. [Operations](operations.md)
1. [Data Types](types.md)
1. [Runners](runners.md)
1. [Language specification](ksml-language-spec)
2. [Stream Types](streams.md)
3. [Functions](functions.md)
4. [Pipelines](pipelines.md)
5. [Operations](operations.md)
6. [Data Types](types.md)
7. [Runners](runners.md)
8. [Language specification](ksml-language-spec.md)

[Getting Started](quick-start)

Expand Down
127 changes: 88 additions & 39 deletions docs/introduction.md

Large diffs are not rendered by default.

56 changes: 36 additions & 20 deletions docs/notations.md
Original file line number Diff line number Diff line change
@@ -1,53 +1,62 @@
# Notations

### Table of Contents

1. [Introduction](#introduction)
1. [Avro](#avro)
1. [CSV](#csv)
1. [JSON](#json)
1. [SOAP](#soap)
1. [XML](#xml)
2. [Avro](#avro)
3. [CSV](#csv)
4. [JSON](#json)
5. [SOAP](#soap)
6. [XML](#xml)

## Introduction

KSML is able to express its internal data types in a number of external representations. Internally these are called _notations_.
The different notations are described below.
KSML is able to express its internal data types in a number of external representations. Internally these are called
_notations_. The different notations are described below.

## AVRO

Avro types are supported through the "avro" prefix in types. The notation is ```avro:schema```, where schema is the schema fqdn, or just the schema name itself.
Avro types are supported through the "avro" prefix in types. The notation is ```avro:schema```, where schema is the
schema fqdn, or just the schema name itself.

On Kafka topics, Avro types are serialized in binary format. Internally they are represented as structs.

Examples
Examples:

```
avro:SensorData
avro:io.axual.ksml.example.SensorData
```

Note: when referencing an AVRO schema, you have to ensure that the respective schema file can be found in the KSML working directory and has the .avsc file extension.
Note: when referencing an AVRO schema, you have to ensure that the respective schema file can be found in the KSML
working directory and has the .avsc file extension.

## CSV

Comma-separated values are supported through the "csv" prefix in types. The notation is ```csv:schema```, where schema is the schema fqdn, or just the schema name itself.
Comma-separated values are supported through the "csv" prefix in types. The notation is ```csv:schema```, where schema
is the schema fqdn, or just the schema name itself.

On Kafka topics, CSV types are serialized as `string`. Internally they are represented as structs.

Examples
Examples:

```
csv:SensorData
csv:io.axual.ksml.example.SensorData
```

Note: when referencing an CSV schema, you have to ensure that the respective schema file can be found in the KSML working directory and has the .csv file extension.
Note: when referencing an CSV schema, you have to ensure that the respective schema file can be found in the KSML
working directory and has the .csv file extension.

## JSON

JSON types are supported through the "json" prefix in types. The notation is ```json:schema```, where `schema` is the schema fqdn, or just the schema name itself.
JSON types are supported through the "json" prefix in types. The notation is ```json:schema```, where `schema` is the
schema fqdn, or just the schema name itself.

On Kafka topics, JSON types are serialized as `string`. Internally they are represented as structs or lists.

Examples
Examples:

```
json:SensorData
json:io.axual.ksml.example.SensorData
Expand All @@ -58,20 +67,27 @@ If you want to use JSON without a schema, you can leave out the colon and schema
```
json
```
Note: when referencing an JSON schema, you have to ensure that the respective schema file can be found in the KSML working directory and has the .json file extension.

Note: when referencing an JSON schema, you have to ensure that the respective schema file can be found in the KSML
working directory and has the .json file extension.

## SOAP

SOAP is supported through built-in serializers and deserializers. The representation on Kafka will always be ```string```. Internally SOAP objects are structs with their own schema. Field names are derived from the SOAP standards.
SOAP is supported through built-in serializers and deserializers. The representation on Kafka will always
be ```string```. Internally SOAP objects are structs with their own schema. Field names are derived from the SOAP
standards.

## XML

XML is supported through built-in serializers and deserializers. The representation on Kafka will always be ```string```. Internally XML objects are structs.
XML is supported through built-in serializers and deserializers. The representation on Kafka will always
be ```string```. Internally XML objects are structs.

Examples:

Examples
```
xml:SensorData
xml:io.axual.ksml.example.SensorData
```

Note: when referencing an XML schema, you have to ensure that the respective schema file can be found in the KSML working directory and has the .xsd file extension.
Note: when referencing an XML schema, you have to ensure that the respective schema file can be found in the KSML
working directory and has the .xsd file extension.
6 changes: 3 additions & 3 deletions docs/pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,14 +67,14 @@ four sink types in KSML:

| Sink type | Description |
|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `as` | Allows the pipeline result to be saved under an internal referenceable name. Pipelines defined after this point may refer to this name in their `from` statement. |
| `as` | Allows the pipeline result to be saved under an internal name, which can later be referenced. Pipelines defined after this point may refer to this name in their `from` statement. |
| `branch` | This statement allows the pipeline to be split up in several branches. Each branch filters messages with an `if` statement. Messages will be processed only by the first branch of which the `if` statement is true. |
| `forEach` | Sends every message to a function, without expecting any return type. Because there is no return type, the pipeline always stops after this statement. |
| `print` | Prints out every message according to a given output specification. |
| `to` | Sends all output messages to a specific target. This target can be a pre-defined `stream`, `table` or `globalTable`, an inline-defined topic, or a special function called a `topicNameExtractor`. |

For more information, see the respective documentation
on [pipeline definitions](specifications.md#definitions/PipelineDefinition).
For more information, see the respective documentation on pipeline definitions in
the [definitions section of the KSML language spec](ksml-language-spec.md#definitions).

## Duration

Expand Down
11 changes: 8 additions & 3 deletions docs/quick-start.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,27 @@
# Quick start

### Table of Contents

1. [Introduction](#introduction)
2. [Starting a demo setup](#starting-a-demo-setup)
3. [Starting a KSML runner](#starting-a-ksml-runner)
4. [Next steps](#next-steps)

## Introduction

KSML comes with example definitions, which contain a producer that outputs SensorData messages to Kafka,
and several pipelines, which each independently consume and process the produced messages.

## Starting a demo setup

After checking out the repository, go to the KSML directory and execute the following:

```
docker compose up -d
```

This will start Zookeeper, Kafka and a Schema Registry in the background. It will also start the demo producer, which outputs two random messages per second on a `ksml_sensordata_avro` topic.
This will start Zookeeper, Kafka and a Schema Registry in the background. It will also start the demo producer, which
outputs two random messages per second on a `ksml_sensordata_avro` topic.

You can check the valid starting of these containers using the following command:

Expand All @@ -26,6 +30,7 @@ docker compose logs -f
```

Press CTRL-C when you verified data is produced. This typically looks like this:

```
example-producer-1 | 2024-03-06T20:24:49,480Z INFO i.a.k.r.backend.KafkaProducerRunner Calling generate_sensordata_message
example-producer-1 | 2024-03-06T20:24:49,480Z INFO i.a.k.r.backend.ExecutableProducer Message: key=sensor2, value=SensorData: {"city":"Utrecht", "color":"white", "name":"sensor2", "owner":"Alice", "timestamp":1709756689480, "type":"HUMIDITY", "unit":"%", "value":"66"}
Expand All @@ -42,7 +47,6 @@ example-producer-1 | 2024-03-06T20:24:50,035Z INFO i.a.k.r.backend.ExecutableP
```


## Starting a KSML runner

To start a container which executes the example KSML definitions, type
Expand Down Expand Up @@ -70,6 +74,7 @@ This will start the KSML docker container. You should see the following typical

## Next steps

Check out the examples in the [Examples]({{ site.github.repository_url }}/tree/main/examples/) directory. By modifying the file `examples/ksml-runner.yaml` you can select the example(s) to run.
Check out the examples in the `examples` directory of the project. By modifying the file `examples/ksml-runner.yaml` you
can select the example(s) to run.

For a more elaborate introduction, you can start [here](introduction.md) or refer to the [documentation](index.md).
87 changes: 56 additions & 31 deletions docs/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,45 +2,70 @@

## Releases

<!-- TOC -->
* [Release Notes](#release-notes)
* [Releases](#releases)
* [0.8.0 (2024-03-08)](#080-2024-03-08)
* [0.2.2 (2024-01-30)](#022-2024-01-30)
* [0.2.1 (2023-12-20)](#021-2023-12-20)
* [0.2.0 (2023-12-07)](#020-2023-12-07)
* [0.1.0 (2023-03-15)](#010-2023-03-15)
* [0.0.4 (2022-12-02)](#004-2022-12-02)
* [0.0.3 (2021-07-30)](#003-2021-07-30)
* [0.0.2 (2021-06-28)](#002-2021-06-28)
* [0.0.1 (2021-04-30)](#001-2021-04-30)
<!-- TOC -->
* [Releases](#releases)
* [1.0.0 (2024-06-28)](#100-2024-06-28)
* [0.8.0 (2024-03-08)](#080-2024-03-08)
* [0.9.1 (2024-06-21)](#091-2024-06-21)
* [0.9.0 (2024-06-05)](#090-2024-06-05)
* [0.2.2 (2024-01-30)](#022-2024-01-30)
* [0.2.1 (2023-12-20)](#021-2023-12-20)
* [0.2.0 (2023-12-07)](#020-2023-12-07)
* [0.1.0 (2023-03-15)](#010-2023-03-15)
* [0.0.4 (2022-12-02)](#004-2022-12-02)
* [0.0.3 (2021-07-30)](#003-2021-07-30)
* [0.0.2 (2021-06-28)](#002-2021-06-28)
* [0.0.1 (2021-04-30)](#001-2021-04-30)

### 1.0.0 (2024-06-28)

* Reworked parsing logic, allowing alternatives for operations and other definitions to co-exist in the KSML language
specification. This allows for better syntax checking in IDEs.
* Lots of small fixes and completion modifications.

### 0.9.1 (2024-06-21)

* Fix failing test in GitHub Actions during release
* Unified build workflows

### 0.9.0 (2024-06-05)

* Collectable metrics
* New topology test suite
* Python context hardening
* Improved handling of Kafka tombstones
* Added flexibility to producers (single shot, n-shot, or user condition-based)
* JSON Logging support
* Bumped GraalVM to 23.1.2
* Bumped several dependency versions
* Several fixes and security updates

### 0.8.0 (2024-03-08)

* Reworked all parsing logic, to allow for exporting the JSON schema of the KSML specification:
* docs/specification.md is now derived from internal parser logic, guaranteeing consistency and completeness.
* examples/ksml.json contains the JSON schema, which can be loaded into IDEs for syntax validation and completion.
* docs/specification.md is now derived from internal parser logic, guaranteeing consistency and completeness.
* examples/ksml.json contains the JSON schema, which can be loaded into IDEs for syntax validation and completion.
* Improved schema handling:
* Better compatibility checking between schema fields.
* Better compatibility checking between schema fields.
* Improved support for state stores:
* Update to state store typing and handling.
* Manual state stores can be defined and referenced in pipelines.
* Manual state stores are also available in Python functions.
* State stores can be used 'side-effect-free' (eg. no AVRO schema registration)
* Update to state store typing and handling.
* Manual state stores can be defined and referenced in pipelines.
* Manual state stores are also available in Python functions.
* State stores can be used 'side-effect-free' (e.g. no AVRO schema registration)
* Python function improvements:
* Automatic variable assignment for state stores.
* Every Python function can use a Java Logger, integrating Python output with KSML log output.
* Type inference in situations where parameters or result types can be derived from the context.
* Automatic variable assignment for state stores.
* Every Python function can use a Java Logger, integrating Python output with KSML log output.
* Type inference in situations where parameters or result types can be derived from the context.
* Lots of small language updates:
* Improve readability for store types, filter operations and windowing operations
* Introduction of the "as" operation, which allows for pipeline referencing and chaining.
* Improve readability for store types, filter operations and windowing operations
* Introduction of the "as" operation, which allows for pipeline referencing and chaining.
* Better data type handling:
* Separation of data types and KSML core, allowing for easier addition of new data types in the future.
* Automatic conversion of data types, removing common pipeline failure scenarios.
* New implementation for CSV handling.
* Separation of data types and KSML core, allowing for easier addition of new data types in the future.
* Automatic conversion of data types, removing common pipeline failure scenarios.
* New implementation for CSV handling.
* Merged the different runners into a single runner.
* KSML definitions can now include both producers (data generators) and pipelines (Kafka Streams topologies).
* Removal of Kafka and Axual backend distinctions.
* KSML definitions can now include both producers (data generators) and pipelines (Kafka Streams topologies).
* Removal of Kafka and Axual backend distinctions.
* Configuration file updates, allowing for running multiple definitions in a single runner (each in its own namespace).
* Examples updated to reflect the latest definition format.
* Documentation updated.
Expand Down Expand Up @@ -82,7 +107,7 @@
**Changes:**

* Added XML/SOAP support
* Added datagenerator
* Added data generator
* Added Automatic Type Conversion
* Added Schema Support for XML, Avro, JSON, Schema
* Added Basic Error Handling
Expand All @@ -102,7 +127,7 @@
* Bug fix for windowed objects
* Store improvements
* Support Liberica NIK
* Switch from Travis CI to Github workflow
* Switch from Travis CI to GitHub workflow
* Build snapshot Docker image on pull request merged

### 0.0.3 (2021-07-30)
Expand Down
Loading

0 comments on commit 336a765

Please sign in to comment.