diff --git a/content/telegraf/v1/data_formats/input/_index.md b/content/telegraf/v1/data_formats/input/_index.md deleted file mode 100644 index dfefeefda2..0000000000 --- a/content/telegraf/v1/data_formats/input/_index.md +++ /dev/null @@ -1,45 +0,0 @@ ---- -title: Telegraf input data formats -list_title: Input data formats -description: Telegraf supports parsing input data formats into Telegraf metrics. -menu: - telegraf_v1_ref: - name: Input data formats - weight: 1 - parent: Data formats ---- - -Telegraf [input plugins](/telegraf/v1/plugins/inputs/) consume data in one or more data formats and -parse the data into Telegraf [metrics][/telegraf/v1/metrics/]. -Many input plugins use configurable parsers for parsing data formats into metrics. -This allows input plugins such as [`kafka_consumer` input plugin](/telegraf/v1/plugins/#input-kafka_consumer) -to consume and process different data formats, such as InfluxDB line -protocol or JSON. -Telegraf supports the following input **data formats**: - -{{< children >}} - -Any input plugin containing the `data_format` option can use it to select the -desired parser: - -```toml -[[inputs.exec]] - ## Commands array - commands = ["/tmp/test.sh", "/usr/bin/mycollector --foo=bar"] - - ## measurement name suffix (for separating different commands) - name_suffix = "_mycollector" - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "json_v2" -``` - -## Input parser plugins - -When you specify a `data_format` in an [input plugin](/telegraf/v1/plugins/inputs/) configuration that supports it, the input plugin uses the associated [parser plugin](https://github.com/influxdata/telegraf/tree/master/plugins/parsers) to convert data from its source format into Telegraf metrics. -Many parser plugins provide additional configuration options for specifying details about your data schema and how it should map to fields in Telegraf metrics. - -[metrics]: /telegraf/v1/metrics/ diff --git a/content/telegraf/v1/data_formats/input/avro.md b/content/telegraf/v1/data_formats/input/avro.md deleted file mode 100644 index 0b882666a8..0000000000 --- a/content/telegraf/v1/data_formats/input/avro.md +++ /dev/null @@ -1,105 +0,0 @@ ---- -title: Avro input data format -list_title: Avro -description: Use the `avro` input data format to parse Avro binary or JSON data into Telegraf metrics. -menu: - telegraf_v1_ref: - name: Avro - weight: 10 - parent: Input data formats -metadata: [Avro Parser Plugin] ---- - -Use the `avro` input data format to parse binary or JSON [Avro](https://avro.apache.org/) message data into Telegraf metrics. - -## Wire format - -Avro messages should conform to [Wire Format](https://docs.confluent.io/platform/current/schema-registry/fundamentals/serdes-develop/index.html#wire-format) using the following byte-mapping: - -| Bytes | Area | Description | -| ----- | ---------- | ------------------------------------------------ | -| 0 | Magic Byte | Confluent serialization format version number; currently always `0`. | -| 1-4 | Schema ID | 4-byte schema ID as returned by Schema Registry. | -| 5- | Data | Serialized data. | - -{{% caption %}} -Source: [Confluent Documentation](https://docs.confluent.io/platform/current/schema-registry/fundamentals/serdes-develop/index.html#wire-format) -{{% /caption %}} - -For more information about Avro schema and encodings, see the [specification](https://avro.apache.org/docs/current/specification/) in the Apache Avro documentation. - -## Configuration - -```toml -[[inputs.kafka_consumer]] - ## Kafka brokers. - brokers = ["localhost:9092"] - - ## Topics to consume. - topics = ["telegraf"] - - ## Maximum length of a message to consume, in bytes (default 0/unlimited); - ## larger messages are dropped - max_message_len = 1000000 - - ## Avro data format settings - data_format = "avro" - - ## Avro message format - ## Supported values are "binary" (default) and "json" - # avro_format = "binary" - - ## Url of the schema registry; exactly one of schema registry and - ## schema must be set - avro_schema_registry = "http://localhost:8081" - - ## Schema string; exactly one of schema registry and schema must be set - #avro_schema = ''' - # { - # "type":"record", - # "name":"Value", - # "namespace":"com.example", - # "fields":[ - # { - # "name":"tag", - # "type":"string" - # }, - # { - # "name":"field", - # "type":"long" - # }, - # { - # "name":"timestamp", - # "type":"long" - # } - # ] - # } - #''' - - ## Measurement string; if not set, determine measurement name from - ## schema (as ".") - # avro_measurement = "ratings" - - ## Avro fields to be used as tags; optional. - # avro_tags = ["CHANNEL", "CLUB_STATUS"] - - ## Avro fields to be used as fields; if empty, any Avro fields - ## detected from the schema, not used as tags, will be used as - ## measurement fields. - # avro_fields = ["STARS"] - - ## Avro fields to be used as timestamp; if empty, current time will - ## be used for the measurement timestamp. - # avro_timestamp = "" - ## If avro_timestamp is specified, avro_timestamp_format must be set - ## to one of 'unix', 'unix_ms', 'unix_us', or 'unix_ns' - # avro_timestamp_format = "unix" - - ## Used to separate parts of array structures. As above, the default - ## is the empty string, so a=["a", "b"] becomes a0="a", a1="b". - ## If this were set to "_", then it would be a_0="a", a_1="b". - # avro_field_separator = "_" - - ## Default values for given tags: optional - # tags = { "application": "hermes", "region": "central" } -``` diff --git a/content/telegraf/v1/data_formats/input/binary.md b/content/telegraf/v1/data_formats/input/binary.md deleted file mode 100644 index 641769a9ae..0000000000 --- a/content/telegraf/v1/data_formats/input/binary.md +++ /dev/null @@ -1,355 +0,0 @@ ---- -title: Binary input data format -list_title: Binary -description: - Use the `binary` input data format with user-specified configurations to parse binary protocols into Telegraf metrics. -menu: - telegraf_v1_ref: - name: Binary - weight: 10 - parent: Input data formats -metadata: [Binary Parser Plugin] ---- - -Use the `binary` input data format with user-specified configurations to parse binary protocols into Telegraf metrics. - -## Configuration - -```toml -[[inputs.file]] - files = ["example.bin"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "binary" - - ## Do not error-out if none of the filter expressions below matches. - # allow_no_match = false - - ## Specify the endianness of the data. - ## Available values are "be" (big-endian), "le" (little-endian) and "host", - ## where "host" means the same endianness as the machine running Telegraf. - # endianess = "host" - - ## Interpret input as string containing hex-encoded data. - # hex_encoding = false - - ## Multiple parsing sections are allowed - [[inputs.file.binary]] - ## Optional: Metric (measurement) name to use if not extracted from the data. - # metric_name = "my_name" - - ## Definition of the message format and the extracted data. - ## Please note that you need to define all elements of the data in the - ## correct order with the correct length as the data is parsed in the order - ## given. - ## An entry can have the following properties: - ## name -- Name of the element (e.g. field or tag). Can be omitted - ## for special assignments (i.e. time & measurement) or if - ## entry is omitted. - ## type -- Data-type of the entry. Can be "int8/16/32/64", "uint8/16/32/64", - ## "float32/64", "bool" and "string". - ## In case of time, this can be any of "unix" (default), "unix_ms", "unix_us", - ## "unix_ns" or a valid Golang time format. - ## bits -- Length in bits for this entry. If omitted, the length derived from - ## the "type" property will be used. For "time" 64-bit will be used - ## as default. - ## assignment -- Assignment of the gathered data. Can be "measurement", "time", - ## "field" or "tag". If omitted "field" is assumed. - ## omit -- Omit the given data. If true, the data is skipped and not added - ## to the metric. Omitted entries only need a length definition - ## via "bits" or "type". - ## terminator -- Terminator for dynamic-length strings. Only used for "string" type. - ## Valid values are "fixed" (fixed length string given by "bits"), - ## "null" (null-terminated string) or a character sequence specified - ## as HEX values (e.g. "0x0D0A"). Defaults to "fixed" for strings. - ## timezone -- Timezone of "time" entries. Only applies to "time" assignments. - ## Can be "utc", "local" or any valid Golang timezone (e.g. "Europe/Berlin") - entries = [ - { type = "string", assignment = "measurement", terminator = "null" }, - { name = "address", type = "uint16", assignment = "tag" }, - { name = "value", type = "float64" }, - { type = "unix", assignment = "time" }, - ] - - ## Optional: Filter evaluated before applying the configuration. - ## This option can be used to mange multiple configuration specific for - ## a certain message type. If no filter is given, the configuration is applied. - # [inputs.file.binary.filter] - # ## Filter message by the exact length in bytes (default: N/A). - # # length = 0 - # ## Filter the message by a minimum length in bytes. - # ## Messages longer of of equal length will pass. - # # length_min = 0 - # ## List of data parts to match. - # ## Only if all selected parts match, the configuration will be - # ## applied. The "offset" is the start of the data to match in bits, - # ## "bits" is the length in bits and "match" is the value to match - # ## against. Non-byte boundaries are supported, data is always right-aligned. - # selection = [ - # { offset = 0, bits = 8, match = "0x1F" }, - # ] - # - # -``` - -In this configuration mode, you explicitly specify the field and tags -to parse from your data. - -A configuration can contain multiple `binary` subsections. -For example, the `file` plugin can process binary data multiple times. -This can be useful (together with _filters_) to handle different message types. - -**Note**: The `filter` section needs to be placed _after_ the `entries` -definitions, otherwise the entries will be assigned to the filter section. - -### General options and remarks - -#### `allow_no_match` (optional) - -By specifying `allow_no_match` you allow the parser to silently ignore data -that does not match _any_ given configuration filter. This can be useful if -you only want to collect a subset of the available messages. - -#### `endianness` (optional) - -This specifies the endianness of the data. If not specified, the parser will -fall back to the "host" endianness, assuming that the message and Telegraf -machine share the same endianness. -Alternatively, you can explicitly specify big-endian format (`"be"`) or -little-endian format (`"le"`). - -#### `hex_encoding` (optional) - -If `true`, the input data is interpreted as a string containing hex-encoded -data like `C0 C7 21 A9`. The value is _case insensitive_ and can handle spaces, -however prefixes like `0x` or `x` are _not_ allowed. - -### Non-byte aligned value extraction - -In both, `filter` and `entries` definitions, values can be extracted at non-byte -boundaries. You can for example extract 3-bit starting at bit-offset 8. In those -cases, the result will be masked and shifted such that the resulting byte-value -is _right_ aligned. In case your 3-bit are `101` the resulting byte value is -`0x05`. - -This is especially important when specifying the `match` value in the filter -section. - -### Entries definitions - -The `entries` array specifies how to parse the message into the measurement name, -timestamp, tags, and fields. - -#### `measurement` specification - -When setting the `assignment` to `"measurement"`, the extracted value -is used as the metric name, overriding other specifications. -The `type` setting is assumed to be `"string"` and can be omitted similar -to the `name` option. See [`string` type handling](#string-type-handling) -for details and further options. - -#### `time` specification - -When setting the `assignment` to `"time"`, the extracted value -is used as the timestamp of the metric. The default is the _current -time_ for all created metrics. - -The `type` setting specifies the time-format of included timestamps. -Use one of the following: - -- `unix` _(default)_ -- `unix_ms` -- `unix_us` -- `unix_ns` -- [Go "reference time"][time const]. Consult the Go [time][time parse] - package for details and additional examples on how to set the time format. - -For the `unix` format and derivatives, the underlying value is assumed -to be a 64-bit integer. The `bits` setting can be used to specify other -length settings. All other time-formats assume a fixed-length `string` -value to be extracted. The length of the string is automatically -determined using the format setting in `type`. - -The `timezone` setting converts the extracted time to the -given value timezone. By default, the time will be interpreted as `utc`. -Other valid values are `local` (the local timezone configured for -the machine), or valid timezone-specification (for example,`Europe/Berlin`). - -### `tag` specification - -When setting the `assignment` to `"tag"`, the extracted value -is used as a tag. The `name` setting is the name of the tag -and the `type` defaults to `string`. When specifying other types, -the extracted value is first interpreted as the given type and -then converted to `string`. - -The `bits` setting can be used to specify the length of the data to -extract and is required for fixed-length `string` types. - -### `field` specification - -When setting the `assignment` to `"field"` or omitting the `assignment` -setting, the extracted value is used as a field. The `name` setting -is used as the name of the field and the `type` as the type of the field value. - -The `bits` setting can be used to specify the length of the data to -extract. By default the length corresponding to `type` is used. -Please see the [string](#string-type-handling) and [bool](#bool-type-handling) -specific sections when using those types. - -### `string` type handling - -Strings are assumed to be fixed-length strings by default. In this case, the -`bits` setting is mandatory to specify the length of the string in _bit_. - -To handle dynamic strings, the `terminator` setting can be used to specify -characters to terminate the string. The two named options, `fixed` and `null` -specify fixed-length and null-terminated strings, respectively. -Any other setting is interpreted as a hexadecimal sequence of bytes -matching the end of the string. The termination-sequence is removed from -the result. - -### `bool` type handling - -By default, `bool` types are assumed to be _one_ bit in length. You can -specify any other length by using the `bits` setting. -When interpreting values as booleans, any zero value is `false` and -any non-zero value is `true`. - -### omitting data - -Parts of the data can be omitted by setting `omit = true`. In this case, -you only need to specify the length of the chunk to omit by either using -the `type` or `bits` setting. All other options can be skipped. - -### Filter definitions - -Filters can be used to match the length or the content of the data against -a specified reference. See the [examples section](#examples) for details. -You can also check multiple parts of the message by specifying multiple -`section` entries for a filter. Each `section` is then matched separately. -All have to match to apply the configuration. - -#### `length` and `length_min` options - -Using the `length` option, the filter checks if the parsed data has -exactly the given number of _bytes_. Otherwise, the configuration is not applied. -Similarly, for `length_min` the data has to have _at least_ the given number -of _bytes_ to generate a match. - -#### `selection` list - -Selections can be used with or without length constraints to match the content -of the data. Here, the `offset` and `bits` properties specify the start -and length of the data to check. Both values are in _bit_ allowing for non-byte -aligned value extraction. The extracted data is checked against the -given `match` value specified in HEX. - -If multiple `selection` entries are specified _all_ of the selections must -match for the configuration to get applied. - -## Examples - -In the following example, we use a binary protocol with three different messages -in little-endian format - -### Message A definition - -```text -+--------+------+------+--------+--------+------------+--------------------+--------------------+ -| ID | type | len | addr | count | failure | value | timestamp | -+--------+------+------+--------+--------+------------+--------------------+--------------------+ -| 0x0201 | 0x0A | 0x18 | 0x7F01 | 0x2A00 | 0x00000000 | 0x6F1283C0CA210940 | 0x10D4DF6200000000 | -+--------+------+------+--------+--------+------------+--------------------+--------------------+ -``` - -### Message B definition - -```text -+--------+------+------+------------+ -| ID | type | len | value | -+--------+------+------+------------+ -| 0x0201 | 0x0B | 0x04 | 0xDEADC0DE | -+--------+------+------+------------+ -``` - -### Message C definition - -```text -+--------+------+------+------------+------------+--------------------+ -| ID | type | len | value x | value y | timestamp | -+--------+------+------+------------+------------+--------------------+ -| 0x0201 | 0x0C | 0x10 | 0x4DF82D40 | 0x5F305C08 | 0x10D4DF6200000000 | -+--------+------+------+------------+------------+--------------------+ -``` - -All messages consists of a 4-byte header containing the _message type_ -in the 3rd byte and a message specific body. To parse those messages -you can use the following configuration: - -```toml -[[inputs.file]] - files = ["messageA.bin", "messageB.bin", "messageC.bin"] - data_format = "binary" - endianess = "le" - - [[inputs.file.binary]] - metric_name = "messageA" - - entries = [ - { bits = 32, omit = true }, - { name = "address", type = "uint16", assignment = "tag" }, - { name = "count", type = "int16" }, - { name = "failure", type = "bool", bits = 32, assignment = "tag" }, - { name = "value", type = "float64" }, - { type = "unix", assignment = "time" }, - ] - - [inputs.file.binary.filter] - selection = [{ offset = 16, bits = 8, match = "0x0A" }] - - [[inputs.file.binary]] - metric_name = "messageB" - - entries = [ - { bits = 32, omit = true }, - { name = "value", type = "uint32" }, - ] - - [inputs.file.binary.filter] - selection = [{ offset = 16, bits = 8, match = "0x0B" }] - - [[inputs.file.binary]] - metric_name = "messageC" - - entries = [ - { bits = 32, omit = true }, - { name = "x", type = "float32" }, - { name = "y", type = "float32" }, - { type = "unix", assignment = "time" }, - ] - - [inputs.file.binary.filter] - selection = [{ offset = 16, bits = 8, match = "0x0C" }] -``` - -The above configuration has one `[[inputs.file.binary]]` section per -message type and uses a filter in each of those sections to apply -the correct configuration by comparing the 3rd byte (containing -the message type). This results in the following output: - -```text -metricA,address=383,failure=false count=42i,value=3.1415 1658835984000000000 -metricB value=3737169374i 1658847037000000000 -metricC x=2.718280076980591,y=0.0000000000000000000000000000000006626070178575745 1658835984000000000 -``` - -`metricB` uses the parsing time as timestamp due to missing -information in the data. The other two metrics use the timestamp -derived from the data. - -[time const]: https://golang.org/pkg/time/#pkg-constants -[time parse]: https://golang.org/pkg/time/#Parse diff --git a/content/telegraf/v1/data_formats/input/collectd.md b/content/telegraf/v1/data_formats/input/collectd.md deleted file mode 100644 index 016c58acae..0000000000 --- a/content/telegraf/v1/data_formats/input/collectd.md +++ /dev/null @@ -1,69 +0,0 @@ ---- -title: Collectd input data format -list_title: Collectd -description: Use the `collectd` input data format to parse collectd network binary protocol to create tags for host, instance, type, and type instance. -menu: - telegraf_v1_ref: - name: collectd - weight: 10 - parent: Input data formats -metadata: [Collectd Parser Plugin] ---- - -Use the `collectd` input data format to parse [collectd binary network protocol](https://collectd.org/wiki/index.php/Binary_protocol) data into Telegraf metrics. -Tags are -created for host, instance, type, and type instance. -All collectd values are added as float64 fields. - -For more information about the binary network protocol see -[here](https://collectd.org/wiki/index.php/Binary_protocol). - -You can control the cryptographic settings with parser options. Create an -authentication file and set `collectd_auth_file` to the path of the file, then -set the desired security level in `collectd_security_level`. - -Additional information including client setup can be found [here][1]. - -You can also change the path to the typesdb or add additional typesdb using -`collectd_typesdb`. - -[1]: https://collectd.org/wiki/index.php/Networking_introduction#Cryptographic_setup - -## Configuration - -```toml -[[inputs.socket_listener]] - service_address = "udp://:25826" - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "collectd" - - ## Authentication file for cryptographic security levels - collectd_auth_file = "/etc/collectd/auth_file" - ## One of none (default), sign, or encrypt - collectd_security_level = "encrypt" - ## Path of to TypesDB specifications - collectd_typesdb = ["/usr/share/collectd/types.db"] - - ## Multi-value plugins can be handled two ways. - ## "split" will parse and store the multi-value plugin data into separate measurements - ## "join" will parse and store the multi-value plugin as a single multi-value measurement. - ## "split" is the default behavior for backward compatibility with previous versions of InfluxDB. - collectd_parse_multivalue = "split" -``` - -## Example Output - -```text -memory,type=memory,type_instance=buffered value=2520051712 1560455990829955922 -memory,type=memory,type_instance=used value=3710791680 1560455990829955922 -memory,type=memory,type_instance=buffered value=2520047616 1560455980830417318 -memory,type=memory,type_instance=cached value=9472626688 1560455980830417318 -memory,type=memory,type_instance=slab_recl value=2088894464 1560455980830417318 -memory,type=memory,type_instance=slab_unrecl value=146984960 1560455980830417318 -memory,type=memory,type_instance=free value=2978258944 1560455980830417318 -memory,type=memory,type_instance=used value=3707047936 1560455980830417318 -``` diff --git a/content/telegraf/v1/data_formats/input/csv.md b/content/telegraf/v1/data_formats/input/csv.md deleted file mode 100644 index 97217ee673..0000000000 --- a/content/telegraf/v1/data_formats/input/csv.md +++ /dev/null @@ -1,599 +0,0 @@ ---- -title: CSV input data format -list_title: CSV -description: Use the `csv` input data format to parse comma-separated values into Telegraf metrics. -menu: - telegraf_v1_ref: - name: CSV - weight: 10 - parent: Input data formats -metadata: [CSV parser plugin] ---- - -Use the `csv` input data format to parse comma-separated values into Telegraf metrics. - -## Configuration - -```toml -[[inputs.file]] - files = ["example"] - - ## The data format to consume. - ## Type: string - ## Each data format has its own unique set of configuration options. - ## For more information about input data formats and options, - ## see https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "csv" - - ## Specifies the number of rows to treat as the header. - ## Type: integer - ## Default: 0 - ## The value can be 0 or greater. - ## If `0`, doesn't use a header; the parser treats all rows as data and uses the names specified in `csv_column_names`. - ## If `1`, uses the first row as the header. - ## If greater than `1`, concatenates that number of values for each column. - ## Values specified in `csv_column_names` override column names in the header. - csv_header_row_count = 0 - - ## Specifies custom names for columns. - ## Type: []string - ## Default: [] - ## Specify names in order by column; unnamed columns are ignored by the parser. - ## Required if `csv_header_row_count` is set to `0`. - csv_column_names = [] - - ## Specifies data types for columns. - ## Type: []string{"int", "float", "bool", "string"} - ## Default: Tries to convert each column to one of the possible types, in the following order: "int", "float", "bool", "string". - ## Possible values: "int", "float", "bool", "string". - ## Specify types in order by column (for example, `["string", "int", "float"]`). - csv_column_types = [] - - ## Specifies the number of rows to skip before looking for metadata and header information. - ## Default: 0 - csv_skip_rows = 0 - - ## Specifies the number of rows to parse as metadata (before looking for header information). - ## Type: integer - ## Default: 0; no metadata rows to parse. - ## If set, parses the rows using the characters specified in `csv_metadata_separators`, and then adds the - ## parsed key-value pairs as tags in the data. - ## To convert the tags to fields, use the converter processor. - csv_metadata_rows = 0 - - ## Specifies metadata separators, in order of precedence, for parsing metadata rows. - ## Type: []string - ## At least one separator is required if `csv_metadata_rows` is set. - ## The specified values set the order of precedence for separators used to parse `csv_metadata_rows` into key-value pairs. - ## Separators are case-sensitive. - csv_metadata_separators = [":", "="] - - ## Specifies a set of characters to trim from metadata rows. - ## Type: string - ## Default: empty; the parser doesn't trim metadata rows. - ## Trim characters are case sensitive. - csv_metadata_trim_set = "" - - ## Specifies the number of columns to skip in header and data rows. - ## Type: integer - ## Default: 0; no columns are skipped - csv_skip_columns = 0 - - ## Specifies the separator for columns in the CSV. - ## Type: string - ## Default: a comma (`,`) - ## If you specify an invalid delimiter (for example, `"\u0000"`), - ## the parser converts commas to `"\ufffd"` and converts invalid delimiters - ## to commas, parses the data, and then reverts invalid characters and commas - ## to their original values. - csv_delimiter = "," - - ## Specifies the character used to indicate a comment row. - ## Type: string - ## Default: empty; no rows are treated as comments - ## The parser skips rows that begin with the specified character. - csv_comment = "" - - ## Specifies whether to remove leading whitespace from fields. - ## Type: boolean - ## Default: false - csv_trim_space = false - - ## Specifies columns (by name) to use as tags. - ## Type: []string - ## Default: empty - ## Columns not specified as tags or measurement name are considered fields. - csv_tag_columns = [] - - ## Specifies whether column tags overwrite metadata and default tags. - ## Type: boolean - ## Default: false - ## If true, the column tag value takes precedence over metadata - ## or default tags that have the same name. - csv_tag_overwrite = false - - ## Specifies the CSV column to use for the measurement name. - ## Type: string - ## Default: empty; uses the input plugin name for the measurement name. - ## If set, the measurement name is extracted from values in the specified - ## column and the column isn't included as a field. - csv_measurement_column = "" - - ## Specifies the CSV column to use for the timestamp. - ## Type: string - ## Default: empty; uses the current system time as the timestamp in metrics - ## If set, the parser extracts time values from the specified column - ## to use as timestamps in metrics, and the column isn't included - ## as a field in metrics. - ## If set, you must also specify a value for `csv_timestamp_format`. - ## For more information, see [timestamps](/telegraf/v1/data_formats/input/csv/#timestamps). - csv_timestamp_column = "" - - ## Specifies the timestamp format for values extracted from `csv_timestamp_column`. - ## Type: string - ## Possible values: "unix", "unix_ms", "unix_us", "unix_ns", the Go reference time in one of the predefined layouts - ## Default: empty - ## Required if `csv_timestamp_column` is specified. - ## For more information, see [timestamps](/telegraf/v1/data_formats/input/csv/#timestamps). - csv_timestamp_format = "" - - ## Specifies the time zone to use and outputs location-specific timestamps in metrics. - ## Only used if `csv_timestamp_format` is the Go reference time in one of the - ## predefined layouts; unix formats are in UTC. - ## Type: string - ## Default: empty - ## Possible values: a time zone name in TZ syntax. For a list of names, see https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List. - csv_timezone = "" - ## For more information, see [timestamps](/telegraf/v1/data_formats/input/csv/#timestamps). - - ## Specifies values to skip--for example, an empty string (`""`). - ## Type: []string - ## Default: empty - ## The parser skips field values that match any of the specified values. - csv_skip_values = [] - - ## Specifies whether to skip CSV lines that can't be parsed. - ## Type: boolean - ## Default: false - csv_skip_errors = false - - ## Specifies whether to reset the parser after each call. - ## Type: string - ## Default: "none" - ## Possible values: - ## - "none": Do not reset the parser. - ## - "always": Reset the parser's state after reading each file in the gather - ## cycle. If parsing by line, the setting is ignored. - ## Resetting the parser state after parsing each file is helpful when reading - ## full CSV structures that include headers or metadata. - csv_reset_mode = "none" - ``` - -## Metrics - -With the default configuration, the CSV data format parser creates one metric -for each CSV row, and adds CSV columns as fields in the metric. -A field's data type is automatically determined from its value (unless explicitly defined with `csv_column_types`). - -Data format configuration options let you customize how the parser handles -specific CSV rows, columns, and data types. - -[Metric filtering](/telegraf/v1/configuration/#metric-filtering) and [aggregator and processor plugins](/telegraf/v1/configure_plugins/aggregator_processor/) provide additional data transformation options--for example: -- Use metric filtering to skip columns and rows. -- Use the [converter processor](https://github.com/influxdata/telegraf/tree/master/plugins/processors/converter/) to convert parsed metadata from tags to fields. - -## Timestamps - -Every metric has a timestamp--a date and time associated with the fields. -The default timestamp for created metrics is the _current time_ in UTC. - -To use extracted values from the CSV as timestamps for metrics, specify -the `csv_timestamp_column` and `csv_timestamp_format` options. - -### csv_timestamp_column - -The `csv_timestamp_column` option specifies the key (column name) in the CSV data -that contains the time value to extract and use as the timestamp in metrics. - -A unix time value may be one of the following data types: - -- int64 -- float64 -- string - -If you specify a [Go format](https://go.dev/src/time/format.go) for `csv_timestamp_format`, -values in your timestamp column must be strings. - -When using the [`"unix"` format](#csv_timestamp_format), an optional fractional component is allowed. -Other unix time formats, such as `"unix_ms"`, cannot have a fractional component. - -### csv_timestamp_format - -If specifying `csv_timestamp_column`, you must also specify the format of timestamps in the column. -To specify the format, set `csv_timestamp_format` to one of the following values: - -- `"unix"` -- `"unix_ms"` -- `"unix_us"` -- `"unix_ns"` -- a predefined layout from Go [`time` constants](https://pkg.go.dev/time#pkg-constants) using the - Go _reference time_--for example, `"Mon Jan 2 15:04:05 MST 2006"` (the `UnixDate` format string). - -For more information about time formats, see the following: - -- Unix time documentation -- Go [time][time parse] package documentation - -### Time zone - -Telegraf outputs timestamps in UTC. - -To parse location-aware timestamps in your data, -specify a [`csv_timestamp_format`](#csv_timestamp_format) -that contains time zone information. - -If timestamps in the `csv_timestamp_column` contain a time zone offset, the parser uses the offset to calculate the timestamp in UTC. - -If `csv_timestamp_format` and your timestamp data contain a time zone abbreviation, then the parser tries to resolve the abbreviation to a location in the [IANA Time Zone Database](https://www.iana.org/time-zones) and return a UTC offset for that location. -To set the location that the parser should use when resolving time zone abbreviations, specify a value for `csv_timezone`, following the TZ syntax in the [Internet Assigned Numbers Authority time zone database](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List). - -{{% warn %}} -Prior to Telegraf v1.27, the Telegraf parser ignored abbreviated time zones (for example, "EST") in parsed time values, and used UTC for the timestamp location. -{{% /warn %}} - -## Examples - -### Extract timestamps from a time column using RFC3339 format - -Configuration: - -```toml -[agent] - omit_hostname = true -[[inputs.file]] - files = ["example"] - data_format = "csv" - csv_header_row_count = 1 - csv_measurement_column = "measurement" - csv_timestamp_column = "time" - csv_timestamp_format = "2006-01-02T15:04:05Z07:00" -[[outputs.file]] - files = ["metrics.out"] - influx_sort_fields = true -``` - -Input: - -```csv -measurement,cpu,time_user,time_system,time_idle,time -cpu,cpu0,42,42,42,2018-09-13T13:03:28Z -``` - - - -Output: - - - -``` -cpu cpu="cpu0",time_idle=42i,time_system=42i,time_user=42i 1536843808000000000 -``` - -### Parse timestamp abbreviations - -The following example specifies `csv_timezone` for resolving an associated time zone (`EST`) in the input data: - -Configuration: - -```toml -[agent] - omit_hostname = true -[[inputs.file]] - files = ["example"] - data_format = "csv" - csv_header_row_count = 1 - csv_measurement_column = "measurement" - csv_timestamp_column = "time" - csv_timestamp_format = "Mon, 02 Jan 2006 15:04:05 MST" - csv_timezone = "America/New_York" -[[outputs.file]] - files = ["metrics.out"] - influx_sort_fields = true -``` - -Input: - -```csv -measurement,cpu,time_user,time_system,time_idle,time -cpu,cpu1,42,42,42,"Mon, 02 Jan 2006 15:04:05 EST" -cpu,cpu1,42,42,42,"Mon, 02 Jan 2006 15:04:05 GMT" -``` - - - -The parser resolves the `GMT` and `EST` abbreviations and outputs the following: - - - -``` -cpu cpu="cpu1",time_idle=42i,time_system=42i,time_user=42i 1136232245000000000 -cpu cpu="cpu1",time_idle=42i,time_system=42i,time_user=42i 1136214245000000000 -``` - -The timestamps represent the following dates, respectively: - -```text -2006-01-02 20:04:05 -2006-01-02 15:04:05 -``` - -### Parse metadata into tags - -Configuration: - -```toml -[agent] - omit_hostname = true -[[inputs.file]] - files = ["example"] - data_format = "csv" - csv_measurement_column = "measurement" - csv_metadata_rows = 2 - csv_metadata_separators = [":", "="] - csv_metadata_trim_set = "# " - csv_header_row_count = 1 - csv_tag_columns = ["Version","cpu"] - csv_timestamp_column = "time" - csv_timestamp_format = "2006-01-02T15:04:05Z07:00" -[[outputs.file]] - files = ["metrics.out"] - influx_sort_fields = true -``` - -Input: - -```csv -# Version=1.1 -# File Created: 2021-11-17T07:02:45+10:00 -Version,measurement,cpu,time_user,time_system,time_idle,time -1.2,cpu,cpu0,42,42,42,2018-09-13T13:03:28Z -``` - - - -Output: - - - -``` -cpu,File\ Created=2021-11-17T07:02:45+10:00,Version=1.1,cpu=cpu0 time_idle=42i,time_system=42i,time_user=42i 1536843808000000000 -``` - -### Allow tag column values to overwrite parsed metadata - -Configuration: - -```toml -[agent] - omit_hostname = true -[[inputs.file]] - files = ["example"] - data_format = "csv" - csv_measurement_column = "measurement" - csv_metadata_rows = 2 - csv_metadata_separators = [":", "="] - csv_metadata_trim_set = " #" - csv_header_row_count = 1 - csv_tag_columns = ["Version","cpu"] - csv_tag_overwrite = true - csv_timestamp_column = "time" - csv_timestamp_format = "2006-01-02T15:04:05Z07:00" -[[outputs.file]] - files = ["metrics.out"] - influx_sort_fields = true -``` - -Input: - -```csv -# Version=1.1 -# File Created: 2021-11-17T07:02:45+10:00 -Version,measurement,cpu,time_user,time_system,time_idle,time -1.2,cpu,cpu0,42,42,42,2018-09-13T13:03:28Z -``` - - - -Output: - - - -``` -cpu,File\ Created=2021-11-17T07:02:45+10:00,Version=1.2,cpu=cpu0 time_idle=42i,time_system=42i,time_user=42i 1536843808000000000 -``` - -### Combine multiple header rows - -Configuration: - -```toml -[agent] - omit_hostname = true -[[inputs.file]] - files = ["example"] - data_format = "csv" - csv_comment = "#" - csv_header_row_count = 2 - csv_measurement_column = "measurement" - csv_timestamp_column = "time" - csv_timestamp_format = "2006-01-02T15:04:05Z07:00" -[[outputs.file]] - ## Files to write to. - files = ["metrics.out"] - ## Use determinate ordering. - influx_sort_fields = true -``` - -Input: - -```csv -# Version=1.1 -# File Created: 2021-11-17T07:02:45+10:00 -Version,measurement,cpu,time,time,time,time -_system,,,_user,_system,_idle, -1.2,cpu,cpu0,42,42,42,2018-09-13T13:03:28Z -``` - - - -Output: - - - -``` -cpu Version_system=1.2,cpu="cpu0",time_idle=42i,time_system=42i,time_user=42i 1536843808000000000 -``` - -[time parse]: https://pkg.go.dev/time#Parse -[metric filtering]: /telegraf/v1/configuration/#metric-filtering diff --git a/content/telegraf/v1/data_formats/input/dropwizard.md b/content/telegraf/v1/data_formats/input/dropwizard.md deleted file mode 100644 index 49ac69e7ed..0000000000 --- a/content/telegraf/v1/data_formats/input/dropwizard.md +++ /dev/null @@ -1,187 +0,0 @@ ---- -title: Dropwizard input data format -list_title: Dropwizard -description: Use the `dropwizard` input data format to parse Dropwizard JSON representations into Telegraf metrics. -menu: - telegraf_v1_ref: - name: Dropwizard - weight: 10 - parent: Input data formats -metadata: [Dropwizard parser plugin] ---- - -Use the `dropwizard` input data format to parse the [JSON Dropwizard][dropwizard] -representation of a single dropwizard metric registry into Telegraf metrics. By default, tags are -parsed from metric names as if they were actual InfluxDB line protocol keys -(`measurement<,tag_set>`) which can be overridden by defining a custom [template -pattern][templates]. All field value types are supported, `string`, `number` and -`boolean`. - -[templates]: https://github.com/influxdata/telegraf/blob/master/docs/TEMPLATE_PATTERN.md -[dropwizard]: https://metrics.dropwizard.io/3.1.0/manual/json/ - -## Configuration - -```toml -[[inputs.file]] - files = ["example"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "dropwizard" - - ## Used by the templating engine to join matched values when cardinality is > 1 - separator = "_" - - ## Each template line requires a template pattern. It can have an optional - ## filter before the template and separated by spaces. It can also have optional extra - ## tags following the template. Multiple tags should be separated by commas and no spaces - ## similar to the line protocol format. There can be only one default template. - ## Templates support below format: - ## 1. filter + template - ## 2. filter + template + extra tag(s) - ## 3. filter + template with field key - ## 4. default template - ## By providing an empty template array, templating is disabled and measurements are parsed as InfluxDB line protocol keys (measurement<,tag_set>) - templates = [] - - ## You may use an appropriate [gjson path](https://github.com/tidwall/gjson#path-syntax) - ## to locate the metric registry within the JSON document - # dropwizard_metric_registry_path = "metrics" - - ## You may use an appropriate [gjson path](https://github.com/tidwall/gjson#path-syntax) - ## to locate the default time of the measurements within the JSON document - # dropwizard_time_path = "time" - # dropwizard_time_format = "2006-01-02T15:04:05Z07:00" - - ## You may use an appropriate [gjson path](https://github.com/tidwall/gjson#path-syntax) - ## to locate the tags map within the JSON document - # dropwizard_tags_path = "tags" - - ## You may even use tag paths per tag - # [inputs.exec.dropwizard_tag_paths] - # tag1 = "tags.tag1" - # tag2 = "tags.tag2" -``` - -## Examples - -A typical JSON of a dropwizard metric registry: - -```json -{ - "version": "3.0.0", - "counters" : { - "measurement,tag1=green" : { - "count" : 1 - } - }, - "meters" : { - "measurement" : { - "count" : 1, - "m15_rate" : 1.0, - "m1_rate" : 1.0, - "m5_rate" : 1.0, - "mean_rate" : 1.0, - "units" : "events/second" - } - }, - "gauges" : { - "measurement" : { - "value" : 1 - } - }, - "histograms" : { - "measurement" : { - "count" : 1, - "max" : 1.0, - "mean" : 1.0, - "min" : 1.0, - "p50" : 1.0, - "p75" : 1.0, - "p95" : 1.0, - "p98" : 1.0, - "p99" : 1.0, - "p999" : 1.0, - "stddev" : 1.0 - } - }, - "timers" : { - "measurement" : { - "count" : 1, - "max" : 1.0, - "mean" : 1.0, - "min" : 1.0, - "p50" : 1.0, - "p75" : 1.0, - "p95" : 1.0, - "p98" : 1.0, - "p99" : 1.0, - "p999" : 1.0, - "stddev" : 1.0, - "m15_rate" : 1.0, - "m1_rate" : 1.0, - "m5_rate" : 1.0, - "mean_rate" : 1.0, - "duration_units" : "seconds", - "rate_units" : "calls/second" - } - } -} -``` - -Would get translated into 4 different measurements: - -```text -measurement,metric_type=counter,tag1=green count=1 -measurement,metric_type=meter count=1,m15_rate=1.0,m1_rate=1.0,m5_rate=1.0,mean_rate=1.0 -measurement,metric_type=gauge value=1 -measurement,metric_type=histogram count=1,max=1.0,mean=1.0,min=1.0,p50=1.0,p75=1.0,p95=1.0,p98=1.0,p99=1.0,p999=1.0 -measurement,metric_type=timer count=1,max=1.0,mean=1.0,min=1.0,p50=1.0,p75=1.0,p95=1.0,p98=1.0,p99=1.0,p999=1.0,stddev=1.0,m15_rate=1.0,m1_rate=1.0,m5_rate=1.0,mean_rate=1.0 -``` - -You may also parse a dropwizard registry from any JSON document which contains a -dropwizard registry in some inner field. Eg. to parse the following JSON -document: - -```json -{ - "time" : "2017-02-22T14:33:03.662+02:00", - "tags" : { - "tag1" : "green", - "tag2" : "yellow" - }, - "metrics" : { - "counters" : { - "measurement" : { - "count" : 1 - } - }, - "meters" : {}, - "gauges" : {}, - "histograms" : {}, - "timers" : {} - } -} -``` - -and translate it into: - -```text -measurement,metric_type=counter,tag1=green,tag2=yellow count=1 1487766783662000000 -``` - -you simply need to use the following additional configuration properties: - -```toml -dropwizard_metric_registry_path = "metrics" -dropwizard_time_path = "time" -dropwizard_time_format = "2006-01-02T15:04:05Z07:00" -dropwizard_tags_path = "tags" -## tag paths per tag are supported too, eg. -#[inputs.yourinput.dropwizard_tag_paths] -# tag1 = "tags.tag1" -# tag2 = "tags.tag2" -``` diff --git a/content/telegraf/v1/data_formats/input/form_urlencoded.md b/content/telegraf/v1/data_formats/input/form_urlencoded.md deleted file mode 100644 index 2da014c5bf..0000000000 --- a/content/telegraf/v1/data_formats/input/form_urlencoded.md +++ /dev/null @@ -1,71 +0,0 @@ ---- -title: Form URL-encoded input data format -list_title: Form URL-encoded -description: - Use the `form-urlencoded` data format to parse `application/x-www-form-urlencoded` - data, such as HTTP query strings. -menu: - telegraf_v1_ref: - name: Form URL-encoded - weight: 10 - parent: Input data formats -metadata: [Form URLencoded parser plugin] ---- - -Use the `form-urlencoded` data format to parse `application/x-www-form-urlencoded` -data, such as HTTP query strings. - -A common use case is to pair it with the [http_listener_v2](/telegraf/v1/plugins/#input-http_listener_v2) input plugin to parse -the HTTP request body or query parameters. - -## Configuration - -```toml -[[inputs.http_listener_v2]] - ## Address and port to host HTTP listener on - service_address = ":8080" - - ## Part of the request to consume. Available options are "body" and - ## "query". - data_source = "body" - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "form_urlencoded" - - ## Array of key names which should be collected as tags. - ## By default, keys with string value are ignored if not marked as tags. - form_urlencoded_tag_keys = ["tag1"] -``` - -## Examples - -### Basic parsing - -Config: - -```toml -[[inputs.http_listener_v2]] - name_override = "mymetric" - service_address = ":8080" - data_source = "query" - data_format = "form_urlencoded" - form_urlencoded_tag_keys = ["tag1"] -``` - -Request: - -```bash -curl -i -XGET 'http://localhost:8080/telegraf?tag1=foo&field1=0.42&field2=42' -``` - -Output: - -```text -mymetric,tag1=foo field1=0.42,field2=42 -``` - -[query string]: https://en.wikipedia.org/wiki/Query_string -[http_listener_v2]: /plugins/inputs/http_listener_v2 diff --git a/content/telegraf/v1/data_formats/input/graphite.md b/content/telegraf/v1/data_formats/input/graphite.md deleted file mode 100644 index 915fe3c8b7..0000000000 --- a/content/telegraf/v1/data_formats/input/graphite.md +++ /dev/null @@ -1,59 +0,0 @@ ---- -title: Graphite input data format -list_title: Graphite -description: Use the `graphite` input data format to parse Graphite dot buckets into Telegraf metrics. -menu: - telegraf_v1_ref: - name: Graphite - weight: 10 - parent: Input data formats ---- - -Use the `graphite` input data format to parse graphite _dot_ buckets directly into -Telegraf metrics with a measurement name, a single field, and optional tags. -By default, the separator is left as `.`, but this can be changed using the -`separator` argument. For more advanced options, Telegraf supports specifying -[templates](#templates) to translate graphite buckets into Telegraf metrics. - -## Configuration - -```toml -[[inputs.exec]] - ## Commands array - commands = ["/tmp/test.sh", "/usr/bin/mycollector --foo=bar"] - - ## measurement name suffix (for separating different commands) - name_suffix = "_mycollector" - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "graphite" - - ## This string will be used to join the matched values. - separator = "_" - - ## Each template line requires a template pattern. It can have an optional - ## filter before the template and separated by spaces. It can also have optional extra - ## tags following the template. Multiple tags should be separated by commas and no spaces - ## similar to the line protocol format. There can be only one default template. - ## Templates support below format: - ## 1. filter + template - ## 2. filter + template + extra tag(s) - ## 3. filter + template with field key - ## 4. default template - templates = [ - "*.app env.service.resource.measurement", - "stats.* .host.measurement* region=eu-east,agent=sensu", - "stats2.* .host.measurement.field", - "measurement*" - ] -``` - -## Templates - -[Template patterns](/telegraf/v1/configure_plugins/template-patterns/) specify how a dot-delimited -string should be mapped to and from [metrics](/telegraf/v1/metrics/). - - diff --git a/content/telegraf/v1/data_formats/input/grok.md b/content/telegraf/v1/data_formats/input/grok.md deleted file mode 100644 index cda0590c8c..0000000000 --- a/content/telegraf/v1/data_formats/input/grok.md +++ /dev/null @@ -1,282 +0,0 @@ ---- -title: Grok input data format -list_title: Grok -description: Use the `grok` data format to parse line-delimited data using a regular expression-like language. -menu: - telegraf_v1_ref: - name: Grok - weight: 10 - parent: Input data formats ---- - -Use the `grok` data format to parse line-delimited data using a regular expression-like -language. - -For an introduction to grok patterns, see [Grok Basics](https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#_grok_basics) -in the Logstash documentation. The grok parser uses a slightly modified version of logstash **grok** -patterns, using the format: - -```text -%{[:][:]} -``` - -The `capture_syntax` defines the grok pattern used to parse the input -line and the `semantic_name` is used to name the field or tag. The extension -`modifier` controls the data type that the parsed item is converted to or -other special handling. - -By default all named captures are converted into string fields. -If a pattern does not have a semantic name it will not be captured. -Timestamp modifiers can be used to convert captures to the timestamp of the -parsed metric. If no timestamp is parsed the metric will be created using the -current time. - -You must capture at least one field per line. - -- Available modifiers: - - string (default if nothing is specified) - - int - - float - - duration (ie, 5.23ms gets converted to int nanoseconds) - - tag (converts the field into a tag) - - drop (drops the field completely) - - measurement (use the matched text as the measurement name) -- Timestamp modifiers: - - ts (This will auto-learn the timestamp format) - - ts-ansic ("Mon Jan _2 15:04:05 2006") - - ts-unix ("Mon Jan _2 15:04:05 MST 2006") - - ts-ruby ("Mon Jan 02 15:04:05 -0700 2006") - - ts-rfc822 ("02 Jan 06 15:04 MST") - - ts-rfc822z ("02 Jan 06 15:04 -0700") - - ts-rfc850 ("Monday, 02-Jan-06 15:04:05 MST") - - ts-rfc1123 ("Mon, 02 Jan 2006 15:04:05 MST") - - ts-rfc1123z ("Mon, 02 Jan 2006 15:04:05 -0700") - - ts-rfc3339 ("2006-01-02T15:04:05Z07:00") - - ts-rfc3339nano ("2006-01-02T15:04:05.999999999Z07:00") - - ts-httpd ("02/Jan/2006:15:04:05 -0700") - - ts-epoch (seconds since unix epoch, may contain decimal) - - ts-epochnano (nanoseconds since unix epoch) - - ts-epochmilli (milliseconds since unix epoch) - - ts-syslog ("Jan 02 15:04:05", parsed time is set to the current year) - - ts-"CUSTOM" - -CUSTOM time layouts must be within quotes and be the representation of the -"reference time", which is `Mon Jan 2 15:04:05 -0700 MST 2006`. -To match a comma decimal point you can use a period. For example `%{TIMESTAMP:timestamp:ts-"2006-01-02 15:04:05.000"}` can be used to match `"2018-01-02 15:04:05,000"` -To match a comma decimal point you can use a period in the pattern string. -See https://golang.org/pkg/time/#Parse for more details. - -Telegraf has many of its own [built-in patterns](https://github.com/influxdata/telegraf/blob/master/plugins/parsers/grok/influx_patterns.go), -as well as support for most of -[Logstash's core patterns](https://github.com/logstash-plugins/logstash-patterns-core/blob/main/patterns/ecs-v1/grok-patterns). -_Golang regular expressions do not support _lookahead_ or _lookbehind_. -Logstash patterns that depend on these aren't supported._ - -For help building and testing patterns, see [tips for creating patterns](#tips-for-creating-patterns). - - - -- [Configuration](#configuration) - - [Timestamp Examples](#timestamp-examples) - - [TOML Escaping](#toml-escaping) - - [Tips for creating patterns](#tips-for-creating-patterns) - - [Performance](#performance) - -## Configuration - -```toml -[[inputs.file]] - ## Files to parse each interval. - ## These accept standard unix glob matching rules, but with the addition of - ## ** as a "super asterisk". ie: - ## /var/log/**.log -> recursively find all .log files in /var/log - ## /var/log/*/*.log -> find all .log files with a parent dir in /var/log - ## /var/log/apache.log -> only tail the apache log file - files = ["/var/log/apache/access.log"] - - ## The dataformat to be read from files - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "grok" - - ## This is a list of patterns to check the given log file(s) for. - ## Note that adding patterns here increases processing time. The most - ## efficient configuration is to have one pattern. - ## Other common built-in patterns are: - ## %{COMMON_LOG_FORMAT} (plain apache & nginx access logs) - ## %{COMBINED_LOG_FORMAT} (access logs + referrer & agent) - grok_patterns = ["%{COMBINED_LOG_FORMAT}"] - - ## Full path(s) to custom pattern files. - grok_custom_pattern_files = [] - - ## Custom patterns can also be defined here. Put one pattern per line. - grok_custom_patterns = ''' - ''' - - ## Timezone allows you to provide an override for timestamps that - ## don't already include an offset - ## e.g. 04/06/2016 12:41:45 data one two 5.43µs - ## - ## Default: "" which renders UTC - ## Options are as follows: - ## 1. Local -- interpret based on machine localtime - ## 2. "Canada/Eastern" -- Unix TZ values like those found in https://en.wikipedia.org/wiki/List_of_tz_database_time_zones - ## 3. UTC -- or blank/unspecified, will return timestamp in UTC - grok_timezone = "Canada/Eastern" - - ## When set to "disable" timestamp will not incremented if there is a - ## duplicate. - # grok_unique_timestamp = "auto" - - ## Enable multiline messages to be processed. - # grok_multiline = false -``` - -### Timestamp Examples - -This example input and config parses a file using a custom timestamp conversion: - -```text -2017-02-21 13:10:34 value=42 -``` - -```toml -[[inputs.file]] - grok_patterns = ['%{TIMESTAMP_ISO8601:timestamp:ts-"2006-01-02 15:04:05"} value=%{NUMBER:value:int}'] -``` - -This example input and config parses a file using a timestamp in unix time: - -```text -1466004605 value=42 -1466004605.123456789 value=42 -``` - -```toml -[[inputs.file]] - grok_patterns = ['%{NUMBER:timestamp:ts-epoch} value=%{NUMBER:value:int}'] -``` - -This example parses a file using a built-in conversion and a custom pattern: - -```text -Wed Apr 12 13:10:34 PST 2017 value=42 -``` - -```toml -[[inputs.file]] - grok_patterns = ["%{TS_UNIX:timestamp:ts-unix} value=%{NUMBER:value:int}"] - grok_custom_patterns = ''' - TS_UNIX %{DAY} %{MONTH} %{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND} %{TZ} %{YEAR} - ''' -``` - -This example input and config parses a file using a custom timestamp conversion -that doesn't match any specific standard: - -```text -21/02/2017 13:10:34 value=42 -``` - -```toml -[[inputs.file]] - grok_patterns = ['%{MY_TIMESTAMP:timestamp:ts-"02/01/2006 15:04:05"} value=%{NUMBER:value:int}'] - - grok_custom_patterns = ''' - MY_TIMESTAMP (?:\d{2}.\d{2}.\d{4} \d{2}:\d{2}:\d{2}) - ''' -``` - -For cases where the timestamp itself is without offset, the `timezone` config -var is available to denote an offset. By default (with `timezone` either omit, -blank or set to `"UTC"`), the times are processed as if in the UTC timezone. If -specified as `timezone = "Local"`, the timestamp will be processed based on the -current machine timezone configuration. Lastly, if using a timezone from the -list of Unix -[timezones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones), grok -will offset the timestamp accordingly. - -#### TOML Escaping - -When saving patterns to the configuration file, keep in mind the different TOML -[string](https://github.com/toml-lang/toml#string) types and the escaping -rules for each. These escaping rules must be applied in addition to the -escaping required by the grok syntax. Using the Multi-line line literal -syntax with `'''` may be useful. - -The following config examples will parse this input file: - -```text -|42|\uD83D\uDC2F|'telegraf'| -``` - -Since `|` is a special character in the grok language, we must escape it to -get a literal `|`. With a basic TOML string, special characters such as -backslash must be escaped, requiring us to escape the backslash a second time. - -```toml -[[inputs.file]] - grok_patterns = ["\\|%{NUMBER:value:int}\\|%{UNICODE_ESCAPE:escape}\\|'%{WORD:name}'\\|"] - grok_custom_patterns = "UNICODE_ESCAPE (?:\\\\u[0-9A-F]{4})+" -``` - -We cannot use a literal TOML string for the pattern, because we cannot match a -`'` within it. However, it works well for the custom pattern. - -```toml -[[inputs.file]] - grok_patterns = ["\\|%{NUMBER:value:int}\\|%{UNICODE_ESCAPE:escape}\\|'%{WORD:name}'\\|"] - grok_custom_patterns = 'UNICODE_ESCAPE (?:\\u[0-9A-F]{4})+' -``` - -A multi-line literal string allows us to encode the pattern: - -```toml -[[inputs.file]] - grok_patterns = [''' - \|%{NUMBER:value:int}\|%{UNICODE_ESCAPE:escape}\|'%{WORD:name}'\| - '''] - grok_custom_patterns = 'UNICODE_ESCAPE (?:\\u[0-9A-F]{4})+' -``` - -### Tips for creating patterns -Complex patterns can be difficult to read and write. -For help building and debugging grok patterns, see the following tools: -- [Grok Constructor](https://grokconstructor.appspot.com/) -- [Grok Debugger](https://grokdebugger.com/) - -We recommend the following steps for building and testing a new pattern with Telegraf and your data: - -1. In your Telegraf configuration, do the following to help you isolate and view the captured metrics: - - Configure a file output that writes to stdout: - - ```toml - [[outputs.file]] - files = ["stdout"] - ``` - - - Disable other outputs while testing. - - *Keep in mind that the file output will only print once per `flush_interval`.* - -2. For the input, start with a sample file that contains a single line of your data, - and then remove all but the first token or piece of the line. -3. In your Telegraf configuration, add the section of your pattern that matches the piece of data from the previous step. -4. Run Telegraf and verify that the metric is parsed successfully. -5. If successful, add the next token to the data file, update the pattern configuration in Telegraf, and then retest. -6. Continue one token at a time until the entire line is successfully parsed. - -#### Performance - -Performance depends heavily on the regular expressions that you use, but there -are a few techniques that can help: - -- Avoid using patterns such as `%{DATA}` that will always match. -- If possible, add `^` and `$` anchors to your pattern: - - ```toml - [[inputs.file]] - grok_patterns = ["^%{COMBINED_LOG_FORMAT}$"] - ``` diff --git a/content/telegraf/v1/data_formats/input/influx.md b/content/telegraf/v1/data_formats/input/influx.md deleted file mode 100644 index e04978485f..0000000000 --- a/content/telegraf/v1/data_formats/input/influx.md +++ /dev/null @@ -1,30 +0,0 @@ ---- -title: InfluxDB line protocol input data format -list_title: InfluxDB line protocol -description: Use the `influx` line protocol input data format to parse InfluxDB metrics directly into Telegraf metrics. -menu: - telegraf_v1_ref: - name: InfluxDB line protocol - weight: 10 - parent: Input data formats ---- - -Use the `influx` line protocol input data format to parse InfluxDB [line protocol](/influxdb3/cloud-serverless/reference/syntax/line-protocol/) data into Telegraf [metrics](/telegraf/v1/metrics/). - -## Configuration - -```toml -[[inputs.file]] - files = ["example"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "influx" - - ## Influx line protocol parser - ## 'internal' is the default. 'upstream' is a newer parser that is faster - ## and more memory efficient. - ## influx_parser_type = "internal" -``` diff --git a/content/telegraf/v1/data_formats/input/json.md b/content/telegraf/v1/data_formats/input/json.md deleted file mode 100644 index ea1ee7a5f2..0000000000 --- a/content/telegraf/v1/data_formats/input/json.md +++ /dev/null @@ -1,278 +0,0 @@ ---- -title: JSON input data format -list_title: JSON -description: | - The `json` input data format parses JSON objects, or an array of objects, into Telegraf metrics. - For most cases, use the JSON v2 input data format instead. -menu: - telegraf_v1_ref: - name: JSON - weight: 10 - parent: Input data formats ---- - -{{% note %}} -The following information applies to the legacy JSON input data format. -For most cases, use the [JSON v2 input data format](/telegraf/v1/data_formats/input/json_v2/) instead. -{{% /note %}} - -The `json` data format parses a [JSON][json] object or an array of objects into -metric fields. - -**NOTE:** All JSON numbers are converted to float fields. JSON strings and -booleans are ignored unless specified in the `tag_key` or `json_string_fields` -options. - -## Configuration - -```toml -[[inputs.file]] - files = ["example"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "json" - - ## When strict is true and a JSON array is being parsed, all objects within the - ## array must be valid - json_strict = true - - ## Query is a GJSON path that specifies a specific chunk of JSON to be - ## parsed, if not specified the whole document will be parsed. - ## - ## GJSON query paths are described here: - ## https://github.com/tidwall/gjson/tree/v1.3.0#path-syntax - json_query = "" - - ## Tag keys is an array of keys that should be added as tags. Matching keys - ## are no longer saved as fields. Supports wildcard glob matching. - tag_keys = [ - "my_tag_1", - "my_tag_2", - "tags_*", - "tag*" - ] - - ## Array of glob pattern strings or booleans keys that should be added as string fields. - json_string_fields = [] - - ## Name key is the key to use as the measurement name. - json_name_key = "" - - ## Time key is the key containing the time that should be used to create the - ## metric. - json_time_key = "" - - ## Time format is the time layout that should be used to interpret the json_time_key. - ## The time must be `unix`, `unix_ms`, `unix_us`, `unix_ns`, or a time in the - ## "reference time". To define a different format, arrange the values from - ## the "reference time" in the example to match the format you will be - ## using. For more information on the "reference time", visit - ## https://golang.org/pkg/time/#Time.Format - ## ex: json_time_format = "Mon Jan 2 15:04:05 -0700 MST 2006" - ## json_time_format = "2006-01-02T15:04:05Z07:00" - ## json_time_format = "01/02/2006 15:04:05" - ## json_time_format = "unix" - ## json_time_format = "unix_ms" - json_time_format = "" - - ## Timezone allows you to provide an override for timestamps that - ## don't already include an offset - ## e.g. 04/06/2016 12:41:45 - ## - ## Default: "" which renders UTC - ## Options are as follows: - ## 1. Local -- interpret based on machine localtime - ## 2. "America/New_York" -- Unix TZ values like those found in https://en.wikipedia.org/wiki/List_of_tz_database_time_zones - ## 3. UTC -- or blank/unspecified, will return timestamp in UTC - json_timezone = "" -``` - -### json_query - -The `json_query` is a [GJSON][gjson] path that can be used to transform the -JSON document before being parsed. The query is performed before any other -options are applied and the new document produced will be parsed instead of the -original document, as such, the result of the query should be a JSON object or -an array of objects. - -Consult the GJSON [path syntax][gjson syntax] for details and examples, and -consider using the [GJSON playground][gjson playground] for developing and -debugging your query. - -### json_time_key, json_time_format, json_timezone - -By default the current time will be used for all created metrics, to set the -time using the JSON document you can use the `json_time_key` and -`json_time_format` options together to set the time to a value in the parsed -document. - -The `json_time_key` option specifies the key containing the time value and -`json_time_format` must be set to `unix`, `unix_ms`, `unix_us`, `unix_ns`, or -the Go "reference time" which is defined to be the specific time: -`Mon Jan 2 15:04:05 MST 2006`. - -Consult the Go [time][time parse] package for details and additional examples -on how to set the time format. - -When parsing times that don't include a timezone specifier, times are assumed to -be UTC. To default to another timezone, or to local time, specify the -`json_timezone` option. This option should be set to a [Unix TZ -value](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones), such as -`America/New_York`, to `Local` to utilize the system timezone, or to `UTC`. - -## Examples - -### Basic Parsing - -Config: - -```toml -[[inputs.file]] - files = ["example"] - name_override = "myjsonmetric" - data_format = "json" -``` - -Input: - -```json -{ - "a": 5, - "b": { - "c": 6 - }, - "ignored": "I'm a string" -} -``` - -Output: - -```text -myjsonmetric a=5,b_c=6 -``` - -### Name, Tags, and String Fields - -Config: - -```toml -[[inputs.file]] - files = ["example"] - json_name_key = "name" - tag_keys = ["my_tag_1"] - json_string_fields = ["b_my_field"] - data_format = "json" -``` - -Input: - -```json -{ - "a": 5, - "b": { - "c": 6, - "my_field": "description" - }, - "my_tag_1": "foo", - "name": "my_json" -} -``` - -Output: - -```text -my_json,my_tag_1=foo a=5,b_c=6,b_my_field="description" -``` - -### Arrays - -If the JSON data is an array, then each object within the array is parsed with -the configured settings. - -Config: - -```toml -[[inputs.file]] - files = ["example"] - data_format = "json" - json_time_key = "b_time" - json_time_format = "02 Jan 06 15:04 MST" -``` - -Input: - -```json -[ - { - "a": 5, - "b": { - "c": 6, - "time":"04 Jan 06 15:04 MST" - } - }, - { - "a": 7, - "b": { - "c": 8, - "time":"11 Jan 07 15:04 MST" - } - } -] -``` - -Output: - -```text -file a=5,b_c=6 1136387040000000000 -file a=7,b_c=8 1168527840000000000 -``` - -### Query - -The `json_query` option can be used to parse a subset of the document. - -Config: - -```toml -[[inputs.file]] - files = ["example"] - data_format = "json" - tag_keys = ["first"] - json_string_fields = ["last"] - json_query = "obj.friends" -``` - -Input: - -```json -{ - "obj": { - "name": {"first": "Tom", "last": "Anderson"}, - "age":37, - "children": ["Sara","Alex","Jack"], - "fav.movie": "Deer Hunter", - "friends": [ - {"first": "Dale", "last": "Murphy", "age": 44}, - {"first": "Roger", "last": "Craig", "age": 68}, - {"first": "Jane", "last": "Murphy", "age": 47} - ] - } -} -``` - -Output: - -```text -file,first=Dale last="Murphy",age=44 -file,first=Roger last="Craig",age=68 -file,first=Jane last="Murphy",age=47 -``` - -[gjson]: https://github.com/tidwall/gjson -[gjson syntax]: https://github.com/tidwall/gjson#path-syntax -[gjson playground]: https://gjson.dev/ -[json]: https://www.json.org/ -[time parse]: https://golang.org/pkg/time/#Parse diff --git a/content/telegraf/v1/data_formats/input/json_v2.md b/content/telegraf/v1/data_formats/input/json_v2.md deleted file mode 100644 index 269bc9f1f3..0000000000 --- a/content/telegraf/v1/data_formats/input/json_v2.md +++ /dev/null @@ -1,179 +0,0 @@ ---- -title: JSON v2 input data format -list_title: JSON v2 -description: Use the `json_v2` input data format to parse [JSON][json] objects, or an array of objects, into Telegraf metrics. -menu: - telegraf_v1_ref: - name: JSON v2 - weight: 10 - parent: Input data formats ---- - -Use the `json_v2` input data format to parse a [JSON][json] object or an array of objects into Telegraf metrics. - -The parser supports [GJSON Path Syntax](https://github.com/tidwall/gjson/blob/v1.7.5/SYNTAX.md) for querying JSON. - -To test your GJSON path, use [GJSON Playground](https://gjson.dev/). - -You can find multiple examples [here](https://github.com/influxdata/telegraf/tree/master/plugins/parsers/json_v2/testdata) in the Telegraf repository. - - - -## Configuration - -Configure this parser by describing the metric you want by defining the fields and tags from the input. -The configuration is divided into config sub-tables called `field`, `tag`, and `object`. -In the example below you can see all the possible configuration keys you can define for each config table. -In the sections that follow these configuration keys are defined in more detail. - -```toml - [[inputs.file]] - urls = [] - data_format = "json_v2" - - [[inputs.file.json_v2]] - measurement_name = "" # A string that will become the new measurement name - measurement_name_path = "" # A string with valid GJSON path syntax, will override measurement_name - timestamp_path = "" # A string with valid GJSON path syntax to a valid timestamp (single value) - timestamp_format = "" # A string with a valid timestamp format (see below for possible values) - timestamp_timezone = "" # A string with with a valid timezone (see below for possible values) - - [[inputs.file.json_v2.field]] - path = "" # A string with valid GJSON path syntax - rename = "new name" # A string with a new name for the tag key - type = "int" # A string specifying the type (int,uint,float,string,bool) - optional = false # true: suppress errors if configured path does not exist - - [[inputs.file.json_v2.tag]] - path = "" # A string with valid GJSON path syntax - rename = "new name" # A string with a new name for the tag key - type = "float" # A string specifying the type (int,uint,float,string,bool) - optional = false # true: suppress errors if configured path does not exist - - [[inputs.file.json_v2.object]] - path = "" # A string with valid GJSON path syntax - timestamp_key = "" # A JSON key (for a nested key, prepend the parent keys with underscores) to a valid timestamp - timestamp_format = "" # A string with a valid timestamp format (see below for possible values) - timestamp_timezone = "" # A string with with a valid timezone (see below for possible values) - disable_prepend_keys = false (or true, just not both) - included_keys = [] # List of JSON keys (for a nested key, prepend the parent keys with underscores) that should be only included in result - excluded_keys = [] # List of JSON keys (for a nested key, prepend the parent keys with underscores) that shouldn't be included in result - tags = [] # List of JSON keys (for a nested key, prepend the parent keys with underscores) to be a tag instead of a field - optional = false # true: suppress errors if configured path does not exist - [inputs.file.json_v2.object.renames] # A map of JSON keys (for a nested key, prepend the parent keys with underscores) with a new name for the tag key - key = "new name" - [inputs.file.json_v2.object.fields] # A map of JSON keys (for a nested key, prepend the parent keys with underscores) with a type (int,uint,float,string,bool) - key = "int" -``` - -### Root configuration options - -* **measurement_name (OPTIONAL)**: Will set the measurement name to the provided string. -* **measurement_name_path (OPTIONAL)**: You can define a query with [GJSON Path Syntax](https://github.com/tidwall/gjson/blob/v1.7.5/SYNTAX.md) to set a measurement name from the JSON input. - The query must return a single data value or it will use the default measurement name. - This takes precedence over `measurement_name`. -* **timestamp_path (OPTIONAL)**: You can define a query with [GJSON Path Syntax](https://github.com/tidwall/gjson/blob/v1.7.5/SYNTAX.md) to set a timestamp from the JSON input. - The query must return a single data value or it will default to the current time. -* **timestamp_format (OPTIONAL, but REQUIRED when timestamp_path is defined**: Must be set to `unix`, `unix_ms`, `unix_us`, `unix_ns`, or - the Go "reference time" which is defined to be the specific time: - `Mon Jan 2 15:04:05 MST 2006` -* **timestamp_timezone (OPTIONAL, but REQUIRES timestamp_path**: This option should be set to a - [Unix TZ value](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones), - such as `America/New_York`, to `Local` to utilize the system timezone, or to `UTC`. - Defaults to `UTC` - -## Arrays and Objects - -The following describes the high-level approach when parsing arrays and objects: - -- **Array**: Every element in an array is treated as a *separate* metric -- **Object**: Every key-value in a object is treated as a *single* metric - -When handling nested arrays and objects, the rules above continue to apply as the parser creates metrics. -When an object has multiple arrays as values, -the arrays will become separate metrics containing only non-array values from the object. -Below you can see an example of this behavior, -with an input JSON containing an array of book objects that has a nested array of characters. - -**Example JSON:** - -```json -{ - "book": { - "title": "The Lord Of The Rings", - "chapters": [ - "A Long-expected Party", - "The Shadow of the Past" - ], - "author": "Tolkien", - "characters": [ - { - "name": "Bilbo", - "species": "hobbit" - }, - { - "name": "Frodo", - "species": "hobbit" - } - ], - "random": [ - 1, - 2 - ] - } -} - -``` - -**Example configuration:** - -```toml -[[inputs.file]] - files = ["./testdata/multiple_arrays_in_object/input.json"] - data_format = "json_v2" - [[inputs.file.json_v2]] - [[inputs.file.json_v2.object]] - path = "book" - tags = ["title"] - disable_prepend_keys = true -``` - -**Expected metrics:** - -``` -file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",chapters="A Long-expected Party" -file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",chapters="The Shadow of the Past" -file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",name="Bilbo",species="hobbit" -file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",name="Frodo",species="hobbit" -file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",random=1 -file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",random=2 - -``` - -You can find more complicated examples under the folder [`testdata`][] in the telegraf repo. - -## Types - -For each field you have the option to define the types for each metric. -The following rules are in place for this configuration: - -* If a type is explicitly defined, the parser will enforce this type and convert the data to the defined type if possible. - If the type can't be converted then the parser will fail. -* If a type isn't defined, the parser will use the default type defined in the JSON (int, float, string). - -The type values you can set: - -* `int`, bool, floats or strings (with valid numbers) can be converted to a int. -* `uint`, bool, floats or strings (with valid numbers) can be converted to a uint. -* `string`, any data can be formatted as a string. -* `float`, string values (with valid numbers) or integers can be converted to a float. -* `bool`, the string values "true" or "false" (regardless of capitalization) or the integer values `0` or `1` can be turned to a bool. - -[json]: https://www.json.org/ -[testdata]: https://github.com/influxdata/telegraf/tree/master/plugins/parsers/json_v2/testdata \ No newline at end of file diff --git a/content/telegraf/v1/data_formats/input/logfmt.md b/content/telegraf/v1/data_formats/input/logfmt.md deleted file mode 100644 index 7a3c6d417f..0000000000 --- a/content/telegraf/v1/data_formats/input/logfmt.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -title: Logfmt input data format -list_title: Logfmt -description: Use the `logfmt` input data format to parse logfmt data into Telegraf metrics. -menu: - telegraf_v1_ref: - name: logfmt - weight: 10 - parent: Input data formats ---- - -Use the `logfmt` data format to parse [logfmt] data into Telegraf metrics. - -[logfmt]: https://brandur.org/logfmt - -## Configuration - -```toml -[[inputs.file]] - files = ["example"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "logfmt" - - ## Array of key names which should be collected as tags. Globs accepted. - logfmt_tag_keys = ["method","host"] -``` - -## Metrics - -Each key/value pair in the line is added to a new metric as a field. The type -of the field is automatically determined based on the contents of the value. - -## Examples - -```text -- method=GET host=example.org ts=2018-07-24T19:43:40.275Z connect=4ms service=8ms status=200 bytes=1653 -+ logfmt,host=example.org,method=GET ts="2018-07-24T19:43:40.275Z",connect="4ms",service="8ms",status=200i,bytes=1653i -``` diff --git a/content/telegraf/v1/data_formats/input/nagios.md b/content/telegraf/v1/data_formats/input/nagios.md deleted file mode 100644 index 74e82e8940..0000000000 --- a/content/telegraf/v1/data_formats/input/nagios.md +++ /dev/null @@ -1,28 +0,0 @@ ---- -title: Nagios input data format -list_title: Nagios -description: Use the `nagios` input data format to parse the output of Nagios plugins into Telegraf metrics. -menu: - telegraf_v1_ref: - name: Nagios - weight: 10 - parent: Input data formats ---- - -Use the `nagios` input data format to parse the output of -[Nagios plugins](https://www.nagios.org/downloads/nagios-plugins/) into -Telegraf metrics. - -## Configuration - -```toml -[[inputs.exec]] - ## Commands array - commands = ["/usr/lib/nagios/plugins/check_load -w 5,6,7 -c 7,8,9"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "nagios" -``` diff --git a/content/telegraf/v1/data_formats/input/opentsdb.md b/content/telegraf/v1/data_formats/input/opentsdb.md deleted file mode 100644 index 1d9b12b644..0000000000 --- a/content/telegraf/v1/data_formats/input/opentsdb.md +++ /dev/null @@ -1,39 +0,0 @@ ---- -title: OpenTSDB Telnet "PUT" API input data format -list_title: OpenTSDB Telnet PUT API -description: - Use the `opentsdb` data format to parse OpenTSDB Telnet `PUT` API data into Telegraf metrics. -menu: - telegraf_v1_ref: - name: OpenTSDB - weight: 10 - parent: Input data formats -metadata: [] ---- - -Use the `opentsdb` data format to parse [OpenTSDB Telnet `PUT` API](http://opentsdb.net/docs/build/html/api_telnet/put.html) data into -Telegraf metrics. There are no additional configuration options for OpenTSDB. - -For more detail on the format, see: - -- [OpenTSDB Telnet "PUT" API guide](http://opentsdb.net/docs/build/html/api_telnet/put.html) -- [OpenTSDB data specification](http://opentsdb.net/docs/build/html/user_guide/writing/index.html#data-specification) - -## Configuration - -```toml -[[inputs.file]] - files = ["example"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "opentsdb" -``` - -## Example - -```opentsdb -put sys.cpu.user 1356998400 42.5 host=webserver01 cpu=0 -``` diff --git a/content/telegraf/v1/data_formats/input/prometheus-remote-write.md b/content/telegraf/v1/data_formats/input/prometheus-remote-write.md deleted file mode 100644 index 33620938e0..0000000000 --- a/content/telegraf/v1/data_formats/input/prometheus-remote-write.md +++ /dev/null @@ -1,69 +0,0 @@ ---- -title: Prometheus Remote Write input data format -list_title: Prometheus Remote Write -description: - Use the `prometheusremotewrite` input data format to parse Prometheus Remote Write samples into Telegraf metrics. -menu: - telegraf_v1_ref: - name: Prometheus Remote Write - weight: 10 - parent: Input data formats ---- - -Use the `prometheusremotewrite` input data format to parse [Prometheus Remote Write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write) samples into Telegraf metrics. - -{{% note %}} -If you are using InfluxDB 1.x and the [Prometheus Remote Write endpoint](https://github.com/influxdata/telegraf/blob/master/plugins/parsers/prometheusremotewrite/README.md -to write in metrics, you can migrate to InfluxDB 2.0 and use this parser. -For the metrics to completely align with the 1.x endpoint, add a Starlark processor as described [here](https://github.com/influxdata/telegraf/blob/master/plugins/processors/starlark/README.md). - -{{% /note %}} - -Converts prometheus remote write samples directly into Telegraf metrics. It can -be used with [http_listener_v2](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/http_listener_v2). There are no -additional configuration options for Prometheus Remote Write Samples. - -## Configuration - -```toml -[[inputs.http_listener_v2]] - ## Address and port to host HTTP listener on - service_address = ":1234" - - ## Paths to listen to. - paths = ["/receive"] - - ## Data format to consume. - data_format = "prometheusremotewrite" -``` - -## Example Input - -```json -prompb.WriteRequest{ - Timeseries: []*prompb.TimeSeries{ - { - Labels: []*prompb.Label{ - {Name: "__name__", Value: "go_gc_duration_seconds"}, - {Name: "instance", Value: "localhost:9090"}, - {Name: "job", Value: "prometheus"}, - {Name: "quantile", Value: "0.99"}, - }, - Samples: []prompb.Sample{ - {Value: 4.63, Timestamp: time.Date(2020, 4, 1, 0, 0, 0, 0, time.UTC).UnixNano()}, - }, - }, - }, - } - -``` - -## Example Output - -```text -prometheus_remote_write,instance=localhost:9090,job=prometheus,quantile=0.99 go_gc_duration_seconds=4.63 1614889298859000000 -``` - -## For alignment with the [InfluxDB v1.x Prometheus Remote Write Spec](/influxdb/v1/supported_protocols/prometheus/#how-prometheus-metrics-are-parsed-in-influxdb) - -- Use the [Starlark processor rename prometheus remote write script](https://github.com/influxdata/telegraf/blob/master/plugins/processors/starlark/testdata/rename_prometheus_remote_write.star) to rename the measurement name to the fieldname and rename the fieldname to value. diff --git a/content/telegraf/v1/data_formats/input/value.md b/content/telegraf/v1/data_formats/input/value.md deleted file mode 100644 index 4387ba6d07..0000000000 --- a/content/telegraf/v1/data_formats/input/value.md +++ /dev/null @@ -1,45 +0,0 @@ ---- -title: Value input data format -list_title: Value -description: Use the `value` input data format to parse single values into Telegraf metrics. -menu: - telegraf_v1_ref: - name: Value - weight: 10 - parent: Input data formats ---- - -Use the `value` input data format to parse single values into Telegraf metrics. - -## Configuration - -Specify the measurement name and a field to use as the parsed metric. - -> To specify the measurement name for your metric, set `name_override`; otherwise, the input plugin name (for example, "exec") is used as the measurement name. - -You **must** tell Telegraf what type of metric to collect by using the -`data_type` configuration option. Available data type options are: - -1. integer -2. float or long -3. string -4. boolean - -```toml -[[inputs.exec]] - ## Commands array - commands = ["cat /proc/sys/kernel/random/entropy_avail"] - - ## override the default metric name of "exec" - name_override = "entropy_available" - - ## override the field name of "value" - # value_field_name = "value" - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "value" - data_type = "integer" # required -``` diff --git a/content/telegraf/v1/data_formats/input/wavefront.md b/content/telegraf/v1/data_formats/input/wavefront.md deleted file mode 100644 index 8ee1eff375..0000000000 --- a/content/telegraf/v1/data_formats/input/wavefront.md +++ /dev/null @@ -1,29 +0,0 @@ ---- -title: Wavefront input data format -list_title: Wavefront -description: Use the `wavefront` input data format to parse Wavefront data into Telegraf metrics. -menu: - telegraf_v1_ref: - name: Wavefront - weight: 10 - parent: Input data formats ---- - -Use the `wavefront` input data format to parse Wavefront data into Telegraf metrics. -For more information on the Wavefront native data format, see -[Wavefront Data Format](https://docs.wavefront.com/wavefront_data_format.html) in the Wavefront documentation. - -## Configuration - -There are no additional configuration options for Wavefront Data Format line-protocol. - -```toml -[[inputs.file]] - files = ["example"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "wavefront" -``` diff --git a/content/telegraf/v1/data_formats/input/xml.md b/content/telegraf/v1/data_formats/input/xml.md deleted file mode 100644 index a34a8a218f..0000000000 --- a/content/telegraf/v1/data_formats/input/xml.md +++ /dev/null @@ -1,59 +0,0 @@ ---- -title: XML input data format -list_title: XML -description: Use the `xml` input data format to parse XML data into Telegraf metrics. -menu: - telegraf_v1_ref: - name: XML - weight: 10 - parent: Input data formats -metadata: [XPath parser plugin] ---- - -Use the `xml` input data format, provided by the [XPath parser plugin](https://github.com/influxdata/telegraf/tree/master/plugins/parsers/xpath), with XPath expressions to parse XML data into Telegraf metrics. - -## Configuration - -```toml -[[inputs.file]] - files = ["example.xml"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "xml" - - ## Multiple parsing sections are allowed - [[inputs.file.xml]] - ## Optional: XPath-query to select a subset of nodes from the XML document. - #metric_selection = "/Bus/child::Sensor" - - ## Optional: XPath-query to set the metric (measurement) name. - #metric_name = "string('example')" - - ## Optional: Query to extract metric timestamp. - ## If not specified the time of execution is used. - #timestamp = "/Gateway/Timestamp" - ## Optional: Format of the timestamp determined by the query above. - ## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang - ## time format. If not specified, a "unix" timestamp (in seconds) is expected. - #timestamp_format = "2006-01-02T15:04:05Z" - - ## Tag definitions using the given XPath queries. - [inputs.file.xml.tags] - name = "substring-after(Sensor/@name, ' ')" - device = "string('the ultimate sensor')" - - ## Integer field definitions using XPath queries. - [inputs.file.xml.fields_int] - consumers = "Variable/@consumers" - - ## Non-integer field definitions using XPath queries. - ## The field type is defined using XPath expressions such as number(), boolean() or string(). If no conversion is performed the field will be of type string. - [inputs.file.xml.fields] - temperature = "number(Variable/@temperature)" - power = "number(Variable/@power)" - frequency = "number(Variable/@frequency)" - ok = "Mode != 'ok'" -``` diff --git a/content/telegraf/v1/data_formats/input/xpath_json.md b/content/telegraf/v1/data_formats/input/xpath_json.md deleted file mode 100644 index f2fa314ed4..0000000000 --- a/content/telegraf/v1/data_formats/input/xpath_json.md +++ /dev/null @@ -1,629 +0,0 @@ ---- -title: XPath JSON input data format -list_title: XPath JSON -description: - Use the `xpath_json` input data format and XPath expressions to parse JSON into Telegraf metrics. -menu: - telegraf_v1_ref: - name: XPath JSON - weight: 10 - parent: Input data formats -metadata: [XPath parser plugin] ---- - -Use the `xpath_json` input data format, provided by the [XPath parser plugin](https://github.com/influxdata/telegraf/tree/master/plugins/parsers/xpath), with [XPath][xpath] expressions to parse JSON data into Telegraf metrics. - -For information about supported XPath functions, see [the underlying XPath library][xpath lib]. - -**NOTE:** The type of fields are specified using [XPath functions][xpath -lib]. The only exceptions are _integer_ fields that need to be specified in a -`fields_int` section. - -## Supported data formats - -| name | `data_format` setting | comment | -| --------------------------------------- | --------------------- | ------- | -| [Extensible Markup Language (XML)][xml] | `"xml"` | | -| [JSON][json] | `"xpath_json"` | | -| [MessagePack][msgpack] | `"xpath_msgpack"` | | -| [Protocol-buffers][protobuf] | `"xpath_protobuf"` | [see additional parameters](#protocol-buffers-additional-settings)| - -### Protocol-buffers additional settings - -For using the protocol-buffer format you need to specify additional -(_mandatory_) properties for the parser. Those options are described here. - -#### `xpath_protobuf_file` (mandatory) - -Use this option to specify the name of the protocol-buffer definition file -(`.proto`). - -#### `xpath_protobuf_type` (mandatory) - -This option contains the top-level message file to use for deserializing the -data to be parsed. Usually, this is constructed from the `package` name in the -protocol-buffer definition file and the `message` name as `.`. - -#### `xpath_protobuf_import_paths` (optional) - -In case you import other protocol-buffer definitions within your `.proto` file -(i.e. you use the `import` statement) you can use this option to specify paths -to search for the imported definition file(s). By default the imports are only -searched in `.` which is the current-working-directory, i.e. usually the -directory you are in when starting telegraf. - -Imagine you do have multiple protocol-buffer definitions (e.g. `A.proto`, -`B.proto` and `C.proto`) in a directory (e.g. `/data/my_proto_files`) where your -top-level file (e.g. `A.proto`) imports at least one other definition - -```protobuf -syntax = "proto3"; - -package foo; - -import "B.proto"; - -message Measurement { - ... -} -``` - -You should use the following setting - -```toml -[[inputs.file]] - files = ["example.dat"] - - data_format = "xpath_protobuf" - xpath_protobuf_file = "A.proto" - xpath_protobuf_type = "foo.Measurement" - xpath_protobuf_import_paths = [".", "/data/my_proto_files"] - - ... -``` - -#### `xpath_protobuf_skip_bytes` (optional) - -This option allows to skip a number of bytes before trying to parse -the protocol-buffer message. This is useful in cases where the raw data -has a header e.g. for the message length or in case of GRPC messages. - -This is a list of known headers and the corresponding values for -`xpath_protobuf_skip_bytes` - -| name | setting | comment | -| --------------------------------------- | ------- | ------- | -| [GRPC protocol][GRPC] | 5 | GRPC adds a 5-byte header for _Length-Prefixed-Messages_ | -| [PowerDNS logging][PDNS] | 2 | Sent messages contain a 2-byte header containing the message length | - -[GRPC]: https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md -[PDNS]: https://docs.powerdns.com/recursor/lua-config/protobuf.html - -## Configuration - -```toml -[[inputs.file]] - files = ["example.xml"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "xml" - - ## PROTOCOL-BUFFER definitions - ## Protocol-buffer definition file - # xpath_protobuf_file = "sparkplug_b.proto" - ## Name of the protocol-buffer message type to use in a fully qualified form. - # xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload" - ## List of paths to use when looking up imported protocol-buffer definition files. - # xpath_protobuf_import_paths = ["."] - ## Number of (header) bytes to ignore before parsing the message. - # xpath_protobuf_skip_bytes = 0 - - ## Print the internal XML document when in debug logging mode. - ## This is especially useful when using the parser with non-XML formats like protocol-buffers - ## to get an idea on the expression necessary to derive fields etc. - # xpath_print_document = false - - ## Allow the results of one of the parsing sections to be empty. - ## Useful when not all selected files have the exact same structure. - # xpath_allow_empty_selection = false - - ## Get native data-types for all data-format that contain type information. - ## Currently, protobuf, msgpack and JSON support native data-types - # xpath_native_types = false - - ## Multiple parsing sections are allowed - [[inputs.file.xpath]] - ## Optional: XPath-query to select a subset of nodes from the XML document. - # metric_selection = "/Bus/child::Sensor" - - ## Optional: XPath-query to set the metric (measurement) name. - # metric_name = "string('example')" - - ## Optional: Query to extract metric timestamp. - ## If not specified the time of execution is used. - # timestamp = "/Gateway/Timestamp" - ## Optional: Format of the timestamp determined by the query above. - ## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang - ## time format. If not specified, a "unix" timestamp (in seconds) is expected. - # timestamp_format = "2006-01-02T15:04:05Z" - ## Optional: Timezone of the parsed time - ## This will locate the parsed time to the given timezone. Please note that - ## for times with timezone-offsets (e.g. RFC3339) the timestamp is unchanged. - ## This is ignored for all (unix) timestamp formats. - # timezone = "UTC" - - ## Optional: List of fields to convert to hex-strings if they are - ## containing byte-arrays. This might be the case for e.g. protocol-buffer - ## messages encoding data as byte-arrays. Wildcard patterns are allowed. - ## By default, all byte-array-fields are converted to string. - # fields_bytes_as_hex = [] - - ## Tag definitions using the given XPath queries. - [inputs.file.xpath.tags] - name = "substring-after(Sensor/@name, ' ')" - device = "string('the ultimate sensor')" - - ## Integer field definitions using XPath queries. - [inputs.file.xpath.fields_int] - consumers = "Variable/@consumers" - - ## Non-integer field definitions using XPath queries. - ## The field type is defined using XPath expressions such as number(), boolean() or string(). If no conversion is performed the field will be of type string. - [inputs.file.xpath.fields] - temperature = "number(Variable/@temperature)" - power = "number(Variable/@power)" - frequency = "number(Variable/@frequency)" - ok = "Mode != 'ok'" -``` - -In this configuration mode, you explicitly specify the field and tags you want -to scrape from your data. - -A configuration can contain multiple _xpath_ subsections (for example, the file plugin -to process the xml-string multiple times). Consult the [XPath syntax][xpath] and -the [underlying library's functions][xpath lib] for details and help regarding -XPath queries. Consider using an XPath tester such as [xpather.com][xpather] or -[Code Beautify's XPath Tester][xpath tester] for help developing and debugging -your query. - -## Configuration (batch) - -Alternatively to the configuration above, fields can also be specified in a -batch way. So contrary to specify the fields in a section, you can define a -`name` and a `value` selector used to determine the name and value of the fields -in the metric. - -```toml -[[inputs.file]] - files = ["example.xml"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "xml" - - ## PROTOCOL-BUFFER definitions - ## Protocol-buffer definition file - # xpath_protobuf_file = "sparkplug_b.proto" - ## Name of the protocol-buffer message type to use in a fully qualified form. - # xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload" - ## List of paths to use when looking up imported protocol-buffer definition files. - # xpath_protobuf_import_paths = ["."] - - ## Print the internal XML document when in debug logging mode. - ## This is especially useful when using the parser with non-XML formats like protocol-buffers - ## to get an idea on the expression necessary to derive fields etc. - # xpath_print_document = false - - ## Allow the results of one of the parsing sections to be empty. - ## Useful when not all selected files have the exact same structure. - # xpath_allow_empty_selection = false - - ## Get native data-types for all data-format that contain type information. - ## Currently, protobuf, msgpack and JSON support native data-types - # xpath_native_types = false - - ## Multiple parsing sections are allowed - [[inputs.file.xpath]] - ## Optional: XPath-query to select a subset of nodes from the XML document. - metric_selection = "/Bus/child::Sensor" - - ## Optional: XPath-query to set the metric (measurement) name. - # metric_name = "string('example')" - - ## Optional: Query to extract metric timestamp. - ## If not specified the time of execution is used. - # timestamp = "/Gateway/Timestamp" - ## Optional: Format of the timestamp determined by the query above. - ## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang - ## time format. If not specified, a "unix" timestamp (in seconds) is expected. - # timestamp_format = "2006-01-02T15:04:05Z" - - ## Field specifications using a selector. - field_selection = "child::*" - ## Optional: Queries to specify field name and value. - ## These options are only to be used in combination with 'field_selection'! - ## By default the node name and node content is used if a field-selection - ## is specified. - # field_name = "name()" - # field_value = "." - - ## Optional: Expand field names relative to the selected node - ## This allows to flatten out nodes with non-unique names in the subtree - # field_name_expansion = false - - ## Tag specifications using a selector. - ## tag_selection = "child::*" - ## Optional: Queries to specify tag name and value. - ## These options are only to be used in combination with 'tag_selection'! - ## By default the node name and node content is used if a tag-selection - ## is specified. - # tag_name = "name()" - # tag_value = "." - - ## Optional: Expand tag names relative to the selected node - ## This allows to flatten out nodes with non-unique names in the subtree - # tag_name_expansion = false - - ## Tag definitions using the given XPath queries. - [inputs.file.xpath.tags] - name = "substring-after(Sensor/@name, ' ')" - device = "string('the ultimate sensor')" - -``` - -**Please note**: The resulting fields are _always_ of type string. - -It is also possible to specify a mixture of the two alternative ways of -specifying fields. In this case, _explicitly_ defined tags and fields take -_precedence_ over the batch instances if both use the same tag or field name. -### metric_selection (optional) - -You can specify a [XPath][xpath] query to select a subset of nodes from the XML -document, each used to generate a new metrics with the specified fields, tags -etc. - -Relative queries in subsequent queries are relative to the -`metric_selection`. To specify absolute paths, start the query with a -slash (`/`). - -Specifying `metric_selection` is optional. If not specified, all relative queries -are relative to the root node of the XML document. - -### metric_name (optional) - -By specifying `metric_name` you can override the metric/measurement name with -the result of the given [XPath][xpath] query. If not specified, the default -metric name is used. - -### timestamp, timestamp_format, timezone (optional) - -By default, the current time is used for all created metrics. To set the -time from values in the XML document you can specify a [XPath][xpath] query in -`timestamp` and set the format in `timestamp_format`. - -The `timestamp_format` can be set to `unix`, `unix_ms`, `unix_us`, `unix_ns`, or -an accepted [Go "reference time"][time const]. Consult the Go [time][time parse] -package for details and additional examples on how to set the time format. If -`timestamp_format` is omitted `unix` format is assumed as result of the -`timestamp` query. - -The `timezone` setting is used to locate the parsed time in the given -timezone. This is helpful for cases where the time does not contain timezone -information, e.g. `2023-03-09 14:04:40` and is not located in _UTC_, which is -the default setting. It is also possible to set the `timezone` to `Local` which -used the configured host timezone. - -For time formats with timezone information, e.g. RFC3339, the resulting -timestamp is unchanged. The `timezone` setting is ignored for all `unix` -timestamp formats. - -### tags sub-section - -[XPath][xpath] queries in the `tag name = query` format to add tags to the -metrics. The specified path can be absolute (starting with `/`) or -relative. Relative paths use the currently selected node as reference. - -__NOTE:__ Results of tag-queries will always be converted to strings. - -### fields_int sub-section - -[XPath][xpath] queries in the `field name = query` format to add integer typed -fields to the metrics. The specified path can be absolute (starting with `/`) or -relative. Relative paths use the currently selected node as reference. - -__NOTE:__ Results of field_int-queries will always be converted to -__int64__. The conversion fails in case the query result is not convertible. - -### fields sub-section - -[XPath][xpath] queries in the `field name = query` format to add non-integer -fields to the metrics. The specified path can be absolute (starting with `/`) or -relative. Relative paths use the currently selected node as reference. - -The type of the field is specified in the [XPath][xpath] query using the type -conversion functions of XPath such as `number()`, `boolean()` or `string()` If -no conversion is performed in the query the field will be of type string. - -__NOTE: Path conversion functions always succeed even if you convert a text -to float.__ - -### field_selection, field_name, field_value (optional) - -You can specify a [XPath][xpath] query to select a set of nodes forming the -fields of the metric. The specified path can be absolute (starting with `/`) or -relative to the currently selected node. Each node selected by `field_selection` -forms a new field within the metric. - -The _name_ and the _value_ of each field can be specified using the optional -`field_name` and `field_value` queries. The queries are relative to the selected -field if not starting with `/`. If not specified the field's _name_ defaults to -the node name and the field's _value_ defaults to the content of the selected -field node. - -__NOTE__: `field_name` and `field_value` queries are only evaluated if a -`field_selection` is specified. - -Specifying `field_selection` is optional. This is an alternative way to specify -fields especially for documents where the node names are not known a priori or -if there is a large number of fields to be specified. These options can also be -combined with the field specifications above. - -__NOTE: Path conversion functions always succeed even if you convert a text -to float.__ - -### field_name_expansion (optional) - -When _true_, field names selected with `field_selection` are expanded to a -_path_ relative to the _selected node_. This is necessary if we select all -leaf nodes as fields and those leaf nodes do not have unique names. That is in -case you have duplicate names in the fields you select you should set this to -`true`. - -### tag_selection, tag_name, tag_value (optional) - -You can specify a [XPath][xpath] query to select a set of nodes forming the tags -of the metric. The specified path can be absolute (starting with `/`) or -relative to the currently selected node. Each node selected by `tag_selection` -forms a new tag within the metric. - -The _name_ and the _value_ of each tag can be specified using the optional -`tag_name` and `tag_value` queries. The queries are relative to the selected tag -if not starting with `/`. If not specified the tag's _name_ defaults to the node -name and the tag's _value_ defaults to the content of the selected tag node. -__NOTE__: `tag_name` and `tag_value` queries are only evaluated if a -`tag_selection` is specified. - -Specifying `tag_selection` is optional. This is an alternative way to specify -tags especially for documents where the node names are not known a priori or if -there is a large number of tags to be specified. These options can also be -combined with the tag specifications above. - -### tag_name_expansion (optional) - -When _true_, tag names selected with `tag_selection` are expanded to a _path_ -relative to the _selected node_. This is necessary if we e.g. select all leaf -nodes as tags and those leaf nodes do not have unique names. That is in case you -have duplicate names in the tags you select you should set this to `true`. - -## Examples - -This `example.xml` file is used in the configuration examples below: - -```xml - - - Main Gateway - 2020-08-01T15:04:03Z - 12 - ok - - - - - - - - - busy - - - - - - - standby - - - - - - - error - - -``` - -### Basic Parsing - -This example shows the basic usage of the xml parser. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - [inputs.file.xpath.tags] - gateway = "substring-before(/Gateway/Name, ' ')" - - [inputs.file.xpath.fields_int] - seqnr = "/Gateway/Sequence" - - [inputs.file.xpath.fields] - ok = "/Gateway/Status = 'ok'" -``` - -Output: - -```text -file,gateway=Main,host=Hugin seqnr=12i,ok=true 1598610830000000000 -``` - -In the _tags_ definition the XPath function `substring-before()` is used to only -extract the sub-string before the space. To get the integer value of -`/Gateway/Sequence` we have to use the _fields_int_ section as there is no XPath -expression to convert node values to integers (only float). - -The `ok` field is filled with a boolean by specifying a query comparing the -query result of `/Gateway/Status` with the string _ok_. Use the type conversions -available in the XPath syntax to specify field types. - -### Time and metric names - -This is an example for using time and name of the metric from the XML document -itself. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - metric_name = "name(/Gateway/Status)" - - timestamp = "/Gateway/Timestamp" - timestamp_format = "2006-01-02T15:04:05Z" - - [inputs.file.xpath.tags] - gateway = "substring-before(/Gateway/Name, ' ')" - - [inputs.file.xpath.fields] - ok = "/Gateway/Status = 'ok'" -``` - -Output: - -```text -Status,gateway=Main,host=Hugin ok=true 1596294243000000000 -``` - -Additionally to the basic parsing example, the metric name is defined as the -name of the `/Gateway/Status` node and the timestamp is derived from the XML -document instead of using the execution time. - -### Multi-node selection - -For XML documents containing metrics for e.g. multiple devices (like `Sensor`s -in the _example.xml_), multiple metrics can be generated using node -selection. This example shows how to generate a metric for each _Sensor_ in the -example. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - metric_selection = "/Bus/child::Sensor" - - metric_name = "string('sensors')" - - timestamp = "/Gateway/Timestamp" - timestamp_format = "2006-01-02T15:04:05Z" - - [inputs.file.xpath.tags] - name = "substring-after(@name, ' ')" - - [inputs.file.xpath.fields_int] - consumers = "Variable/@consumers" - - [inputs.file.xpath.fields] - temperature = "number(Variable/@temperature)" - power = "number(Variable/@power)" - frequency = "number(Variable/@frequency)" - ok = "Mode != 'error'" - -``` - -Output: - -```text -sensors,host=Hugin,name=Facility\ A consumers=3i,frequency=49.78,ok=true,power=123.4,temperature=20 1596294243000000000 -sensors,host=Hugin,name=Facility\ B consumers=1i,frequency=49.78,ok=true,power=14.3,temperature=23.1 1596294243000000000 -sensors,host=Hugin,name=Facility\ C consumers=0i,frequency=49.78,ok=false,power=0.02,temperature=19.7 1596294243000000000 -``` - -Using the `metric_selection` option we select all `Sensor` nodes in the XML -document. Please note that all field and tag definitions are relative to these -selected nodes. An exception is the timestamp definition which is relative to -the root node of the XML document. - -### Batch field processing with multi-node selection - -For XML documents containing metrics with a large number of fields or where the -fields are not known before (e.g. an unknown set of `Variable` nodes in the -_example.xml_), field selectors can be used. This example shows how to generate -a metric for each _Sensor_ in the example with fields derived from the -_Variable_ nodes. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - metric_selection = "/Bus/child::Sensor" - metric_name = "string('sensors')" - - timestamp = "/Gateway/Timestamp" - timestamp_format = "2006-01-02T15:04:05Z" - - field_selection = "child::Variable" - field_name = "name(@*[1])" - field_value = "number(@*[1])" - - [inputs.file.xpath.tags] - name = "substring-after(@name, ' ')" -``` - -Output: - -```text -sensors,host=Hugin,name=Facility\ A consumers=3,frequency=49.78,power=123.4,temperature=20 1596294243000000000 -sensors,host=Hugin,name=Facility\ B consumers=1,frequency=49.78,power=14.3,temperature=23.1 1596294243000000000 -sensors,host=Hugin,name=Facility\ C consumers=0,frequency=49.78,power=0.02,temperature=19.7 1596294243000000000 -``` - -Using the `metric_selection` option we select all `Sensor` nodes in the XML -document. For each _Sensor_ we then use `field_selection` to select all child -nodes of the sensor as _field-nodes_ Please note that the field selection is -relative to the selected nodes. For each selected _field-node_ we use -`field_name` and `field_value` to determining the field's name and value, -respectively. The `field_name` derives the name of the first attribute of the -node, while `field_value` derives the value of the first attribute and converts -the result to a number. - -[xpath lib]: https://github.com/antchfx/xpath -[json]: https://www.json.org/ -[msgpack]: https://msgpack.org/ -[protobuf]: https://developers.google.com/protocol-buffers -[xml]: https://www.w3.org/XML/ -[xpath]: https://www.w3.org/TR/xpath/ -[xpather]: http://xpather.com/ -[xpath tester]: https://codebeautify.org/Xpath-Tester -[time const]: https://golang.org/pkg/time/#pkg-constants -[time parse]: https://golang.org/pkg/time/#Parse diff --git a/content/telegraf/v1/data_formats/input/xpath_msgpack.md b/content/telegraf/v1/data_formats/input/xpath_msgpack.md deleted file mode 100644 index fbcf618bdb..0000000000 --- a/content/telegraf/v1/data_formats/input/xpath_msgpack.md +++ /dev/null @@ -1,629 +0,0 @@ ---- -title: XPath MessagePack input data format -list_title: XPath MessagePack -description: - Use the `xpath_msgpack` input data format and XPath expressions to parse MessagePack data into Telegraf metrics. -menu: - telegraf_v1_ref: - name: XPath MessagePack - weight: 10 - parent: Input data formats -metadata: [XPath parser plugin] ---- - -Use the `xpath_msgpack` input data format, provided by the [XPath parser plugin](https://github.com/influxdata/telegraf/tree/master/plugins/parsers/xpath), with XPath expressions to parse MessagePack data into Telegraf metrics. - -For information about supported XPath functions, see [the underlying XPath library][xpath lib]. - -**NOTE:** The type of fields are specified using [XPath functions][xpath -lib]. The only exceptions are _integer_ fields that need to be specified in a -`fields_int` section. - -## Supported data formats - -| name | `data_format` setting | comment | -| --------------------------------------- | --------------------- | ------- | -| [Extensible Markup Language (XML)][xml] | `"xml"` | | -| [JSON][json] | `"xpath_json"` | | -| [MessagePack][msgpack] | `"xpath_msgpack"` | | -| [Protocol-buffers][protobuf] | `"xpath_protobuf"` | [see additional parameters](#protocol-buffers-additional-settings)| - -### Protocol-buffers additional settings - -For using the protocol-buffer format you need to specify additional -(_mandatory_) properties for the parser. Those options are described here. - -#### `xpath_protobuf_file` (mandatory) - -Use this option to specify the name of the protocol-buffer definition file -(`.proto`). - -#### `xpath_protobuf_type` (mandatory) - -This option contains the top-level message file to use for deserializing the -data to be parsed. Usually, this is constructed from the `package` name in the -protocol-buffer definition file and the `message` name as `.`. - -#### `xpath_protobuf_import_paths` (optional) - -In case you import other protocol-buffer definitions within your `.proto` file -(i.e. you use the `import` statement) you can use this option to specify paths -to search for the imported definition file(s). By default the imports are only -searched in `.` which is the current-working-directory, i.e. usually the -directory you are in when starting telegraf. - -Imagine you do have multiple protocol-buffer definitions (e.g. `A.proto`, -`B.proto` and `C.proto`) in a directory (e.g. `/data/my_proto_files`) where your -top-level file (e.g. `A.proto`) imports at least one other definition - -```protobuf -syntax = "proto3"; - -package foo; - -import "B.proto"; - -message Measurement { - ... -} -``` - -You should use the following setting - -```toml -[[inputs.file]] - files = ["example.dat"] - - data_format = "xpath_protobuf" - xpath_protobuf_file = "A.proto" - xpath_protobuf_type = "foo.Measurement" - xpath_protobuf_import_paths = [".", "/data/my_proto_files"] - - ... -``` - -#### `xpath_protobuf_skip_bytes` (optional) - -This option allows to skip a number of bytes before trying to parse -the protocol-buffer message. This is useful in cases where the raw data -has a header e.g. for the message length or in case of GRPC messages. - -This is a list of known headers and the corresponding values for -`xpath_protobuf_skip_bytes` - -| name | setting | comment | -| --------------------------------------- | ------- | ------- | -| [GRPC protocol][GRPC] | 5 | GRPC adds a 5-byte header for _Length-Prefixed-Messages_ | -| [PowerDNS logging][PDNS] | 2 | Sent messages contain a 2-byte header containing the message length | - -[GRPC]: https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md -[PDNS]: https://docs.powerdns.com/recursor/lua-config/protobuf.html - -## Configuration - -```toml -[[inputs.file]] - files = ["example.xml"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "xml" - - ## PROTOCOL-BUFFER definitions - ## Protocol-buffer definition file - # xpath_protobuf_file = "sparkplug_b.proto" - ## Name of the protocol-buffer message type to use in a fully qualified form. - # xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload" - ## List of paths to use when looking up imported protocol-buffer definition files. - # xpath_protobuf_import_paths = ["."] - ## Number of (header) bytes to ignore before parsing the message. - # xpath_protobuf_skip_bytes = 0 - - ## Print the internal XML document when in debug logging mode. - ## This is especially useful when using the parser with non-XML formats like protocol-buffers - ## to get an idea on the expression necessary to derive fields etc. - # xpath_print_document = false - - ## Allow the results of one of the parsing sections to be empty. - ## Useful when not all selected files have the exact same structure. - # xpath_allow_empty_selection = false - - ## Get native data-types for all data-format that contain type information. - ## Currently, protobuf, msgpack and JSON support native data-types - # xpath_native_types = false - - ## Multiple parsing sections are allowed - [[inputs.file.xpath]] - ## Optional: XPath-query to select a subset of nodes from the XML document. - # metric_selection = "/Bus/child::Sensor" - - ## Optional: XPath-query to set the metric (measurement) name. - # metric_name = "string('example')" - - ## Optional: Query to extract metric timestamp. - ## If not specified the time of execution is used. - # timestamp = "/Gateway/Timestamp" - ## Optional: Format of the timestamp determined by the query above. - ## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang - ## time format. If not specified, a "unix" timestamp (in seconds) is expected. - # timestamp_format = "2006-01-02T15:04:05Z" - ## Optional: Timezone of the parsed time - ## This will locate the parsed time to the given timezone. Please note that - ## for times with timezone-offsets (e.g. RFC3339) the timestamp is unchanged. - ## This is ignored for all (unix) timestamp formats. - # timezone = "UTC" - - ## Optional: List of fields to convert to hex-strings if they are - ## containing byte-arrays. This might be the case for e.g. protocol-buffer - ## messages encoding data as byte-arrays. Wildcard patterns are allowed. - ## By default, all byte-array-fields are converted to string. - # fields_bytes_as_hex = [] - - ## Tag definitions using the given XPath queries. - [inputs.file.xpath.tags] - name = "substring-after(Sensor/@name, ' ')" - device = "string('the ultimate sensor')" - - ## Integer field definitions using XPath queries. - [inputs.file.xpath.fields_int] - consumers = "Variable/@consumers" - - ## Non-integer field definitions using XPath queries. - ## The field type is defined using XPath expressions such as number(), boolean() or string(). If no conversion is performed the field will be of type string. - [inputs.file.xpath.fields] - temperature = "number(Variable/@temperature)" - power = "number(Variable/@power)" - frequency = "number(Variable/@frequency)" - ok = "Mode != 'ok'" -``` - -In this configuration mode, you explicitly specify the field and tags you want -to scrape from your data. - -A configuration can contain multiple _xpath_ subsections (for example, the file plugin -to process the xml-string multiple times). Consult the [XPath syntax][xpath] and -the [underlying library's functions][xpath lib] for details and help regarding -XPath queries. Consider using an XPath tester such as [xpather.com][xpather] or -[Code Beautify's XPath Tester][xpath tester] for help developing and debugging -your query. - -## Configuration (batch) - -Alternatively to the configuration above, fields can also be specified in a -batch way. So contrary to specify the fields in a section, you can define a -`name` and a `value` selector used to determine the name and value of the fields -in the metric. - -```toml -[[inputs.file]] - files = ["example.xml"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "xml" - - ## PROTOCOL-BUFFER definitions - ## Protocol-buffer definition file - # xpath_protobuf_file = "sparkplug_b.proto" - ## Name of the protocol-buffer message type to use in a fully qualified form. - # xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload" - ## List of paths to use when looking up imported protocol-buffer definition files. - # xpath_protobuf_import_paths = ["."] - - ## Print the internal XML document when in debug logging mode. - ## This is especially useful when using the parser with non-XML formats like protocol-buffers - ## to get an idea on the expression necessary to derive fields etc. - # xpath_print_document = false - - ## Allow the results of one of the parsing sections to be empty. - ## Useful when not all selected files have the exact same structure. - # xpath_allow_empty_selection = false - - ## Get native data-types for all data-format that contain type information. - ## Currently, protobuf, msgpack and JSON support native data-types - # xpath_native_types = false - - ## Multiple parsing sections are allowed - [[inputs.file.xpath]] - ## Optional: XPath-query to select a subset of nodes from the XML document. - metric_selection = "/Bus/child::Sensor" - - ## Optional: XPath-query to set the metric (measurement) name. - # metric_name = "string('example')" - - ## Optional: Query to extract metric timestamp. - ## If not specified the time of execution is used. - # timestamp = "/Gateway/Timestamp" - ## Optional: Format of the timestamp determined by the query above. - ## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang - ## time format. If not specified, a "unix" timestamp (in seconds) is expected. - # timestamp_format = "2006-01-02T15:04:05Z" - - ## Field specifications using a selector. - field_selection = "child::*" - ## Optional: Queries to specify field name and value. - ## These options are only to be used in combination with 'field_selection'! - ## By default the node name and node content is used if a field-selection - ## is specified. - # field_name = "name()" - # field_value = "." - - ## Optional: Expand field names relative to the selected node - ## This allows to flatten out nodes with non-unique names in the subtree - # field_name_expansion = false - - ## Tag specifications using a selector. - ## tag_selection = "child::*" - ## Optional: Queries to specify tag name and value. - ## These options are only to be used in combination with 'tag_selection'! - ## By default the node name and node content is used if a tag-selection - ## is specified. - # tag_name = "name()" - # tag_value = "." - - ## Optional: Expand tag names relative to the selected node - ## This allows to flatten out nodes with non-unique names in the subtree - # tag_name_expansion = false - - ## Tag definitions using the given XPath queries. - [inputs.file.xpath.tags] - name = "substring-after(Sensor/@name, ' ')" - device = "string('the ultimate sensor')" - -``` - -**Please note**: The resulting fields are _always_ of type string. - -It is also possible to specify a mixture of the two alternative ways of -specifying fields. In this case, _explicitly_ defined tags and fields take -_precedence_ over the batch instances if both use the same tag or field name. -### metric_selection (optional) - -You can specify a [XPath][xpath] query to select a subset of nodes from the XML -document, each used to generate a new metrics with the specified fields, tags -etc. - -Relative queries in subsequent queries are relative to the -`metric_selection`. To specify absolute paths, start the query with a -slash (`/`). - -Specifying `metric_selection` is optional. If not specified, all relative queries -are relative to the root node of the XML document. - -### metric_name (optional) - -By specifying `metric_name` you can override the metric/measurement name with -the result of the given [XPath][xpath] query. If not specified, the default -metric name is used. - -### timestamp, timestamp_format, timezone (optional) - -By default, the current time is used for all created metrics. To set the -time from values in the XML document you can specify a [XPath][xpath] query in -`timestamp` and set the format in `timestamp_format`. - -The `timestamp_format` can be set to `unix`, `unix_ms`, `unix_us`, `unix_ns`, or -an accepted [Go "reference time"][time const]. Consult the Go [time][time parse] -package for details and additional examples on how to set the time format. If -`timestamp_format` is omitted `unix` format is assumed as result of the -`timestamp` query. - -The `timezone` setting will be used to locate the parsed time in the given -timezone. This is helpful for cases where the time does not contain timezone -information, e.g. `2023-03-09 14:04:40` and is not located in _UTC_, which is -the default setting. It is also possible to set the `timezone` to `Local` which -used the configured host timezone. - -For time formats with timezone information, e.g. RFC3339, the resulting -timestamp is unchanged. The `timezone` setting is ignored for all `unix` -timestamp formats. - -### tags sub-section - -[XPath][xpath] queries in the `tag name = query` format to add tags to the -metrics. The specified path can be absolute (starting with `/`) or -relative. Relative paths use the currently selected node as reference. - -__NOTE:__ Results of tag-queries will always be converted to strings. - -### fields_int sub-section - -[XPath][xpath] queries in the `field name = query` format to add integer typed -fields to the metrics. The specified path can be absolute (starting with `/`) or -relative. Relative paths use the currently selected node as reference. - -__NOTE:__ Results of field_int-queries will always be converted to -__int64__. The conversion will fail in case the query result is not convertible! - -### fields sub-section - -[XPath][xpath] queries in the `field name = query` format to add non-integer -fields to the metrics. The specified path can be absolute (starting with `/`) or -relative. Relative paths use the currently selected node as reference. - -The type of the field is specified in the [XPath][xpath] query using the type -conversion functions of XPath such as `number()`, `boolean()` or `string()` If -no conversion is performed in the query the field will be of type string. - -__NOTE: Path conversion functions will always succeed even if you convert a text -to float!__ - -### field_selection, field_name, field_value (optional) - -You can specify a [XPath][xpath] query to select a set of nodes forming the -fields of the metric. The specified path can be absolute (starting with `/`) or -relative to the currently selected node. Each node selected by `field_selection` -forms a new field within the metric. - -The _name_ and the _value_ of each field can be specified using the optional -`field_name` and `field_value` queries. The queries are relative to the selected -field if not starting with `/`. If not specified the field's _name_ defaults to -the node name and the field's _value_ defaults to the content of the selected -field node. - -__NOTE__: `field_name` and `field_value` queries are only evaluated if a -`field_selection` is specified. - -Specifying `field_selection` is optional. This is an alternative way to specify -fields especially for documents where the node names are not known a priori or -if there is a large number of fields to be specified. These options can also be -combined with the field specifications above. - -__NOTE: Path conversion functions will always succeed even if you convert a text -to float!__ - -### field_name_expansion (optional) - -When _true_, field names selected with `field_selection` are expanded to a -_path_ relative to the _selected node_. This is necessary if we e.g. select all -leaf nodes as fields and those leaf nodes do not have unique names. That is in -case you have duplicate names in the fields you select you should set this to -`true`. - -### tag_selection, tag_name, tag_value (optional) - -You can specify a [XPath][xpath] query to select a set of nodes forming the tags -of the metric. The specified path can be absolute (starting with `/`) or -relative to the currently selected node. Each node selected by `tag_selection` -forms a new tag within the metric. - -The _name_ and the _value_ of each tag can be specified using the optional -`tag_name` and `tag_value` queries. The queries are relative to the selected tag -if not starting with `/`. If not specified the tag's _name_ defaults to the node -name and the tag's _value_ defaults to the content of the selected tag node. -__NOTE__: `tag_name` and `tag_value` queries are only evaluated if a -`tag_selection` is specified. - -Specifying `tag_selection` is optional. This is an alternative way to specify -tags especially for documents where the node names are not known a priori or if -there is a large number of tags to be specified. These options can also be -combined with the tag specifications above. - -### tag_name_expansion (optional) - -When _true_, tag names selected with `tag_selection` are expanded to a _path_ -relative to the _selected node_. This is necessary if we e.g. select all leaf -nodes as tags and those leaf nodes do not have unique names. That is in case you -have duplicate names in the tags you select you should set this to `true`. - -## Examples - -This `example.xml` file is used in the configuration examples below: - -```xml - - - Main Gateway - 2020-08-01T15:04:03Z - 12 - ok - - - - - - - - - busy - - - - - - - standby - - - - - - - error - - -``` - -### Basic Parsing - -This example shows the basic usage of the xml parser. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - [inputs.file.xpath.tags] - gateway = "substring-before(/Gateway/Name, ' ')" - - [inputs.file.xpath.fields_int] - seqnr = "/Gateway/Sequence" - - [inputs.file.xpath.fields] - ok = "/Gateway/Status = 'ok'" -``` - -Output: - -```text -file,gateway=Main,host=Hugin seqnr=12i,ok=true 1598610830000000000 -``` - -In the _tags_ definition the XPath function `substring-before()` is used to only -extract the sub-string before the space. To get the integer value of -`/Gateway/Sequence` we have to use the _fields_int_ section as there is no XPath -expression to convert node values to integers (only float). - -The `ok` field is filled with a boolean by specifying a query comparing the -query result of `/Gateway/Status` with the string _ok_. Use the type conversions -available in the XPath syntax to specify field types. - -### Time and metric names - -This is an example for using time and name of the metric from the XML document -itself. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - metric_name = "name(/Gateway/Status)" - - timestamp = "/Gateway/Timestamp" - timestamp_format = "2006-01-02T15:04:05Z" - - [inputs.file.xpath.tags] - gateway = "substring-before(/Gateway/Name, ' ')" - - [inputs.file.xpath.fields] - ok = "/Gateway/Status = 'ok'" -``` - -Output: - -```text -Status,gateway=Main,host=Hugin ok=true 1596294243000000000 -``` - -Additionally to the basic parsing example, the metric name is defined as the -name of the `/Gateway/Status` node and the timestamp is derived from the XML -document instead of using the execution time. - -### Multi-node selection - -For XML documents containing metrics for e.g. multiple devices (like `Sensor`s -in the _example.xml_), multiple metrics can be generated using node -selection. This example shows how to generate a metric for each _Sensor_ in the -example. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - metric_selection = "/Bus/child::Sensor" - - metric_name = "string('sensors')" - - timestamp = "/Gateway/Timestamp" - timestamp_format = "2006-01-02T15:04:05Z" - - [inputs.file.xpath.tags] - name = "substring-after(@name, ' ')" - - [inputs.file.xpath.fields_int] - consumers = "Variable/@consumers" - - [inputs.file.xpath.fields] - temperature = "number(Variable/@temperature)" - power = "number(Variable/@power)" - frequency = "number(Variable/@frequency)" - ok = "Mode != 'error'" - -``` - -Output: - -```text -sensors,host=Hugin,name=Facility\ A consumers=3i,frequency=49.78,ok=true,power=123.4,temperature=20 1596294243000000000 -sensors,host=Hugin,name=Facility\ B consumers=1i,frequency=49.78,ok=true,power=14.3,temperature=23.1 1596294243000000000 -sensors,host=Hugin,name=Facility\ C consumers=0i,frequency=49.78,ok=false,power=0.02,temperature=19.7 1596294243000000000 -``` - -Using the `metric_selection` option we select all `Sensor` nodes in the XML -document. Please note that all field and tag definitions are relative to these -selected nodes. An exception is the timestamp definition which is relative to -the root node of the XML document. - -### Batch field processing with multi-node selection - -For XML documents containing metrics with a large number of fields or where the -fields are not known before (e.g. an unknown set of `Variable` nodes in the -_example.xml_), field selectors can be used. This example shows how to generate -a metric for each _Sensor_ in the example with fields derived from the -_Variable_ nodes. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - metric_selection = "/Bus/child::Sensor" - metric_name = "string('sensors')" - - timestamp = "/Gateway/Timestamp" - timestamp_format = "2006-01-02T15:04:05Z" - - field_selection = "child::Variable" - field_name = "name(@*[1])" - field_value = "number(@*[1])" - - [inputs.file.xpath.tags] - name = "substring-after(@name, ' ')" -``` - -Output: - -```text -sensors,host=Hugin,name=Facility\ A consumers=3,frequency=49.78,power=123.4,temperature=20 1596294243000000000 -sensors,host=Hugin,name=Facility\ B consumers=1,frequency=49.78,power=14.3,temperature=23.1 1596294243000000000 -sensors,host=Hugin,name=Facility\ C consumers=0,frequency=49.78,power=0.02,temperature=19.7 1596294243000000000 -``` - -Using the `metric_selection` option we select all `Sensor` nodes in the XML -document. For each _Sensor_ we then use `field_selection` to select all child -nodes of the sensor as _field-nodes_ Please note that the field selection is -relative to the selected nodes. For each selected _field-node_ we use -`field_name` and `field_value` to determining the field's name and value, -respectively. The `field_name` derives the name of the first attribute of the -node, while `field_value` derives the value of the first attribute and converts -the result to a number. - -[xpath lib]: https://github.com/antchfx/xpath -[json]: https://www.json.org/ -[msgpack]: https://msgpack.org/ -[protobuf]: https://developers.google.com/protocol-buffers -[xml]: https://www.w3.org/XML/ -[xpath]: https://www.w3.org/TR/xpath/ -[xpather]: http://xpather.com/ -[xpath tester]: https://codebeautify.org/Xpath-Tester -[time const]: https://golang.org/pkg/time/#pkg-constants -[time parse]: https://golang.org/pkg/time/#Parse diff --git a/content/telegraf/v1/data_formats/input/xpath_protobuf.md b/content/telegraf/v1/data_formats/input/xpath_protobuf.md deleted file mode 100644 index 214fad6631..0000000000 --- a/content/telegraf/v1/data_formats/input/xpath_protobuf.md +++ /dev/null @@ -1,629 +0,0 @@ ---- -title: XPath Protocol Buffers input data format -list_title: XPath Protocol Buffers -description: - Use the `xpath_protobuf` input data format and XPath expressions to parse protobuf (Protocol Buffer) data into Telegraf metrics. -menu: - telegraf_v1_ref: - name: XPath Protocol Buffers - weight: 10 - parent: Input data formats -metadata: [XPath parser plugin] ---- - -Use the `xpath_protobuf` input data format, provided by the [XPath parser plugin](https://github.com/influxdata/telegraf/tree/master/plugins/parsers/xpath), with XPath expressions to parse Protocol Buffer data into Telegraf metrics. - -For information about supported XPath functions, see [the underlying XPath library][xpath lib]. - -**NOTE:** The type of fields are specified using [XPath functions][xpath -lib]. The only exceptions are _integer_ fields that need to be specified in a -`fields_int` section. - -## Supported data formats - -| name | `data_format` setting | comment | -| --------------------------------------- | --------------------- | ------- | -| [Extensible Markup Language (XML)][xml] | `"xml"` | | -| [JSON][json] | `"xpath_json"` | | -| [MessagePack][msgpack] | `"xpath_msgpack"` | | -| [Protocol-buffers][protobuf] | `"xpath_protobuf"` | [see additional parameters](#protocol-buffers-additional-settings)| - -### Protocol-buffers additional settings - -For using the protocol-buffer format you need to specify additional -(_mandatory_) properties for the parser. Those options are described here. - -#### `xpath_protobuf_file` (mandatory) - -Use this option to specify the name of the protocol-buffer definition file -(`.proto`). - -#### `xpath_protobuf_type` (mandatory) - -This option contains the top-level message file to use for deserializing the -data to be parsed. Usually, this is constructed from the `package` name in the -protocol-buffer definition file and the `message` name as `.`. - -#### `xpath_protobuf_import_paths` (optional) - -In case you import other protocol-buffer definitions within your `.proto` file -(i.e. you use the `import` statement) you can use this option to specify paths -to search for the imported definition file(s). By default the imports are only -searched in `.` which is the current-working-directory, i.e. usually the -directory you are in when starting telegraf. - -Imagine you do have multiple protocol-buffer definitions (e.g. `A.proto`, -`B.proto` and `C.proto`) in a directory (e.g. `/data/my_proto_files`) where your -top-level file (e.g. `A.proto`) imports at least one other definition - -```protobuf -syntax = "proto3"; - -package foo; - -import "B.proto"; - -message Measurement { - ... -} -``` - -You should use the following setting - -```toml -[[inputs.file]] - files = ["example.dat"] - - data_format = "xpath_protobuf" - xpath_protobuf_file = "A.proto" - xpath_protobuf_type = "foo.Measurement" - xpath_protobuf_import_paths = [".", "/data/my_proto_files"] - - ... -``` - -#### `xpath_protobuf_skip_bytes` (optional) - -This option allows to skip a number of bytes before trying to parse -the protocol-buffer message. This is useful in cases where the raw data -has a header e.g. for the message length or in case of GRPC messages. - -This is a list of known headers and the corresponding values for -`xpath_protobuf_skip_bytes` - -| name | setting | comment | -| --------------------------------------- | ------- | ------- | -| [GRPC protocol][GRPC] | 5 | GRPC adds a 5-byte header for _Length-Prefixed-Messages_ | -| [PowerDNS logging][PDNS] | 2 | Sent messages contain a 2-byte header containing the message length | - -[GRPC]: https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md -[PDNS]: https://docs.powerdns.com/recursor/lua-config/protobuf.html - -## Configuration - -```toml -[[inputs.file]] - files = ["example.xml"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "xml" - - ## PROTOCOL-BUFFER definitions - ## Protocol-buffer definition file - # xpath_protobuf_file = "sparkplug_b.proto" - ## Name of the protocol-buffer message type to use in a fully qualified form. - # xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload" - ## List of paths to use when looking up imported protocol-buffer definition files. - # xpath_protobuf_import_paths = ["."] - ## Number of (header) bytes to ignore before parsing the message. - # xpath_protobuf_skip_bytes = 0 - - ## Print the internal XML document when in debug logging mode. - ## This is especially useful when using the parser with non-XML formats like protocol-buffers - ## to get an idea on the expression necessary to derive fields etc. - # xpath_print_document = false - - ## Allow the results of one of the parsing sections to be empty. - ## Useful when not all selected files have the exact same structure. - # xpath_allow_empty_selection = false - - ## Get native data-types for all data-format that contain type information. - ## Currently, protobuf, msgpack and JSON support native data-types - # xpath_native_types = false - - ## Multiple parsing sections are allowed - [[inputs.file.xpath]] - ## Optional: XPath-query to select a subset of nodes from the XML document. - # metric_selection = "/Bus/child::Sensor" - - ## Optional: XPath-query to set the metric (measurement) name. - # metric_name = "string('example')" - - ## Optional: Query to extract metric timestamp. - ## If not specified the time of execution is used. - # timestamp = "/Gateway/Timestamp" - ## Optional: Format of the timestamp determined by the query above. - ## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang - ## time format. If not specified, a "unix" timestamp (in seconds) is expected. - # timestamp_format = "2006-01-02T15:04:05Z" - ## Optional: Timezone of the parsed time - ## This will locate the parsed time to the given timezone. Please note that - ## for times with timezone-offsets (e.g. RFC3339) the timestamp is unchanged. - ## This is ignored for all (unix) timestamp formats. - # timezone = "UTC" - - ## Optional: List of fields to convert to hex-strings if they are - ## containing byte-arrays. This might be the case for e.g. protocol-buffer - ## messages encoding data as byte-arrays. Wildcard patterns are allowed. - ## By default, all byte-array-fields are converted to string. - # fields_bytes_as_hex = [] - - ## Tag definitions using the given XPath queries. - [inputs.file.xpath.tags] - name = "substring-after(Sensor/@name, ' ')" - device = "string('the ultimate sensor')" - - ## Integer field definitions using XPath queries. - [inputs.file.xpath.fields_int] - consumers = "Variable/@consumers" - - ## Non-integer field definitions using XPath queries. - ## The field type is defined using XPath expressions such as number(), boolean() or string(). If no conversion is performed the field will be of type string. - [inputs.file.xpath.fields] - temperature = "number(Variable/@temperature)" - power = "number(Variable/@power)" - frequency = "number(Variable/@frequency)" - ok = "Mode != 'ok'" -``` - -In this configuration mode, you explicitly specify the field and tags you want -to scrape from your data. - -A configuration can contain multiple _xpath_ subsections (for example, the file plugin -to process the xml-string multiple times). Consult the [XPath syntax][xpath] and -the [underlying library's functions][xpath lib] for details and help regarding -XPath queries. Consider using an XPath tester such as [xpather.com][xpather] or -[Code Beautify's XPath Tester][xpath tester] for help developing and debugging -your query. - -## Configuration (batch) - -Alternatively to the configuration above, fields can also be specified in a -batch way. So contrary to specify the fields in a section, you can define a -`name` and a `value` selector used to determine the name and value of the fields -in the metric. - -```toml -[[inputs.file]] - files = ["example.xml"] - - ## Data format to consume. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md - data_format = "xml" - - ## PROTOCOL-BUFFER definitions - ## Protocol-buffer definition file - # xpath_protobuf_file = "sparkplug_b.proto" - ## Name of the protocol-buffer message type to use in a fully qualified form. - # xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload" - ## List of paths to use when looking up imported protocol-buffer definition files. - # xpath_protobuf_import_paths = ["."] - - ## Print the internal XML document when in debug logging mode. - ## This is especially useful when using the parser with non-XML formats like protocol-buffers - ## to get an idea on the expression necessary to derive fields etc. - # xpath_print_document = false - - ## Allow the results of one of the parsing sections to be empty. - ## Useful when not all selected files have the exact same structure. - # xpath_allow_empty_selection = false - - ## Get native data-types for all data-format that contain type information. - ## Currently, protobuf, msgpack and JSON support native data-types - # xpath_native_types = false - - ## Multiple parsing sections are allowed - [[inputs.file.xpath]] - ## Optional: XPath-query to select a subset of nodes from the XML document. - metric_selection = "/Bus/child::Sensor" - - ## Optional: XPath-query to set the metric (measurement) name. - # metric_name = "string('example')" - - ## Optional: Query to extract metric timestamp. - ## If not specified the time of execution is used. - # timestamp = "/Gateway/Timestamp" - ## Optional: Format of the timestamp determined by the query above. - ## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang - ## time format. If not specified, a "unix" timestamp (in seconds) is expected. - # timestamp_format = "2006-01-02T15:04:05Z" - - ## Field specifications using a selector. - field_selection = "child::*" - ## Optional: Queries to specify field name and value. - ## These options are only to be used in combination with 'field_selection'! - ## By default the node name and node content is used if a field-selection - ## is specified. - # field_name = "name()" - # field_value = "." - - ## Optional: Expand field names relative to the selected node - ## This allows to flatten out nodes with non-unique names in the subtree - # field_name_expansion = false - - ## Tag specifications using a selector. - ## tag_selection = "child::*" - ## Optional: Queries to specify tag name and value. - ## These options are only to be used in combination with 'tag_selection'! - ## By default the node name and node content is used if a tag-selection - ## is specified. - # tag_name = "name()" - # tag_value = "." - - ## Optional: Expand tag names relative to the selected node - ## This allows to flatten out nodes with non-unique names in the subtree - # tag_name_expansion = false - - ## Tag definitions using the given XPath queries. - [inputs.file.xpath.tags] - name = "substring-after(Sensor/@name, ' ')" - device = "string('the ultimate sensor')" - -``` - -**Please note**: The resulting fields are _always_ of type string. - -It is also possible to specify a mixture of the two alternative ways of -specifying fields. In this case, _explicitly_ defined tags and fields take -_precedence_ over the batch instances if both use the same tag or field name. -### metric_selection (optional) - -You can specify a [XPath][xpath] query to select a subset of nodes from the XML -document, each used to generate a new metrics with the specified fields, tags -etc. - -Relative queries in subsequent queries are relative to the -`metric_selection`. To specify absolute paths, start the query with a -slash (`/`). - -Specifying `metric_selection` is optional. If not specified, all relative queries -are relative to the root node of the XML document. - -### metric_name (optional) - -By specifying `metric_name` you can override the metric/measurement name with -the result of the given [XPath][xpath] query. If not specified, the default -metric name is used. - -### timestamp, timestamp_format, timezone (optional) - -By default, the current time is used for all created metrics. To set the -time from values in the XML document you can specify a [XPath][xpath] query in -`timestamp` and set the format in `timestamp_format`. - -The `timestamp_format` can be set to `unix`, `unix_ms`, `unix_us`, `unix_ns`, or -an accepted [Go "reference time"][time const]. Consult the Go [time][time parse] -package for details and additional examples on how to set the time format. If -`timestamp_format` is omitted `unix` format is assumed as result of the -`timestamp` query. - -The `timezone` setting will be used to locate the parsed time in the given -timezone. This is helpful for cases where the time does not contain timezone -information, e.g. `2023-03-09 14:04:40` and is not located in _UTC_, which is -the default setting. It is also possible to set the `timezone` to `Local` which -used the configured host timezone. - -For time formats with timezone information, e.g. RFC3339, the resulting -timestamp is unchanged. The `timezone` setting is ignored for all `unix` -timestamp formats. - -### tags sub-section - -[XPath][xpath] queries in the `tag name = query` format to add tags to the -metrics. The specified path can be absolute (starting with `/`) or -relative. Relative paths use the currently selected node as reference. - -__NOTE:__ Results of tag-queries will always be converted to strings. - -### fields_int sub-section - -[XPath][xpath] queries in the `field name = query` format to add integer typed -fields to the metrics. The specified path can be absolute (starting with `/`) or -relative. Relative paths use the currently selected node as reference. - -__NOTE:__ Results of field_int-queries will always be converted to -__int64__. The conversion will fail in case the query result is not convertible! - -### fields sub-section - -[XPath][xpath] queries in the `field name = query` format to add non-integer -fields to the metrics. The specified path can be absolute (starting with `/`) or -relative. Relative paths use the currently selected node as reference. - -The type of the field is specified in the [XPath][xpath] query using the type -conversion functions of XPath such as `number()`, `boolean()` or `string()` If -no conversion is performed in the query the field will be of type string. - -__NOTE: Path conversion functions will always succeed even if you convert a text -to float!__ - -### field_selection, field_name, field_value (optional) - -You can specify a [XPath][xpath] query to select a set of nodes forming the -fields of the metric. The specified path can be absolute (starting with `/`) or -relative to the currently selected node. Each node selected by `field_selection` -forms a new field within the metric. - -The _name_ and the _value_ of each field can be specified using the optional -`field_name` and `field_value` queries. The queries are relative to the selected -field if not starting with `/`. If not specified the field's _name_ defaults to -the node name and the field's _value_ defaults to the content of the selected -field node. - -__NOTE__: `field_name` and `field_value` queries are only evaluated if a -`field_selection` is specified. - -Specifying `field_selection` is optional. This is an alternative way to specify -fields especially for documents where the node names are not known a priori or -if there is a large number of fields to be specified. These options can also be -combined with the field specifications above. - -__NOTE: Path conversion functions will always succeed even if you convert a text -to float!__ - -### field_name_expansion (optional) - -When _true_, field names selected with `field_selection` are expanded to a -_path_ relative to the _selected node_. This is necessary if we e.g. select all -leaf nodes as fields and those leaf nodes do not have unique names. That is in -case you have duplicate names in the fields you select you should set this to -`true`. - -### tag_selection, tag_name, tag_value (optional) - -You can specify a [XPath][xpath] query to select a set of nodes forming the tags -of the metric. The specified path can be absolute (starting with `/`) or -relative to the currently selected node. Each node selected by `tag_selection` -forms a new tag within the metric. - -The _name_ and the _value_ of each tag can be specified using the optional -`tag_name` and `tag_value` queries. The queries are relative to the selected tag -if not starting with `/`. If not specified the tag's _name_ defaults to the node -name and the tag's _value_ defaults to the content of the selected tag node. -__NOTE__: `tag_name` and `tag_value` queries are only evaluated if a -`tag_selection` is specified. - -Specifying `tag_selection` is optional. This is an alternative way to specify -tags especially for documents where the node names are not known a priori or if -there is a large number of tags to be specified. These options can also be -combined with the tag specifications above. - -### tag_name_expansion (optional) - -When _true_, tag names selected with `tag_selection` are expanded to a _path_ -relative to the _selected node_. This is necessary if we e.g. select all leaf -nodes as tags and those leaf nodes do not have unique names. That is in case you -have duplicate names in the tags you select you should set this to `true`. - -## Examples - -This `example.xml` file is used in the configuration examples below: - -```xml - - - Main Gateway - 2020-08-01T15:04:03Z - 12 - ok - - - - - - - - - busy - - - - - - - standby - - - - - - - error - - -``` - -### Basic Parsing - -This example shows the basic usage of the xml parser. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - [inputs.file.xpath.tags] - gateway = "substring-before(/Gateway/Name, ' ')" - - [inputs.file.xpath.fields_int] - seqnr = "/Gateway/Sequence" - - [inputs.file.xpath.fields] - ok = "/Gateway/Status = 'ok'" -``` - -Output: - -```text -file,gateway=Main,host=Hugin seqnr=12i,ok=true 1598610830000000000 -``` - -In the _tags_ definition the XPath function `substring-before()` is used to only -extract the sub-string before the space. To get the integer value of -`/Gateway/Sequence` we have to use the _fields_int_ section as there is no XPath -expression to convert node values to integers (only float). - -The `ok` field is filled with a boolean by specifying a query comparing the -query result of `/Gateway/Status` with the string _ok_. Use the type conversions -available in the XPath syntax to specify field types. - -### Time and metric names - -This is an example for using time and name of the metric from the XML document -itself. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - metric_name = "name(/Gateway/Status)" - - timestamp = "/Gateway/Timestamp" - timestamp_format = "2006-01-02T15:04:05Z" - - [inputs.file.xpath.tags] - gateway = "substring-before(/Gateway/Name, ' ')" - - [inputs.file.xpath.fields] - ok = "/Gateway/Status = 'ok'" -``` - -Output: - -```text -Status,gateway=Main,host=Hugin ok=true 1596294243000000000 -``` - -Additionally to the basic parsing example, the metric name is defined as the -name of the `/Gateway/Status` node and the timestamp is derived from the XML -document instead of using the execution time. - -### Multi-node selection - -For XML documents containing metrics for e.g. multiple devices (like `Sensor`s -in the _example.xml_), multiple metrics can be generated using node -selection. This example shows how to generate a metric for each _Sensor_ in the -example. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - metric_selection = "/Bus/child::Sensor" - - metric_name = "string('sensors')" - - timestamp = "/Gateway/Timestamp" - timestamp_format = "2006-01-02T15:04:05Z" - - [inputs.file.xpath.tags] - name = "substring-after(@name, ' ')" - - [inputs.file.xpath.fields_int] - consumers = "Variable/@consumers" - - [inputs.file.xpath.fields] - temperature = "number(Variable/@temperature)" - power = "number(Variable/@power)" - frequency = "number(Variable/@frequency)" - ok = "Mode != 'error'" - -``` - -Output: - -```text -sensors,host=Hugin,name=Facility\ A consumers=3i,frequency=49.78,ok=true,power=123.4,temperature=20 1596294243000000000 -sensors,host=Hugin,name=Facility\ B consumers=1i,frequency=49.78,ok=true,power=14.3,temperature=23.1 1596294243000000000 -sensors,host=Hugin,name=Facility\ C consumers=0i,frequency=49.78,ok=false,power=0.02,temperature=19.7 1596294243000000000 -``` - -Using the `metric_selection` option we select all `Sensor` nodes in the XML -document. Please note that all field and tag definitions are relative to these -selected nodes. An exception is the timestamp definition which is relative to -the root node of the XML document. - -### Batch field processing with multi-node selection - -For XML documents containing metrics with a large number of fields or where the -fields are not known before (e.g. an unknown set of `Variable` nodes in the -_example.xml_), field selectors can be used. This example shows how to generate -a metric for each _Sensor_ in the example with fields derived from the -_Variable_ nodes. - -Config: - -```toml -[[inputs.file]] - files = ["example.xml"] - data_format = "xml" - - [[inputs.file.xpath]] - metric_selection = "/Bus/child::Sensor" - metric_name = "string('sensors')" - - timestamp = "/Gateway/Timestamp" - timestamp_format = "2006-01-02T15:04:05Z" - - field_selection = "child::Variable" - field_name = "name(@*[1])" - field_value = "number(@*[1])" - - [inputs.file.xpath.tags] - name = "substring-after(@name, ' ')" -``` - -Output: - -```text -sensors,host=Hugin,name=Facility\ A consumers=3,frequency=49.78,power=123.4,temperature=20 1596294243000000000 -sensors,host=Hugin,name=Facility\ B consumers=1,frequency=49.78,power=14.3,temperature=23.1 1596294243000000000 -sensors,host=Hugin,name=Facility\ C consumers=0,frequency=49.78,power=0.02,temperature=19.7 1596294243000000000 -``` - -Using the `metric_selection` option we select all `Sensor` nodes in the XML -document. For each _Sensor_ we then use `field_selection` to select all child -nodes of the sensor as _field-nodes_ Please note that the field selection is -relative to the selected nodes. For each selected _field-node_ we use -`field_name` and `field_value` to determining the field's name and value, -respectively. The `field_name` derives the name of the first attribute of the -node, while `field_value` derives the value of the first attribute and converts -the result to a number. - -[xpath lib]: https://github.com/antchfx/xpath -[json]: https://www.json.org/ -[msgpack]: https://msgpack.org/ -[protobuf]: https://developers.google.com/protocol-buffers -[xml]: https://www.w3.org/XML/ -[xpath]: https://www.w3.org/TR/xpath/ -[xpather]: http://xpather.com/ -[xpath tester]: https://codebeautify.org/Xpath-Tester -[time const]: https://golang.org/pkg/time/#pkg-constants -[time parse]: https://golang.org/pkg/time/#Parse diff --git a/content/telegraf/v1/data_formats/output/carbon2.md b/content/telegraf/v1/data_formats/output/carbon2.md deleted file mode 100644 index d592362f5f..0000000000 --- a/content/telegraf/v1/data_formats/output/carbon2.md +++ /dev/null @@ -1,61 +0,0 @@ ---- -title: Carbon2 output data format -list_title: Carbon2 -description: Use the `carbon2` output data format (serializer) to format and output Telegraf metrics as Carbon2 format. -menu: - telegraf_v1_ref: - name: Carbon2 - weight: 10 - parent: Output data formats ---- - -Use the `carbon2` output data format (serializer) to format and output Telegraf metrics as [Carbon2 format](http://metrics20.org/implementations/). - -### Configuration - -```toml -[[outputs.file]] - ## Files to write to, "stdout" is a specially handled file. - files = ["stdout", "/tmp/metrics.out"] - - ## Data format to output. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md - data_format = "carbon2" -``` - -Standard form: - -``` -metric=name field=field_1 host=foo 30 1234567890 -metric=name field=field_2 host=foo 4 1234567890 -metric=name field=field_N host=foo 59 1234567890 -``` - -### Metrics - -The serializer converts the metrics by creating `intrinsic_tags` using the combination of metric name and fields. So, if one Telegraf metric has 4 fields, the `carbon2` output will be 4 separate metrics. There will be a `metric` tag that represents the name of the metric and a `field` tag to represent the field. - -### Example - -If we take the following InfluxDB Line Protocol: - -``` -weather,location=us-midwest,season=summer temperature=82,wind=100 1234567890 -``` - -After serializing in Carbon2, the result would be: - -``` -metric=weather field=temperature location=us-midwest season=summer 82 1234567890 -metric=weather field=wind location=us-midwest season=summer 100 1234567890 -``` - -### Fields and tags with spaces - -When a field key or tag key-value have spaces, spaces will be replaced with `_`. - -### Tags with empty values - -When a tag's value is empty, it will be replaced with `null`. diff --git a/content/telegraf/v1/data_formats/output/graphite.md b/content/telegraf/v1/data_formats/output/graphite.md deleted file mode 100644 index 0bdeb492f5..0000000000 --- a/content/telegraf/v1/data_formats/output/graphite.md +++ /dev/null @@ -1,60 +0,0 @@ ---- -title: Graphite output data format -list_title: Graphite -description: Use the `graphite` output data format (serializer) to format and output Telegraf metrics as Graphite Message Format. -menu: - telegraf_v1_ref: - name: Graphite - weight: 10 - parent: Output data formats - identifier: output-data-format-graphite ---- - -Use the `graphite` output data format (serializer) to format and output Telegraf metrics as [Graphite Message Format](https://graphite.readthedocs.io/en/latest/feeding-carbon.html#step-3-understanding-the-graphite-message-format). - -The serializer uses either the _template pattern_ method (_default_) or the _tag support_ method. -To use the tag support method, set the [`graphite_tag_support`](#graphite_tag_support) option. - -## Configuration - -```toml -[[outputs.file]] - ## Files to write to, "stdout" is a specially handled file. - files = ["stdout", "/tmp/metrics.out"] - - ## Data format to output. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md - data_format = "graphite" - - ## Prefix added to each graphite bucket - prefix = "telegraf" - ## Graphite template pattern - template = "host.tags.measurement.field" - - ## Support Graphite tags, recommended to enable when using Graphite 1.1 or later. - # graphite_tag_support = false -``` - -### graphite_tag_support - -When the `graphite_tag_support` option is enabled, the template pattern is not -used. Instead, tags are encoded using -[Graphite tag support](http://graphite.readthedocs.io/en/latest/tags.html), -added in Graphite 1.1. The `metric_path` is a combination of the optional -`prefix` option, measurement name, and field name. - -The tag `name` is reserved by Graphite, any conflicting tags and will be encoded as `_name`. - -**Example conversion**: -``` -cpu,cpu=cpu-total,dc=us-east-1,host=tars usage_idle=98.09,usage_user=0.89 1455320660004257758 -=> -cpu.usage_user;cpu=cpu-total;dc=us-east-1;host=tars 0.89 1455320690 -cpu.usage_idle;cpu=cpu-total;dc=us-east-1;host=tars 98.09 1455320690 -``` - -### Templates - -To learn more about using templates and template patterns, see [Template patterns](/telegraf/v1/configure_plugins/template-patterns/). diff --git a/content/telegraf/v1/data_formats/output/influx.md b/content/telegraf/v1/data_formats/output/influx.md deleted file mode 100644 index de24aa650f..0000000000 --- a/content/telegraf/v1/data_formats/output/influx.md +++ /dev/null @@ -1,44 +0,0 @@ ---- -title: InfluxDB line protocol output data format -list_title: InfluxDB line protocol -description: Use the `influx` output data format (serializer) to format and output metrics as InfluxDB line protocol format. -menu: - telegraf_v1_ref: - name: InfluxDB line protocol - weight: 10 - parent: Output data formats - identifier: output-data-format-influx ---- - -Use the `influx` output data format (serializer) to format and output metrics as [InfluxDB line protocol][line protocol]. -InfluxData recommends this data format unless another format is required for interoperability. - -## Configuration - -```toml -[[outputs.file]] - ## Files to write to, "stdout" is a specially handled file. - files = ["stdout", "/tmp/metrics.out"] - - ## Data format to output. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md - data_format = "influx" - - ## Maximum line length in bytes. Useful only for debugging. - influx_max_line_bytes = 0 - - ## When true, fields will be output in ascending lexical order. Enabling - ## this option will result in decreased performance and is only recommended - ## when you need predictable ordering while debugging. - influx_sort_fields = false - - ## When true, Telegraf will output unsigned integers as unsigned values, - ## i.e.: `42u`. You will need a version of InfluxDB supporting unsigned - ## integer values. Enabling this option will result in field type errors if - ## existing data has been written. - influx_uint_support = false -``` - -[line protocol]: /influxdb/v1/write_protocols/line_protocol_tutorial/ diff --git a/content/telegraf/v1/data_formats/output/json.md b/content/telegraf/v1/data_formats/output/json.md deleted file mode 100644 index 8666ba24a4..0000000000 --- a/content/telegraf/v1/data_formats/output/json.md +++ /dev/null @@ -1,91 +0,0 @@ ---- -title: JSON output data format -list_title: JSON -description: Use the `json` output data format (serializer) to format and output Telegraf metrics as JSON documents. -menu: - telegraf_v1_ref: - name: JSON - weight: 10 - parent: Output data formats - identifier: output-data-format-json ---- - -Use the `json` output data format (serializer) to format and output Telegraf metrics as [JSON](https://www.json.org/json-en.html) documents. - -## Configuration - -```toml -[[outputs.file]] - ## Files to write to, "stdout" is a specially handled file. - files = ["stdout", "/tmp/metrics.out"] - - ## Data format to output. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md - data_format = "json" - - ## The resolution to use for the metric timestamp. Must be a duration string - ## such as "1ns", "1us", "1ms", "10ms", "1s". Durations are truncated to - ## the power of 10 less than the specified units. - json_timestamp_units = "1s" -``` - -## Examples - -### Standard format - -```json -{ - "fields": { - "field_1": 30, - "field_2": 4, - "field_N": 59, - "n_images": 660 - }, - "name": "docker", - "tags": { - "host": "raynor" - }, - "timestamp": 1458229140 -} -``` - -### Batch format - -When an output plugin needs to emit multiple metrics at one time, it may use the -batch format. The use of batch format is determined by the plugin -- reference -the documentation for the specific plugin. - -```json -{ - "metrics": [ - { - "fields": { - "field_1": 30, - "field_2": 4, - "field_N": 59, - "n_images": 660 - }, - "name": "docker", - "tags": { - "host": "raynor" - }, - "timestamp": 1458229140 - }, - { - "fields": { - "field_1": 30, - "field_2": 4, - "field_N": 59, - "n_images": 660 - }, - "name": "docker", - "tags": { - "host": "raynor" - }, - "timestamp": 1458229140 - } - ] -} -``` diff --git a/content/telegraf/v1/data_formats/output/messagepack.md b/content/telegraf/v1/data_formats/output/messagepack.md deleted file mode 100644 index 0dc7f9c2f7..0000000000 --- a/content/telegraf/v1/data_formats/output/messagepack.md +++ /dev/null @@ -1,49 +0,0 @@ ---- -title: MessagePack output data format -list_title: MessagePack -description: Use the `msgpack` output data format (serializer) to convert Telegraf metrics into MessagePack format. -menu: - telegraf_v1_ref: - name: MessagePack - weight: 10 - parent: Output data formats ---- - -The `msgpack` output data format (serializer) translates the Telegraf metric format to the [MessagePack](https://msgpack.org/). MessagePack is an efficient binary serialization format that lets you exchange data among multiple languages like JSON. - -### Configuration - -```toml -[[outputs.file]] - ## Files to write to, "stdout" is a specially handled file. - files = ["stdout", "/tmp/metrics.out"] - - ## Data format to output. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md - data_format = "msgpack" -``` - - -### Example output - -Output of this format is MessagePack binary representation of metrics with a structure identical to the below JSON: - -``` -{ - "name":"cpu", - "time": , // https://github.com/msgpack/msgpack/blob/master/spec.md#timestamp-extension-type - "tags":{ - "tag_1":"host01", - ... - }, - "fields":{ - "field_1":30, - "field_2":true, - "field_3":"field_value" - "field_4":30.1 - ... - } -} -``` diff --git a/content/telegraf/v1/data_formats/output/nowmetric.md b/content/telegraf/v1/data_formats/output/nowmetric.md deleted file mode 100644 index 58715b1dcd..0000000000 --- a/content/telegraf/v1/data_formats/output/nowmetric.md +++ /dev/null @@ -1,91 +0,0 @@ ---- -title: ServiceNow metrics output data format -list_title: ServiceNow metrics -description: Use the `nowmetric` ServiceNow metrics output data format (serializer) to output Telegraf metrics as ServiceNow Operational Intelligence format. -menu: - telegraf_v1_ref: - name: ServiceNow metrics - weight: 10 - parent: Output data formats ---- - -The `nowmetric` output data format (serializer) outputs Telegraf metrics as [ServiceNow Operational Intelligence format](https://docs.servicenow.com/bundle/kingston-it-operations-management/page/product/event-management/reference/mid-POST-metrics.html). - -It can be used to write to a file using the File output plugin, or for sending metrics to a MID Server with Enable REST endpoint activated using the standard telegraf HTTP output. -If you're using the HTTP output plugin, this serializer knows how to batch the metrics so you don't end up with an HTTP POST per metric. - -An example event looks like: - -```javascript -[{ - "metric_type": "Disk C: % Free Space", - "resource": "C:\\", - "node": "lnux100", - "value": 50, - "timestamp": 1473183012000, - "ci2metric_id": { - "node": "lnux100" - }, - "source": “Telegraf” -}] -``` - -## Using with the HTTP output plugin - -To send this data to a ServiceNow MID Server with Web Server extension activated, you can use the HTTP output plugin, there are some custom headers that you need to add to manage the MID Web Server authorization, here's a sample config for an HTTP output: - -```toml -[[outputs.http]] - ## URL is the address to send metrics to - url = "http://:9082/api/mid/sa/metrics" - - ## Timeout for HTTP message - # timeout = "5s" - - ## HTTP method, one of: "POST" or "PUT" - method = "POST" - - ## HTTP Basic Auth credentials - username = 'evt.integration' - password = 'P@$$w0rd!' - - ## Optional TLS Config - # tls_ca = "/etc/telegraf/ca.pem" - # tls_cert = "/etc/telegraf/cert.pem" - # tls_key = "/etc/telegraf/key.pem" - ## Use TLS but skip chain & host verification - # insecure_skip_verify = false - - ## Data format to output. - ## Each data format has it's own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md - data_format = "nowmetric" - - ## Additional HTTP headers - [outputs.http.headers] - # # Should be set manually to "application/json" for json data_format - Content-Type = "application/json" - Accept = "application/json" -``` - -Starting with the London release, you also need to explicitly create event rule to allow binding of metric events to host CIs. - -https://docs.servicenow.com/bundle/london-it-operations-management/page/product/event-management/task/event-rule-bind-metrics-to-host.html - -## Using with the File output plugin - -You can use the File output plugin to output the payload in a file. -In this case, just add the following section to your telegraf configuration file. - -```toml -[[outputs.file]] - ## Files to write to, "stdout" is a specially handled file. - files = ["C:/Telegraf/metrics.out"] - - ## Data format to output. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md - data_format = "nowmetric" -``` diff --git a/content/telegraf/v1/data_formats/output/splunkmetric.md b/content/telegraf/v1/data_formats/output/splunkmetric.md deleted file mode 100644 index 71068ab76e..0000000000 --- a/content/telegraf/v1/data_formats/output/splunkmetric.md +++ /dev/null @@ -1,149 +0,0 @@ ---- -title: Splunk metrics output data format -list_title: Splunk metrics -description: Use the `splunkmetric` metric output data format (serializer) to output Telegraf metrics in a format that can be consumed by a Splunk metrics index. -menu: - telegraf_v1_ref: - name: Splunk metric - weight: 10 - parent: Output data formats ---- - -Use the `splunkmetric` output data format (serializer) to output Telegraf metrics in a format that can be consumed by a Splunk metrics index. - -The output data format can write to a file using the file output, or send metrics to a HEC using the standard Telegraf HTTP output. - -If you're using the HTTP output, this serializer knows how to batch the metrics so you don't end up with an HTTP POST per metric. - -Th data is output in a format that conforms to the specified Splunk HEC JSON format as found here: -[Send metrics in JSON format](http://dev.splunk.com/view/event-collector/SP-CAAAFDN). - -An example event looks like: -```javascript -{ - "time": 1529708430, - "event": "metric", - "host": "patas-mbp", - "fields": { - "_value": 0.6, - "cpu": "cpu0", - "dc": "mobile", - "metric_name": "cpu.usage_user", - "user": "ronnocol" - } -} -``` -In the above snippet, the following keys are dimensions: -* cpu -* dc -* user - -## Using with the HTTP output - -To send this data to a Splunk HEC, you can use the HTTP output, there are some custom headers that you need to add -to manage the HEC authorization, here's a sample config for an HTTP output: - -```toml -[[outputs.http]] - ## URL is the address to send metrics to - url = "https://localhost:8088/services/collector" - - ## Timeout for HTTP message - # timeout = "5s" - - ## HTTP method, one of: "POST" or "PUT" - # method = "POST" - - ## HTTP Basic Auth credentials - # username = "username" - # password = "pa$$word" - - ## Optional TLS Config - # tls_ca = "/etc/telegraf/ca.pem" - # tls_cert = "/etc/telegraf/cert.pem" - # tls_key = "/etc/telegraf/key.pem" - ## Use TLS but skip chain & host verification - # insecure_skip_verify = false - - ## Data format to output. - ## Each data format has it's own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md - data_format = "splunkmetric" - ## Provides time, index, source overrides for the HEC - splunkmetric_hec_routing = true - - ## Additional HTTP headers - [outputs.http.headers] - # Should be set manually to "application/json" for json data_format - Content-Type = "application/json" - Authorization = "Splunk xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" - X-Splunk-Request-Channel = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -``` - -## Overrides -You can override the default values for the HEC token you are using by adding additional tags to the config file. - -The following aspects of the token can be overridden with tags: -* index -* source - -You can either use `[global_tags]` or using a more advanced configuration as documented [here](https://github.com/influxdata/telegraf/blob/master/docs/CONFIGURATION.md). - -Such as this example which overrides the index just on the cpu metric: -```toml -[[inputs.cpu]] - percpu = false - totalcpu = true - [inputs.cpu.tags] - index = "cpu_metrics" -``` - -## Using with the File output - -You can use the file output when running telegraf on a machine with a Splunk forwarder. - -A sample event when `hec_routing` is false (or unset) looks like: -```javascript -{ - "_value": 0.6, - "cpu": "cpu0", - "dc": "mobile", - "metric_name": "cpu.usage_user", - "user": "ronnocol", - "time": 1529708430 -} -``` -Data formatted in this manner can be ingested with a simple `props.conf` file that -looks like this: - -```ini -[telegraf] -category = Metrics -description = Telegraf Metrics -pulldown_type = 1 -DATETIME_CONFIG = -NO_BINARY_CHECK = true -SHOULD_LINEMERGE = true -disabled = false -INDEXED_EXTRACTIONS = json -KV_MODE = none -TIMESTAMP_FIELDS = time -TIME_FORMAT = %s.%3N -``` - -An example configuration of a file based output is: - -```toml - # Send telegraf metrics to file(s) -[[outputs.file]] - ## Files to write to, "stdout" is a specially handled file. - files = ["/tmp/metrics.out"] - - ## Data format to output. - ## Each data format has its own unique set of configuration options, read - ## more about them here: - ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md - data_format = "splunkmetric" - hec_routing = false -``` diff --git a/content/telegraf/v1/input-data-formats/_index.md b/content/telegraf/v1/input-data-formats/_index.md new file mode 100644 index 0000000000..6cac4137be --- /dev/null +++ b/content/telegraf/v1/input-data-formats/_index.md @@ -0,0 +1,34 @@ +--- +title: Telegraf input data formats +list_title: Input data formats +description: Telegraf supports parsing input data formats into Telegraf metrics. +menu: + telegraf_v1_ref: + name: Input data formats + parent: data_formats_reference + identifier: input_data_formats_reference + weight: 20 +tags: [input-data-formats, input-serializers] +--- + +Telegraf supports the following input data formats for parsing data into [metrics](/telegraf/v1/metrics/). +Input plugins that support these formats include a `data_format` configuration option. + +For example, in the [Exec input plugin](/telegraf/v1/input-plugins/exec/): + +```toml +[[inputs.exec]] + ## Commands array + commands = ["/tmp/test.sh", "/usr/bin/mycollector --foo=bar"] + + ## measurement name suffix (for separating different commands) + name_suffix = "_mycollector" + + ## Data format to consume. + ## Each data format has its own unique set of configuration options, read + ## more about them here: + ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md + data_format = "json_v2" +``` + +{{< telegraf/data-formats type="input" >}} diff --git a/content/telegraf/v1/data_formats/output/_index.md b/content/telegraf/v1/output-data-formats/_index.md similarity index 56% rename from content/telegraf/v1/data_formats/output/_index.md rename to content/telegraf/v1/output-data-formats/_index.md index 9f62fc390f..b3e11ae84b 100644 --- a/content/telegraf/v1/data_formats/output/_index.md +++ b/content/telegraf/v1/output-data-formats/_index.md @@ -5,18 +5,15 @@ description: Telegraf serializes metrics into output data formats. menu: telegraf_v1_ref: name: Output data formats - weight: 1 - parent: Data formats ---- - -In addition to output-specific data formats, Telegraf supports the following set -of common data formats that may be selected when configuring many of the Telegraf -output plugins. + parent: data_formats_reference + identifier: output_data_formats_reference + weight: 20 +tags: [output-data-formats, output-serializers] -{{< children >}} +Telegraf supports the following output data formats for serializing metrics. +Output plugins that support these formats include a `data_format` configuration option. -You will be able to identify the plugins with support by the presence of a -`data_format` configuration option, for example, in the File (`file`) output plugin: +For example, in the [File output plugin](/telegraf/v1/output-plugins/file/): ```toml [[outputs.file]] @@ -29,3 +26,5 @@ You will be able to identify the plugins with support by the presence of a ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md data_format = "influx" ``` + +{{< telegraf/data-formats type="output" >}}