Skip to content

Commit

Permalink
Add missing JMX metrics for ContainerInsights. (#898)
Browse files Browse the repository at this point in the history
### Description of changes
* Added missing metrics from the ContainerInsights
[dashboard](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ContainerInsights-Prometheus-metrics.html#ContainerInsights-Prometheus-metrics-jmx)
not included in the
[agent](https://github.com/aws-observability/aws-otel-java-instrumentation/blob/c39b76830931899a678f72d502dafe33329adde5/instrumentation/jmx-metrics/src/main/resources/jmx/rules/jvm.yaml)'s
solution for JMX.
* Compared the current spec files for JVM and Tomcat with the dashboard
to see what was missing.
* Set up an EC2 cluster, a Tomcat application, and JMXTerm to find the
bean and metric names for the missing metrics.
* Added these to the spec file based on appropriate types and the naming
convention from
[OTel](https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/jmx-metrics/javaagent/README.md).
* Then, the ADOT Jar image was built and tested with the
`amazon-cloudwatch-agent-operator` and `helm-charts` on an EKS cluster
with a custom Tomcat deployment, which correctly produced the correct
metrics:
    * `./scripts/local_patch.sh && ./gradlew build`
* `docker build --platform linux/amd64 -t adot-autoinstrumentation-java
.`
    * `[add docker image to ECR]`

### Manual testing
For testing, I decided to use this built image with the CloudWatch Agent
to see if these metrics are able to be emitted and tracked on the CW
console.

#### Configuration
```
    {
      "metrics": {
        "namespace": "tomcat",
        "metrics_collected": {
          "jmx": {
            "jvm": {
              "measurement": [
                "jvm.system.swap.space.total",
                "jvm.system.swap.space.free",
                "jvm.system.physical.memory.total",
                "jvm.system.physical.memory.free",
                "jvm.system.available.processors",
                "jvm.system.cpu.utilization",
                "jvm.open_file_descriptor.count",
                "jvm.daemon_threads.count",
                "jvm.threads.count"
              ]
            },
            "tomcat": {
              "measurement": [
                "tomcat.rejected_sessions",
                "tomcat.sessions"
              ]
            }
          }
        }
      }
    }
```

#### CW Console
<img width="949" alt="Screenshot 2024-10-04 at 11 08 51 AM"
src="https://github.com/user-attachments/assets/e9f3869b-f17d-4d17-81b5-0c0b97272012">

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Mengyi Zhou (bjrara) <[email protected]>
  • Loading branch information
musa-asad and bjrara authored Oct 11, 2024
1 parent 51098f9 commit e144053
Show file tree
Hide file tree
Showing 3 changed files with 48 additions and 10 deletions.
5 changes: 4 additions & 1 deletion instrumentation/jmx-metrics/src/main/resources/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,7 @@ OTEL_EXPERIMENTAL_METRICS_VIEW_CONFIG: classpath:/jmx/view.yaml

### rules/*.yaml
The rules are a translation of the JMX Metric Gatherer's [target systems](https://github.com/open-telemetry/opentelemetry-java-contrib/tree/main/jmx-metrics/src/main/resources/target-systems)
based on the [JMX metric rule YAML schema](https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/jmx-metrics/javaagent/README.md#basic-syntax).
based on the [JMX metric rule YAML schema](https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/jmx-metrics/javaagent/README.md#basic-syntax).

### SystemCpuLoad
The `SystemCpuLoad` metric is deprecated and Java versions 14+ now use `CpuLoad`. However, to avoid emitting double metrics, we stick to using `SystemCpuLoad` as it still works on newer versions.
39 changes: 34 additions & 5 deletions instrumentation/jmx-metrics/src/main/resources/jmx/rules/jvm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -75,21 +75,50 @@ rules:
desc: The maximum amount of memory can be used for the memory pool
- bean: java.lang:type=Threading
unit: "1"
prefix: jvm.threads.
type: gauge
mapping:
ThreadCount:
metric: count
metric: jvm.threads.count
desc: Number of threads
DaemonThreadCount:
metric: jvm.daemon_threads.count
desc: Number of daemon threads
- bean: java.lang:type=OperatingSystem
prefix: jvm.cpu.
type: gauge
mapping:
TotalSwapSpaceSize:
metric: jvm.system.swap.space.total
desc: The host swap memory size in bytes
unit: by
FreeSwapSpaceSize:
metric: jvm.system.swap.space.free
desc: The amount of available swap memory in bytes
unit: by
TotalPhysicalMemorySize:
metric: jvm.system.physical.memory.total
desc: The total physical memory size in host
unit: by
FreePhysicalMemorySize:
metric: jvm.system.physical.memory.free
desc: The amount of free physical memory in host
unit: by
AvailableProcessors:
metric: jvm.system.available.processors
desc: The number of available processors
unit: "1"
SystemCpuLoad:
metric: jvm.system.cpu.utilization
desc: The current load of CPU in host
unit: "1"
ProcessCpuTime:
metric: time
metric: jvm.cpu.time
unit: ns
desc: CPU time used
ProcessCpuLoad:
metric: recent_utilization
metric: jvm.cpu.recent_utilization
unit: "1"
desc: Recent CPU utilization for the process
OpenFileDescriptorCount:
metric: jvm.open_file_descriptor.count
desc: The number of opened file descriptors
unit: "1"
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,15 @@ rules:
- bean: Catalina:type=Manager,host=localhost,context=*
metricAttribute:
context: param(context)
unit: sessions
type: gauge
mapping:
activeSessions:
metric: tomcat.sessions
type: gauge
unit: sessions
desc: The number of active sessions.
rejectedSessions:
metric: tomcat.rejected_sessions
desc: The number of rejected sessions.
- bean: Catalina:type=GlobalRequestProcessor,name=*
metricAttribute:
name: param(name)
Expand Down Expand Up @@ -68,12 +71,15 @@ rules:
- bean: Tomcat:type=Manager,host=localhost,context=*
metricAttribute:
context: param(context)
unit: sessions
type: gauge
mapping:
activeSessions:
metric: tomcat.sessions
type: gauge
unit: sessions
desc: The number of active sessions.
rejectedSessions:
metric: tomcat.rejected_sessions
desc: The number of rejected sessions.
- bean: Tomcat:type=GlobalRequestProcessor,name=*
metricAttribute:
name: param(name)
Expand Down

0 comments on commit e144053

Please sign in to comment.