diff --git a/docs/advanced/prometheus.md b/docs/advanced/prometheus.md new file mode 100644 index 0000000000..b9a9d23d21 --- /dev/null +++ b/docs/advanced/prometheus.md @@ -0,0 +1,110 @@ +--- +title: Monitoring KubeEdge Edge Nodes with Prometheus +sidebar_position: 6 +--- +# Monitoring KubeEdge Edge Nodes with Prometheus + +## Environment Information + +| Component | Version | +|------------| ------------------------ | +| containerd | 1.7.2 | +| k8s | 1.26.0 | +| KubeEdge | 1.17.0 | +| Jetson model type | NVIDIA Jetson Xavier NX (16GB ram) | + +> Regarding the KubeEdge version description:This feature is recommended for version 1.15.0 and above. Since v1.17.0 supports edge pods using InclusterConfig, the approach is different for versions before and after v1.17.0. This document will use v1.17.0 as examples to illustrate the steps,for versions prior to v1.17.0, please refer to the corresponding version documentation. + + +## Deploying Prometheus + +We can quickly install using the [Helm Charts](https://prometheus-community.github.io/helm-charts/) of [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus), or we can install it manually. + +It is important to pay attention to the compatibility between the Kubernetes version and kube-prometheus. + +```shell +git clone https://github.com/prometheus-operator/kube-prometheus.git +cd kube-prometheus +kubectl apply --server-side -f manifests/setup +kubectl wait \ + --for condition=Established \ + --all CustomResourceDefinition \ + --namespace=monitoring +kubectl apply -f manifests/ +``` + +You can see that a ClusterIP type Service has been created for grafana, alertmanager, and prometheus. Of course, if we want to access these two services from the Internet, we can create the corresponding Ingress objects or use NodePort type Services. Here, for simplicity, we directly use NodePort type services. Edit the 3 Services of grafana, alertmanager-main, and prometheus-k8s to change the service type to NodePort: + +![](../..\static\img\advanced\prometheus-svc.png) + +```shell +kubectl edit svc grafana -n monitoring +kubectl edit svc alertmanager-main -n monitoring +kubectl edit svc prometheus-k8s -n monitoring +``` + +Due to the latest version of kube-prometheus setting NetworkPolicy, even if NodePort is configured, access is not possible. You need to modify the NetworkPolicy to allow access from the 10 network segment IP. + +![](../..\static\img\advanced\NetworkPolicy.png) + +``` +kubectl edit NetworkPolicy prometheus-k8s -n monitoring +kubectl edit NetworkPolicy grafana -n monitoring +kubectl edit NetworkPolicy alertmanager-main -n monitoring +``` + +Now you can access the prometheus and grafana services via NodePort. + +![](../..\static\img\advanced\prometheus-page.png) + + + +## **Deploying** KubeEdge + +### Enable the InClusterConfig feature + +When deploying version 1.17.0, pay attention that it is necessary to support edge Pods to use InClusterConfig to access Kube-APIServer, so you need to configure the specified cloudCore.featureGates.requireAuthorization=true and cloudCore.modules.dynamicController.enable=true. Details can be found in the [KubeEdge public account article](https://mp.weixin.qq.com/s/Dw2IKRDvOWH52xTOStI7dg) + +```shell +keadm init --advertise-address=10.108.96.24 --set cloudCore.featureGates.requireAuthorization=true,cloudCore.modules.dynamicController.enable=true --kubeedge-version=v1.17.0 +``` + +- After starting EdgeCore, modify the edgecore.yaml and restart EdgeCore as follows. + + Modify **metaServer.enable = true** and add **featureGates: requireAuthorization: true** + +```yaml +apiVersion: edgecore.config.kubeedge.io/v1alpha2 +kind: EdgeCore +featureGates: + requireAuthorization: true +modules: + ... + metaManager: + metaServer: + enable: true +``` + +After modification, restart edgecore + +``` +systemctl daemon-reload +systemctl restart edgecore +``` + +### Create clusterrolebinding + +It was found that the container inside node-exporter reported an error: `Unable to authenticate the request due to an error: tokenreviews.authentication.k8s.io is forbidden: User "system:serviceaccount:kubeedge:cloudcore" cannot create resource "tokenreviews" in API group "authentication.k8s.io" at the cluster scope.` + +Because cloudcore does not have permission, create a clusterrolebinding. + +![](../..\static\img\advanced\clusterrolebinding.png) + +``` +kubectl create clusterrolebinding cloudcore-promethus-binding --clusterrole=cluster-admin --serviceaccount=kubeedge:cloudcore +``` + +After creating the clusterrolebinding, you can query the monitoring information of the edge nodes. + +![](../..\static\img\advanced\node-exporter.png) + diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/advanced/prometheus.md b/i18n/zh/docusaurus-plugin-content-docs/current/advanced/prometheus.md new file mode 100644 index 0000000000..c4786dd25c --- /dev/null +++ b/i18n/zh/docusaurus-plugin-content-docs/current/advanced/prometheus.md @@ -0,0 +1,115 @@ +--- +title: 使用 Prometheus 监控 KubeEdge 边缘节点 +sidebar_position: 6 +--- + +# 使用 Prometheus 监控 KubeEdge 边缘节点 + +## 环境信息 + +| 组件 | 版本 | +| ---------- | ---------------------------------- | +| containerd | 1.7.2 | +| k8s | 1.26.0 | +| KubeEdge | 1.17.0 | +| Jetson型号 | NVIDIA Jetson Xavier NX (16GB ram) | + +> 关于 KubeEdge 版本说明:由于 v1.17.0 支持使用 InclusterConfig 的边缘 pod,因此 v1.17.0 之前和之后的版本的方法是不同的。本文档将以 v1.17.0 为例来说明操作步骤, v1.17.0 之前版本请参考对应版本文档。 + + +## 部署 prometheus + +我们可以直接使用 [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus) 的 [Helm Charts](https://prometheus-community.github.io/helm-charts/) 来进行快速安装,也可以直接手动安装。 + +需要注意 Kubernetes 版本和 `kube-prometheus` 的兼容。 + +```shell +git clone https://github.com/prometheus-operator/kube-prometheus.git +cd kube-prometheus +kubectl apply --server-side -f manifests/setup +kubectl wait \ + --for condition=Established \ + --all CustomResourceDefinition \ + --namespace=monitoring +kubectl apply -f manifests/ +``` + +可以看到上面针对 grafana、alertmanager 和 prometheus 都创建了一个类型为 ClusterIP 的 Service,当然如果我们想要在外网访问这两个服务的话可以通过创建对应的 Ingress 对象或者使用 NodePort 类型的 Service,我们这里为了简单,直接使用 NodePort 类型的服务即可,编辑 `grafana`、`alertmanager-main` 和 `prometheus-k8s` 这 3 个 Service,将服务类型更改为 NodePort: + +![](../../../../..\static\img\advanced\prometheus-svc.png) + +```shell +kubectl edit svc grafana -n monitoring +kubectl edit svc alertmanager-main -n monitoring +kubectl edit svc prometheus-k8s -n monitoring +``` + +由于最新版本的 kube-prometheus 设置了网络策略,即使配置了 NodePort 也无法访问。需要修改 NetworkPolicy,允许 10网段的 IP访问。 + +![](../../../../..\static\img\advanced\NetworkPolicy.png) + + + +``` +kubectl edit NetworkPolicy prometheus-k8s -n monitoring +kubectl edit NetworkPolicy grafana -n monitoring +kubectl edit NetworkPolicy alertmanager-main -n monitoring +``` + +这样就可以通过 NodePort 访问 prometheus 和 grafana 服务了 + +![](../../../../..\static\img\advanced\prometheus-page.png) + +## 部署 KubeEdge + +### 开启 InClusterConfig 功能 + +部署 1.17.0版本注意,需要支持边缘 Pods 使用 InClusterConfig 访问 Kube-APIServer ,所以要配置指定 cloudCore.featureGates.requireAuthorization=true 以及 cloudCore.modules.dynamicController.enable=true。 详情可以查看 [KubeEdge 公众号文章](https://mp.weixin.qq.com/s/Dw2IKRDvOWH52xTOStI7dg) + +```shell +keadm init --advertise-address=10.108.96.24 --set cloudCore.featureGates.requireAuthorization=true,cloudCore.modules.dynamicController.enable=true --kubeedge-version=v1.17.0 +``` + +- 启动 EdgeCore 后,按如下修改 edgecore.yaml 后重启 EdgeCore。 + + 修改 **metaServer.enable = true** 同时增加 **featureGates: requireAuthorization: true** + +```yaml +apiVersion: edgecore.config.kubeedge.io/v1alpha2 +kind: EdgeCore +featureGates: + requireAuthorization: true +modules: + ... + metaManager: + metaServer: + enable: true +``` + +修改完重启 edgecore + +``` +systemctl daemon-reload +systemctl restart edgecore +``` + +### 创建 clusterrolebinding + +发现 node-exporter 里面的容器报错:`Unable to authenticate the request due to an error: tokenreviews.authentication.k8s.io is forbidden: User "system:serviceaccount:kubeedge:cloudcore" cannot create resource "tokenreviews" in API group "authentication.k8s.io" at the cluster scope` + +因为 cloudcore 没有权限,所以创建一个 clusterrolebinding + +![](../../../../..\static\img\advanced\clusterrolebinding.png) + + + +``` +kubectl create clusterrolebinding cloudcore-promethus-binding --clusterrole=cluster-admin --serviceaccount=kubeedge:cloudcore +``` + +创建完 clusterrolebinding 就可以查询到边缘节点的监控信息了。 + + + +![](../../../../..\static\img\advanced\node-exporter.png) + diff --git a/static/img/advanced/NetworkPolicy.png b/static/img/advanced/NetworkPolicy.png new file mode 100644 index 0000000000..19ce00811f Binary files /dev/null and b/static/img/advanced/NetworkPolicy.png differ diff --git a/static/img/advanced/clusterrolebinding.png b/static/img/advanced/clusterrolebinding.png new file mode 100644 index 0000000000..814435b72d Binary files /dev/null and b/static/img/advanced/clusterrolebinding.png differ diff --git a/static/img/advanced/node-exporter.png b/static/img/advanced/node-exporter.png new file mode 100644 index 0000000000..bc3ba86c46 Binary files /dev/null and b/static/img/advanced/node-exporter.png differ diff --git a/static/img/advanced/prometheus-page.png b/static/img/advanced/prometheus-page.png new file mode 100644 index 0000000000..8bd02a91d3 Binary files /dev/null and b/static/img/advanced/prometheus-page.png differ diff --git a/static/img/advanced/prometheus-svc.png b/static/img/advanced/prometheus-svc.png new file mode 100644 index 0000000000..d1ccedeb83 Binary files /dev/null and b/static/img/advanced/prometheus-svc.png differ