Skip to content

Commit

Permalink
AKS Version Update + Others (#357)
Browse files Browse the repository at this point in the history
* Update kured, AKS version, resource params, and add eraser

* disable the new preview

* enable node restrictions

* define more defaults

* can't even enclude it as disabled without registering it.

* tenantId -> tenantID to remove warning
Add firewall allowance

* remove legacy warning for unsupported kubectl version

* add suffix

* Update to five days
  • Loading branch information
ckittel authored Nov 16, 2022
1 parent b8a67fd commit 8ca6ef5
Show file tree
Hide file tree
Showing 8 changed files with 105 additions and 32 deletions.
5 changes: 4 additions & 1 deletion 01-prerequisites.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,15 @@ This is the starting point for the instructions on deploying the [AKS Baseline r

1. [Register the Workload Identity preview feature = `EnableWorkloadIdentityPreview`](https://learn.microsoft.com/azure/aks/workload-identity-deploy-cluster#register-the-enableworkloadidentitypreview-feature-flag)

1. Register the ImageCleaner (Earser) preview feature = `EnableImageCleanerPreview`](https://learn.microsoft.com/azure/aks/image-cleaner#prerequisites)

```bash
az feature register --namespace "Microsoft.ContainerService" -n "AKS-AzureDefender"
az feature register --namespace "Microsoft.ContainerService" -n "EnableWorkloadIdentityPreview"
az feature register --namespace "Microsoft.ContainerService" -n "EnableImageCleanerPreview"

# Keep running until all say "Registered." (This may take up to 20 minutes.)
az feature list -o table --query "[?name=='Microsoft.ContainerService/AKS-AzureDefender' || name=='Microsoft.ContainerService/EnableWorkloadIdentityPreview'].{Name:name,State:properties.state}"
az feature list -o table --query "[?name=='Microsoft.ContainerService/AKS-AzureDefender' || name=='Microsoft.ContainerService/EnableWorkloadIdentityPreview' || name=='Microsoft.ContainerService/EnableImageCleanerPreview'].{Name:name,State:properties.state}"

# When all say "Registered" then re-register the AKS resource provider
az provider register --namespace Microsoft.ContainerService
Expand Down
2 changes: 1 addition & 1 deletion 05-bootstrap-prep.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ We'll be bootstrapping this cluster with the Flux GitOps agent as installed as a
echo ACR_NAME_AKS_BASELINE: $ACR_NAME_AKS_BASELINE
# Import core image(s) hosted in public container registries to be used during bootstrapping
az acr import --source docker.io/weaveworks/kured:1.10.1 -n $ACR_NAME_AKS_BASELINE
az acr import --source ghcr.io/kubereboot/kured:1.11.0 -n $ACR_NAME_AKS_BASELINE
```

> In this walkthrough, there is only one image that is included in the bootstrapping process. It's included as an reference for this process. Your choice to use Kubernetes Reboot Daemon (Kured) or any other images, including helm charts, as part of your bootstrapping is yours to make.
Expand Down
2 changes: 1 addition & 1 deletion 09-secret-management-and-ingress-controller.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Previously you have configured [workload prerequisites](./08-workload-prerequisi
objectName: traefik-ingress-internal-aks-ingress-tls
objectAlias: tls.key
objectType: secret
tenantId: $TENANTID_AZURERBAC_AKS_BASELINE
tenantID: $TENANTID_AZURERBAC_AKS_BASELINE
EOF
```
Expand Down
2 changes: 0 additions & 2 deletions 10-workload.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,6 @@ The cluster now has an [Traefik configured with a TLS certificate](./08-secret-m
exit
```

> :beetle: If you are running a version of kubectl less than 1.23 you cannot perform this step. The Azure Policy for Kubernetes assignments that are deployed with this implementation requires both setting limits and also a read-only root filesystem. This isn't possible with kubectl 1.22 and lower. It's okay to skip this step if in that situation.

> From this container shell, you could also try to directly access the workload via `curl -I http://<aspnetapp-service-cluster-ip>`. Instead of getting back a `200 OK`, you'll receive a network timeout because of the [`allow-only-ingress-to-workload` network policy](./cluster-manifests/a0008/ingress-network-policy.yaml) that is in place.
### Next step
Expand Down
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Finally, this implementation uses the [ASP.NET Core Docker sample web app](https

#### Azure platform

- AKS v1.24
- AKS v1.25
- System and User [node pool separation](https://learn.microsoft.com/azure/aks/use-system-pools)
- [AKS-managed Azure AD](https://learn.microsoft.com/azure/aks/managed-aad)
- Azure AD-backed Kubernetes RBAC (_local user accounts disabled_)
Expand All @@ -38,11 +38,13 @@ Finally, this implementation uses the [ASP.NET Core Docker sample web app](https

#### In-cluster OSS components

- [Flux GitOps Operator](https://fluxcd.io) _[AKS-managed extension]_
- [Traefik Ingress Controller](https://doc.traefik.io/traefik/v2.5/routing/providers/kubernetes-ingress/)
- [Azure Workload Identity](https://learn.microsoft.com/azure/aks/workload-identity-overview) _[AKS-managed add-on]_
- [Flux GitOps Operator](https://fluxcd.io) _[AKS-managed extension]_
- [ImageCleaner (Eraser)](https://learn.microsoft.com/azure/aks/image-cleaner) _[AKS-managed add-on]_
- [Kubernetes Reboot Daemon](https://learn.microsoft.com/azure/aks/node-updates-kured)
- [Secrets Store CSI Driver for Kubernetes](https://learn.microsoft.com/azure/aks/csi-secrets-store-driver) _[AKS-managed add-on]_
- [Kured](https://learn.microsoft.com/azure/aks/node-updates-kured)
- [Traefik Ingress Controller](https://doc.traefik.io/traefik/v2.5/routing/providers/kubernetes-ingress/)


![Network diagram depicting a hub-spoke network with two peered VNets and main Azure resources used in the architecture.](https://learn.microsoft.com/azure/architecture/reference-architectures/containers/aks/images/secure-baseline-architecture.svg)

Expand Down
46 changes: 33 additions & 13 deletions cluster-manifests/cluster-baseline-settings/kured.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Source: https://github.com/weaveworks/kured/releases/download/1.10.1/kured-1.10.1-dockerhub.yaml
# Source: https://github.com/kubereboot/charts/tree/kured-4.1.0/charts/kured (1.11.0)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
Expand All @@ -16,6 +16,9 @@ rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["list","delete","get"]
- apiGroups: ["extensions"]
resources: ["daemonsets"]
verbs: ["get"]
- apiGroups: ["apps"]
resources: ["daemonsets"]
verbs: ["get"]
Expand All @@ -42,11 +45,15 @@ metadata:
namespace: cluster-baseline-settings
name: kured
rules:
# Allow kured to lock/unlock itself
- apiGroups: ["apps"]
resources: ["daemonsets"]
resourceNames: ["kured"]
verbs: ["update"]
# Allow kured to lock/unlock itself
- apiGroups: ["extensions"]
resources: ["daemonsets"]
resourceNames: ["n-kured"]
verbs: ["update", "patch"]
- apiGroups: ["apps"]
resources: ["daemonsets"]
resourceNames: ["kured"]
verbs: ["update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
Expand Down Expand Up @@ -76,26 +83,31 @@ metadata:
spec:
selector:
matchLabels:
name: kured
app.kubernetes.io/name: kured
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
name: kured
app.kubernetes.io/name: kured
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "8080"
spec:
serviceAccountName: kured
tolerations:
- key: node-role.kubernetes.io/control-plane
effect: NoSchedule
- key: node-role.kubernetes.io/master
effect: NoSchedule
- effect: NoSchedule
key: CriticalAddonsOnly
- key: CriticalAddonsOnly
effect: NoSchedule
operator: Equal
value: "true"
hostNetwork: true
hostPID: true # Facilitate entering the host mount namespace via init
restartPolicy: Always
nodeSelector:
Expand All @@ -106,10 +118,10 @@ spec:
# PRODUCTION READINESS CHANGE REQUIRED
# This image should be sourced from a non-public container registry, such as the
# one deployed along side of this reference implementation.
# az acr import --source docker.io/weaveworks/kured:1.10.1 -n <your-acr-instance-name>
# az acr import --source ghcr.io/kubereboot/kured:1.11.0 -n <your-acr-instance-name>
# and then set this to
# image: <your-acr-instance-name>.azurecr.io/weaveworks/kured:1.10.1
image: docker.io/weaveworks/kured:1.10.1
# image: <your-acr-instance-name>.azurecr.io/kubereboot/kured:1.10.1
image: ghcr.io/kubereboot/kured:1.11.0
imagePullPolicy: IfNotPresent
securityContext:
privileged: true # Give permission to nsenter /proc/1/ns/mnt
Expand All @@ -120,14 +132,22 @@ spec:
requests:
cpu: 200m
memory: 16Mi
ports:
- containerPort: 8080
name: metrics
env:
# Pass in the name of the node on which this pod is scheduled
# for use with drain/uncordon operations and lock acquisition
- name: KURED_NODE_ID
valueFrom:
fieldRef:
fieldPath: spec.nodeName
command:
- /usr/bin/kured
args:
- --ds-namespace=cluster-baseline-settings
# - --ds-name=kured
# - --reboot-command=/bin/systemctl reboot
# - --force-reboot=false
# - --drain-grace-period=-1
# - --skip-wait-for-delete-timeout=0
Expand Down
64 changes: 56 additions & 8 deletions cluster-stamp.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ param clusterAuthorizedIPRanges array = []
'southeastasia'
])
param location string = 'eastus2'
param kubernetesVersion string = '1.24.6'
param kubernetesVersion string = '1.25.2'

@description('Domain name to use for App Gateway and AKS ingress.')
param domainName string = 'contoso.com'
Expand Down Expand Up @@ -1121,8 +1121,8 @@ resource paEnforceImageSource 'Microsoft.Authorization/policyAssignments@2021-06
policyDefinitionId: pdEnforceImageSourceId
parameters: {
allowedContainerImagesRegex: {
// If all images are pull into your ARC instance as described in these instructions you can remove the docker.io entries.
value: '${acr.name}\\.azurecr\\.io/.+$|mcr\\.microsoft\\.com/.+$|docker\\.io/weaveworks/kured.+$|docker\\.io/library/.+$'
// If all images are pull into your ARC instance as described in these instructions you can remove the docker.io & ghcr.io entries.
value: '${acr.name}\\.azurecr\\.io/.+$|mcr\\.microsoft\\.com/.+$|ghcr\\.io/kubereboot/kured.+$|docker\\.io/library/.+$'
}
excludedNamespaces: {
value: [
Expand Down Expand Up @@ -1651,7 +1651,7 @@ resource pdzAksIngress 'Microsoft.Network/privateDnsZones@2020-06-01' = {
}
}

resource mc 'Microsoft.ContainerService/managedClusters@2022-03-02-preview' = {
resource mc 'Microsoft.ContainerService/managedClusters@2022-09-02-preview' = {
name: clusterName
location: location
tags: {
Expand All @@ -1669,10 +1669,14 @@ resource mc 'Microsoft.ContainerService/managedClusters@2022-03-02-preview' = {
osDiskSizeGB: 80
osDiskType: 'Ephemeral'
osType: 'Linux'
osSKU: 'Ubuntu'
minCount: 3
maxCount: 4
vnetSubnetID: targetVirtualNetwork::snetClusterNodes.id
enableAutoScaling: true
enableCustomCATrust: false
enableFIPS: false
enableEncryptionAtHost: false
type: 'VirtualMachineScaleSets'
mode: 'System'
scaleSetPriority: 'Regular'
Expand All @@ -1699,10 +1703,14 @@ resource mc 'Microsoft.ContainerService/managedClusters@2022-03-02-preview' = {
osDiskSizeGB: 120
osDiskType: 'Ephemeral'
osType: 'Linux'
osSKU: 'Ubuntu'
minCount: 2
maxCount: 5
vnetSubnetID: targetVirtualNetwork::snetClusterNodes.id
enableAutoScaling: true
enableCustomCATrust: false
enableFIPS: false
enableEncryptionAtHost: false
type: 'VirtualMachineScaleSets'
mode: 'User'
scaleSetPriority: 'Regular'
Expand Down Expand Up @@ -1794,14 +1802,54 @@ resource mc 'Microsoft.ContainerService/managedClusters@2022-03-02-preview' = {
podIdentityProfile: {
enabled: false // Using federated workload identity for Azure AD Pod identities, not the deprecated AAD Pod Identity
}
autoUpgradeProfile: {
upgradeChannel: 'stable'
}
azureMonitorProfile: {
metrics: {
enabled: false // This is for the AKS-PrometheusAddonPreview, which is not enabled in this cluster as Container Insights is already collecting.
}
}
storageProfile: { // By default, do not support native state storage, enable as needed to support workloads that require state
blobCSIDriver: {
enabled: false // Azure Blobs
}
diskCSIDriver: {
enabled: false // Azure Disk
}
fileCSIDriver: {
enabled: false // Azure Files
}
snapshotController: {
enabled: false // CSI Snapshotter: https://github.com/kubernetes-csi/external-snapshotter
}
}
workloadAutoScalerProfile: {
keda: {
enabled: false // Enable if using KEDA to scale workloads
}
}
disableLocalAccounts: true
securityProfile: {
workloadIdentity: {
enabled: true
}
azureDefender: {
imageCleaner: {
enabled: true
intervalHours: 120 // 5 days
}
azureKeyVaultKms: {
enabled: false // Not enabled in the this deployment, as it is not used. Enable as needed.
}
nodeRestriction: {
enabled: true // https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction
}
customCATrustCertificates: [] // Empty
defender: {
logAnalyticsWorkspaceResourceId: la.id
securityMonitoring: {
enabled: true
}
}
}
oidcIssuerProfile: {
Expand Down Expand Up @@ -1871,10 +1919,10 @@ resource acrKubeletAcrPullRole_roleAssignment 'Microsoft.Authorization/roleAssig
}
}

// Grant the OMS Agent's Managed Identity the metrics publisher role to push alerts
resource mcOmsAgentMonitoringMetricsPublisherRole_roleAssignment 'Microsoft.Authorization/roleAssignments@2020-10-01-preview' = {
// Grant the Azure Monitor (fka as OMS) Agent's Managed Identity the metrics publisher role to push alerts
resource mcAmaAgentMonitoringMetricsPublisherRole_roleAssignment 'Microsoft.Authorization/roleAssignments@2020-10-01-preview' = {
scope: mc
name: guid(mc.id, 'omsagent', monitoringMetricsPublisherRole.id)
name: guid(mc.id, 'amagent', monitoringMetricsPublisherRole.id)
properties: {
roleDefinitionId: monitoringMetricsPublisherRole.id
principalId: mc.properties.addonProfiles.omsagent.identity.objectId
Expand Down
6 changes: 4 additions & 2 deletions networking/hub-regionA.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -651,8 +651,10 @@ resource fwPolicy 'Microsoft.Network/firewallPolicies@2021-05-01' = {
'${split(environment().authentication.loginEndpoint, '/')[2]}' // Prevent the linter from getting upset at login.microsoftonline.com
'*.blob.${environment().suffixes.storage}' // required for the extension installer to download the helm chart install flux. This storage account is not predictable, but does look like eusreplstore196 for example.
'azurearcfork8s.azurecr.io' // required for a few of the images installed by the extension.
'*.docker.io' // Only required if you use the default bootstrapping manifests included in this repo. Kured is sourced from here by default.
'*.docker.com' // Only required if you use the default bootstrapping manifests included in this repo. Kured is sourced from here by default.
'*.docker.io' // Only required if you use the default bootstrapping manifests included in this repo.
'*.docker.com' // Only required if you use the default bootstrapping manifests included in this repo.
'ghcr.io' // Only required if you use the default bootstrapping manifests included in this repo. Kured is sourced from here by default.
'pkg-containers.githubusercontent.com' // Only required if you use the default bootstrapping manifests included in this repo. Kured is sourced from here by default.
]
targetUrls: []
destinationAddresses: []
Expand Down

0 comments on commit 8ca6ef5

Please sign in to comment.