Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error getting container metrics: json: unsupported value: NaN and have some high CPU usage spikes with COolify #4

Closed
MatteoGauthier opened this issue Jun 26, 2024 · 6 comments

Comments

@MatteoGauthier
Copy link

On my coolify instance, i've enabled Sentinel and I noticed high CPU usages spikes and lots of error logs for the sentinel container

Capture d’écran 2024-06-27 à 01 03 59

@livghit
Copy link

livghit commented Jul 5, 2024

Hey , I've inspected the code a bit and noticed this one happen in the getOneContainerMetrics func

func getOneContainerMetrics(containerID string, csv bool) (string, error) {
	ctx := context.Background()
	apiClient, err := client.NewClientWithOpts()
	if err != nil {
		return "", err
	}
	apiClient.NegotiateAPIVersion(ctx)
	defer apiClient.Close()
	metrics := ContainerMetrics{
		CPUUsagePercentage:    0,
		MemoryUsagePercentage: 0,
		MemoryUsed:            0,
		MemoryAvailable:       0,
		NetworkUsage:          NetworkDevice{},
	}
	container, err := apiClient.ContainerInspect(ctx, containerID)
	if err != nil {
		return "", err
	}
	stats, err := apiClient.ContainerStats(ctx, container.ID, false)
	if err != nil {
		return "", err
	}
	var v types.StatsJSON
	dec := json.NewDecoder(stats.Body)
	if err := dec.Decode(&v); err != nil {
		if err != io.EOF {
			fmt.Printf("Error decoding container stats: %v\n", err)
		}
	}
	defer stats.Body.Close()
	network_devices := v.Networks
	for _, device := range network_devices {
		metrics.NetworkUsage = NetworkDevice{
			Name:    device.InstanceID,
			RxBytes: device.RxBytes,
			TxBytes: device.TxBytes,
		}
	}

	metrics = ContainerMetrics{
		Time:                  getUnixTimeInMilliUTC(),
		CPUUsagePercentage:    calculateCPUPercent(v),
		MemoryUsagePercentage: calculateMemoryPercent(v),
		MemoryUsed:            calculateMemoryUsed(v),
		MemoryAvailable:       v.MemoryStats.Limit,
		NetworkUsage:          metrics.NetworkUsage,
	}
	jsonData, err := json.MarshalIndent(metrics, "", "    ")
	if err != nil {
		return "", err
	}
	if csv {
		return fmt.Sprintf("%s,%f,%d,%f\n", metrics.Time, metrics.CPUUsagePercentage, metrics.MemoryUsed, metrics.MemoryUsagePercentage), nil
	}
	return string(jsonData), nil
}

I think one of the calculation may be the reason this happen , but I am not sure .

metrics = ContainerMetrics{
		Time:                  getUnixTimeInMilliUTC(),
		CPUUsagePercentage:    calculateCPUPercent(v),
		MemoryUsagePercentage: calculateMemoryPercent(v),
		MemoryUsed:            calculateMemoryUsed(v),
		MemoryAvailable:       v.MemoryStats.Limit,
		NetworkUsage:          metrics.NetworkUsage,
	}

@livghit
Copy link

livghit commented Jul 5, 2024

Tested the whole thing locally and inside docker . I wasn't able to reproduce you're error .... 🥲

@mutonby
Copy link

mutonby commented Jul 8, 2024

I have the same error, can it be deactivated or something?
Captura de pantalla 2024-07-08 a las 14 15 04

@Rhiz3K
Copy link

Rhiz3K commented Jul 10, 2024

Same today after 307 update, per message through discord disabling metrics helped
image

@andrasbacsai
Copy link
Member

In coolify v312, I disabled Sentinel on all servers until this bug (and a few others) are not fixed.

@andrasbacsai
Copy link
Member

I changed everything in the next version (rewritten from 0), so this won't be an issue (also closing this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants