Update Console/Cortex requirements #602

juma-conduktor · 2025-01-23T13:40:57Z

This PR updates the Console documentation as follows:

Adds minimum and recommended requirements for Cortex based on real customer usage
Updates the documentation to show the Cortex is a mandatory part of the deployment
Re-emphasises that we recommend block storage for historical metrics.

…state that Cortex container is not optional.

vercel · 2025-01-23T13:41:02Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated (UTC)
conduktor-docs	✅ Ready (Inspect)	Visit Preview	Apr 2, 2025 10:19am

docs/platform/get-started/installation/hardware.md

AurelieMarcuzzo · 2025-01-23T14:46:40Z

docs/platform/get-started/installation/hardware.md

+**Minimum**
+
+- 4 CPU Cores
+- 6 GB of RAM
+- 10 GB of disk space
+
+**Recommended**
+
+- 8 CPU Cores
+- 8+ GB of RAM
+- [block storage](/platform/get-started/configuration/env-variables#monitoring-properties)


This is weird to me. First, it doesn't match our defaults in the helm chart, so we're not consistent. Second, for the "recommended", I'd mention it should depend on the amount of metrics we're getting. We wrote a KB article to explain how to properly size it

Also I believe the disk space should depend on if we externalize the storage of the metrics or not. This leads to mentioning how long the metrics are persisted locally, how often they're pushed to the storage, ...

This is a platform consistency problem which we are not trying to solve in this PR. Our Helm Charts are also not our official documentation, and while I agree they should be consistent, I can only fix one issue at a time.

In addition, every customer I encounter has different settings for Cortex, which causes a lot of problems/support cases. Based on the data I have gathered, I have made the above recommendations. Will these fit every use case, certainly not, however they will greatly reduce the number of issues our customers have hit over the past several months.

The KB article you mentioned, unfortunately, is too limited in scope when it comes to sizing. Simply knowing how many metrics you have and calculating the amount of RAM required, does not solve the majority of metrics issues I've encountered. It's certainly a good starting point when troubleshooting, but it's not a solution in and of itself in most cases.

The ultimate solution is for engineering to perform benchmarking and provide a sizing guide either in chart form or using a formula, along with the settings to adjust. I don't believe that will happen in the short or even medium term, so the goal here is to stop the bleeding as much as possible today.

I also think we need to completely re-evaluate how we think of and provide metrics in general, but again that is much bigger discussion, that has a lot more stake holders.

TL:DR - I hear you, however this PR is meant as a bandaid, as the other solutions will take more more time and energy with buy in from other stake holders.

The ultimate solution is for engineering to perform benchmarking and provide a sizing guide either in chart form or using a formula > YES

Completely agree with you, my point is just not to say "with this, it will work", but be a bit more nuanced and say "this is the recommended resources, but based on your infra you might need more or less. If this is not enough, because you have many topics, partitions etc, please consider adding more resources", or something like that, wdyt?

So do you think we should reduce the minimum here and leave the recommended as is, or should we adjust both?

Let's reduce the minimum and leave the recommended, but add a sentence to explain this might depend on their infra, and that the bigger the cluster, the more resources they should give

I am happy to reduce the minimum and leave the recommended.

@AurelieMarcuzzo - good with the change?

docs/platform/get-started/installation/hardware.md

RG-conduktor · 2025-03-13T11:21:55Z

docs/platform/get-started/installation/hardware.md

+
+The Console container provides the web interface while the Cortex container provides the metrics.
+
+**Note:**  It is not supported to run Console without the Cortex container.

 Jump to:


Can we remove this mini-doc entirely? The right-hand navigation is there to help you navigate what's on the page; this seems unnecessary. We've also been removing these from other pages (like SNI routing), so would be good to keep consistency.

docs/platform/get-started/installation/hardware.md

RG-conduktor · 2025-03-31T08:06:38Z

@juma-conduktor hey, is this PR ready to be completed? :)

Stu-conduktor · 2025-04-01T19:15:37Z

@RG-conduktor - I suggest you add some of your minor text changes to resolve the Comments.
I've asked Aurelie about the main change which is the requirements at the top thread.

Added requirements for Cortex container and updated documentation to …

08c72ed

…state that Cortex container is not optional.

juma-conduktor requested review from jeanlouisboudart, AurelieMarcuzzo and Stu-conduktor January 23, 2025 13:40

vercel bot deployed to Preview January 23, 2025 13:42 View deployment

AurelieMarcuzzo reviewed Jan 23, 2025

View reviewed changes

Merge branch 'main' into cortex_requirements

f33439b

vercel bot deployed to Preview March 13, 2025 08:10 View deployment