Conversation
docjyJ
commented
Sep 21, 2024
- Add Unhealty state
- Replace class by backed enum
There was a problem hiding this comment.
Hey Jean-Yves, Simon asked me to have a look at your changes. I left some feedback regarding the code itself.
I really like you refactoring the container state logic from an interface to an enum. However, I think that both concerns could have been split into separate PRs. One PR to refactor the interface and another one to implement the new healthy states. But the code is already there so why not.
Looks good otherwise but did not test.
|
Hi, Thanks for the answer, as soon as I have time I'll look into it. I will try to separate and make several PR. Take care, |
|
I keep this PR open for the unhealthy state. |
1f3f231 to
b9ca83a
Compare
|
Conflicts :/ |
|
Don't worry, i'll handle it. |
b9ca83a to
6a3c340
Compare
|
Solved and up to date and ready (to be tested anyway...) |
|
I have podman on my machine and I can't launch the container... I can't test it and I don't have time to debug... |
Why may I ask? Do you run into an issue here? |
st3iny
left a comment
There was a problem hiding this comment.
Awesome! Looks good to me now!
|
Health checks may need to be adjusted, probably reduce the interval during startup to speed up startup of all containers.
See: https://docs.docker.com/reference/dockerfile/#healthcheck |
|
This should allow for better dependency management. |
|
Thanks for the idea! However our health checks are currently built in a way that they never fail after a specific time. See for example https://github.com/nextcloud/all-in-one/blob/main/Containers/apache/healthcheck.sh. for the rest the defaults are good enough imho |
The problem with doing this is that docker considers the container ready... This should immediately take the container out of the starting state. |
|
This is the only reliable way to check if the container is really up and running. |
|
A fix would be to fail the container if nextcloud:9000 is not reachable and add a startup period (e.g. 10 minutes) #!/bin/bash
- nc -z "$NEXTCLOUD_HOST" 9000 || exit 0
+ nc -z "$NEXTCLOUD_HOST" 9000 || exit 0
nc -z 127.0.0.1 8000 || exit 1
nc -z 127.0.0.1 "$APACHE_PORT" || exit 1
if ! nc -z "$NC_DOMAIN" 443; then
echo "Could not reach $NC_DOMAIN on port 443."
exit 1
fi- HEALTHCHECK CMD /healthcheck.sh
+ HEALTHCHECK --start-period=10m CMD /healthcheck.sh |
This is the problem: we cannot spwcify a time here as is depending on the overall setup like installed apps and features, amount of users, given hardware and else especially during upgrades |
Yes, I see... Would it be better to manage dependencies like docker compose?
|
|
I fear this is going to have the same problem afaics, no? |
|
@docjyJ LGTM now 😊 Can you please rebase the PR and squash the commits? |
730f4f4 to
9482e43
Compare
Signed-off-by: Jean-Yves <7360784+docjyJ@users.noreply.github.com>
9482e43 to
4798489
Compare
|
Done |
| public function isStarting(): bool { | ||
| return $this == self::Starting; | ||
| } | ||
|
|
||
| public function isRestarting(): bool { | ||
| return $this == self::Restarting; | ||
| } | ||
|
|
||
| public function isHealthy(): bool { | ||
| return $this == self::Healthy; | ||
| } | ||
|
|
||
| public function isUnhealthy(): bool { | ||
| return $this == self::Unhealthy; | ||
| } | ||
|
|
||
| public function isRunning(): bool { | ||
| return $this->isHealthy() || $this->isUnhealthy() || $this->isStarting() || $this->isRestarting(); | ||
| } |
There was a problem hiding this comment.
The Running state corresponds to the old GetRunningContainerState state.
Maybe there is another clearer state. Healthy means that the container is running without any problem detected by the AIO.
| // Don't start if container is already running | ||
| // This is expected to happen if a container is defined in depends_on of multiple containers | ||
| if ($container->GetRunningState() === ContainerState::Running) { | ||
| if ($container->GetContainerState()->isRunning()) { |
There was a problem hiding this comment.
container->GetRunningState does not include Starting and Restarting
There was a problem hiding this comment.
To avoid it being hard to understand I went through the isSomething function
| if ($responseBody['State']['Running'] === true) { | ||
| return ContainerState::Running; | ||
| } else { | ||
| return ContainerState::Stopped; | ||
| } |
There was a problem hiding this comment.
This is true regardless of the Healthy or Starting or Running state.
There was a problem hiding this comment.
@docjyJ I discussed the latest changes with @st3iny and we agree that the latest changes do not make much sense or at least are not easy enough to merge them. I am very sorry for that.
I would like to ask you if you could restore the changes from b20e2a8 by running git checkout <hash> and then push the old changes to a new branch via so: git checkout -b ench/noid/heathcheck-restored. From that commit hash, I think only such a check like https://github.com/nextcloud/all-in-one/pull/5307/files#diff-502ad3fbc5a2714763795f78cf314d4a76ada0cb3746530f2fb86fa471dd897bR91-R105 is missing. (best in a new commit)
Can you please do the above? I would do it myself but I unfortunately don't have your changes locally available anymore