Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alter the home of ceph conf to not interfere with other applications #23

Merged
merged 3 commits into from
Sep 4, 2024

Conversation

addyess
Copy link
Member

@addyess addyess commented Sep 3, 2024

Overview

Addresses #22 or LP#2078646
By changing the location of the charm's ceph configuration and credentials, the charm can integrate with ceph-mon even on hosts where ceph-osd may be present.

Details

  • previously ceph-csi wrote to /etc/ceph/ceph.conf and /etc/ceph/ceph.client.<app>.keyring
  • changes to write these files to a charm specific directory /var/lib/juju/agents/unit-<app>-<id>/ceph-conf
  • sets the charm local config directory to mode=o700 in order to prevent non-root users from accessing the keyfile

@addyess addyess force-pushed the issue/22/converged-with-ceph-osd branch from 49147e7 to 3aef14f Compare September 3, 2024 17:20
@addyess addyess force-pushed the issue/22/converged-with-ceph-osd branch from 3aef14f to b764968 Compare September 3, 2024 18:02
Copy link

@nobuto-m nobuto-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it looks good to me since my initial concern of ceph.conf conflict should be resolved by this. We just need to do more testing of migration, etc. once this hits to the edge channel.

@addyess
Copy link
Member Author

addyess commented Sep 4, 2024

Overall, it looks good to me since my initial concern of ceph.conf conflict should be resolved by this. We just need to do more testing of migration, etc. once this hits to the edge channel.

I did all my testing with kubernetes-control-plane (through some hard labor to get them on the same machine). It wasn't hard to move these files as this charm just needs to run some ceph commands occasionally... so it's just a client config

@nobuto-m
Copy link

nobuto-m commented Sep 4, 2024

Overall, it looks good to me since my initial concern of ceph.conf conflict should be resolved by this. We just need to do more testing of migration, etc. once this hits to the edge channel.

I did all my testing with kubernetes-control-plane (through some hard labor to get them on the same machine). It wasn't hard to move these files as this charm just needs to run some ceph commands occasionally... so it's just a client config

I just meant the migration testing from the previously broken state. ceph.conf for ceph-osd was overwritten by ceph-csi before and we need to rewrite the expected ceph.conf for ceph-osd once again in the charm way. It's out of ceph-csi charm's scope but more general testing as a whole.

@addyess
Copy link
Member Author

addyess commented Sep 4, 2024

I just meant the migration testing from the previously broken state. ceph.conf for ceph-osd was overwritten by ceph-csi before and we need to rewrite the expected ceph.conf for ceph-osd once again in the charm way. It's out of ceph-csi charm's scope but more general testing as a whole.

would the ceph-osd charm not have a way to rewrite it's own config? I understand the ceph-csi charm stomped on it -- but it seems reasoned that the osd charm could stomp over it again right? Would a config change hook of some sort re-write the ceph.conf on the osd charm?

I agree it's out of the scope of the ceph-csi charm to write or even modify /etc/ceph/ at all really

@nobuto-m
Copy link

nobuto-m commented Sep 4, 2024

would the ceph-osd charm not have a way to rewrite it's own config? I understand the ceph-csi charm stomped on it -- but it seems reasoned that the osd charm could stomp over it again right? Would a config change hook of some sort re-write the ceph.conf on the osd charm?

My gut feeling is that upgrading the ceph-csi charm would trigger some relation events to ceph-mon and ceph-osd would run a template rendering to react to those events as a cascading effect. If that's the case, nothing is required other than refreshing the ceph-csi charm. Let's land this fix to the edge channel whenever it's ready and I will run some tests to make sure our theories are correct.

Copy link
Member

@mateoflorido mateoflorido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just a nit while using the Path object as a str. Perhaps we should use the absolute().as_posix() API instead. This is just a non-blocking suggestion :)

@addyess addyess merged commit 64c1429 into main Sep 4, 2024
7 checks passed
@addyess addyess deleted the issue/22/converged-with-ceph-osd branch September 4, 2024 17:24
@lathiat
Copy link

lathiat commented Nov 20, 2024

would the ceph-osd charm not have a way to rewrite it's own config? I understand the ceph-csi charm stomped on it -- but it seems reasoned that the osd charm could stomp over it again right? Would a config change hook of some sort re-write the ceph.conf on the osd charm?

My gut feeling is that upgrading the ceph-csi charm would trigger some relation events to ceph-mon and ceph-osd would run a template rendering to react to those events as a cascading effect. If that's the case, nothing is required other than refreshing the ceph-csi charm. Let's land this fix to the edge channel whenever it's ready and I will run some tests to make sure our theories are correct.

This does not appear to be true. Once you refresh the ceph-csi charm, the ceph.conf still stays as is. You can trigger ceph-osd to write the correct one with:
juju exec --application ceph-osd ./hooks/config-changed

Neither charm seem to proactively rewrite ceph.conf, so if you run that, it will mostly stay put unless one of the juju relations change or a config change is made, but with the notable exception of ceph-csi rewriting ceph.conf if the leader changes, but only on the new leader. So if you reboot all nodes (e.g. power outage) or the node that happens to be the ceph-csi leader, that new unit only will get the broken ceph.conf. However also remember relation changes (on the ceph-client or kubernetes-info relation) or a juju config change may rewrite all ceph.conf back to the ceph-csi version at any random point and stay that way for days or weeks only to have the problem noticed if you reboot (or try to intentionally restart an OSD, or it crashes)

If you hit this issue (OSDs won't start after reboot), the workaround is to:

  1. Force ceph-osd to rewrite ceph.conf with: juju exec --application ceph-osd ./hooks/config-changed
  2. Reset the failed status: systemctl reset-failed ceph-osd@*
  3. Start the OSDs: systemctl start ceph-osd.target (restart also works, but it won't work without doing reset-failed first)

It appears that ceph-csi works fine with the ceph-osd ceph.conf (a 51 line file), but the reverse is not true, ceph-osd (and mon and mds) does not work with the ceph-csi version fo ceph.conf (a 16 line file). The reason is that ceph-csi relies on the default global value of keyring, to look in /etc/ceph/$cluster.$name.keyring where as the OSD relies on a custom section to set keyring to /var/lib/ceph/osd/$cluster-$id/keyring in an [osd] specific section ([mon] and [mds] sections also)

HomayoonAlimohammadi pushed a commit that referenced this pull request Dec 13, 2024
…23)

* Alter the home of ceph.conf to not interfere with other applications

* Log to /var/log/ceph/<unit-name>.log
HomayoonAlimohammadi added a commit that referenced this pull request Dec 13, 2024
…ith other applications (#26)

* Alter the home of ceph.conf to not interfere with other applications

* Log to /var/log/ceph/<unit-name>.log

Co-authored-by: Adam Dyess <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants