Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netbox-ip-controller is not resilient to data loss/corruption in k8s #17

Open
d-honeybadger opened this issue Jul 19, 2022 · 0 comments

Comments

@d-honeybadger
Copy link
Contributor

In a scenario that something happens to the underlying storage in k8s, e.g. etcd data is corrupted and has to be restored from a backup, netbox-ip-controller cannot fully recover without manual intervention. The IPs created in NetBox that correspond to the netboxip objects that magically disappeared after data loss, are never deleted, leading to duplicated IPs in NetBox.

A possible approach for solving this:
Tag all IPs created by the controller with some key that is specific to this particular controller, for example, k8s cluster name. This will allow netbox-ip-controller to list all of the IPs in NetBox that it created and should be managing. Then add a periodic sync loop which lists all current IPs in NetBox, and makes sure each of them has a parent netboxip object. This sync doesn't need to be running often, as it should only be needed in cases of etcd data loss. The tag that ties IPs to the given controller in netbox needs to be separate from the --pod-ip-tags/--service-ip-tags since it has a special meaning and cannot be used for any IPs other than the ones managed by the given controller.

ztariq21 pushed a commit that referenced this issue Jul 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant