Skip to content

Downtimes: mark their histories as cancelled when removed from conf file #913

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yhabteab
Copy link
Member

This PR tries to work around a problem we currently have, as described in #910. That is when removing a Checkable object that has some downtimes set, or even just the downtime configuration, for whatever reason, manually from the conf files or through the /v1/config/packages API endpoint as the Icinga Director does, Icinga 2 will not be able to close the downtimes properly before removing their corresponding configuration. In such situations, you will end up with a broken histories and even corrupt the SLA results for a given checkable.

Since Icinga DB knows when the actual downtime configuration gets deleted with each initial config dump, this PR now makes use of the fact and tries to fake the corresponding removed event for these downtimes. As of now, only the downtime_hostory table is updated and the cancel_time, has_been_cancelled and cancelled_by fields of this table are updated, but we also want to create some kind of event for the regular history and sla_downtime_history tables.

TBD:

  • Should we just insert a fake downtime_end event into the aforementioned two history tables or something else to indicate that the downtime no longer exists. @julianbrost suggested generating a completely new event type like checkable_deleted instead of faking the end event ourselves, but since this problem also occurs without even deleting the checkable itself, I'm not sure if this event would reflect what's happening. However, in cases where a checkable has been deleted along with it's downtimes, it could be an alternative and would somehow serve as a reset to factory settings.

fixes #910

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Downtime on a removed object are never closed.
1 participant