-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merging samples for model prediction #502
Comments
I am happy to be assigned this. I guess this is a change after the fit, where the correct summing and propagation of uncertainties need to be done and all metadata updated accordingly. Is that correct? |
Steps to achieve the implementation via the API:
Things to think about:
|
From what I've seen it is difficult to change the I think the implementation should be done in a way that works pre-fit as well. |
Here is something I wrote recently to merge samples for plotting, but it does not get the per-sample uncertainties correct because to calculate those, the merge needs to happen before the import numpy as np
# need to:
# - partially merge samples
# - change model to have different list of sample names
# WARNING: this does NOT update the uncertainty for the stack of the samples that get merged,
# evaluating that correctly is impossible at this stage and would require changes in the
# calculation for the ModelPrediction object, however this is not needed when only plotting
# the uncertainty for the total model prediction, such as in a normal data/MC plot
# partially merge samples
new_model_yields = []
SAMPLES_TO_SUM = ["small_bkg_1", "small_bkg_2"]
sample_indices = [idx for idx, sample in enumerate(prediction.model.config.samples) if sample in SAMPLES_TO_SUM]
WHERE_TO_INSERT = 1 # this is where in the stack the new merged sample goes, 0 is at the bottom
for i_ch, channel in enumerate(prediction.model.config.channels):
# for each channel, sum together the desired samples
summed_sample = np.sum(np.asarray(prediction.model_yields[i_ch])[sample_indices], axis=0)
# build set of remaining samples and remove the ones already summed
remaining_samples = np.delete(prediction.model_yields[i_ch], sample_indices, axis=0)
model_yields = np.insert(remaining_samples, WHERE_TO_INSERT, summed_sample, axis=0)
new_model_yields.append(model_yields)
# change model to have different list of sample names
# we cannot easily change that list in the pyhf object but we build a fake object that provides what we need
class MockConfig():
def __init__(self, model, merged_sample):
self.samples = model.config.samples
self.channel_nbins = model.config.channel_nbins
self.channels = model.config.channels
self.channel_slices = model.config.channel_slices
# update list of samples
self.samples = model.config.samples
self.samples = np.delete(model.config.samples, sample_indices)
self.samples = np.insert(self.samples, WHERE_TO_INSERT, merged_sample, axis=0)
self.samples = self.samples.tolist()
class MockModel():
def __init__(self, model, merged_sample):
self.config = MockConfig(model, merged_sample)
MERGED_SAMPLE_NAME = "Other"
new_model = MockModel(prediction.model, MERGED_SAMPLE_NAME)
# build new model prediction object that behaves as intended
new_prediction = cabinetry.model_utils.ModelPrediction(
new_model,
new_model_yields, # with samples merged
prediction.total_stdev_model_bins,
prediction.total_stdev_model_channels,
prediction.label,
) The sample summing could happen in |
Discussed in #501
Originally posted by meschrei February 12, 2025
See #501 for some more discussion. This would presumably require changing
ModelPrediction.model
and replacing that by just the relevant metadata pieces we need and can edit more easily (e.g. changing the list of samples).The text was updated successfully, but these errors were encountered: