qpym2.io.hdf#
Routines needed for creating fit input (spectral histograms and data) and other HDF5 files.
## Input HDF5 files
Used to make sure all the input data needed for analysis is self contained.
- input[_postfix].h5 – data, and MC event dataframes as groups.
bkg_model: bkg_model params ndbd: a row similar to bkg_model table mchists: a group with np arrays representing the histograms as datasets. Name refers to index in table data: data to fit to mkchain: bkg_model markov chain (Not necessary)
fit[_postfix].h5 : fit output and component histograms. (Everything needed to replicate a fit?)
fit[_postfix].nc : arviz inference dataor now are {u, v, Channel, Dataset}
## BM Markov chain HDF5 files
Module Contents#
Functions#
write a dataframe to a file. If overwrite is set, overwrite data. |
|
write the list of mchists to a file. The hists are stored as is (numpy arrays) with names given. |
|
write data (np.array) to h5 file. If overwrite is set, overwrite data. |
|
Write the input (data, hists, bm table) needed for fitting to a file. |
|
Read the background model table from input. Returns: |
|
read the table row related to the signal component named name |
|
read the list of mchists from a file. The hists are returned as a pd series indexed by the name in the dataset |
|
read the list of mchists from a file. The hists are returned as a pd series indexed by the name in the dataset |
Data#
Name of the group containing the mchists. |
|
Name of the group containing the signal. TODO: Not implenetd yet. |
|
Name of the group containing the data. |
|
Name of the group containing the bkg_model. |
|
Name of the group containing the bkg_model markov chain. |
|
API#
- qpym2.io.hdf._BKG_MODEL_TABLE_COLS = ['mean', 'mode']#
Name of the group containing the mchists.
- qpym2.io.hdf._MCHISTS_GNAME = 'mchists'#
Name of the group containing the signal. TODO: Not implenetd yet.
- qpym2.io.hdf._SIGNAL_KEY_NAME = 'signal'#
Name of the group containing the data.
- qpym2.io.hdf._DATA_NAME = 'data'#
Name of the group containing the bkg_model.
- qpym2.io.hdf._BM_KEY_NAME = 'bkg_model'#
Name of the group containing the bkg_model markov chain.
- qpym2.io.hdf._MKCHAIN_KEY_NAME = 'mkchain'#
- qpym2.io.hdf._write_df(df, outpath, key, overwrite=False, **kwargs)#
write a dataframe to a file. If overwrite is set, overwrite data.
- Args:
df (pd.DataFrame): dataframe to write. outpath (str): path to the output file. key (str): name of the group to write to. overwrite (bool): overwrite the file if it exists. Default is False. **kwargs: additional arguments to be passed to Pandas.to_hdf.
TODO: implement mode=’a’ in kwargs is needed.
- qpym2.io.hdf._write_hists(hists, outpath, group_name, names=None, overwrite=False, append_hist=False)#
write the list of mchists to a file. The hists are stored as is (numpy arrays) with names given.
- Args:
hists (list or pd.Series of np.array): list of histograms to write. outpath (str): path to the output file. group_name (str): name of the group to write to. names (list of str): names of the histograms. If None, use the index of the list. overwrite (bool): Indicate whether to overwrite all histograms. If false and the group_name exists,
throw an error.
- append_hist (bool): Indicate whether to append the histograms to the existing group. If false and the
group_name exists, throw an error.
- qpym2.io.hdf._write_data(data, outpath, dataname, overwrite=False)#
write data (np.array) to h5 file. If overwrite is set, overwrite data.
- Args:
data (np.array): data to write. outpath (str): path to the output file. dataname (str): name of the group to write to. overwrite (bool): overwrite the file if it exists. Default is False.
- qpym2.io.hdf.write_input(outpath, mctable, signal_df, data, signal_name=None, mkchain=None, overwrite=False, **kwargs)#
Write the input (data, hists, bm table) needed for fitting to a file.
- Args:
outpath (str): path to the output file. mctable (pd.DataFrame): bkg_model table. signal_df (pd.DataFrame): a single row table representing the signal. data (np.array): data to fit to. mkchain (pd.DataFrame): bkg_model markov chain. Default is None. overwrite (bool): overwrite the file if it exists. Default is False. **kwargs: additional arguments to be passed to Pandas.to_hdf.
- TODO: NOTE: seperate functions to write table, data, signal, etc might be better
if we are trying to improve performance using multiprocessing. By seperating the write functions, we don’t need to wait till all the reading is done.
TODO: We probably want custom
- qpym2.io.hdf.read_mkchain(h5path)#
- qpym2.io.hdf.read_bkg_model_fit(h5path)#
- qpym2.io.hdf.read_bkg_model(h5path)#
Read the background model table from input. Returns:
pd.DataFrame: columns = _BKG_MODEL_TABLE_COLS_, ‘mchist’
- qpym2.io.hdf.read_data(h5path, dataname='data')#
- qpym2.io.hdf.read_signal(h5path, name)#
read the table row related to the signal component named name
- qpym2.io.hdf.read_ndbd(h5path)#
- qpym2.io.hdf.read_single_mchist(h5path, mcname, group_name=_MCHISTS_GNAME)#
read the list of mchists from a file. The hists are returned as a pd series indexed by the name in the dataset
- qpym2.io.hdf.read_mchists(h5path, group_name=_MCHISTS_GNAME)#
read the list of mchists from a file. The hists are returned as a pd series indexed by the name in the dataset