pricingengine.utilities package

Submodules

pricingengine.utilities.ddml_marginal_effects module

class pricingengine.utilities.ddml_marginal_effects.DDMLMarginalEffects(schema, treatment_name, model, competition_col, leads=None, filter_dic=None)

Bases: object

Class for aggregating and computing marginal effects from dynamic_DML model

__init__(schema, treatment_name, model, competition_col, leads=None, filter_dic=None)

Create an MarginalEffects object

Parameters:
  • treatment_name – The treatment with respect to which the desired effects are computed
  • model – The dynamic dml model from which marginal effects are derived.
  • leads – list of leads to be included in the effect matrix
  • competition_col – the col (from the estimationDataset) along which units might have spill over effects
  • filter_dic – dictionary for filtering effect matrix (filtering before computing can be much faster)
WARNING:

This class currently only works for dynamic_dml models

competition_col
filter(filter_dic, impacted_leads=None, impacting_leads=None)

Return filtered marginal effects columns

Parameters:
  • filter_dic – dictionary mapping from column levels to acceptable values
  • impacted_leads – list of leads (columns) that will be preserved
  • impacting_leads – list of leads (rows) that will be preserved
get_own_effect(filter_dic=None, lead=None)

Get own treatment effect

Parameters:
  • filter_dic – dictionary mapping from column levels to acceptable values. Default is empty dictionary.
  • lead – lead used (for both row and column) to select marginal effects. Default is lead=1.
get_pull_forward_path(filter_dic, lead)

Get path of pull forward effects

Parameters:
  • filter_dic – dictionary mapping from column levels to acceptable values
  • lead – lead used to select marginal effects
mfx

Gets a dataframe of estiamted marginal effects

mfx_ci_lower(p=0.025)

Get dataframe of lower ci end points on marginal effects. p is the amount of probability left in the lower tail, so setting p=.025 gets the lower end point of a two-sided 95% CI.

mfx_ci_upper(p=0.025)

Get dataframe of upper ci end points on marginal effects. p is the amount of probability left in the upper tail, so setting p=.025 gets the upper end point of a two-sided 95% CI.

mfx_se

Gets a dataframe of standard errors on estimated marginal effects

total_leads

The leads of the MFX

pricingengine.utilities.predictions module

class pricingengine.utilities.predictions.Predictions(model, estimation_dataset, outcome_is_log=False, ret_pred=None)

Bases: object

Class used to generate and manipulate predictions from a pretrained model

ACTUAL_COL = 'actuals'
CI_LOWER_ENDPOINT = 'CI lower'
CI_UPPER_ENDPOINT = 'CI upper'
ERROR_GEQ = 'error_geq'
ERROR_STAT_LEVEL = 'error_view_stat'
ERROR_VIEW_LEVEL = 'error_view'
PERCENT_ERROR = 'percent_error'
RESIDUAL = 'error'
__init__(model, estimation_dataset, outcome_is_log=False, ret_pred=None)

Creates a new Predictions instance

Parameters:
  • model (DoubleMLLikeModel) – The model used to make predictions. This object must have a _predict method
  • estimation_dataset (EstimationDataSet) – The dataset on which predictions are made
  • outcome_is_log (bool) – Set this to true if effectmodel was trained on a logged outcome and all outcomes and forecasts will be transformed to non-log scale
  • diffed_vars – List of variables that were first differenced before they were fit in first-stage
  • ret_pred – None else empty DataFrame (if empty dataFrame then will be populated with predcitions)
data

Return self._date

filter(filter_dic, first_date=None, last_date=None)

Return a subset of the predictions object that corresponds to the filter dictinoary passed in

Parameters:
  • filter_dic – dictionary mapping categorical columns to lists of acceptable values
  • first_date – omit any data before this date
  • last_date – omit any data from after this date
get_mean_residual(names=None)

Computes the mean estimated residual for each unique combination of names. Positive numbers indicate that the model is underforecasting. If self.pred_is_logs=True, then the mean is computed on the underlying logs

Parameters:names – A list of names (that must be present in the index of self._data). avg. mapes will be computed for every unique combination of levels of these names
get_prediction_intervals(coverage=0.9)

Returns prediction intervals for each observation with the corresponding level of coverage. Prediction intervals are based on the avg_error columns and a gaussian approximation

Parameters:coverage – a float between 0 and 1 indicating the desired level of coverage for each forecast interval
get_projections_from_date(projection_date, coverage)

Organized prediction object to get all projections made from a certain date

Parameters:
  • projection_date – The date from which the requested predictions are made
  • coverage – a float between 0 and 1 indicating the desired level of coverage for each forecast interval
get_residual_sign_test_by_cols(names=None)

Computes a sign test on the distribution of residuals for each unique combination of names. Returns a dataframe of residual, counts, number of residuals greater than 0 and a sign test on the null hypothesis that the median of the null distribution is zero.

Parameters:names – A list of names (that must be present in the index of self._data). avg. mapes will be computed for every unique combination of levels of these names
get_smape(names=None, first_stage=False)

Computes sMAPE (symmetric mean average percentage error) for each unique combination of names. Returns a dataframe of mapes for each lookahead and each unique combination of the indices found in names.

Parameters:names – A list of names (that must be present in the index of self._data). avg. mapes will be computed for every unique combination of levels of these names :param first_stage: Get smapes for first stage regressions as opposed to final outcome (default is False)

Module contents