pricingengine.utilities package


pricingengine.utilities.ddml_marginal_effects module

class pricingengine.utilities.ddml_marginal_effects.DDMLMarginalEffects(schema, treatment_name, model, competition_col, leads=None, filter_dic=None)

Bases: object

Class for aggregating and computing marginal effects from dynamic_DML model

__init__(schema, treatment_name, model, competition_col, leads=None, filter_dic=None)

Create an MarginalEffects object

  • treatment_name – The treatment with respect to which the desired effects are computed
  • model – The dynamic dml model from which marginal effects are derived.
  • leads – list of leads to be included in the effect matrix
  • competition_col – the col (from the estimationDataset) along which units might have spill over effects
  • filter_dic – dictionary for filtering effect matrix (filtering before computing can be much faster)

This class currently only works for dynamic_dml models

filter(filter_dic, impacted_leads=None, impacting_leads=None)

Return filtered marginal effects columns

  • filter_dic – dictionary mapping from column levels to acceptable values
  • impacted_leads – list of leads (columns) that will be preserved
  • impacting_leads – list of leads (rows) that will be preserved
get_own_effect(filter_dic=None, lead=None)

Get own treatment effect

  • filter_dic – dictionary mapping from column levels to acceptable values. Default is empty dictionary.
  • lead – lead used (for both row and column) to select marginal effects. Default is lead=1.
get_pull_forward_path(filter_dic, lead)

Get path of pull forward effects

  • filter_dic – dictionary mapping from column levels to acceptable values
  • lead – lead used to select marginal effects

Gets a dataframe of estiamted marginal effects


Get dataframe of lower ci end points on marginal effects. p is the amount of probability left in the lower tail, so setting p=.025 gets the lower end point of a two-sided 95% CI.


Get dataframe of upper ci end points on marginal effects. p is the amount of probability left in the upper tail, so setting p=.025 gets the upper end point of a two-sided 95% CI.


Gets a dataframe of standard errors on estimated marginal effects


The leads of the MFX

pricingengine.utilities.predictions module

class pricingengine.utilities.predictions.Predictions(model, estimation_dataset, outcome_is_log=False, ret_pred=None)

Bases: object

Class used to generate and manipulate predictions from a pretrained model

ACTUAL_COL = 'actuals'
ERROR_GEQ = 'error_geq'
ERROR_STAT_LEVEL = 'error_view_stat'
ERROR_VIEW_LEVEL = 'error_view'
PERCENT_ERROR = 'percent_error'
RESIDUAL = 'error'
__init__(model, estimation_dataset, outcome_is_log=False, ret_pred=None)

Creates a new Predictions instance

  • model (DoubleMLLikeModel) – The model used to make predictions. This object must have a _predict method
  • estimation_dataset (EstimationDataSet) – The dataset on which predictions are made
  • outcome_is_log (bool) – Set this to true if effectmodel was trained on a logged outcome and all outcomes and forecasts will be transformed to non-log scale
  • diffed_vars – List of variables that were first differenced before they were fit in first-stage
  • ret_pred – None else empty DataFrame (if empty dataFrame then will be populated with predcitions)

Return self._date

filter(filter_dic, first_date=None, last_date=None)

Return a subset of the predictions object that corresponds to the filter dictinoary passed in

  • filter_dic – dictionary mapping categorical columns to lists of acceptable values
  • first_date – omit any data before this date
  • last_date – omit any data from after this date

Computes the mean estimated residual for each unique combination of names. Positive numbers indicate that the model is underforecasting. If self.pred_is_logs=True, then the mean is computed on the underlying logs

Parameters:names – A list of names (that must be present in the index of self._data). avg. mapes will be computed for every unique combination of levels of these names

Returns prediction intervals for each observation with the corresponding level of coverage. Prediction intervals are based on the avg_error columns and a gaussian approximation

Parameters:coverage – a float between 0 and 1 indicating the desired level of coverage for each forecast interval
get_projections_from_date(projection_date, coverage)

Organized prediction object to get all projections made from a certain date

  • projection_date – The date from which the requested predictions are made
  • coverage – a float between 0 and 1 indicating the desired level of coverage for each forecast interval

Computes a sign test on the distribution of residuals for each unique combination of names. Returns a dataframe of residual, counts, number of residuals greater than 0 and a sign test on the null hypothesis that the median of the null distribution is zero.

Parameters:names – A list of names (that must be present in the index of self._data). avg. mapes will be computed for every unique combination of levels of these names
get_smape(names=None, first_stage=False)

Computes sMAPE (symmetric mean average percentage error) for each unique combination of names. Returns a dataframe of mapes for each lookahead and each unique combination of the indices found in names.

Parameters:names – A list of names (that must be present in the index of self._data). avg. mapes will be computed for every unique combination of levels of these names :param first_stage: Get smapes for first stage regressions as opposed to final outcome (default is False)

Module contents