pricingengine.utilities package¶
Submodules¶
pricingengine.utilities.ddml_marginal_effects module¶
-
class
pricingengine.utilities.ddml_marginal_effects.
DDMLMarginalEffects
(schema, treatment_name, model, competition_col, leads=None, filter_dic=None)¶ Bases:
object
Class for aggregating and computing marginal effects from dynamic_DML model
-
__init__
(schema, treatment_name, model, competition_col, leads=None, filter_dic=None)¶ Create an MarginalEffects object
Parameters: - treatment_name – The treatment with respect to which the desired effects are computed
- model – The dynamic dml model from which marginal effects are derived.
- leads – list of leads to be included in the effect matrix
- competition_col – the col (from the estimationDataset) along which units might have spill over effects
- filter_dic – dictionary for filtering effect matrix (filtering before computing can be much faster)
WARNING: This class currently only works for dynamic_dml models
-
competition_col
¶
-
filter
(filter_dic, impacted_leads=None, impacting_leads=None)¶ Return filtered marginal effects columns
Parameters: - filter_dic – dictionary mapping from column levels to acceptable values
- impacted_leads – list of leads (columns) that will be preserved
- impacting_leads – list of leads (rows) that will be preserved
-
get_own_effect
(filter_dic=None, lead=None)¶ Get own treatment effect
Parameters: - filter_dic – dictionary mapping from column levels to acceptable values. Default is empty dictionary.
- lead – lead used (for both row and column) to select marginal effects. Default is lead=1.
-
get_pull_forward_path
(filter_dic, lead)¶ Get path of pull forward effects
Parameters: - filter_dic – dictionary mapping from column levels to acceptable values
- lead – lead used to select marginal effects
-
mfx
¶ Gets a dataframe of estiamted marginal effects
-
mfx_ci_lower
(p=0.025)¶ Get dataframe of lower ci end points on marginal effects. p is the amount of probability left in the lower tail, so setting p=.025 gets the lower end point of a two-sided 95% CI.
-
mfx_ci_upper
(p=0.025)¶ Get dataframe of upper ci end points on marginal effects. p is the amount of probability left in the upper tail, so setting p=.025 gets the upper end point of a two-sided 95% CI.
-
mfx_se
¶ Gets a dataframe of standard errors on estimated marginal effects
-
total_leads
¶ The leads of the MFX
-
pricingengine.utilities.predictions module¶
-
class
pricingengine.utilities.predictions.
Predictions
(model, estimation_dataset, outcome_is_log=False, ret_pred=None)¶ Bases:
object
Class used to generate and manipulate predictions from a pretrained model
-
ACTUAL_COL
= 'actuals'¶
-
CI_LOWER_ENDPOINT
= 'CI lower'¶
-
CI_UPPER_ENDPOINT
= 'CI upper'¶
-
ERROR_GEQ
= 'error_geq'¶
-
ERROR_STAT_LEVEL
= 'error_view_stat'¶
-
ERROR_VIEW_LEVEL
= 'error_view'¶
-
PERCENT_ERROR
= 'percent_error'¶
-
RESIDUAL
= 'error'¶
-
__init__
(model, estimation_dataset, outcome_is_log=False, ret_pred=None)¶ Creates a new Predictions instance
Parameters: - model (DoubleMLLikeModel) – The model used to make predictions. This object must have a _predict method
- estimation_dataset (EstimationDataSet) – The dataset on which predictions are made
- outcome_is_log (bool) – Set this to true if effectmodel was trained on a logged outcome and all outcomes and forecasts will be transformed to non-log scale
- diffed_vars – List of variables that were first differenced before they were fit in first-stage
- ret_pred – None else empty DataFrame (if empty dataFrame then will be populated with predcitions)
-
data
¶ Return self._date
-
filter
(filter_dic, first_date=None, last_date=None)¶ Return a subset of the predictions object that corresponds to the filter dictinoary passed in
Parameters: - filter_dic – dictionary mapping categorical columns to lists of acceptable values
- first_date – omit any data before this date
- last_date – omit any data from after this date
-
get_mean_residual
(names=None)¶ Computes the mean estimated residual for each unique combination of names. Positive numbers indicate that the model is underforecasting. If self.pred_is_logs=True, then the mean is computed on the underlying logs
Parameters: names – A list of names (that must be present in the index of self._data). avg. mapes will be computed for every unique combination of levels of these names
-
get_prediction_intervals
(coverage=0.9)¶ Returns prediction intervals for each observation with the corresponding level of coverage. Prediction intervals are based on the avg_error columns and a gaussian approximation
Parameters: coverage – a float between 0 and 1 indicating the desired level of coverage for each forecast interval
-
get_projections_from_date
(projection_date, coverage)¶ Organized prediction object to get all projections made from a certain date
Parameters: - projection_date – The date from which the requested predictions are made
- coverage – a float between 0 and 1 indicating the desired level of coverage for each forecast interval
-
get_residual_sign_test_by_cols
(names=None)¶ Computes a sign test on the distribution of residuals for each unique combination of names. Returns a dataframe of residual, counts, number of residuals greater than 0 and a sign test on the null hypothesis that the median of the null distribution is zero.
Parameters: names – A list of names (that must be present in the index of self._data). avg. mapes will be computed for every unique combination of levels of these names
-
get_smape
(names=None, first_stage=False)¶ Computes sMAPE (symmetric mean average percentage error) for each unique combination of names. Returns a dataframe of mapes for each lookahead and each unique combination of the indices found in names.
Parameters: names – A list of names (that must be present in the index of self._data). avg. mapes will be computed for every unique combination of levels of these names :param first_stage: Get smapes for first stage regressions as opposed to final outcome (default is False)
-