Plots

PREDICT.Plots.AUPRCPlot(log)

Plot the AUPRC of the model over time.

Parameters:

log (dict) – Log of model metrics over time and when the model was updated.

PREDICT.Plots.AUROCPlot(log)

Plot the AUROC of the model over time.

Parameters:

log (dict) – Log of model metrics over time and when the model was updated.

PREDICT.Plots.AccuracyPlot(log, recalthreshold=None)

Plot the accuracy of the model over time.

Parameters:
  • log (dict) – Log of model metrics over time and when the model was updated.

  • recalthreshold (float, int, optional) – Threshold to trigger recalibration.

PREDICT.Plots.BayesianCoefsPlot(log, model_name=None, max_predictors_per_plot=10, fileloc='./')
Plots the mean coefficients (with standard deviation as the error bar) of the Bayesian model over time.

Note: this is only suitable for the BayesianModel and .addLogHook(TrackBayesianCoefs(model)) must be used.

Parameters:
  • log (dict, pd.DataFrame) – Log or dataframe of model metrics over time and when the model was updated.

  • model_name (str, optional) – Name of model or domain used in filename e.g. ‘COVID_data_simulation’.

  • max_predictors_per_plot (int) – Max number of predictors per plot to avoid clutter.

  • fileloc (str) – Location to save the plots.

PREDICT.Plots.CITLPlot(log)

Plot the Calibration-in-the-Large of the model over time.

Parameters:

log (dict) – Log of model metrics over time and when the model was updated.

PREDICT.Plots.CalibrationSlopePlot(log)

Plot the calibration slope of the model over time.

Parameters:

log (dict) – Log of model metrics over time and when the model was updated.

PREDICT.Plots.CoxSnellPlot(log)

Plot the Cox-Snell R^2 of the model over time.

Parameters:

log (dict) – Log of model metrics over time and when the model was updated.

PREDICT.Plots.ErrorSPCPlot(log, model)

Plots the error over time as a statistical process control chart with upper control limits indicating warning and danger zones when model performance drops.

Parameters:
  • log (dict) – Log of model metrics over time and when the model was updated.

  • model (PREDICTModel) – The model to evaluate, must have a predict method.

PREDICT.Plots.F1ScorePlot(log)

Plot the F1 Score of the model over time.

Parameters:

log (dict) – Log of model metrics over time and when the model was updated.

PREDICT.Plots.MonitorChangeSPC(input_data, trackCol, timeframe, windowSize, largerSD=3, smallerSD=2)

Generate a statistical process control chart to observe data changes over time. Plot shows prevalence or mean of a dataframe column over time with control limits for ± x and y standard deviations from the mean (where x and y default to 2 and 3 respectively). This function is useful for tracking changes that might control to model error increasing.

Parameters:
  • input_data (pd.DataFrame) – The input data to monitor data changes.

  • trackCol (str) – Column of input data to monitor.

  • timeframe (str) – How often to plot the data points of the tracked variable. Can be ‘Day’, ‘Week’, ‘Month’ or ‘Year’.

  • windowSize (int) – How many timeframes to use as a the rolling control limit window size e.g. if timeframe is ‘week’ and the window_size = 4 then the window covers 4 weeks.

  • largerSD (float) – Red line upper and lower most control limits. Defaults to 3.

  • smallerSD (float) – Yellow line inner control limts. Defaults to 2.

Raises:

ValueError – If timeframe variable is not ‘Day’, ‘Week’, ‘Month’, or ‘Year’.

PREDICT.Plots.NormalisedSumOfDiffPlot(log)

Plot the error of the model over time.

Parameters:

log (dict) – Log of model metrics over time and when the model was updated.

PREDICT.Plots.OEPlot(log)

Plot the observation to expectation ratio of the model over time.

Parameters:

log (dict) – Log of model metrics over time and when the model was updated.

PREDICT.Plots.PrecisionPlot(log)

Plot the Precision of the model over time.

Parameters:

log (dict) – Log of model metrics over time and when the model was updated.

PREDICT.Plots.PredictorBasedPlot(log, x_axis_min=None, x_axis_max=None, predictor=None, outcome='outcome', show_legend=True)
Plots the probability of an outcome given a specific predictor.

Note: this is only suitable for the BayesianModel and .addLogHook(TrackBayesianCoefs(model)) must be used.

Parameters:
  • log (dict) – Log of model metrics over time and when the model was updated.

  • x_axis_min (float, optional) – Minimum value for the x axis representing the predictor. Defaults to None.

  • x_axis_max (float, optional) – Maximum value for the x axis representing the predictor. Defaults to None.

  • predictor (str, optional) – Name of the predictor to assess. Defaults to None.

  • outcome (str, optional) – Name of the outcome being predicted. Defaults to “outcome”.

  • show_legend (bool, optional) – Whether to show the legend. Defaults to True.

Raises:
  • ValueError – Raises error if x_axis_min is not provided.

  • ValueError – Raises error if x_axis_max is not provided.

  • ValueError – Raises error if predictor is not provided.

PREDICT.Plots.SensitivityPlot(log)

Plot the Sensitivity of the model over time.

Parameters:

log (dict) – Log of model metrics over time and when the model was updated.

PREDICT.Plots.SpecificityPlot(log)

Plot the Specificity of the model over time.

Parameters:

log (dict) – Log of model metrics over time and when the model was updated.

PREDICT.Plots.plot_calibration_yearly(model, method_list=['Baseline', 'Regular Testing', 'Static Threshold', 'SPC', 'Bayesian'], gender='')

Plots the calibration slope each year as one plot with each line as a different PREDICT method.

Parameters:
  • model (str) – Name of the model used e.g. ‘qrisk’.

  • method_list (list) – List of the methods to plot the yearly calibration slope of.

  • gender (str) – If using a model separated by gender include a string e.g. ‘female’. Defaults to ‘’.

PREDICT.Plots.plot_count_of_patients_over_threshold_risk(threshold=0.1, model_type='qrisk2', gender='')

Plot the number of people per month who have over x% risk of the outcome.

Parameters:
  • threshold (float) – Risk threshold value. Defaults to 0.1.

  • model_type (str) – String of model name e.g. ‘qrisk’. Defaults to ‘qrisk2’.

  • gender (str) – If using the qrisk model pick between the male and female model e.g. “female”. Defaults to ‘’.

PREDICT.Plots.plot_method_comparison_metrics(metrics_df, recalthreshold, model_updates, model_type, gender='')

Plot the metric comparison graphs with each line showing a different PREDICT method.

Parameters:
  • metrics_df (str) – csv file name where performance metrics for each method are saved.

  • recalthreshold (float) – static threshold method AUROC threshold.

  • model_updates (str) – csv file name where dates of model updates with method names are stored.

  • model_type (str) – name of the model used e.g. QRISK2.

  • gender (str) – if using the QRISK model, define whether to use male or female. Defaults to ‘’.

PREDICT.Plots.plot_patients_per_month(df, model_type: str, gender: str = '')

Plots the number of people per month. :param df: DataFrame of patient data. :type df: pd.DataFrame :param model_type: String of model name e.g. ‘qrisk’. :type model_type: str :param gender: If using the qrisk model pick between the male and female model e.g. “female”. Defaults to ‘’. :type gender: str

PREDICT.Plots.plot_predictor_distributions(df, predictors, plot_type, model_name)

Plots the distributions of the predictors, can choose from using a violin plot, a stacked bar chart or a percentage stacked barchart. One bar is plotted for each month.

Parameters:
  • df (pd.DataFrame) – Dataframe where predictors are columns and rows are individual visits.

  • predictors (list) – List of the predictors to plot.

  • plot_type (str) – What type of plot to draw, either ‘violin’, ‘stacked_bar’ or ‘stacked_perc’.

  • model_name (str) – Name of the model e.g. ‘qrisk2_female’, used to name the saved plots.