Open-SIR API

SIR Model

Most epidemic models share a common approach on modelling the spread of a disease. The susceptible-infectious-removed (SIR) model is a simple deterministic compartmental model to predict disease spread. An objective population is divided in three groups: the susceptible (\(S\)), the infected (\(I\)) and the recovered or removed (\(R\)). These quantities enter the model as fractions of the total population \(P\).

\[S = \frac{\text{Number of susceptible individuals}}{\text{Population size}},\]
\[I = \frac{\text{Number of infected individuals}}{\text{Population size}},\]
\[R = \frac{\text{Number of recovered or removed individuals}}{\text{Population size}},\]

As a pandemics infects and kills much more quickly than human natural rates of birth and death, the population size is assumed constant except for the individuals that recover or die. Hence, \(S+I+R=P/P=1\). The pandemics dynamics is modelled as a system of ordinary differential equations which governs the rate of change at which the percentage of susceptible, infected and recovered/removed individuals in a population evolve.

The number of possible transmissions is proportional to the number of interactions between the susceptible and infected compartments, \(S \times I\):

\[\frac{dS}{dt} = -\alpha SI,\]

Where \(\alpha\) / \([\text{time}]^{-1}\) is the transmission rate of the process which quantifies how many of the interactions between susceptible and infected populations yield to new infections per day.

The population of infected individuals will increase with new infections and decrease with recovered or removed people.

\[\frac{dI}{dt} = \alpha S I - \beta I,\]
\[\frac{dR}{dt} = \beta I,\]

Where \(\beta\) is the percentage of the infected population that is removed from the transmission process per day.

The infectious period, \(T_I\) / \([\text{time}]\) , is defined as the reciprocal of the removal rate:

\[T_I=\frac{1}{\beta}.\]

In early stages of the infection, the number of infected people is much lower than the susceptible population. Hence, \(S \approx 1\) making \(dI/dt\) linear and the system has the analytical solution \(I(t) = I_0 \exp (\alpha - \beta)t\).

class opensir.models.SIR

SIR model definition

exception InconsistentDimensionsError

Raised when the length of the days array is not equal to the dimension of the observed cases, or if the length of fit_index has a length different than the length of the parameter array self.p

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception InitializationError

Raised when a function executed violating the logical sequence of the Open-SIR pipeline

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception InvalidNumberOfParametersError

Raised when the number of initial parameters is not correct

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception InvalidParameterError

Raised when an initial parameter of a value is not correct

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

block_cv(lags=1, min_sample=3)

Calculates mean squared error of the predictions as a measure of model predictive performance using block cross validation.

The cross-validation mean squared error can be used to estimate a confidence interval of model predictions.

The model needs to be initialized and fitted prior calling block_cv.

Parameters
  • lags (int) – Defines the number of days that will be forecasted to calculate the mean squared error. For example, for the prediction Xp(t) and the real value X(t), the mean squared error will be calculated as mse = 1/n_boots |Xp(t+lags)-X(t+lags)|. This provides an estimate of the mean deviation of the predictions after “lags” days.

  • min_sample (int) – Number of days that will be used in the train set to make the first prediction.

Returns

tuple containing:
  • mse_avg (float):

    Simple average of the mean squared error between the model prediction for “lags” days and the real observed value.

  • mse_list (numpy.array):

    List of the mean squared errors using (i) points to predict the X(i+lags) value, with i an iterator that goes from n_samples+1 to the end of t_obs index.

  • p_list (numpy.array):

    List of the parameters sampled on the bootstrapping as a function of time. A common use of this list is to plot the mean squared error against time, to identify time periods where the model produces the best and worst fit to the data.

Return type

tuple

ci_bootstrap(alpha=0.95, n_iter=1000, r0_ci=True)

Calculates the confidence interval of the parameters using the random sample bootstrap method.

The model needs to be initialized and fitted prior calling ci_bootstrap

Parameters
  • alpha (float) – Percentile of the confidence interval required.

  • n_iter (int) – Number of random samples that will be taken to fit the model and perform the bootstrapping. Use n_iter >= 1000

  • r0_ci (boolean) – Set to True to also return the reproduction rate confidence interval.

Note

This traditional random sampling bootstrap is not a good way to bootstrap time-series data , baceuse the data because X(t+1) is correlated with X(t). In any case, it provides a reference case and it will can be an useful method for other types of models. When using this function, always compare the prediction error with the interval provided by the function ci_block_cv.

Returns

tuple containing:
  • ci (numpy.array):

    list of lists that contain the lower and upper confidence intervals of each parameter.

  • p_bt (numpy.array):

    list of the parameters sampled on the bootstrapping. The most common use of this list is to plot histograms to visualize and try to infer the probability density function of the parameters.

Return type

tuple

export(f, suppress_header=False, delimiter=',')

Export the output of the model in CSV format.

Note

Calling this before solve() raises an exception.

Parameters
  • f – file name or descriptor

  • suppress_header (boolean) – Set to true to suppress the CSV header

  • delimiter (str) – delimiter of the CSV file

fetch()

Fetch the data from the model.

Returns

An array with the data. The first column is the time.

Return type

np.array

fit(t_obs, n_obs, fit_index=None)

Use the Levenberg-Marquardt algorithm to fit model parameters consistent with True entries in the fit_index list.

Parameters
  • t_obs (numpy.ndarray) – Vector of days corresponding to the observations of number of infected people. Must be a non-decreasing array.

  • n_obs (numpy.nparray) – Vector which contains the observed epidemiological variable to fit the model against. It must be consistent with t_obs and with the initial conditions defined when building the model and using the set_parameters and set_initial_conds function. The model fit_input attribute defines against which epidemiological variable the fitting will be performed.

  • fit_index (list of booleans , optional) – this list must be of the same size of the number of parameters of the model. The parameter p[i] will be fitted if fit_index[i] = True. Otherwise, the parameter will be fixed. By default, fit will only fit the first parameter of p, p[0].

Returns

Reference to self

Return type

Model

predict(n_days=7, n_I=None, n_R=None)

Predict Susceptible, Infected and Removed

Parameters
  • n_days (int) – number of days to predict

  • n_I (int) – number of infected at the last

  • of available data. If no number of (day) –

  • is provided, the value is taken (infected) –

  • the last element of the number of (from) –

  • array on which the model was (infected) –

  • fitted.

  • n_R (int) – number of removed at the last

  • of available data. If no number of

  • is provided, the value is set as (removed) –

  • number of removed calculated by the (the) –

  • model as a consequence of the parameter (SIR) –

  • fitting.

Returns

Array with:
  • T: days of the predictions, where T[0] represents the last day of the sample and T[1] onwards the predictions.

  • S: Predicted number of susceptible

  • I: Predicted number of infected

  • R: Predicted number of removed

Return type

np.array

property r0

Returns reproduction number

Returns

\[R_0 = \alpha/\beta\]

Return type

float

set_initial_conds(array=None, n_S0=None, n_I0=None, n_R0=None)

Set SIR initial conditions

Parameters

array (list) –

List of initial conditions [n_S0, n_I0, n_R0]. If set, all other arguments are ignored.

  • n_S0: Total number of susceptible to the infection

  • n_I0: Total number of infected

  • n_R0: Total number of removed

Note: n_S0 + n_I0 + n_R0 = Population

Note

Internally, the model initial conditions are the ratios

  • S0 = n_S0/Population

  • I0 = n_I0/Population

  • R0 = n_R0/Population

which is consistent with the mathematical description of the SIR model.

Returns

Reference to self

Return type

SIR

set_parameters(array=None, alpha=None, beta=None)

Set SIR parameters

Parameters
  • array (list) – list of parameters of the model ([alpha, beta]) If set, all other arguments are ignored. All these values should be in 1/day units.

  • alpha (float) – Value of alpha in 1/day unit.

  • beta (float) – Value of beta in 1/day unit.

Returns

Reference to self

Return type

SIR

set_params(p, initial_conds)

Set model parameters.

Parameters
  • p (dict or array) – parameters of the model (alpha, beta). All these values should be in 1/day units. If a list is used, the order of parameters is [alpha, beta].

  • initial_conds (list) –

    Initial conditions (n_S0, n_I0, n_R0), where:

    • n_S0: Total number of susceptible to the infection

    • n_I0: Toral number of infected

    • n_R0: Total number of removed

    Note n_S0 + n_I0 + n_R0 = Population

    Internally, the model initial conditions are the ratios

    • S0 = n_S0/Population

    • I0 = n_I0/Population

    • R0 = n_R0/Population

    which is consistent with the mathematical description of the SIR model.

    If a list is used, the order of initial conditions is [n_S0, n_I0, n_R0]

Deprecated:

This function is deprecated and will be removed soon. Please use set_parameters() and set_initial_conds()

Returns

reference to self

Return type

SIR

solve(tf_days=7, numpoints=7)

Solve using children class model.

Parameters
  • tf_days (int) – number of days to simulate

  • numpoints (int) – number of points for the simulation.

Returns

Reference to self

Return type

Model

SIR-X Model

The SIR-X model extends the SIR model adding two parameters: the quarantine rate \(\kappa\) and the containement rate \(\kappa_0\). This extension allows the model to capture the “decrease” of susceptible population owing containment and quarantine measures.

class opensir.models.SIRX

SIRX model definition

exception InconsistentDimensionsError

Raised when the length of the days array is not equal to the dimension of the observed cases, or if the length of fit_index has a length different than the length of the parameter array self.p

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception InitializationError

Raised when a function executed violating the logical sequence of the Open-SIR pipeline

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception InvalidNumberOfParametersError

Raised when the number of initial parameters is not correct

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception InvalidParameterError

Raised when an initial parameter of a value is not correct

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

block_cv(lags=1, min_sample=3)

Calculates mean squared error of the predictions as a measure of model predictive performance using block cross validation.

The cross-validation mean squared error can be used to estimate a confidence interval of model predictions.

The model needs to be initialized and fitted prior calling block_cv.

Parameters
  • lags (int) – Defines the number of days that will be forecasted to calculate the mean squared error. For example, for the prediction Xp(t) and the real value X(t), the mean squared error will be calculated as mse = 1/n_boots |Xp(t+lags)-X(t+lags)|. This provides an estimate of the mean deviation of the predictions after “lags” days.

  • min_sample (int) – Number of days that will be used in the train set to make the first prediction.

Returns

tuple containing:
  • mse_avg (float):

    Simple average of the mean squared error between the model prediction for “lags” days and the real observed value.

  • mse_list (numpy.array):

    List of the mean squared errors using (i) points to predict the X(i+lags) value, with i an iterator that goes from n_samples+1 to the end of t_obs index.

  • p_list (numpy.array):

    List of the parameters sampled on the bootstrapping as a function of time. A common use of this list is to plot the mean squared error against time, to identify time periods where the model produces the best and worst fit to the data.

Return type

tuple

ci_bootstrap(alpha=0.95, n_iter=1000, r0_ci=True)

Calculates the confidence interval of the parameters using the random sample bootstrap method.

The model needs to be initialized and fitted prior calling ci_bootstrap

Parameters
  • alpha (float) – Percentile of the confidence interval required.

  • n_iter (int) – Number of random samples that will be taken to fit the model and perform the bootstrapping. Use n_iter >= 1000

  • r0_ci (boolean) – Set to True to also return the reproduction rate confidence interval.

Note

This traditional random sampling bootstrap is not a good way to bootstrap time-series data , baceuse the data because X(t+1) is correlated with X(t). In any case, it provides a reference case and it will can be an useful method for other types of models. When using this function, always compare the prediction error with the interval provided by the function ci_block_cv.

Returns

tuple containing:
  • ci (numpy.array):

    list of lists that contain the lower and upper confidence intervals of each parameter.

  • p_bt (numpy.array):

    list of the parameters sampled on the bootstrapping. The most common use of this list is to plot histograms to visualize and try to infer the probability density function of the parameters.

Return type

tuple

export(f, suppress_header=False, delimiter=',')

Export the output of the model in CSV format.

Note

Calling this before solve() raises an exception.

Parameters
  • f – file name or descriptor

  • suppress_header (boolean) – Set to true to suppress the CSV header

  • delimiter (str) – delimiter of the CSV file

fetch()

Fetch the data from the model.

Returns

An array with the data. The first column is the time.

Return type

np.array

fit(t_obs, n_obs, fit_index=None)

Use the Levenberg-Marquardt algorithm to fit model parameters consistent with True entries in the fit_index list.

Parameters
  • t_obs (numpy.ndarray) – Vector of days corresponding to the observations of number of infected people. Must be a non-decreasing array.

  • n_obs (numpy.nparray) – Vector which contains the observed epidemiological variable to fit the model against. It must be consistent with t_obs and with the initial conditions defined when building the model and using the set_parameters and set_initial_conds function. The model fit_input attribute defines against which epidemiological variable the fitting will be performed.

  • fit_index (list of booleans , optional) – this list must be of the same size of the number of parameters of the model. The parameter p[i] will be fitted if fit_index[i] = True. Otherwise, the parameter will be fixed. By default, fit will only fit the first parameter of p, p[0].

Returns

Reference to self

Return type

Model

property pcl

Returns public containment leverage \(P\)

Returns

\[P = \frac{\kappa_0}{\kappa_0 + \kappa}\]

Return type

float

predict(n_days=7, n_X=None, n_R=None)

Predicts Susceptible, Infected, Removed and Quarantined in the next n_days from the last day of the sample used to train the model.

Parameters
  • n_days (int) – number of days to predict

  • n_X (int) – number of confirmed cases at the last

  • of available data. If no number of (day) –

  • cases is provided, the value is taken (confirmed) –

  • the last element of the number of (from) –

  • cases array on which the model was (confirmed) –

  • fitted.

  • n_R (int) – number of removed at the last

  • of available data. If no number of

  • is provided, the value is set as (removed) –

  • number of removed calculated by the (the) –

  • model as a consequence of the parameter (SIR-X) –

  • fitting.

Returns

Array with:
  • T: days of the predictions, where T[0] represents the last day of the sample and T[1] onwards the predictions.

  • S: Predicted number of susceptible

  • I: Predicted number of infected

  • R: Predicted number of removed

  • X: Predicted number of quarantined

Return type

np.array

property q_prob

Returns quarantine probability \(Q\)

Returns

\[Q = \frac{\kappa_0 + \kappa}{\beta + \kappa_0 + \kappa}\]

Return type

float

property r0

Returns reproduction number

Returns

\[R_0 = \alpha/\beta\]

Return type

float

property r0_eff

Returns effective reproduction rate \(R_{0,eff}\)

Returns

\[R_{0,eff} = \alpha T_{I,eff}\]

Return type

float

set_initial_conds(array=None, n_S0=None, n_I0=None, n_R0=None, n_X0=None)

Set SIR-X initial conditions

Parameters

array (list) –

List of initial conditions [n_S0, n_I0, n_R0, n_X0]. If set, all other arguments are ignored.

  • n_S0: Total number of susceptible to the infection

  • n_I0: Total number of infected

  • n_R0: Total number of removed

  • n_X0: Total number of quarantined

Note: n_S0 + n_I0 + n_R0 + n_X0 = Population

Note

Internally, the model initial conditions are the ratios

  • S0 = n_S0/Population

  • I0 = n_I0/Population

  • R0 = n_R0/Population

  • X0 = n_X0/Population

which is consistent with the mathematical description of the SIR-X model.

Returns

Reference to self

Return type

SIRX

set_parameters(array=None, alpha=None, beta=None, kappa_0=None, kappa=None, inf_over_test=None)

Set SIR-X parameters

Parameters
  • array (list) – list of parameters of the model ([alpha, beta, kappa_0, kappa, inf_over_test]) If set, all other arguments are ignored. All these values should be in 1/day units.

  • alpha (float) – Value of alpha in 1/day unit.

  • beta (float) – Value of beta in 1/day unit.

  • kappa_0 (float) – Value of kappa_0 in 1/day unit.

  • kappa (float) – Value of kappa in 1/day unit.

  • inf_over_test (float) – Value of infected/tested

Returns

Reference to self

Return type

SIRX

set_params(p, initial_conds)

Set model parameters.

Parameters
  • p (list) – parameters of the model (alpha, beta, kappa_0, kappa, inf_over_test). All these values should be in 1/day units. If a list is used, the order of parameters is [alpha, beta, kappa_0, kappa, inf_over_test]

  • initial_conds (list) –

    Initial conditions (n_S0, n_I0, n_R0, n_X0), where:

    • n_S0: Total number of susceptible to the infection

    • n_I0: Total number of infected

    • n_R0: Total number of removed

    • n_X0: Total number of quarantined

    Note: n_S0 + n_I0 + n_R0 + n_X0 = Population

    Internally, the model initial conditions are the ratios

    • S0 = n_S0/Population

    • I0 = n_I0/Population

    • R0 = n_R0/Population

    • X0 = n_X0/Population

    which is consistent with the mathematical description of the SIR model.

    If a list is used, the order of initial conditions is [n_S0, n_I0, n_R0, n_X0]

Deprecated:

This function is deprecated and will be removed soon. Please use set_parameters() and set_initial_conds()

Returns

Reference to self

Return type

SIRX

solve(tf_days=7, numpoints=7)

Solve using children class model.

Parameters
  • tf_days (int) – number of days to simulate

  • numpoints (int) – number of points for the simulation.

Returns

Reference to self

Return type

Model

property t_inf_eff

Returns effective infectious period

Returns

\[T_{I,eff} = (\beta + \kappa + \kappa_0)^{-1}\]

Return type

float