Open-SIR API¶
SIR Model¶
Most epidemic models share a common approach on modelling the spread of a disease. The susceptible-infectious-removed (SIR) model is a simple deterministic compartmental model to predict disease spread. An objective population is divided in three groups: the susceptible (\(S\)), the infected (\(I\)) and the recovered or removed (\(R\)). These quantities enter the model as fractions of the total population \(P\).
As a pandemics infects and kills much more quickly than human natural rates of birth and death, the population size is assumed constant except for the individuals that recover or die. Hence, \(S+I+R=P/P=1\). The pandemics dynamics is modelled as a system of ordinary differential equations which governs the rate of change at which the percentage of susceptible, infected and recovered/removed individuals in a population evolve.
The number of possible transmissions is proportional to the number of interactions between the susceptible and infected compartments, \(S \times I\):
Where \(\alpha\) / \([\text{time}]^{-1}\) is the transmission rate of the process which quantifies how many of the interactions between susceptible and infected populations yield to new infections per day.
The population of infected individuals will increase with new infections and decrease with recovered or removed people.
Where \(\beta\) is the percentage of the infected population that is removed from the transmission process per day.
The infectious period, \(T_I\) / \([\text{time}]\) , is defined as the reciprocal of the removal rate:
In early stages of the infection, the number of infected people is much lower than the susceptible population. Hence, \(S \approx 1\) making \(dI/dt\) linear and the system has the analytical solution \(I(t) = I_0 \exp (\alpha - \beta)t\).
-
class
opensir.models.
SIR
¶ SIR model definition
-
exception
InconsistentDimensionsError
¶ Raised when the length of the days array is not equal to the dimension of the observed cases, or if the length of fit_index has a length different than the length of the parameter array self.p
-
args
¶
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
exception
InitializationError
¶ Raised when a function executed violating the logical sequence of the Open-SIR pipeline
-
args
¶
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
exception
InvalidNumberOfParametersError
¶ Raised when the number of initial parameters is not correct
-
args
¶
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
exception
InvalidParameterError
¶ Raised when an initial parameter of a value is not correct
-
args
¶
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
block_cv
(lags=1, min_sample=3)¶ Calculates mean squared error of the predictions as a measure of model predictive performance using block cross validation.
The cross-validation mean squared error can be used to estimate a confidence interval of model predictions.
The model needs to be initialized and fitted prior calling block_cv.
- Parameters
lags (int) – Defines the number of days that will be forecasted to calculate the mean squared error. For example, for the prediction Xp(t) and the real value X(t), the mean squared error will be calculated as mse = 1/n_boots |Xp(t+lags)-X(t+lags)|. This provides an estimate of the mean deviation of the predictions after “lags” days.
min_sample (int) – Number of days that will be used in the train set to make the first prediction.
- Returns
- tuple containing:
- mse_avg (float):
Simple average of the mean squared error between the model prediction for “lags” days and the real observed value.
- mse_list (numpy.array):
List of the mean squared errors using (i) points to predict the X(i+lags) value, with i an iterator that goes from n_samples+1 to the end of t_obs index.
- p_list (numpy.array):
List of the parameters sampled on the bootstrapping as a function of time. A common use of this list is to plot the mean squared error against time, to identify time periods where the model produces the best and worst fit to the data.
- Return type
tuple
-
ci_bootstrap
(alpha=0.95, n_iter=1000, r0_ci=True)¶ Calculates the confidence interval of the parameters using the random sample bootstrap method.
The model needs to be initialized and fitted prior calling ci_bootstrap
- Parameters
alpha (float) – Percentile of the confidence interval required.
n_iter (int) – Number of random samples that will be taken to fit the model and perform the bootstrapping. Use n_iter >= 1000
r0_ci (boolean) – Set to True to also return the reproduction rate confidence interval.
Note
This traditional random sampling bootstrap is not a good way to bootstrap time-series data , baceuse the data because X(t+1) is correlated with X(t). In any case, it provides a reference case and it will can be an useful method for other types of models. When using this function, always compare the prediction error with the interval provided by the function ci_block_cv.
- Returns
- tuple containing:
- ci (numpy.array):
list of lists that contain the lower and upper confidence intervals of each parameter.
- p_bt (numpy.array):
list of the parameters sampled on the bootstrapping. The most common use of this list is to plot histograms to visualize and try to infer the probability density function of the parameters.
- Return type
tuple
-
export
(f, suppress_header=False, delimiter=',')¶ Export the output of the model in CSV format.
Note
Calling this before solve() raises an exception.
- Parameters
f – file name or descriptor
suppress_header (boolean) – Set to true to suppress the CSV header
delimiter (str) – delimiter of the CSV file
-
fetch
()¶ Fetch the data from the model.
- Returns
An array with the data. The first column is the time.
- Return type
np.array
-
fit
(t_obs, n_obs, fit_index=None)¶ Use the Levenberg-Marquardt algorithm to fit model parameters consistent with True entries in the fit_index list.
- Parameters
t_obs (numpy.ndarray) – Vector of days corresponding to the observations of number of infected people. Must be a non-decreasing array.
n_obs (numpy.nparray) – Vector which contains the observed epidemiological variable to fit the model against. It must be consistent with t_obs and with the initial conditions defined when building the model and using the set_parameters and set_initial_conds function. The model fit_input attribute defines against which epidemiological variable the fitting will be performed.
fit_index (list of booleans , optional) – this list must be of the same size of the number of parameters of the model. The parameter p[i] will be fitted if fit_index[i] = True. Otherwise, the parameter will be fixed. By default, fit will only fit the first parameter of p, p[0].
- Returns
Reference to self
- Return type
Model
-
predict
(n_days=7, n_I=None, n_R=None)¶ Predict Susceptible, Infected and Removed
- Parameters
n_days (int) – number of days to predict
n_I (int) – number of infected at the last
of available data. If no number of (day) –
is provided, the value is taken (infected) –
the last element of the number of (from) –
array on which the model was (infected) –
fitted. –
n_R (int) – number of removed at the last
of available data. If no number of –
is provided, the value is set as (removed) –
number of removed calculated by the (the) –
model as a consequence of the parameter (SIR) –
fitting. –
- Returns
- Array with:
T: days of the predictions, where T[0] represents the last day of the sample and T[1] onwards the predictions.
S: Predicted number of susceptible
I: Predicted number of infected
R: Predicted number of removed
- Return type
np.array
-
property
r0
¶ Returns reproduction number
- Returns
- \[R_0 = \alpha/\beta\]
- Return type
float
-
set_initial_conds
(array=None, n_S0=None, n_I0=None, n_R0=None)¶ Set SIR initial conditions
- Parameters
array (list) –
List of initial conditions [n_S0, n_I0, n_R0]. If set, all other arguments are ignored.
n_S0: Total number of susceptible to the infection
n_I0: Total number of infected
n_R0: Total number of removed
Note: n_S0 + n_I0 + n_R0 = Population
Note
Internally, the model initial conditions are the ratios
S0 = n_S0/Population
I0 = n_I0/Population
R0 = n_R0/Population
which is consistent with the mathematical description of the SIR model.
- Returns
Reference to self
- Return type
-
set_parameters
(array=None, alpha=None, beta=None)¶ Set SIR parameters
- Parameters
array (list) – list of parameters of the model ([alpha, beta]) If set, all other arguments are ignored. All these values should be in 1/day units.
alpha (float) – Value of alpha in 1/day unit.
beta (float) – Value of beta in 1/day unit.
- Returns
Reference to self
- Return type
-
set_params
(p, initial_conds)¶ Set model parameters.
- Parameters
p (dict or array) – parameters of the model (alpha, beta). All these values should be in 1/day units. If a list is used, the order of parameters is [alpha, beta].
initial_conds (list) –
Initial conditions (n_S0, n_I0, n_R0), where:
n_S0: Total number of susceptible to the infection
n_I0: Toral number of infected
n_R0: Total number of removed
Note n_S0 + n_I0 + n_R0 = Population
Internally, the model initial conditions are the ratios
S0 = n_S0/Population
I0 = n_I0/Population
R0 = n_R0/Population
which is consistent with the mathematical description of the SIR model.
If a list is used, the order of initial conditions is [n_S0, n_I0, n_R0]
- Deprecated:
This function is deprecated and will be removed soon. Please use
set_parameters()
andset_initial_conds()
- Returns
reference to self
- Return type
-
solve
(tf_days=7, numpoints=7)¶ Solve using children class model.
- Parameters
tf_days (int) – number of days to simulate
numpoints (int) – number of points for the simulation.
- Returns
Reference to self
- Return type
Model
-
exception
SIR-X Model¶
The SIR-X model extends the SIR model adding two parameters: the quarantine rate \(\kappa\) and the containement rate \(\kappa_0\). This extension allows the model to capture the “decrease” of susceptible population owing containment and quarantine measures.
-
class
opensir.models.
SIRX
¶ SIRX model definition
-
exception
InconsistentDimensionsError
¶ Raised when the length of the days array is not equal to the dimension of the observed cases, or if the length of fit_index has a length different than the length of the parameter array self.p
-
args
¶
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
exception
InitializationError
¶ Raised when a function executed violating the logical sequence of the Open-SIR pipeline
-
args
¶
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
exception
InvalidNumberOfParametersError
¶ Raised when the number of initial parameters is not correct
-
args
¶
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
exception
InvalidParameterError
¶ Raised when an initial parameter of a value is not correct
-
args
¶
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
block_cv
(lags=1, min_sample=3)¶ Calculates mean squared error of the predictions as a measure of model predictive performance using block cross validation.
The cross-validation mean squared error can be used to estimate a confidence interval of model predictions.
The model needs to be initialized and fitted prior calling block_cv.
- Parameters
lags (int) – Defines the number of days that will be forecasted to calculate the mean squared error. For example, for the prediction Xp(t) and the real value X(t), the mean squared error will be calculated as mse = 1/n_boots |Xp(t+lags)-X(t+lags)|. This provides an estimate of the mean deviation of the predictions after “lags” days.
min_sample (int) – Number of days that will be used in the train set to make the first prediction.
- Returns
- tuple containing:
- mse_avg (float):
Simple average of the mean squared error between the model prediction for “lags” days and the real observed value.
- mse_list (numpy.array):
List of the mean squared errors using (i) points to predict the X(i+lags) value, with i an iterator that goes from n_samples+1 to the end of t_obs index.
- p_list (numpy.array):
List of the parameters sampled on the bootstrapping as a function of time. A common use of this list is to plot the mean squared error against time, to identify time periods where the model produces the best and worst fit to the data.
- Return type
tuple
-
ci_bootstrap
(alpha=0.95, n_iter=1000, r0_ci=True)¶ Calculates the confidence interval of the parameters using the random sample bootstrap method.
The model needs to be initialized and fitted prior calling ci_bootstrap
- Parameters
alpha (float) – Percentile of the confidence interval required.
n_iter (int) – Number of random samples that will be taken to fit the model and perform the bootstrapping. Use n_iter >= 1000
r0_ci (boolean) – Set to True to also return the reproduction rate confidence interval.
Note
This traditional random sampling bootstrap is not a good way to bootstrap time-series data , baceuse the data because X(t+1) is correlated with X(t). In any case, it provides a reference case and it will can be an useful method for other types of models. When using this function, always compare the prediction error with the interval provided by the function ci_block_cv.
- Returns
- tuple containing:
- ci (numpy.array):
list of lists that contain the lower and upper confidence intervals of each parameter.
- p_bt (numpy.array):
list of the parameters sampled on the bootstrapping. The most common use of this list is to plot histograms to visualize and try to infer the probability density function of the parameters.
- Return type
tuple
-
export
(f, suppress_header=False, delimiter=',')¶ Export the output of the model in CSV format.
Note
Calling this before solve() raises an exception.
- Parameters
f – file name or descriptor
suppress_header (boolean) – Set to true to suppress the CSV header
delimiter (str) – delimiter of the CSV file
-
fetch
()¶ Fetch the data from the model.
- Returns
An array with the data. The first column is the time.
- Return type
np.array
-
fit
(t_obs, n_obs, fit_index=None)¶ Use the Levenberg-Marquardt algorithm to fit model parameters consistent with True entries in the fit_index list.
- Parameters
t_obs (numpy.ndarray) – Vector of days corresponding to the observations of number of infected people. Must be a non-decreasing array.
n_obs (numpy.nparray) – Vector which contains the observed epidemiological variable to fit the model against. It must be consistent with t_obs and with the initial conditions defined when building the model and using the set_parameters and set_initial_conds function. The model fit_input attribute defines against which epidemiological variable the fitting will be performed.
fit_index (list of booleans , optional) – this list must be of the same size of the number of parameters of the model. The parameter p[i] will be fitted if fit_index[i] = True. Otherwise, the parameter will be fixed. By default, fit will only fit the first parameter of p, p[0].
- Returns
Reference to self
- Return type
Model
-
property
pcl
¶ Returns public containment leverage \(P\)
- Returns
- \[P = \frac{\kappa_0}{\kappa_0 + \kappa}\]
- Return type
float
-
predict
(n_days=7, n_X=None, n_R=None)¶ Predicts Susceptible, Infected, Removed and Quarantined in the next n_days from the last day of the sample used to train the model.
- Parameters
n_days (int) – number of days to predict
n_X (int) – number of confirmed cases at the last
of available data. If no number of (day) –
cases is provided, the value is taken (confirmed) –
the last element of the number of (from) –
cases array on which the model was (confirmed) –
fitted. –
n_R (int) – number of removed at the last
of available data. If no number of –
is provided, the value is set as (removed) –
number of removed calculated by the (the) –
model as a consequence of the parameter (SIR-X) –
fitting. –
- Returns
- Array with:
T: days of the predictions, where T[0] represents the last day of the sample and T[1] onwards the predictions.
S: Predicted number of susceptible
I: Predicted number of infected
R: Predicted number of removed
X: Predicted number of quarantined
- Return type
np.array
-
property
q_prob
¶ Returns quarantine probability \(Q\)
- Returns
- \[Q = \frac{\kappa_0 + \kappa}{\beta + \kappa_0 + \kappa}\]
- Return type
float
-
property
r0
¶ Returns reproduction number
- Returns
- \[R_0 = \alpha/\beta\]
- Return type
float
-
property
r0_eff
¶ Returns effective reproduction rate \(R_{0,eff}\)
- Returns
- \[R_{0,eff} = \alpha T_{I,eff}\]
- Return type
float
-
set_initial_conds
(array=None, n_S0=None, n_I0=None, n_R0=None, n_X0=None)¶ Set SIR-X initial conditions
- Parameters
array (list) –
List of initial conditions [n_S0, n_I0, n_R0, n_X0]. If set, all other arguments are ignored.
n_S0: Total number of susceptible to the infection
n_I0: Total number of infected
n_R0: Total number of removed
n_X0: Total number of quarantined
Note: n_S0 + n_I0 + n_R0 + n_X0 = Population
Note
Internally, the model initial conditions are the ratios
S0 = n_S0/Population
I0 = n_I0/Population
R0 = n_R0/Population
X0 = n_X0/Population
which is consistent with the mathematical description of the SIR-X model.
- Returns
Reference to self
- Return type
-
set_parameters
(array=None, alpha=None, beta=None, kappa_0=None, kappa=None, inf_over_test=None)¶ Set SIR-X parameters
- Parameters
array (list) – list of parameters of the model ([alpha, beta, kappa_0, kappa, inf_over_test]) If set, all other arguments are ignored. All these values should be in 1/day units.
alpha (float) – Value of alpha in 1/day unit.
beta (float) – Value of beta in 1/day unit.
kappa_0 (float) – Value of kappa_0 in 1/day unit.
kappa (float) – Value of kappa in 1/day unit.
inf_over_test (float) – Value of infected/tested
- Returns
Reference to self
- Return type
-
set_params
(p, initial_conds)¶ Set model parameters.
- Parameters
p (list) – parameters of the model (alpha, beta, kappa_0, kappa, inf_over_test). All these values should be in 1/day units. If a list is used, the order of parameters is [alpha, beta, kappa_0, kappa, inf_over_test]
initial_conds (list) –
Initial conditions (n_S0, n_I0, n_R0, n_X0), where:
n_S0: Total number of susceptible to the infection
n_I0: Total number of infected
n_R0: Total number of removed
n_X0: Total number of quarantined
Note: n_S0 + n_I0 + n_R0 + n_X0 = Population
Internally, the model initial conditions are the ratios
S0 = n_S0/Population
I0 = n_I0/Population
R0 = n_R0/Population
X0 = n_X0/Population
which is consistent with the mathematical description of the SIR model.
If a list is used, the order of initial conditions is [n_S0, n_I0, n_R0, n_X0]
- Deprecated:
This function is deprecated and will be removed soon. Please use
set_parameters()
andset_initial_conds()
- Returns
Reference to self
- Return type
-
solve
(tf_days=7, numpoints=7)¶ Solve using children class model.
- Parameters
tf_days (int) – number of days to simulate
numpoints (int) – number of points for the simulation.
- Returns
Reference to self
- Return type
Model
-
property
t_inf_eff
¶ Returns effective infectious period
- Returns
- \[T_{I,eff} = (\beta + \kappa + \kappa_0)^{-1}\]
- Return type
float
-
exception