gtime.forecasting.GARFF

class gtime.forecasting.GARFF(estimator)

Generalized Auto Regression model with feedforward training. This model is a wrapper of sklearn.multioutput.RegressorChain but returns a pd.DataFrame.

Fit one model for each target variable contained in the y matrix, also using the predictions of the previous model.

Parameters:
estimator : estimator object, required

The model used to make the predictions step by step. Regressor object such as derived from RegressorMixin.

Notes

sklearn.multioutput.RegressorChain order, cv and random_state parameters were set to None due to target order importance in a time-series forecasting context.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from gtime.forecasting import GARFF
>>> from sklearn.ensemble import RandomForestRegressor
>>> time_index = pd.date_range("2020-01-01", "2020-01-30")
>>> X = pd.DataFrame(np.random.random((30, 5)), index=time_index)
>>> y_columns = ["y_1", "y_2", "y_3"]
>>> y = pd.DataFrame(np.random.random((30, 3)), index=time_index, columns=y_columns)
>>> X_train, y_train = X[:20], y[:20]
>>> X_test, y_test = X[20:], y[20:]
>>> random_forest = RandomForestRegressor()
>>> garff = GARFF(estimator=random_forest)
>>> garff.fit(X_train, y_train)
>>> predictions = garff.predict(X_test)
>>> predictions.shape
(10, 3)

Methods

fit(self, X, y) Fit the models, one for each time step.
get_params(self[, deep]) Get parameters for this estimator.
predict(self, X) For each row in X, make a prediction for each fitted model, from 1 to horizon.
score(self, X, y[, sample_weight]) Return the coefficient of determination R^2 of the prediction.
set_params(self, \*\*params) Set the parameters of this estimator.
__init__(self, estimator)

Initialize self. See help(type(self)) for accurate signature.

fit(self, X: pandas.core.frame.DataFrame, y: pandas.core.frame.DataFrame)

Fit the models, one for each time step. Each model is trained on the initial set of features and on the true values of the previous steps.

Parameters:
X : pd.DataFrame, shape (n_samples, n_features), required

The data.

y : pd.DataFrame, shape (n_samples, horizon), required

The matrix containing the target variables.

Returns:
self : object

The fitted object.

get_params(self, deep=True)

Get parameters for this estimator.

Parameters:
deep : bool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
params : mapping of string to any

Parameter names mapped to their values.

predict(self, X: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame

For each row in X, make a prediction for each fitted model, from 1 to horizon.

Parameters:
X : pd.DataFrame, shape (n_samples, n_features)

The data.

Returns:
y_p_df : pd.DataFrame, shape (n_samples, horizon)

The predictions, one for each timestep in horizon.

score(self, X, y, sample_weight=None)

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

Parameters:
X : array-like of shape (n_samples, n_features)

Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

y : array-like of shape (n_samples,) or (n_samples, n_outputs)

True values for X.

sample_weight : array-like of shape (n_samples,), default=None

Sample weights.

Returns:
score : float

R^2 of self.predict(X) wrt. y.

Notes

The R2 score used when calling score on a regressor will use multioutput='uniform_average' from version 0.23 to keep consistent with r2_score. This will influence the score method of all the multioutput regressors (except for MultiOutputRegressor). To specify the default value manually and avoid the warning, please either call r2_score directly or make a custom scorer with make_scorer (the built-in scorer 'r2' uses multioutput='uniform_average').

set_params(self, **params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**params : dict

Estimator parameters.

Returns:
self : object

Estimator instance.