gtime.model_selection.FeatureSplitter

class gtime.model_selection.FeatureSplitter(drop_na_mode: str = 'any')

Splits the feature matrices X and y in X_train, y_train, X_test, y_test.

X and y are the feature matrices obtained from the FeatureCreation class.

Parameters
drop_na_modestr, optional, default: 'any'

How to drop the Nan contained in the X and y matrices. Only ‘any’ is supported for the moment.

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from gtime.model_selection import FeatureSplitter
>>> X = pd.DataFrame.from_dict({"feature_0": [np.nan, 0, 1, 2, 3, 4, 5, 6, 7, 8],
...                             "feature_1": [np.nan, np.nan, 0.5, 1.5, 2.5, 3.5,
...                                            4.5, 5.5, 6.5, 7.5, ]
...                            })
>>> y = pd.DataFrame.from_dict({"y_0": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
...                             "y_1": [1, 2, 3, 4, 5, 6, 7, 8, 9, np.nan],
...                             "y_2": [2, 3, 4, 5, 6, 7, 8, 9, np.nan, np.nan]
...                            })
>>> feature_splitter = FeatureSplitter()
>>> X_train, y_train, X_test, y_test = feature_splitter.transform(X, y)
>>> X_train
   feature_0  feature_1
2        1.0        0.5
3        2.0        1.5
4        3.0        2.5
5        4.0        3.5
6        5.0        4.5
7        6.0        5.5
>>> y_train
   y_0  y_1  y_2
2    2  3.0  4.0
3    3  4.0  5.0
4    4  5.0  6.0
5    5  6.0  7.0
6    6  7.0  8.0
7    7  8.0  9.0
>>> X_test
   feature_0  feature_1
8        7.0        6.5
9        8.0        7.5
>>> y_test
   y_0  y_1  y_2
8    8  9.0  NaN
9    9  NaN  NaN

Methods

transform(self, X, y)

Split the feature matrices X and y in X_train, y_train, X_test, y_test.

__init__(self, drop_na_mode:str='any')

Initialize self. See help(type(self)) for accurate signature.

transform(self, X:pandas.core.frame.DataFrame, y:pandas.core.frame.DataFrame) -> (<class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>)

Split the feature matrices X and y in X_train, y_train, X_test, y_test.

X and y are the feature matrices obtained from the FeatureCreation class.

Parameters
Xpd.DataFrame, shape (n_samples, n_features), required

The feature matrix.

ypd.DataFrame, shape (n_samples, horizon), required

The y matrix.

Returns
X_train, y_train, X_test, y_testTuple[pd.DataFrame, pd.DataFrame,

pd.DataFrame, pd.DataFrame] The X and y, split between train and test.