gtime.model_selection
.FeatureSplitter¶
-
class
gtime.model_selection.
FeatureSplitter
(drop_na_mode: str = 'any')¶ Splits the feature matrices X and y in X_train, y_train, X_test, y_test.
X and y are the feature matrices obtained from the FeatureCreation class.
- Parameters
- drop_na_modestr, optional, default:
'any'
How to drop the Nan contained in the
X
andy
matrices. Only ‘any’ is supported for the moment.
- drop_na_modestr, optional, default:
Examples
>>> import pandas as pd >>> import numpy as np >>> from gtime.model_selection import FeatureSplitter >>> X = pd.DataFrame.from_dict({"feature_0": [np.nan, 0, 1, 2, 3, 4, 5, 6, 7, 8], ... "feature_1": [np.nan, np.nan, 0.5, 1.5, 2.5, 3.5, ... 4.5, 5.5, 6.5, 7.5, ] ... }) >>> y = pd.DataFrame.from_dict({"y_0": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], ... "y_1": [1, 2, 3, 4, 5, 6, 7, 8, 9, np.nan], ... "y_2": [2, 3, 4, 5, 6, 7, 8, 9, np.nan, np.nan] ... }) >>> feature_splitter = FeatureSplitter() >>> X_train, y_train, X_test, y_test = feature_splitter.transform(X, y) >>> X_train feature_0 feature_1 2 1.0 0.5 3 2.0 1.5 4 3.0 2.5 5 4.0 3.5 6 5.0 4.5 7 6.0 5.5 >>> y_train y_0 y_1 y_2 2 2 3.0 4.0 3 3 4.0 5.0 4 4 5.0 6.0 5 5 6.0 7.0 6 6 7.0 8.0 7 7 8.0 9.0 >>> X_test feature_0 feature_1 8 7.0 6.5 9 8.0 7.5 >>> y_test y_0 y_1 y_2 8 8 9.0 NaN 9 9 NaN NaN
Methods
transform
(self, X, y)Split the feature matrices X and y in X_train, y_train, X_test, y_test.
-
__init__
(self, drop_na_mode:str='any')¶ Initialize self. See help(type(self)) for accurate signature.
-
transform
(self, X:pandas.core.frame.DataFrame, y:pandas.core.frame.DataFrame) -> (<class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>)¶ Split the feature matrices X and y in X_train, y_train, X_test, y_test.
X
andy
are the feature matrices obtained from the FeatureCreation class.- Parameters
- Xpd.DataFrame, shape (n_samples, n_features), required
The feature matrix.
- ypd.DataFrame, shape (n_samples, horizon), required
The y matrix.
- Returns
- X_train, y_train, X_test, y_testTuple[pd.DataFrame, pd.DataFrame,
pd.DataFrame, pd.DataFrame] The X and y, split between train and test.