split_train_test#
- protopipe.mva.split_train_test(survived_images, train_fraction, feature_name_list, target_name)[source]#
Split the data selected for cuts in train and test samples.
If the estimator is a classifier, data is split in a stratified fashion, using this as the class labels.
- Parameters
- survived_images: `~pandas.DataFrame`
Images that survived the selection cuts.
- train_fraction: `float`
Fraction of data to be used for training.
- feature_name_list: `list`
List of variables to use for training the model.
- target_name: `str`
Variable against which to train.
- Returns
- X_train: ~pandas.DataFrame
Data frame
- X_test: ~pandas.DataFrame
Data frame
- y_train: ~pandas.DataFrame
Data frame
- y_test: ~pandas.DataFrame
Data frame
- data_train: ~pandas.DataFrame
Training data indexed by observation ID and event ID.
- data_test: ~pandas.DataFrame
Test data indexed by observation ID and event ID.