TrainModel#
- class protopipe.mva.TrainModel(case, feature_name_list, target_name=None)[source]#
Bases:
object
Train classification or regressor model.
- Parameters
- case: str
Possibilities are regressor or classifier
- feature_name_list: list
List of features
- target_name: str, optional
Regression target
Methods Summary
get_optimal_model
(init_model, ...[, refit, ...])Get optimal hyperparameters for an estimator and return the best model.
split_data
(data_sig, train_fraction[, ...])Load and split data to build train/test samples.
Methods Documentation
- get_optimal_model(init_model, tuned_parameters, scoring, cv, refit=True, verbose=2, njobs=1)[source]#
Get optimal hyperparameters for an estimator and return the best model.
The best parameters are obtained by performing an exhaustive search over specified parameter values.
- Parameters
- init_model: `~sklearn.base.BaseEstimator`
Model to optimise
- tuned_parameters: dict
Contains parameter names and ranges to optimise on
- scoring: str
Estimator
- cv: int
number of split for x-validation
- refit: bool, str, or callable, default=False
Refit the estimator using the best found parameters on the whole dataset.
- verbose: int
Controls the verbosity: the higher, the more messages. >1 : the computation time for each fold and parameter candidate is displayed >2 : the score is also displayed >3 : the fold and candidate parameter indexes are also displayed together with the starting time of the computation
- njobs: int
Number of jobs to run in parallel. -1 means using all processors.
- Returns
- best_estimator: ~sklearn.base.BaseEstimator
Best model
- split_data(data_sig, train_fraction, data_bkg=None, force_same_nsig_nbkg=False)[source]#
Load and split data to build train/test samples.
- Parameters
- data_sig: `~pandas.DataFrame`
Data frame
- train_fraction: float
Fraction of events to build the training sample
- data_bkg: `~pandas.DataFrame`
Data frame
- force_same_nsig_nbkg: bool
If true, the same number of signal and bkg events will be used to build a classifier