Home

Jump to bottom

Ilya Ploshchik edited this page Apr 14, 2022 · 4 revisions

Welcome to the 2dv50e wiki!

*** Data 4 datasets are provided:

Heart Disease
Breast Cancer Wisconsin (Diagnostic)
Pima Indian Diabetes
Vehicle Silhouettes

Each dataset includes following files:

dataset.csv - original csv file with all respective features
target.csv - csv file with target class instances
topModels.csv - top 55 models (5 models per base learning algorithm)

Each instance (row) represents one model with model_id, algorthm id, all calculated metrics and overall performance. Overall performance is calculated as a single average of all 8 metrics. Column "params" identifies the hyperparameters, used for this particular model

topModelsProbabilities.csv - csv file with class predictions for all 55 best models

each row represents class probabilities per instance of target variable for every model

Clone this wiki locally