-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Ilya Ploshchik edited this page Apr 14, 2022
·
4 revisions
Welcome to the 2dv50e wiki!
*** Data 4 datasets are provided:
- Heart Disease
- Breast Cancer Wisconsin (Diagnostic)
- Pima Indian Diabetes
- Vehicle Silhouettes
Each dataset includes following files:
- dataset.csv - original csv file with all respective features
- target.csv - csv file with target class instances
- topModels.csv - top 55 models (5 models per base learning algorithm)
Each instance (row) represents one model with model_id, algorthm id, all calculated metrics and overall performance. Overall performance is calculated as a single average of all 8 metrics. Column "params" identifies the hyperparameters, used for this particular model
- topModelsProbabilities.csv - csv file with class predictions for all 55 best models
each row represents class probabilities per instance of target variable for every model