Skip to content
Ilya Ploshchik edited this page Apr 14, 2022 · 4 revisions

Welcome to the 2dv50e wiki!

*** Data 4 datasets are provided:

  1. Heart Disease
  2. Breast Cancer Wisconsin (Diagnostic)
  3. Pima Indian Diabetes
  4. Vehicle Silhouettes

Each dataset includes following files:

  • dataset.csv - original csv file with all respective features
  • target.csv - csv file with target class instances
  • topModels.csv - top 55 models (5 models per base learning algorithm)

Each instance (row) represents one model with model_id, algorthm id, all calculated metrics and overall performance. Overall performance is calculated as a single average of all 8 metrics. Column "params" identifies the hyperparameters, used for this particular model

  • topModelsProbabilities.csv - csv file with class predictions for all 55 best models

each row represents class probabilities per instance of target variable for every model

Clone this wiki locally