Skip to content

Conversation

@robertlcx
Copy link
Contributor

@robertlcx robertlcx commented Jul 9, 2024

Description

Usage

A dataset config secret is required. The make run already adds one:

{
    "url": "https://gist.githubusercontent.com/netj/8836201/raw/6f9306ad21398ea43cba4f7d537619d0e07d5ae3/iris.csv",
    "name": "iris.data",
    "feature_columns": ["sepal.length", "sepal.width", "petal.length", "petal.width"],
    "target_column": "variety"
}

This can technically be set to anything as long as classifiers are being trained/deployed.

Considerations

  1. Using the local executor in Airflow for ease of use. A follow-up more-complex DAG using solely SageMaker will be added.
  2. The workflow can accept any dataset that has a classifying target column.
  3. There's a fan-out on 3 different algorithms: LogisticRegression, KNeighborsClassifier, DecisionTreeClassifier. The model with the best accuracy wins.

Running it

make start
make run
make stop

@robertlcx robertlcx added the enhancement New feature or request label Jul 9, 2024
@robertlcx robertlcx self-assigned this Jul 9, 2024
@robertlcx robertlcx marked this pull request as ready for review July 11, 2024 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants