kedro-starters-sklearn

This repository provides the following preserved starter templates updated for kedro==1.3.1.

sklearn-iris trains a Logistic Regression model using Scikit-learn.
sklearn-mlflow-iris adds experiment tracking feature using MLflow.

Pipeline visualized by Kedro-viz

`sklearn-iris` template

Iris dataset

Iris dataset is included and used by default.

Modification: for each species, setosa is encoded to 0, versicolor is encoded to 1, and virginica samples were removed.
Split: for each species, the first 25 samples are included in train.csv, and the last 25 samples are included in test.csv.

How to use

Install dependencies.

pip install "kedro==1.3.1" pandas scikit-learn

Generate your Kedro starter project from sklearn-iris directory.
```
kedro new --starter https://github.com/Minyus/kedro-starters-sklearn.git --directory sklearn-iris
```
As explained in the Kedro documentation, enter project_name, repo_name, and python_package.

Note: as your Python package name, choose a unique name and avoid a generic name such as test or sklearn used by another package. You can see the list of importable packages by running python -c "help('modules')".
Change the current directory to the generated project directory.
```
cd /path/to/project/directory
```
Install project dependencies and run the project.
```
pip install -r requirements.txt
kedro run
```

Option to use Kaggle Titanic dataset

Download Kaggle Titanic dataset
Replace train.csv and test.csv in /path/to/project/directory/data/01_raw directory
Modify /path/to/project/directory/conf/base/parameters.yml to set parameters appropriate for the dataset (commented out by default)

`sklearn-mlflow-iris` template

This template integrates MLflow into Kedro using PipelineX. Even without writing MLflow code, you can:

configure MLflow Tracking
log inputs and outputs of Python functions set up as Kedro nodes as parameters (for example, features used to train the model) and metrics (for example, F1 score)
log execution time for each Kedro node and dataset loading/saving as metrics
log artifacts such as models, execution time Gantt charts visualized by Plotly, and parameters.yml

In this template, MLflow logging is configured in Python code at src/<python_package>/hooks.py.

See here for details.

How to use

Install dependencies.

pip install "kedro==1.3.1" pandas scikit-learn mlflow "pipelinex==0.8.0" plotly

Generate your Kedro starter project from sklearn-mlflow-iris directory.

kedro new --starter https://github.com/Minyus/kedro-starters-sklearn.git --directory sklearn-mlflow-iris

Follow the same steps as the sklearn-iris template.

Access MLflow web UI

To access the MLflow web UI, launch the MLflow server.

mlflow server --host 127.0.0.1 --port 8080 --backend-store-uri sqlite:///mlruns/sqlite.db --default-artifact-root ./mlruns

Logged metrics shown in MLflow's UI

Gantt chart for execution time, generated using Plotly, shown in MLflow's UI

Notes

Both starters preserve the repo's original Iris-focused examples.
The MLflow starter keeps the PipelineX-based hook integration, pinned to pipelinex==0.8.0.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
_doc_images		_doc_images
sklearn-iris		sklearn-iris
sklearn-mlflow-iris		sklearn-mlflow-iris
.gitignore		.gitignore
README.md		README.md
experiment_tracking_with_mlflow_and_kedro.md		experiment_tracking_with_mlflow_and_kedro.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kedro-starters-sklearn

`sklearn-iris` template

Iris dataset

How to use

Option to use Kaggle Titanic dataset

`sklearn-mlflow-iris` template

How to use

Access MLflow web UI

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

kedro-starters-sklearn

sklearn-iris template

Iris dataset

How to use

Option to use Kaggle Titanic dataset

sklearn-mlflow-iris template

How to use

Access MLflow web UI

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`sklearn-iris` template

`sklearn-mlflow-iris` template

Packages