cli

auto-ml Titanic example with datmo (CLI)

CLI

1. Kaggle Titanic Competition

a. Introduction

This tutorial is using the "Getting Started" competition from Kaggle's Titanic competition to showcase how we can use Auto-ML along with datmo, in order to track our work and make machine learning experiments reproducible and reusable. Some part of data analysis is inspired from this kernel.

b. Installation

To use datmo, you can install it using pip install datmo. If you have trouble installing, try checking the prerequisites listed here.

To run the experimentation.ipynb file, you can run it with $ datmo notebook command. Datmo will automatically recreate the environment the experiment was run in and recreate it for you.

home:~/datmo-tutorials/auto-ml$ datmo notebook

c. Solution

After the installation, we run the experimentation.ipynb notebook and perform following steps,

Exploratory Data Analysis (EDA)
Data Cleaning
Using Auto-ML to figure out the best algorithm and hyperparameter

For auto-ml, we use the tpot algorithm, which works as follows,

d. Creating versions (snapshots)

During the process of EDA, data cleaning and algorithm selection, we will be using datmo to create versions of work by creating datmo snapshots. As you see below, we created three snapshots at the end of the notebook tutorial - One after EDA, and the later two from different versions of our model. More information about the flow can be found in the notebook file.

home:~/datmo-tutorials/auto-ml$ datmo snapshot create -m "auto-ml-2"
Creating a new snapshot
Created snapshot with id: adf76fa7d0800cc6eec033d4b00f97536bcb0c20
home:~/datmo-tutorials/auto-ml$ datmo snapshot ls
+---------+-------------+-------------------------------------------+-----------------+---------------+-------+
|    id   |  created at |                 config                    |   stats         |      message  | label |
+---------+-------------+-------------------------------------------+-----------------+---------------+-------+
| adf76fa7|  2018-05-16 | {u'selected features': [u'Sex', u'Pclass',|{u'accuracy':    |   auto-ml-2   |  None |
|         |   03:04:06  |  u'Age', u'Fare', u'Embarked', u'Title',  |   0.8295964}    |               |       |
|         |             |   u'FarePerPerson', u'FamilySize']}       |                 |               |       |
| 30f8366b|  2018-05-16 | {u'selected features': [u'Sex', u'Pclass',|{u'accuracy':    |   auto-ml-1   |  None |
|         |   01:24:53  |  u'Age', u'Fare', u'Embarked',            |   0.8206278}    |               |       |
|         |             |   u'Fare', u'IsAlone', u'Title']}         |                 |               |       |
| 30803662|  2018-05-15 | {u'features analyzed': [u'Sex',           |    {}           |     EDA       |  None |
|         |   23:15:44  |  u'Pclass', u'FamilySize', u'IsAlone',    |                 |               |       |
|         |             |  u'Embarked', u'Fare', u'Age', u'Title']} |                 |               |       |
+---------+-------------+-------------------------------------------+-----------------+---------------+-------+

After we've created snapshots, we can checkout to one of our previous states with the following command:

home:~/datmo-tutorials/auto-ml$ datmo snapshot checkout adf76fa7

This reinstates all files present at the time of snapshot creation, available for me to edit and run new tasks, or re-run code as I please.

Name		Name	Last commit message	Last commit date
parent directory ..
.ipynb_checkpoints		.ipynb_checkpoints
images		images
input		input
.DS_Store		.DS_Store
Dockerfile		Dockerfile
README.md		README.md
experimentation.ipynb		experimentation.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

auto-ml Titanic example with datmo (CLI)

CLI

1. Kaggle Titanic Competition

a. Introduction

b. Installation

c. Solution

d. Creating versions (snapshots)

FilesExpand file tree

cli

Directory actions

More options

Directory actions

More options

Latest commit

History

cli

Folders and files

parent directory

README.md

auto-ml Titanic example with datmo (CLI)

CLI

1. Kaggle Titanic Competition

a. Introduction

b. Installation

c. Solution

d. Creating versions (snapshots)