Skip to content

outerbounds/hamilton-metaflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 

Repository files navigation

Installation

git clone https://github.com/outerbounds/hamilton-metaflow.git
cd ./absenteeism

This flow depends on Metaflow's integration with Anaconda. You can install conda like:

bash Miniconda3-latest-MacOSX-x86_64.sh

Note about visualizing Hamilton flow

In the flow.py step called featurize_and_split the code is configured to visualize the hamilton DAG. By default, we assume you have system graphviz installed. If you do not have graphviz or don't want to use it you can set the graphviz_flag parameter defined in flow.py to false. If you want to visualize the hamilton DAGs in Metaflow Cards here is a link to install graphviz at system level.

  • on MacOS: brew install graphviz
  • on Unix: sudo apt-get install graphviz
  • More alternatives are discussed here.

Run the flow

Run the flow:

python ./flow.py --environment=conda run

Note that this will create a single conda environment to run each step of the flow in. The first time you do this it may take several minutes to fetch all dependencies from conda, but each time after the environment will bootstrap much faster so long as the conda dependencies are unchanged.

After configuring your AWS credentials you can run any steps on AWS Batch using Metaflow decorators.

This is one benefit of using Metaflow's @conda or @conda_base decorators. Metaflow will automatically package the conda environment and start the compute environment on remote enviroments from it.

Inspecting Results

This flow creates several Metaflow cards. Cards are associated with flow steps. These can be viewed locally in the browser running

python ./flow.py --environment=conda card view <step name>

For example, the start step displays class label distribution plots:

python ./flow.py --environment=conda card view start

You can see Hamilton visualizations by using

python ./flow.py --environment=conda card view featurize_and_split
python ./flow.py --environment=conda card view feature_importance_merge

In a script or notebook you can also look at the scores of each modeling step in a pandas DataFrame like:

from metaflow import Flow
run = Flow('FeatureSelectionAndClassification').latest_run
run.data.results

Reference

About

using hamilton + metaflow together

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages