-
Notifications
You must be signed in to change notification settings - Fork 6
New workflow: MapReduce #14
Copy link
Copy link
Closed
Labels
help wantedExtra attention is neededExtra attention is needed
Description
MapReduce Workflow
Goal: Create a simple map-reduce workflow for WEBS.
This will require finding a suitable map-reduce data set, but we are open to pretty much any example. For example, counting word examples in a text dataset would be sufficient.
Steps
- Read the Creating Workflows in WEBS Guide.
- Rewrite the reference information using the structure described in the guide.
At the end, you should be able to run:to run the workflow.python -m webs.run mapreduce {args} - Write up instructions for running the workflow as a module docstring in
webs/wf/mapreduce/__init__.py. This should include:- Installation instructions if there are external dependencies.
- Data download instructions.
- Discussion of important parameters or results.
Tips
- Please reply to the issue with any questions.
- The WEBS CI requires 100% code coverage. It is okay to exclude the workflow from coverage by adding the following to
pyproject.toml.[tool.coverage.run] ... omit = [ "examples", "webs/wf/{workflow-dir}/", ]
- Python dependencies needed by the workflow can be included as an "extras" option in the
pyproject.toml.Non-Python dependencies will require adding documentation instructions in[project.optional-dependencies] mapreduce = ["numpy", "pandas"]
webs/wf/{workflow-name}/__init__.py. - If the reference implementation includes a license, we will need to include that in
third_party_licenses/{workflow-name}-{licence-type}.md.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
help wantedExtra attention is neededExtra attention is needed