Skip to content

Technique example: data loader, Python to parquet#1422

Merged
allisonhorst merged 14 commits into
mainfrom
allison/py-parquet-loader
Jul 15, 2024
Merged

Technique example: data loader, Python to parquet#1422
allisonhorst merged 14 commits into
mainfrom
allison/py-parquet-loader

Conversation

@allisonhorst
Copy link
Copy Markdown
Contributor

@allisonhorst allisonhorst requested review from Fil and mbostock June 3, 2024 13:48
@Fil
Copy link
Copy Markdown
Contributor

Fil commented Jun 6, 2024

can we link to https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html explicitly mentioning that there are many options (and recommend compression); and maybe show compression in action?

Comment thread examples/loader-python-to-parquet/README.md Outdated
Comment thread examples/loader-python-to-parquet/src/index.md Outdated
Comment thread examples/loader-python-to-parquet/src/index.md Outdated
@allisonhorst
Copy link
Copy Markdown
Contributor Author

@Fil I added the compression codec explicitly in the loader (compression="snappy"), and include a sentence pointing to the write_table docs and different compression algorithms. Look okay?

can we link to https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html explicitly mentioning that there are many options (and recommend compression); and maybe show compression in action?

Copy link
Copy Markdown
Member

@mbostock mbostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind copying the new virtual environment pattern from #1468?

<div class="note">

To run this data loader, you’ll need python3 and the geopandas, matplotlib, io, and sys modules installed and available on your `$PATH`.

</div>

<div class="tip">

We recommend using a [Python virtual environment](https://observablehq.com/framework/loaders#venv), such as with venv or uv, and managing required packages via `requirements.txt` rather than installing them globally.

</div>

@jaanli
Copy link
Copy Markdown

jaanli commented Jun 15, 2024

Quick question - would dbt work here?

Comment on lines +66 to +69
Plot.barX(dams,
Plot.groupY({x: "count"}, {y: "Primary Purpose", fill: "Hazard Potential Classification", sort: {y: "x", reverse: true}
})
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please prettier this. 🙏

Also, you can use sort: {y: "-x"} to shorten.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep will do!

@allisonhorst allisonhorst merged commit 7dcae8b into main Jul 15, 2024
@allisonhorst allisonhorst deleted the allison/py-parquet-loader branch July 15, 2024 14:45

```js echo
Inputs.table(dams)
Inputs.table(dams);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prettier edit will prevent the table from displaying.

Copy link
Copy Markdown
Contributor Author

@allisonhorst allisonhorst Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was correctly updated in a subsequent commit (I turned prettier off after formatting, to leave the semicolon after FileAttachment but remove in the Inputs.table and Plot code:

https://observablehq.observablehq.cloud/framework-example-loader-python-to-parquet/

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right you are, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants