Skip to content

Allow to stream compressed data into BigQuery #2811

@xgalen

Description

@xgalen

Hi all,

In order to save data transfer (out) costs, we would want to stream the data compressed with gzip.

I have made some tries and it works, requesting directly to the API. Example (omitting values for the sake of simplicity):

#!/bin/bash

...

OBJECT="{'kind': 'bigquery#tableDataInsertAllRequest', 'skipInvalidRows': true, 'ignoreUnknownValues': true, 'rows': $ROWS}"

echo $OBJECT | gzip -cf > compressed.gz

curl -v -H "Authorization: Bearer $ACCESS_TOKEN" \
     -H "Content-Type: text/plain" \
     -H "Content-Encoding: gzip" \
     --data-binary @compressed.gz \
"https://www.googleapis.com/bigquery/v2/projects/$GOOGLE_CLOUD_PROJECT/datasets/$DATASET_ID/tables/$TABLE_ID/insertAll"

But I couldn't find where to set the header to change the content-encoding to gzip. I know it's an option to the responses and to store files ( https://github.com/googleapis/nodejs-bigquery/blob/fbe4f93f9167e904ee9d0beaf27f0d0b9da6d7bb/src/table.js#L728 ) but no for requesting.

Is it possible to add a new setting to allow compress or not? In that case, the responsibility of the compression is for the server, not for this module. I think it would be worth :)

Of course, I could help if needed.

Thanks!

Alfredo

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions