Skip to content
This repository was archived by the owner on Feb 7, 2026. It is now read-only.

feat: add table.createInsertStream for native streaming inserts#997

Merged
steffnay merged 21 commits intogoogleapis:mainfrom
steffnay:createInsertStream
Jan 17, 2022
Merged

feat: add table.createInsertStream for native streaming inserts#997
steffnay merged 21 commits intogoogleapis:mainfrom
steffnay:createInsertStream

Conversation

@steffnay
Copy link
Copy Markdown
Contributor

@steffnay steffnay commented Aug 27, 2021

This feature adds the ability to use a native Duplex stream for inserting rows into BigQuery via the /insertAll endpoint and reading the API response. Implements batching of rows via RowBatch and RowQueue classes.

Adds:

  • Table.createInsertStream()

  • RowQueue

  • RowBatch

  • Ensure the tests and linter pass

  • Code coverage does not decrease (if any source code was changed)

  • Appropriate docs were updated (if necessary)

Fixes #506 🦕

@product-auto-label product-auto-label Bot added the api: bigquery Issues related to the googleapis/nodejs-bigquery API. label Aug 27, 2021
@google-cla google-cla Bot added the cla: yes This human has signed the Contributor License Agreement. label Aug 27, 2021
@steffnay steffnay requested review from feywind and tswast August 27, 2021 21:36
@steffnay steffnay marked this pull request as ready for review August 30, 2021 17:22
@steffnay steffnay requested review from a team August 30, 2021 17:22
@steffnay steffnay added the owlbot:run Add this label to trigger the Owlbot post processor. label Oct 12, 2021
@gcf-owl-bot gcf-owl-bot Bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Oct 12, 2021
Comment thread src/rowBatch.ts Outdated
Comment thread src/rowQueue.ts Outdated
Comment thread src/rowQueue.ts
maxOutstandingBytes: 1 * 1024 * 1024,

// The maximum time we'll wait to send batched rows, in milliseconds.
maxDelayMillis: 10000,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How was 10 seconds chosen?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just based off similar batching of messages in nodejs-pubsub, do you have an idea of a timeframe that is better aligned for BigQuery inserts?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. I think it has more to do with the customer's requirements than BigQuery limitations. Since it's configurable, aligning the default with pub/sub makes sense to me.

Copy link
Copy Markdown
Contributor

@feywind feywind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the flood of comments ^^; I'm happy to talk over any of them and/or help fix some of the any comments if you like.

Comment thread src/rowBatch.ts Outdated
Comment thread src/rowBatch.ts Outdated
Comment thread src/rowBatch.ts Outdated
Comment thread src/rowBatch.ts Outdated
Comment thread src/rowBatch.ts Outdated
Comment thread src/table.ts Outdated
Comment thread test/rowBatch.ts
Comment thread test/rowBatch.ts Outdated
Comment thread test/rowQueue.ts Outdated
Comment thread test/rowQueue.ts Outdated
@steffnay steffnay requested a review from tswast January 2, 2022 03:03
@steffnay steffnay requested a review from feywind January 2, 2022 03:03
@steffnay steffnay added the owlbot:run Add this label to trigger the Owlbot post processor. label Jan 3, 2022
@gcf-owl-bot gcf-owl-bot Bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Jan 3, 2022
Copy link
Copy Markdown
Contributor

@feywind feywind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember what all I commented on now, but overall it looks good :)

Comment thread src/rowQueue.ts
const opts = typeof options === 'object' ? options : {};

if (opts.insertRowsOptions) {
this.insertRowsOptions = opts.insertRowsOptions;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread src/rowQueue.ts
};
}),
// eslint-disable-next-line @typescript-eslint/no-explicit-any
row: rows[(insertError as any).index],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good :)

Comment thread src/rowQueue.ts
maxOutstandingBytes: 1 * 1024 * 1024,

// The maximum time we'll wait to send batched rows, in milliseconds.
maxDelayMillis: 10000,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. I think it has more to do with the customer's requirements than BigQuery limitations. Since it's configurable, aligning the default with pub/sub makes sense to me.

Comment thread src/rowQueue.ts Outdated
@steffnay steffnay added the owlbot:run Add this label to trigger the Owlbot post processor. label Jan 17, 2022
@gcf-owl-bot gcf-owl-bot Bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Jan 17, 2022
@steffnay steffnay merged commit 0ffe544 into googleapis:main Jan 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

api: bigquery Issues related to the googleapis/nodejs-bigquery API. cla: yes This human has signed the Contributor License Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement writable streams for natively streaming inserts

3 participants