Hi there,
This is just an issue that I had with the docs which confused me a bit.
I think that I have figured out the best route going forward for myself, but I think the docs were a bit of a false red flag for me and made me take longer to realised the next step than I should have.
My use case is a decent volume of data rows coming in (1000 / hour) during some processing jobs that I'd like to push to a BigQuery table one at a time, as they happen (not batched).
In the docs for table.insert():
insert(rows, options, callback) returns Promise
Stream data into BigQuery one record at a time without running a load job.
There are more strict quota limits using this method so it is highly recommended that you load data into BigQuery using Table#load instead.
Here it says that there are "more strict quota limits" to using this method and that we should use table.load() instead.
So I went to go look at https://cloud.google.com/bigquery/quotas#load_jobs which tells me that these are limited to 1000 per day, so its really much below my use case - unless I now create a whole process to pre-store the data somewhere and then batch it into BigQuery at a later stage (undesired).
What's confusing there is that it also says:
The limits also apply to load jobs submitted programmatically by using the load-type jobs.insert API method.
Which made me think that even using table.insert has these limitations too.
Upon further investigation I have found that different limits apply to streaming (https://cloud.google.com/bigquery/quotas#streaming_inserts), which I assume are the limits that apply to this libraries table.insert() too?
So, where I've landed now is that I should be using the original method table.insert() which allows me the limits defined here.
I think it would be very helpful for future users if you were a little more clear about the limitations according to certain situations - as depending on the situation, those limitations can become irrelevant (much higher inserts but lower row count / huge row counts but limited on inserts ). The way its currently implied is leaning only towards the latter.
It took me much longer than I care to admit (thinking of ways to divert and batch these events with table.load()) to realise that I had landed on the correct method the first time.
Hi there,
This is just an issue that I had with the docs which confused me a bit.
I think that I have figured out the best route going forward for myself, but I think the docs were a bit of a false red flag for me and made me take longer to realised the next step than I should have.
My use case is a decent volume of data rows coming in (1000 / hour) during some processing jobs that I'd like to push to a BigQuery table one at a time, as they happen (not batched).
In the docs for
table.insert():Here it says that there are "more strict quota limits" to using this method and that we should use
table.load()instead.So I went to go look at https://cloud.google.com/bigquery/quotas#load_jobs which tells me that these are limited to 1000 per day, so its really much below my use case - unless I now create a whole process to pre-store the data somewhere and then batch it into BigQuery at a later stage (undesired).
What's confusing there is that it also says:
Which made me think that even using
table.inserthas these limitations too.Upon further investigation I have found that different limits apply to streaming (https://cloud.google.com/bigquery/quotas#streaming_inserts), which I assume are the limits that apply to this libraries
table.insert()too?So, where I've landed now is that I should be using the original method
table.insert()which allows me the limits defined here.I think it would be very helpful for future users if you were a little more clear about the limitations according to certain situations - as depending on the situation, those limitations can become irrelevant (much higher inserts but lower row count / huge row counts but limited on inserts ). The way its currently implied is leaning only towards the latter.
It took me much longer than I care to admit (thinking of ways to divert and batch these events with
table.load()) to realise that I had landed on the correct method the first time.