You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Mar 6, 2026. It is now read-only.
Using the google-cloud-bigquery client with version 1.23.1
Python 3.7 (on linux and macos)
Steps to reproduce
Using client.list_row with max_result and start_index induce wrong data to be pulled when
the client needs to use more than one page.
He then issued a second call with 'nextPageToken' and 'startIndex' wich seems to be incompatible.
Code example
deftable_to_df_iterator(project_id, dataset_id, table_id) ->iter:
table_full_id=project_id+"."+dataset_id+"."+table_idclient=get_client()
index=0whileTrue:
offset=BATCH_SIZE_ROWS*indexdf=client.list_rows(table_full_id, max_results=BATCH_SIZE_ROWS,
start_index=offset).to_dataframe()
ifdf.empty:
breaklogging.info(f"Offset is at {offset} got a dataframe of size {len(DataFrame.index)}")
yielddfindex+=1
Environment details
Using the google-cloud-bigquery client with version 1.23.1
Python 3.7 (on linux and macos)
Steps to reproduce
the client needs to use more than one page.
He then issued a second call with 'nextPageToken' and 'startIndex' wich seems to be incompatible.
Code example
Trace
Idea to fix
Make the second call use an updated startIndex instead of 'nextPageToken'
Thanks!