Skip to content

Allow easy paging for list operations #895

@jgeewax

Description

@jgeewax

Taking the comment on https://github.com/GoogleCloudPlatform/gcloud-python/pull/889/files#r31042158 over here.

Requiring explicit paging for topics is really ugly, there must be a better way than:

all_topics = []

topics, token = client.list_topics()
all_topics.extend(topics)
while token is not None:
  topics, token = client.list_topics(page_token=token)
  all_topics.extend(topics)

If our concern is "people could make a boatload of requests and they don't realize it", we could always limit this like we do with the recursive deleting in storage?

topics = client.list_topics(limit=100)

We can do a bunch of this with the page_size parameter, but that might mean that I have to wait for everything to come back before starting any work, which seems kind of ridiculous.

It'd be really nice if that limit and page-size stuff was both there, so it's easy to do things like "I want the first 5000 topics, and I want to pull them from the server in chunks of 50":

for topic in client.list_topics(page_size=50, limit=5000):
  push_work_to_other_system(topic)

To add a bit more context, I'd like to toss out: what if we made all of our list operations return iterators?

The use cases I see here are...

  1. I want everything, give it all to me (for topic in list_topics())
  2. I want up to N things, stop giving them to me then (for topic in list_topics(limit=100))
  3. I don't know how many I want, I'll know when I want to stop though... (for topics in list_topics(): if topic.name == 'foo': break)
  4. Combination of the previous two (I don't know when I want to stop, but don't let me go on forever, kill it at say... 1000)
  5. I want to pick up where I left off, I saved a token somewhere (sort of like offset)! (for topic in list_topics(page_token=token, limit=100))

The "let's just always return page, page_token" thing doesn't really make all of those use-cases all that fun... But if we always return iterators, they are all easy.

Further, let's say I have a weird case where I just want one page worth of stuff... list_topics().get_current_page() could return what you want, no?

Metadata

Metadata

Assignees

Labels

api: datastoreIssues related to the Datastore API.api: pubsubIssues related to the Pub/Sub API.api: storageIssues related to the Cloud Storage API.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions