Skip to content
This repository was archived by the owner on Mar 20, 2018. It is now read-only.
This repository was archived by the owner on Mar 20, 2018. It is now read-only.

Support sized pages in Page Streaming #86

@tbetbetbe

Description

@tbetbetbe

Early feedback suggests that in addition to simple iteration, a batch mode of operation should also be supported for page-streamed calls, where the user is able to iterate by batches of results. E.g,

E.g, currently for page streaming the following works

>>> from google.pubsub.v1.publisher_api import PublisherApi
>>> api = PublisherApi()
>>> topic = api.topic_path('google.com:pubsub-demo', 'my-topic')
>>> all_subs = list(api.list_topic_subscriptions(topic))
>>> all_subs
[‘sub_1’, ‘sub_2’, ‘sub_3’, ‘sub_2’]

There should be a way to support batching:

>>> from google.pubsub.v1.publisher_api import PublisherApi
>>> api = PublisherApi()
>>> topic = api.topic_path('google.com:pubsub-demo', 'my-topic')
>>> all_subs = list(??? with batchsize=2)
>>> all_subs
[(‘sub_1’, ‘sub_2’), (‘sub_3’, ‘sub_2’)]

Proposals

  1. Add a helper to google.gax to support batch iteration. Users will import it and use it when they need it.
  2. Add a helper to google.gax as in 1. Also, add a field to CallOptions, batch_size which users set to batch in pages

Add a helper to google.gax to support batch iteration

>>> from google.pubsub.v1.publisher_api import PublisherApi
>>> from google.gax import batch_iter
>>> api = PublisherApi()
>>> topic, batch_size = api.topic_path('google.com:pubsub-demo', 'my-topic'), 2
>>> all_subs = list(batch_iter(api.list_topic_subscriptions(topic), batch_size))
>>> all_subs
[(‘sub_1’, ‘sub_2’), (‘sub_3’, ‘sub_2’)]

Pros

  • the generated code stays simple, there are no changes required to the generator
  • the new feature batch_iter is not difficult, and is reminiscent of the helper methods in the itertools library

Cons

  • users have more of the gax-python surface to learn; the batch_iter func is a new feature to be learned.

Add a helper to google.gax, also add batch_size to CallOptions

>>> from google.pubsub.v1.publisher_api import PublisherApi
>>> from google.gax import batch_iter
>>> api = PublisherApi()
>>> topic, batch_size = api.topic_path('google.com:pubsub-demo', 'my-topic'), 2
>>> all_subs = list(api.list_topic_subscriptions(topic, 
...    options=CallOptions(batch_size=batch_size)))
>>> all_subs
[(‘sub_1’, ‘sub_2’), (‘sub_3’, ‘sub_2’)]

Pros

  • keeps the change to the surface elements the user needs to know about to a minimum; i.e to use this feature, the user only needs to know about an addition field that we will document in CallOptions

Cons

  • the generated code for methods that support page-streaming is slightly more complex

@jmuk, @geigerj, @bjwatson, @anthmgoogle PTAL and discuss

Decision

  • the surface will look like this
>>> from google.pubsub.v1.publisher_api import PublisherApi
>>> api = PublisherApi()
>>> api.topic_path('google.com:pubsub-demo', 'my-topic')
>>> all_subs = list(api.list_topic_subscriptions(topic, 
...    options=CallOptions(is_page_streaming=False)))
>>> all_subs  # whatever the server page size is 
[(‘sub_1’, ‘sub_2’), (‘sub_3’), ('sub_4')]

There's an open question that is OK to resolve after implementation begins

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions