-
Notifications
You must be signed in to change notification settings - Fork 213
Description
Thanks for stopping by to let us know something could be better!
PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.
Please run down the following list and make sure you've tried the usual "quick fixes":
- Search the issues already opened: https://github.com/googleapis/python-pubsub/issues
- Search StackOverflow: https://stackoverflow.com/questions/tagged/google-cloud-platform+python
If you are still having issues, please be sure to include as much information as possible:
Environment details
- OS type and version: Ubuntu 22.04
- Python version: 3.10.9
- pip version:
pip --version google-cloud-pubsubversion: 2.21.1
Steps to reproduce
Run google-cloud-pubsub and suffer a metadata outage like https://status.cloud.google.com/incidents/u6rQ2nNVbhAFqGCcTm58.
Note that this can trigger even in an un-sustained GCE metadata outage as once this exception triggers even once, the commit thread is dead forever. In our case, there was a short outage on the metadata server, but the retries all happened so quickly that the exception was raised before the service recovered
2024-04-26T07:30:45.783 Compute Engine Metadata server unavailable on attempt 1 of 5. Reason: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/universe/universe_domain (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7c1ca813c1c0>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-04-26T07:30:45.788 Compute Engine Metadata server unavailable on attempt 2 of 5. Reason: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/universe/universe_domain (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7c1ca8290730>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-04-26T07:30:45.794 Compute Engine Metadata server unavailable on attempt 3 of 5. Reason: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/universe/universe_domain (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7c1ca82918a0>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-04-26T07:30:45.801 [...]
2024-04-26T07:30:45.806 [...]
Code example
# exampleStack trace
Traceback (most recent call last):
File "/app/device/trimark/proxy/proxy.runfiles/python3_10_x86_64-unknown-linux-gnu/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/app/device/trimark/proxy/proxy.runfiles/python3_10_x86_64-unknown-linux-gnu/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/app/device/trimark/proxy/proxy.runfiles/common_deps_google_cloud_pubsub/site-packages/google/cloud/pubsub_v1/publisher/_batch/thread.py", line 274, in _commit
response = self._client._gapic_publish(
File "/app/device/trimark/proxy/proxy.runfiles/common_deps_google_cloud_pubsub/site-packages/google/cloud/pubsub_v1/publisher/client.py", line 267, in _gapic_publish
return super().publish(*args, **kwargs)
File "/app/device/trimark/proxy/proxy.runfiles/common_deps_google_cloud_pubsub/site-packages/google/pubsub_v1/services/publisher/client.py", line 1058, in publish
self._validate_universe_domain()
File "/app/device/trimark/proxy/proxy.runfiles/common_deps_google_cloud_pubsub/site-packages/google/pubsub_v1/services/publisher/client.py", line 554, in _validate_universe_domain
or PublisherClient._compare_universes(
File "/app/device/trimark/proxy/proxy.runfiles/common_deps_google_cloud_pubsub/site-packages/google/pubsub_v1/services/publisher/client.py", line 531, in _compare_universes
credentials_universe = getattr(credentials, \"universe_domain\", default_universe)
File "/app/device/trimark/proxy/proxy.runfiles/common_deps_google_auth/site-packages/google/auth/compute_engine/credentials.py", line 154, in universe_domain
self._universe_domain = _metadata.get_universe_domain(
File "/app/device/trimark/proxy/proxy.runfiles/common_deps_google_auth/site-packages/google/auth/compute_engine/_metadata.py", line 284, in get_universe_domain
universe_domain = get(
File "/app/device/trimark/proxy/proxy.runfiles/common_deps_google_auth/site-packages/google/auth/compute_engine/_metadata.py", line 217, in get
raise exceptions.TransportError(
google.auth.exceptions.TransportError: Failed to retrieve http://metadata.google.internal/computeMetadata/v1/universe/universe_domain from the Google Compute Engine metadata service. Compute Engine Metadata server unavailable
Speculative analysis
It looks like the issue is that the google-auth library is raising a TransportError which is not caught by the batch commit thread in this library. Potential fixes include catching that in Batch._commit (e.g. here), or catching it further down in google-cloud-pubsub and wrapping it in a GoogleAPIError.