-
Notifications
You must be signed in to change notification settings - Fork 852
Module initialization order seems to be wrong for querier #3550
Description
Describe the bug
We say queries failed with query-frontend returning message {"status":"error","errorType":"internal","error":"BlocksStoreQueryable is not running: Starting"}
This happens around the time we deploy querier. We are using Cortex at this commit: beddbff
Investigation
Looking at the code, it seems like querier is taking work from query-frontend before its self is fully initialized.
In querier's log, we see the following initialization log entry:
level=debug ts=2020-12-01T07:47:53.895516151Z caller=module_service.go:48 msg="module waiting for initialization" module=distributor-service waiting_for=ring
level=info ts=2020-12-01T07:47:53.895542059Z caller=module_service.go:58 msg=initialising module=runtime-config
level=info ts=2020-12-01T07:47:53.895605222Z caller=module_service.go:58 msg=initialising module=memberlist-kv
level=info ts=2020-12-01T07:47:53.89569197Z caller=module_service.go:58 msg=initialising module=server
level=debug ts=2020-12-01T07:47:53.895713041Z caller=module_service.go:48 msg="module waiting for initialization" module=store-queryable waiting_for=memberlist-kv
level=info ts=2020-12-01T07:47:53.895744095Z caller=module_service.go:58 msg=initialising module=store-queryable
level=debug ts=2020-12-01T07:47:53.895760711Z caller=module_service.go:48 msg="module waiting for initialization" module=ring waiting_for=runtime-config
level=debug ts=2020-12-01T07:47:53.895773926Z caller=module_service.go:48 msg="module waiting for initialization" module=ring waiting_for=memberlist-kv
level=info ts=2020-12-01T07:47:53.895784002Z caller=module_service.go:58 msg=initialising module=ring
level=info ts=2020-12-01T07:47:53.895778955Z caller=module_service.go:58 msg=initialising module=querier
level=info ts=2020-12-01T07:47:53.895947084Z caller=module_service.go:58 msg=initialising module=distributor-service
ts=2020-12-01T07:47:53.902402602Z caller=memberlist_logger.go:74 level=debug msg="Initiating push/pull sync with: 10.0.53.204:7946"
Compare to earlier version of Cortex (revision bbf01e830) the initialization log looks like
level=debug ts=2020-10-06T19:36:09.645276963Z caller=module_service.go:48 msg=module waiting for initialization module=querier waiting_for=memberlist-kv
level=info ts=2020-10-06T19:36:09.645303929Z caller=module_service.go:58 msg=initialising module=querier
level=info ts=2020-10-06T19:36:09.64380678Z caller=module_service.go:58 msg=initialising module=distributor-service
level=debug ts=2020-10-06T19:36:09.643885063Z caller=module_service.go:48 msg=module waiting for initialization module=querier waiting_for=ring
level=debug ts=2020-10-06T19:36:09.643910866Z caller=module_service.go:48 msg=module waiting for initialization module=querier waiting_for=store-queryable
level=info ts=2020-10-06T19:36:09.643412491Z caller=module_service.go:58 msg=initialising module=server
level=info ts=2020-10-06T19:36:09.643473123Z caller=module_service.go:58 msg=initialising module=runtime-config
level=debug ts=2020-10-06T19:36:09.643520867Z caller=module_service.go:48 msg=module waiting for initialization module=distributor-service waiting_for=ring
level=debug ts=2020-10-06T19:36:09.643542333Z caller=module_service.go:48 msg=module waiting for initialization module=store-queryable waiting_for=memberlist-kv
level=debug ts=2020-10-06T19:36:09.643560481Z caller=module_service.go:48 msg=module waiting for initialization module=querier waiting_for=distributor-service
level=info ts=2020-10-06T19:36:09.643576371Z caller=module_service.go:58 msg=initialising module=memberlist-kv
level=info ts=2020-10-06T19:36:09.643638605Z caller=module_service.go:58 msg=initialising module=store-queryable
level=debug ts=2020-10-06T19:36:09.643625475Z caller=module_service.go:48 msg=module waiting for initialization module=ring waiting_for=runtime-config
level=debug ts=2020-10-06T19:36:09.643676698Z caller=module_service.go:48 msg=module waiting for initialization module=ring waiting_for=memberlist-kv
level=info ts=2020-10-06T19:36:09.64368781Z caller=module_service.go:58 msg=initialising module=ring
Notice the log of earlier version of Cortex contains msg=module waiting for initialization module=querier waiting_for=store-queryable, but newer version of cortex does not. It looks like querier started to perform work without waiting for store-queryable to finish initialization?
Any idea what is going on?