Skip to content

Conversation

@bbernays
Copy link
Collaborator

@bbernays bbernays commented Apr 18, 2023

Summary

closes #10151

In our small org this change brought the initialization time down from ~5 seconds to ~1 second

@cq-bot cq-bot added the aws label Apr 18, 2023
kodiakhq bot pushed a commit that referenced this pull request Apr 18, 2023

#### Summary

This is a preliminary PR to make it easier to follow and understand the changes coming for #10177
kodiakhq bot pushed a commit that referenced this pull request Apr 18, 2023

#### Summary

More naming refactor to reduce complexity in #10177
kodiakhq bot pushed a commit that referenced this pull request Apr 18, 2023

#### Summary

refactor to make #10177 easier to read
@cq-bot cq-bot added the website label Apr 18, 2023

In AWS organization mode, CloudQuery will source all accounts underneath automatically

- `initialization_concurrency` (int) (default: 4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why such a low default if we have 50000 as the default for the syncing goroutines?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

50,000 is a number that is designed to be spread out over every single table being resolved... This is only for setting up the SDK (assuming a role and getting the enabled/disabled regions)... If a user needs a higher value they can always increase it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we get anything by setting this to a super high number that we cannot validate as being appropriate. If we get feedback from users that using it at much high values increases the performance without any errors I am fully onboard raising this value

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non blocking for this PR, but we could test by modifying the code to run on more accounts than we have in our playground.
In Azure I tested by creating 1800 duplicated subscriptions, see https://github.com/cloudquery/cloudquery/compare/main...erezrokah:test/azure_memory?expand=1#diff-c5caf1d71566c18f3cf9dfd107693cc1fca4ea980176d33db01d94b013685afeR226

The only catch is the need to keep client IDs unique

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So tested it with 7000 faked accounts... Only ran into rate limit issues when an account was duplicated... which should ever happen

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Does this mean we can use a higher default concurrency? If we're not sure we can keep it lower and. change the default later based on users reports

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather keep it low until we hear from users

Copy link
Member

@hermanschaaf hermanschaaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@bbernays bbernays added the automerge Automatically merge once required checks pass label Apr 19, 2023
@kodiakhq kodiakhq bot merged commit 3838e80 into cloudquery:main Apr 19, 2023
@bbernays bbernays deleted the parallelization branch April 19, 2023 20:50
kodiakhq bot pushed a commit that referenced this pull request Apr 20, 2023
🤖 I have created a release *beep* *boop*
---


## [16.3.0](plugins-source-aws-v16.2.0...plugins-source-aws-v16.3.0) (2023-04-20)


### Features

* **aws-services:** Support newly added regions ([#10152](#10152)) ([d395a94](d395a94))
* **aws:** Add Support for Config Delivery Channels ([#10150](#10150)) ([361df3f](361df3f))
* **aws:** Parallelize Initialization of Accounts after discovery ([#10177](#10177)) ([3838e80](3838e80))
* **aws:** Upgrade to `github.com/cloudquery/plugin-sdk/v2` ([#9938](#9938)) ([a3fb436](a3fb436)), closes [#9937](#9937)


### Bug Fixes

* **aws:** Update EBS Snapshot Permissions Check Query ([#10149](#10149)) ([f65d9da](f65d9da)), closes [#10140](#10140)
* **deps:** Update module github.com/aws/aws-sdk-go-v2/config to v1.18.21 ([#10127](#10127)) ([3bcde69](3bcde69))
* **deps:** Update module github.com/aws/aws-sdk-go-v2/feature/s3/manager to v1.11.62 ([#10129](#10129)) ([13f8670](13f8670))
* **deps:** Update module github.com/aws/aws-sdk-go-v2/service/accessanalyzer to v1.19.10 ([#10131](#10131)) ([eefbad5](eefbad5))
* **deps:** Update module github.com/aws/aws-sdk-go-v2/service/acm to v1.17.9 ([#10132](#10132)) ([7f6d235](7f6d235))
* **deps:** Update module github.com/cloudquery/plugin-sdk/v2 to v2.2.0 ([#10135](#10135)) ([cf33b89](cf33b89))
* **deps:** Update module github.com/cloudquery/plugin-sdk/v2 to v2.2.2 ([#10143](#10143)) ([8f887e0](8f887e0))
* **deps:** Update module github.com/cloudquery/plugin-sdk/v2 to v2.3.0 ([#10163](#10163)) ([9a7f214](9a7f214))
* **deps:** Update module github.com/cloudquery/plugin-sdk/v2 to v2.3.1 ([#10175](#10175)) ([5b53423](5b53423))
* **deps:** Update module github.com/cloudquery/plugin-sdk/v2 to v2.3.3 ([#10187](#10187)) ([b185248](b185248))
* **deps:** Update module github.com/cloudquery/plugin-sdk/v2 to v2.3.4 ([#10196](#10196)) ([c6d2f59](c6d2f59))
* **deps:** Update module github.com/cloudquery/plugin-sdk/v2 to v2.3.5 ([#10200](#10200)) ([5a33693](5a33693))
* **deps:** Update module github.com/cloudquery/plugin-sdk/v2 to v2.3.6 ([#10208](#10208)) ([91c80a7](91c80a7))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge Automatically merge once required checks pass

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Improve Initialization Performance for AWS Plugin

6 participants