Skip to content

Conversation

@xiaoyongzhu
Copy link
Member

@xiaoyongzhu xiaoyongzhu commented Jun 15, 2022

This PR addresses a few comments/issues:

  1. Previously, when getting all the features for a certain project, we use search_entities method, which might not be scalable when there are tons of entities (or might take some time)
  2. We use the attribute field to get the relations between different entities (between project and anchor, or between derived features, etc.), which might be hard to merge two projects.

This PR solves those two issues, by doing the following:

  1. Using lineage API to get all the entities belonging to a certain project at once, so that we don't have to do search, rather we rely on the pre-computed purview lineage to get all the entities
  2. when calling get_features_from_registry(), we don't use the attributes since there might be concurrency issues; Instead, we use AtlasProcess to avoid potential concurrency issues.

Also fixed an issue brought by #380 where additional schema is added.

@xiaoyongzhu xiaoyongzhu changed the title Xiaoyzhu/purview fix3 Use elegant way to get all features for a project and fix potential concurrency issues Jun 15, 2022
@xiaoyongzhu
Copy link
Member Author

#466 the issues should already be fixed by this PR hence closing.

@xiaoyongzhu xiaoyongzhu deleted the xiaoyzhu/purview_fix3 branch September 23, 2022 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants