Add parallel fetching for registered model ID

At the moment, we're fetching registered model IDs sequentially in the `MainThread`:

```
crawl_permissions.log.2023-12-06_02-40:02:47:20 DEBUG [databricks.sdk] {MainThread} GET /api/2.0/mlflow/registered-models/list?page_token=...
crawl_permissions.log.2023-12-06_02-40:02:47:20 DEBUG [databricks.sdk] {MainThread} GET /api/2.0/mlflow/databricks/registered-models/get?name=...
crawl_permissions.log.2023-12-06_02-40:02:47:20 DEBUG [databricks.sdk] {MainThread} GET /api/2.0/mlflow/databricks/registered-models/get?name=....
```

this results in overly long runtimes for assessment tasks, that go beyond 18 hours. I strongly believe this can be parallelised. Add `Threads.parallel` call to speed this up.

potential fix in https://github.com/databrickslabs/ucx/blob/main/src/databricks/labs/ucx/workspace_access/generic.py#L327-L333:

```
def models_listing(ws: WorkspaceClient):
    def inner() -> Iterator[ml.ModelDatabricks]:
        return Threads.parallel('fetching models with ID', map(lambda model: ws.model_registry.get_model(model.name).registered_model_databricks, ws.model_registry.list_models())
    return inner
```

but it needs to be tested.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add parallel fetching for registered model ID #688

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add parallel fetching for registered model ID #688

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions