Skip to content

Make ucx pylsp plugin configurable#2280

Merged
nfx merged 6 commits intomainfrom
feature/lsp-plugin-configure
Aug 2, 2024
Merged

Make ucx pylsp plugin configurable#2280
nfx merged 6 commits intomainfrom
feature/lsp-plugin-configure

Conversation

@vsevolodstep-db
Copy link
Copy Markdown
Contributor

Changes

Make LSP linter plugin configurable with cluster information. This config can be provided either in a file or by a client and its provisioning is handled by pylsp infrastructure.
Spark Connect linter is now applied only to UC Shared clusters, as Single-User clusters are running in Spark Classic mode.

Tests

  • manually tested
  • added unit tests
  • added integration tests
  • verified on staging environment (screenshot attached)

Copy link
Copy Markdown
Contributor

@JCZuurmond JCZuurmond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have some questions


class SparkConnectLinter(PythonLinter):
def __init__(self, session_state: CurrentSessionState):
if session_state.data_security_mode != DataSecurityMode.USER_ISOLATION:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to run this only for USER_ISOLATION security mode?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UC Shared Clusters use Spark connect, this linter detects use of legacy APIs that are not supported in Spark Connect mode

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

databricks labs ucx lint-local-code command sets no data security mode, so these linters would be omitted.

Suggested change
if session_state.data_security_mode != DataSecurityMode.USER_ISOLATION:
if session_state.data_security_mode is not None and session_state.data_security_mode != DataSecurityMode.USER_ISOLATION:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the expectation for databricks labs ucx lint-local-code? These errors emitted by SparkConnectLinter are relevant for UC Shared Clusters, but they have no use for Single-User ones. While it's still better to migrate off using deprecated APIs, it's not possible to achieve for all workloads (e.g. using RDD APIs on a UC SIngle User cluster). I wonder if we need to pass data_security_mode to lint-local-code as an optional flag

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jul 30, 2024

❌ 140/142 passed, 5 flaky, 2 failed, 14 skipped, 1h52m30s total

❌ test_creating_lakeview_dashboard_permissions: databricks.sdk.errors.platform.InvalidParameterValue: validation failed: [resource name must be unique; found duplicates: [dashboards/01ef50c6a5671e6392529473bed14902/pages/01ef50c6a5671fa999e4e92b3826fcdb/widgets/01ef50c6a568102e992c6dc5df9c86da]] (157ms)
databricks.sdk.errors.platform.InvalidParameterValue: validation failed: [resource name must be unique; found duplicates: [dashboards/01ef50c6a5671e6392529473bed14902/pages/01ef50c6a5671fa999e4e92b3826fcdb/widgets/01ef50c6a568102e992c6dc5df9c86da]]
[gw5] linux -- Python 3.10.14 /home/runner/work/ucx/ucx/.venv/bin/python
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] clearing 0 workspace group fixtures
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] clearing 0 lakeview_dashboard permissions fixtures
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] clearing 0 dashboard fixtures
[gw5] linux -- Python 3.10.14 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_running_real_assessment_job_ext_hms: AssertionError: assert False (9m46.332s)
AssertionError: assert False
 +  where False = <bound method DeployedWorkflows.validate_step of <databricks.labs.ucx.installer.workflows.DeployedWorkflows object at 0x7f3931e709d0>>('assessment')
 +    where <bound method DeployedWorkflows.validate_step of <databricks.labs.ucx.installer.workflows.DeployedWorkflows object at 0x7f3931e709d0>> = <databricks.labs.ucx.installer.workflows.DeployedWorkflows object at 0x7f3931e709d0>.validate_step
 +      where <databricks.labs.ucx.installer.workflows.DeployedWorkflows object at 0x7f3931e709d0> = <tests.integration.conftest.MockInstallationContext object at 0x7f3931e116c0>.deployed_workflows
[gw0] linux -- Python 3.10.14 /home/runner/work/ucx/ucx/.venv/bin/python
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added workspace user fixture: User(active=True, display_name='[email protected]', emails=[ComplexValue(display=None, primary=True, ref=None, type='work', value='[email protected]')], entitlements=[], external_id=None, groups=[], id='1889434851838399', name=Name(family_name=None, given_name='[email protected]'), roles=[], schemas=[<UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_USER: 'urn:ietf:params:scim:schemas:core:2.0:User'>, <UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_EXTENSION_WORKSPACE_2_0_USER: 'urn:ietf:params:scim:schemas:extension:workspace:2.0:User'>], user_name='[email protected]')
11:59 INFO [databricks.labs.ucx.mixins.fixtures] Workspace group ucx-wOcQ-ra78a50355: https://DATABRICKS_HOST#setting/accounts/groups/672690495722281
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added workspace group fixture: Group(display_name='ucx-wOcQ-ra78a50355', entitlements=[ComplexValue(display=None, primary=None, ref=None, type=None, value='allow-cluster-create')], external_id=None, groups=[], id='672690495722281', members=[ComplexValue(display='[email protected]', primary=None, ref='Users/1889434851838399', type=None, value='1889434851838399')], meta=ResourceMeta(resource_type='WorkspaceGroup'), roles=[], schemas=[<GroupSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_GROUP: 'urn:ietf:params:scim:schemas:core:2.0:Group'>])
11:59 INFO [databricks.labs.ucx.mixins.fixtures] Account group ucx-wOcQ-ra78a50355: https://accounts.CLOUD_ENVdatabricks.net/users/groups/516910880884609/members
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added account group fixture: Group(display_name='ucx-wOcQ-ra78a50355', entitlements=[], external_id=None, groups=[], id='516910880884609', members=[ComplexValue(display='[email protected]', primary=None, ref='Users/1889434851838399', type=None, value='1889434851838399')], meta=None, roles=[], schemas=[<GroupSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_GROUP: 'urn:ietf:params:scim:schemas:core:2.0:Group'>])
11:59 INFO [databricks.labs.ucx.mixins.fixtures] Cluster policy: https://DATABRICKS_HOST#setting/clusters/cluster-policies/view/000CCF2B2469E0F8
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added cluster policy fixture: CreatePolicyResponse(policy_id='000CCF2B2469E0F8')
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added cluster_policy permissions fixture: 000CCF2B2469E0F8 [group_name admins CAN_USE] -> [group_name ucx-wOcQ-ra78a50355 CAN_USE]
11:59 INFO [databricks.labs.ucx.mixins.fixtures] Schema hive_metastore.ucx_scs2l: https://DATABRICKS_HOST/explore/data/hive_metastore/ucx_scs2l
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added schema fixture: SchemaInfo(browse_only=None, catalog_name='hive_metastore', catalog_type=None, comment=None, created_at=None, created_by=None, effective_predictive_optimization_flag=None, enable_predictive_optimization=None, full_name='hive_metastore.ucx_scs2l', metastore_id=None, name='ucx_scs2l', owner=None, properties=None, schema_id=None, storage_location=None, storage_root=None, updated_at=None, updated_by=None)
11:59 DEBUG [databricks.labs.ucx.install] Cannot find previous installation: Path (/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.TQcP/config.yml) doesn't exist.
11:59 INFO [databricks.labs.ucx.install] Please answer a couple of questions to configure Unity Catalog migration
11:59 INFO [databricks.labs.ucx.installer.hms_lineage] HMS Lineage feature creates one system table named system.hms_to_uc_migration.table_access and helps in your migration process from HMS to UC by allowing you to programmatically query HMS lineage data.
11:59 INFO [databricks.labs.ucx.install] Fetching installations...
11:59 INFO [databricks.labs.ucx.installer.policy] Setting up an external metastore
11:59 INFO [databricks.labs.ucx.installer.policy] Creating UCX cluster policy.
11:59 DEBUG [tests.integration.conftest] Waiting for clusters to start...
11:59 DEBUG [tests.integration.conftest] Waiting for clusters to start...
11:59 INFO [databricks.labs.ucx.install] Installing UCX v0.31.1+2120240802115923
11:59 INFO [databricks.labs.ucx.install] Creating ucx schemas...
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=failing
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=validate-groups-permissions
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups-experimental
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=experimental-workflow-linter
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=scan-tables-in-mounts-experimental
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables-in-mounts-experimental
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-hiveserde-tables-in-place-experimental
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=assessment
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-tables-ctas
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=remove-workspace-local-backup-groups
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-data-reconciliation
12:00 INFO [databricks.labs.ucx.install] Creating dashboards...
12:00 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/views...
12:00 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration...
12:00 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/groups...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/main...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/main...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/estimates...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/CLOUD_ENV...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/interactive...
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/main...
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/estimates...
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.install] Installation completed successfully! Please refer to the https://DATABRICKS_HOST/#workspace/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.TQcP/README for the next steps.
12:00 DEBUG [databricks.labs.ucx.installer.workflows] starting assessment job: https://DATABRICKS_HOST#job/722464289587957
12:08 DEBUG [databricks.labs.ucx.installer.workflows] Validating assessment workflow: https://DATABRICKS_HOST#job/722464289587957
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added workspace user fixture: User(active=True, display_name='[email protected]', emails=[ComplexValue(display=None, primary=True, ref=None, type='work', value='[email protected]')], entitlements=[], external_id=None, groups=[], id='1889434851838399', name=Name(family_name=None, given_name='[email protected]'), roles=[], schemas=[<UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_USER: 'urn:ietf:params:scim:schemas:core:2.0:User'>, <UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_EXTENSION_WORKSPACE_2_0_USER: 'urn:ietf:params:scim:schemas:extension:workspace:2.0:User'>], user_name='[email protected]')
11:59 INFO [databricks.labs.ucx.mixins.fixtures] Workspace group ucx-wOcQ-ra78a50355: https://DATABRICKS_HOST#setting/accounts/groups/672690495722281
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added workspace group fixture: Group(display_name='ucx-wOcQ-ra78a50355', entitlements=[ComplexValue(display=None, primary=None, ref=None, type=None, value='allow-cluster-create')], external_id=None, groups=[], id='672690495722281', members=[ComplexValue(display='[email protected]', primary=None, ref='Users/1889434851838399', type=None, value='1889434851838399')], meta=ResourceMeta(resource_type='WorkspaceGroup'), roles=[], schemas=[<GroupSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_GROUP: 'urn:ietf:params:scim:schemas:core:2.0:Group'>])
11:59 INFO [databricks.labs.ucx.mixins.fixtures] Account group ucx-wOcQ-ra78a50355: https://accounts.CLOUD_ENVdatabricks.net/users/groups/516910880884609/members
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added account group fixture: Group(display_name='ucx-wOcQ-ra78a50355', entitlements=[], external_id=None, groups=[], id='516910880884609', members=[ComplexValue(display='[email protected]', primary=None, ref='Users/1889434851838399', type=None, value='1889434851838399')], meta=None, roles=[], schemas=[<GroupSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_GROUP: 'urn:ietf:params:scim:schemas:core:2.0:Group'>])
11:59 INFO [databricks.labs.ucx.mixins.fixtures] Cluster policy: https://DATABRICKS_HOST#setting/clusters/cluster-policies/view/000CCF2B2469E0F8
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added cluster policy fixture: CreatePolicyResponse(policy_id='000CCF2B2469E0F8')
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added cluster_policy permissions fixture: 000CCF2B2469E0F8 [group_name admins CAN_USE] -> [group_name ucx-wOcQ-ra78a50355 CAN_USE]
11:59 INFO [databricks.labs.ucx.mixins.fixtures] Schema hive_metastore.ucx_scs2l: https://DATABRICKS_HOST/explore/data/hive_metastore/ucx_scs2l
11:59 DEBUG [databricks.labs.ucx.mixins.fixtures] added schema fixture: SchemaInfo(browse_only=None, catalog_name='hive_metastore', catalog_type=None, comment=None, created_at=None, created_by=None, effective_predictive_optimization_flag=None, enable_predictive_optimization=None, full_name='hive_metastore.ucx_scs2l', metastore_id=None, name='ucx_scs2l', owner=None, properties=None, schema_id=None, storage_location=None, storage_root=None, updated_at=None, updated_by=None)
11:59 DEBUG [databricks.labs.ucx.install] Cannot find previous installation: Path (/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.TQcP/config.yml) doesn't exist.
11:59 INFO [databricks.labs.ucx.install] Please answer a couple of questions to configure Unity Catalog migration
11:59 INFO [databricks.labs.ucx.installer.hms_lineage] HMS Lineage feature creates one system table named system.hms_to_uc_migration.table_access and helps in your migration process from HMS to UC by allowing you to programmatically query HMS lineage data.
11:59 INFO [databricks.labs.ucx.install] Fetching installations...
11:59 INFO [databricks.labs.ucx.installer.policy] Setting up an external metastore
11:59 INFO [databricks.labs.ucx.installer.policy] Creating UCX cluster policy.
11:59 DEBUG [tests.integration.conftest] Waiting for clusters to start...
11:59 DEBUG [tests.integration.conftest] Waiting for clusters to start...
11:59 INFO [databricks.labs.ucx.install] Installing UCX v0.31.1+2120240802115923
11:59 INFO [databricks.labs.ucx.install] Creating ucx schemas...
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=failing
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=validate-groups-permissions
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups-experimental
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=experimental-workflow-linter
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=scan-tables-in-mounts-experimental
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables-in-mounts-experimental
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-hiveserde-tables-in-place-experimental
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=assessment
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-tables-ctas
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=remove-workspace-local-backup-groups
11:59 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-data-reconciliation
12:00 INFO [databricks.labs.ucx.install] Creating dashboards...
12:00 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/views...
12:00 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration...
12:00 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/groups...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/main...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/main...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/estimates...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/CLOUD_ENV...
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/interactive...
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/main...
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/estimates...
12:00 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
12:00 INFO [databricks.labs.ucx.install] Installation completed successfully! Please refer to the https://DATABRICKS_HOST/#workspace/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.TQcP/README for the next steps.
12:00 DEBUG [databricks.labs.ucx.installer.workflows] starting assessment job: https://DATABRICKS_HOST#job/722464289587957
12:08 DEBUG [databricks.labs.ucx.installer.workflows] Validating assessment workflow: https://DATABRICKS_HOST#job/722464289587957
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] clearing 1 cluster_policy permissions fixtures
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] removing cluster_policy permissions fixture: 000CCF2B2469E0F8 [group_name admins CAN_USE] -> [group_name ucx-wOcQ-ra78a50355 CAN_USE]
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] clearing 1 cluster policy fixtures
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] removing cluster policy fixture: CreatePolicyResponse(policy_id='000CCF2B2469E0F8')
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] ignoring error while cluster policy CreatePolicyResponse(policy_id='000CCF2B2469E0F8') teardown: Can't find a cluster policy with id: 000CCF2B2469E0F8.
12:08 INFO [databricks.labs.ucx.install] Deleting UCX v0.31.1+2120240802115923 from https://DATABRICKS_HOST
12:08 ERROR [databricks.labs.ucx.install] Check if /Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.TQcP is present
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] clearing 1 workspace user fixtures
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] removing workspace user fixture: User(active=True, display_name='[email protected]', emails=[ComplexValue(display=None, primary=True, ref=None, type='work', value='[email protected]')], entitlements=[], external_id=None, groups=[], id='1889434851838399', name=Name(family_name=None, given_name='[email protected]'), roles=[], schemas=[<UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_USER: 'urn:ietf:params:scim:schemas:core:2.0:User'>, <UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_EXTENSION_WORKSPACE_2_0_USER: 'urn:ietf:params:scim:schemas:extension:workspace:2.0:User'>], user_name='[email protected]')
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] clearing 1 account group fixtures
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] removing account group fixture: Group(display_name='ucx-wOcQ-ra78a50355', entitlements=[], external_id=None, groups=[], id='516910880884609', members=[ComplexValue(display='[email protected]', primary=None, ref='Users/1889434851838399', type=None, value='1889434851838399')], meta=None, roles=[], schemas=[<GroupSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_GROUP: 'urn:ietf:params:scim:schemas:core:2.0:Group'>])
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] clearing 1 workspace group fixtures
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] removing workspace group fixture: Group(display_name='ucx-wOcQ-ra78a50355', entitlements=[ComplexValue(display=None, primary=None, ref=None, type=None, value='allow-cluster-create')], external_id=None, groups=[], id='672690495722281', members=[ComplexValue(display='[email protected]', primary=None, ref='Users/1889434851838399', type=None, value='1889434851838399')], meta=ResourceMeta(resource_type='WorkspaceGroup'), roles=[], schemas=[<GroupSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_GROUP: 'urn:ietf:params:scim:schemas:core:2.0:Group'>])
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] clearing 0 table fixtures
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] clearing 0 table fixtures
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] clearing 1 schema fixtures
12:08 DEBUG [databricks.labs.ucx.mixins.fixtures] removing schema fixture: SchemaInfo(browse_only=None, catalog_name='hive_metastore', catalog_type=None, comment=None, created_at=None, created_by=None, effective_predictive_optimization_flag=None, enable_predictive_optimization=None, full_name='hive_metastore.ucx_scs2l', metastore_id=None, name='ucx_scs2l', owner=None, properties=None, schema_id=None, storage_location=None, storage_root=None, updated_at=None, updated_by=None)
[gw0] linux -- Python 3.10.14 /home/runner/work/ucx/ucx/.venv/bin/python

Flaky tests:

  • 🤪 test_migrate_view (1m10.72s)
  • 🤪 test_installation_when_dashboard_id_is_invalid[] (44.985s)
  • 🤪 test_job_failure_propagates_correct_error_message_and_logs (1m37.389s)
  • 🤪 test_hiveserde_table_ctas_migration_job[hiveserde] (2m5.524s)
  • 🤪 test_table_migration_job_refreshes_migration_status[hiveserde-migrate-external-tables-ctas] (2m5.36s)

Running from acceptance #5039


@classmethod
def from_json(cls, json: dict) -> CurrentSessionState:
mode_str = json.get('data_security_mode', None)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's better to extract this into a private method


class SparkConnectLinter(PythonLinter):
def __init__(self, session_state: CurrentSessionState):
if session_state.data_security_mode != DataSecurityMode.USER_ISOLATION:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

databricks labs ucx lint-local-code command sets no data security mode, so these linters would be omitted.

Suggested change
if session_state.data_security_mode != DataSecurityMode.USER_ISOLATION:
if session_state.data_security_mode is not None and session_state.data_security_mode != DataSecurityMode.USER_ISOLATION:

Copy link
Copy Markdown
Contributor

@JCZuurmond JCZuurmond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the best of my judgement, it looks good. Added some comments. @nfx : Could you give the final approval?

code = 'sc.emptyRDD()'
_, doc = temp_document(code, workspace)
diagnostics = sorted(lsp_plugin.pylsp_lint(config, doc), key=lambda d: d['code'])
assert diagnostics == []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indepedent of the code, should this always return no diagnostic if the config is not set?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think so

Copy link
Copy Markdown
Collaborator

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few last nits

code = 'sc.emptyRDD()'
_, doc = temp_document(code, workspace)
diagnostics = sorted(lsp_plugin.pylsp_lint(config, doc), key=lambda d: d['code'])
assert diagnostics == []
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think so

@nfx nfx merged commit d3d42c0 into main Aug 2, 2024
@nfx nfx deleted the feature/lsp-plugin-configure branch August 2, 2024 12:18
nfx added a commit that referenced this pull request Aug 2, 2024
* Added troubleshooting guide for self-signed SSL cert related error ([#2346](#2346)). In this release, we have added a troubleshooting guide to the README file to address a specific error that may occur when connecting from a local machine to a Databricks Account and Workspace using a web proxy and self-signed SSL certificate. This error, SSLCertVerificationError, can prevent UCX from connecting to the Account and Workspace. To resolve this issue, users can now set the `REQUESTS_CA_BUNDLE` and `CURL_CA_BUNDLE` environment variables to force the requests library to set `verify=False`, and set the `SSL_CERT_DIR` env var pointing to the proxy CA cert for the urllib3 library. This guide will help users understand and resolve this error, making it easier to connect to Databricks Accounts and Workspaces using a web proxy and self-signed SSL certificate.
* Code Compatibility Dashboard: Fix broken links ([#2347](#2347)). In this release, we have addressed and resolved two issues in the Code Compatibility Dashboard of the UCX Migration (Main) project, enhancing its overall usability. Previously, the Markdown panel contained a broken link to the workflow due to an incorrect anchor, and the links in the table widget to the workflow and task definitions did not render correctly. These problems have been rectified, and the dashboard has been manually tested and verified in a staging environment. Additionally, we have updated the `invisibleColumns` section in the SQL file by changing the `fieldName` attribute to 'name', which will now display the `workflow_id` as a link. Before and after screenshots have been provided for visual reference. The corresponding workflow is now referred to as "Jobs Static Code Analysis Workflow".
* Filter out missing import problems for imports within a try-except clause with ImportError ([#2332](#2332)). This release introduces changes to handle missing import problems within a try-except clause that catches ImportError. A new method, `_filter_import_problem_in_try_except`, has been added to filter out import-not-found issues when they occur in such a clause, preventing unnecessary build failures. The `_register_import` method now returns an Iterable[DependencyProblem] instead of yielding problems directly. Supporting classes and methods, including Dependency, DependencyGraph, and DependencyProblem from the databricks.labs.ucx.source_code.graph module, as well as FileLoader and PythonCodeAnalyzer from the databricks.labs.ucx.source_code.notebooks.cells module, have been added. The ImportSource.extract_from_tree method has been updated to accept a DependencyProblem object as an argument. Additionally, a new test case has been included for the scenario where a failing import in a try-except clause goes unreported. Issue [#1705](#1705) has been resolved, and unit tests have been added to ensure proper functionality.
* Fixed `report-account-compatibility` cli command docstring ([#2340](#2340)). In this release, we have updated the `report-account-compatibility` CLI command's docstring to accurately reflect its functionality, addressing a previous issue where it inadvertently duplicated the `sync-workspace-info` command's description. This command now provides a clear and concise explanation of its purpose: "Report compatibility of all workspaces available in the account." Upon execution, it generates a readiness report for the account, specifically focusing on workspaces where ucx is installed. This enhancement improves the clarity of the CLI's functionality for software engineers, enabling them to understand and effectively utilize the `report-account-compatibility` command.
* Fixed broken table migration workflow links in README ([#2286](#2286)). In this release, we have made significant improvements to the README file of our open-source library, including fixing broken links and adding a mermaid flowchart to demonstrate the table migration workflows. The table migration workflow has been renamed to the table migration process, which includes migrating Delta tables, non-Delta tables, external tables, and views. Two optional workflows have been added for migrating HiveSerDe tables in place and for migrating external tables using CTAS. Additionally, the commands related to table migration have been updated, with the table migration workflow being renamed to the table migration process. These changes are aimed at providing a more comprehensive understanding of the table migration process and enhancing the overall user experience.
* Fixed dashboard queries fail when default catalog is not `hive_metastore` ([#2278](#2278)). In this release, we have addressed an issue where dashboard queries fail when the default catalog is not set to `hive_metastore`. This has been achieved by modifying the existing `databricks labs ucx install` command to always include the `hive_metastore` namespace in dashboard queries. Additionally, the code has been updated to add the `hive_metastore` namespace to the `DashboardMetadata` object used in creating a dashboard from SQL queries in a folder, ensuring queries are executed in the correct database. The commit also includes modifications to the `test_install.py` unit test file to ensure the installation process correctly handles specific configurations related to the `ucx` namespace for managing data storage and retrieval. The changes have been manually tested and verified on a staging environment.
* Improve group migration error reporting ([#2344](#2344)). This PR introduces enhancements to the group migration dashboard, focusing on improved error reporting and a more informative user experience. The documentation widgets have been fine-tuned, and the failed-migration widget now provides formatted failure information with a link to the failed job run. The dashboard will display only failures from the latest workflow run, complete with logs. A new link to the job list has been added in the [workflows](/jobs) section of the documentation to assist users in identifying and troubleshooting issues. Additionally, the SQL query for retrieving group migration failure information has been refactored, improving readability and extracting relevant data using regular expressions. The changes have been tested and verified on the staging environment, providing clearer and more actionable insights during group migrations. The PR is related to previous work in [#2333](#2333) and [#1914](#1914), with updates to the UCX Migration (Groups) dashboard, but no new methods have been added.
* Improve type checking in cli command ([#2335](#2335)). This release introduces enhanced type checking in the command line interface (CLI) of our open-source library, specifically in the `lint_local_code` function of the `cli.py` file. By utilizing a newly developed local code linter object, the function now performs more rigorous and accurate type checking for potential issues in the local code. While the functionality remains consistent, this improvement is expected to prevent similar occurrences like issue [#2221](#2221), ensuring more robust and reliable code. This change underscores our commitment to delivering a high-quality, efficient, and developer-friendly library.
* Lint dependencies in context ([#2236](#2236)). The `InheritedContext` class has been introduced to gather code fragments from parent files or notebooks during linting of child files or notebooks, addressing issues [#2155](#2155), [#2156](#2156), and [#2221](#2221). This new feature includes the addition of the `InheritedContext` class, with methods for building instances from a route of dependencies, appending other `InheritedContext` instances, and finalizing them for use with linting. The `DependencyGraph` class has been updated to support the new functionality, and various classes, methods, and functions for handling the linter context have been added or updated. Unit, functional, and integration tests have been added to ensure the correct functioning of the changes, which improve the linting functionality by allowing it to consider the broader context of the codebase.
* Make ucx pylsp plugin configurable ([#2280](#2280)). This commit introduces the ability to configure the ucx pylsp plugin with cluster information, which can be provided either in a file or by a client and is managed by the pylsp infrastructure. The Spark Connect linter is now only applied to UC Shared clusters, as Single-User clusters run in Spark Classic mode. A new entry point `pylsp_ucx` has been added to the pylsp configuration file. The changes affect the pylsp plugin configuration and the application of the Spark Connect linter. Unit tests and manual testing have been conducted, but integration tests and verification on a staging environment are not included in this release.
* New dashboard: group migration, showing groups that failed to migrate ([#2333](#2333)). In this release, we have developed a new dashboard for monitoring group migration in the UCX Migration (Groups) workspace. This dashboard includes a widget displaying messages related to groups that failed to migrate during the `migrate-groups-experimental` workflow, aiding users in identifying and addressing migration issues. The group migration process consists of several steps, including renaming workspace groups, provisioning account-level groups, and replicating permissions. The release features new methods for displaying and monitoring migration-related messages, as well as links to documentation and workflows for assessing, validating, and removing workspace-level groups post-migration. The new dashboard is currently not connected to the existing system, but it has undergone manual testing and verification on the staging environment. The changes include the addition of a new SQL query file to implement the logic for fetching group migration failures and a new Markdown file displaying the Group Migration Failures section.
* Support spaces in run cmd args ([#2330](#2330)). The recent commit resolves an issue where the system had trouble handling spaces in command-line arguments when running subprocesses. The previous implementation only accepted a full command line, which it would split on spaces, causing problems when the command line contained arguments with spaces. The new implementation supports argument lists, which are passed `as is` to `Popen`, allowing for proper handling of command lines with spaces. This change is incorporated in the `run_command` function of the `utils.py` file and the `_install_pip` method of the `PythonLibraryResolver` class. The `shlex.join()` function has been replaced with direct string formatting for increased flexibility. The feature is intended for use with the `PythonLibraryResolver` class and is co-authored by Eric Vergnaud and Andrew Snare. Integration tests have been enabled to ensure the proper functioning of the updated code.
* Updated error messages for SparkConnect linter ([#2348](#2348)). The SparkConnect linter's error messages have been updated to improve clarity and precision. The term `UC Shared clusters` has been replaced with `Unity Catalog clusters in Shared access mode` throughout the codebase, affecting messages related to various unsupported functionalities or practices on these clusters. These changes include warnings about direct Spark log level setting, accessing the Spark Driver JVM or its logger, using `sc`, and employing RDD APIs. This revision enhances user experience by providing more accurate and descriptive error messages, enabling them to better understand and address the issues in their code. The functionality of the linter remains unchanged.
* Updated sqlglot requirement from <25.8,>=25.5.0 to >=25.5.0,<25.9 ([#2279](#2279)). In this update, we have updated the required version range of the `sqlglot` dependency in the 'pyproject.toml' file from 'sqlglot>=25.5.0,<25.8' to 'sqlglot>=25.5.0,<25.9'. This change allows the project to utilize any version of `sqlglot` that is greater than or equal to 25.5.0 and less than 25.9, including the latest version. The update also includes a changelog for the updated version range, sourced from 'sqlglot's official changelog. This changelog includes various bug fixes and new features for several dialects such as BigQuery, DuckDB, and tSQL. Additionally, the parser has undergone some refactors and improvements. The commits section lists the individual commits included in this update.

Dependency updates:

 * Updated sqlglot requirement from <25.8,>=25.5.0 to >=25.5.0,<25.9 ([#2279](#2279)).
@nfx nfx mentioned this pull request Aug 2, 2024
nfx added a commit that referenced this pull request Aug 2, 2024
* Added troubleshooting guide for self-signed SSL cert related error
([#2346](#2346)). In this
release, we have added a troubleshooting guide to the README file to
address a specific error that may occur when connecting from a local
machine to a Databricks Account and Workspace using a web proxy and
self-signed SSL certificate. This error, SSLCertVerificationError, can
prevent UCX from connecting to the Account and Workspace. To resolve
this issue, users can now set the `REQUESTS_CA_BUNDLE` and
`CURL_CA_BUNDLE` environment variables to force the requests library to
set `verify=False`, and set the `SSL_CERT_DIR` env var pointing to the
proxy CA cert for the urllib3 library. This guide will help users
understand and resolve this error, making it easier to connect to
Databricks Accounts and Workspaces using a web proxy and self-signed SSL
certificate.
* Code Compatibility Dashboard: Fix broken links
([#2347](#2347)). In this
release, we have addressed and resolved two issues in the Code
Compatibility Dashboard of the UCX Migration (Main) project, enhancing
its overall usability. Previously, the Markdown panel contained a broken
link to the workflow due to an incorrect anchor, and the links in the
table widget to the workflow and task definitions did not render
correctly. These problems have been rectified, and the dashboard has
been manually tested and verified in a staging environment.
Additionally, we have updated the `invisibleColumns` section in the SQL
file by changing the `fieldName` attribute to 'name', which will now
display the `workflow_id` as a link. Before and after screenshots have
been provided for visual reference. The corresponding workflow is now
referred to as "Jobs Static Code Analysis Workflow".
* Filter out missing import problems for imports within a try-except
clause with ImportError
([#2332](#2332)). This
release introduces changes to handle missing import problems within a
try-except clause that catches ImportError. A new method,
`_filter_import_problem_in_try_except`, has been added to filter out
import-not-found issues when they occur in such a clause, preventing
unnecessary build failures. The `_register_import` method now returns an
Iterable[DependencyProblem] instead of yielding problems directly.
Supporting classes and methods, including Dependency, DependencyGraph,
and DependencyProblem from the databricks.labs.ucx.source_code.graph
module, as well as FileLoader and PythonCodeAnalyzer from the
databricks.labs.ucx.source_code.notebooks.cells module, have been added.
The ImportSource.extract_from_tree method has been updated to accept a
DependencyProblem object as an argument. Additionally, a new test case
has been included for the scenario where a failing import in a
try-except clause goes unreported. Issue
[#1705](#1705) has been
resolved, and unit tests have been added to ensure proper functionality.
* Fixed `report-account-compatibility` cli command docstring
([#2340](#2340)). In this
release, we have updated the `report-account-compatibility` CLI
command's docstring to accurately reflect its functionality, addressing
a previous issue where it inadvertently duplicated the
`sync-workspace-info` command's description. This command now provides a
clear and concise explanation of its purpose: "Report compatibility of
all workspaces available in the account." Upon execution, it generates a
readiness report for the account, specifically focusing on workspaces
where ucx is installed. This enhancement improves the clarity of the
CLI's functionality for software engineers, enabling them to understand
and effectively utilize the `report-account-compatibility` command.
* Fixed broken table migration workflow links in README
([#2286](#2286)). In this
release, we have made significant improvements to the README file of our
open-source library, including fixing broken links and adding a mermaid
flowchart to demonstrate the table migration workflows. The table
migration workflow has been renamed to the table migration process,
which includes migrating Delta tables, non-Delta tables, external
tables, and views. Two optional workflows have been added for migrating
HiveSerDe tables in place and for migrating external tables using CTAS.
Additionally, the commands related to table migration have been updated,
with the table migration workflow being renamed to the table migration
process. These changes are aimed at providing a more comprehensive
understanding of the table migration process and enhancing the overall
user experience.
* Fixed dashboard queries fail when default catalog is not
`hive_metastore`
([#2278](#2278)). In this
release, we have addressed an issue where dashboard queries fail when
the default catalog is not set to `hive_metastore`. This has been
achieved by modifying the existing `databricks labs ucx install` command
to always include the `hive_metastore` namespace in dashboard queries.
Additionally, the code has been updated to add the `hive_metastore`
namespace to the `DashboardMetadata` object used in creating a dashboard
from SQL queries in a folder, ensuring queries are executed in the
correct database. The commit also includes modifications to the
`test_install.py` unit test file to ensure the installation process
correctly handles specific configurations related to the `ucx` namespace
for managing data storage and retrieval. The changes have been manually
tested and verified on a staging environment.
* Improve group migration error reporting
([#2344](#2344)). This PR
introduces enhancements to the group migration dashboard, focusing on
improved error reporting and a more informative user experience. The
documentation widgets have been fine-tuned, and the failed-migration
widget now provides formatted failure information with a link to the
failed job run. The dashboard will display only failures from the latest
workflow run, complete with logs. A new link to the job list has been
added in the [workflows](/jobs) section of the documentation to assist
users in identifying and troubleshooting issues. Additionally, the SQL
query for retrieving group migration failure information has been
refactored, improving readability and extracting relevant data using
regular expressions. The changes have been tested and verified on the
staging environment, providing clearer and more actionable insights
during group migrations. The PR is related to previous work in
[#2333](#2333) and
[#1914](#1914), with updates
to the UCX Migration (Groups) dashboard, but no new methods have been
added.
* Improve type checking in cli command
([#2335](#2335)). This
release introduces enhanced type checking in the command line interface
(CLI) of our open-source library, specifically in the `lint_local_code`
function of the `cli.py` file. By utilizing a newly developed local code
linter object, the function now performs more rigorous and accurate type
checking for potential issues in the local code. While the functionality
remains consistent, this improvement is expected to prevent similar
occurrences like issue
[#2221](#2221), ensuring
more robust and reliable code. This change underscores our commitment to
delivering a high-quality, efficient, and developer-friendly library.
* Lint dependencies in context
([#2236](#2236)). The
`InheritedContext` class has been introduced to gather code fragments
from parent files or notebooks during linting of child files or
notebooks, addressing issues
[#2155](#2155),
[#2156](#2156), and
[#2221](#2221). This new
feature includes the addition of the `InheritedContext` class, with
methods for building instances from a route of dependencies, appending
other `InheritedContext` instances, and finalizing them for use with
linting. The `DependencyGraph` class has been updated to support the new
functionality, and various classes, methods, and functions for handling
the linter context have been added or updated. Unit, functional, and
integration tests have been added to ensure the correct functioning of
the changes, which improve the linting functionality by allowing it to
consider the broader context of the codebase.
* Make ucx pylsp plugin configurable
([#2280](#2280)). This
commit introduces the ability to configure the ucx pylsp plugin with
cluster information, which can be provided either in a file or by a
client and is managed by the pylsp infrastructure. The Spark Connect
linter is now only applied to UC Shared clusters, as Single-User
clusters run in Spark Classic mode. A new entry point `pylsp_ucx` has
been added to the pylsp configuration file. The changes affect the pylsp
plugin configuration and the application of the Spark Connect linter.
Unit tests and manual testing have been conducted, but integration tests
and verification on a staging environment are not included in this
release.
* New dashboard: group migration, showing groups that failed to migrate
([#2333](#2333)). In this
release, we have developed a new dashboard for monitoring group
migration in the UCX Migration (Groups) workspace. This dashboard
includes a widget displaying messages related to groups that failed to
migrate during the `migrate-groups-experimental` workflow, aiding users
in identifying and addressing migration issues. The group migration
process consists of several steps, including renaming workspace groups,
provisioning account-level groups, and replicating permissions. The
release features new methods for displaying and monitoring
migration-related messages, as well as links to documentation and
workflows for assessing, validating, and removing workspace-level groups
post-migration. The new dashboard is currently not connected to the
existing system, but it has undergone manual testing and verification on
the staging environment. The changes include the addition of a new SQL
query file to implement the logic for fetching group migration failures
and a new Markdown file displaying the Group Migration Failures section.
* Support spaces in run cmd args
([#2330](#2330)). The recent
commit resolves an issue where the system had trouble handling spaces in
command-line arguments when running subprocesses. The previous
implementation only accepted a full command line, which it would split
on spaces, causing problems when the command line contained arguments
with spaces. The new implementation supports argument lists, which are
passed `as is` to `Popen`, allowing for proper handling of command lines
with spaces. This change is incorporated in the `run_command` function
of the `utils.py` file and the `_install_pip` method of the
`PythonLibraryResolver` class. The `shlex.join()` function has been
replaced with direct string formatting for increased flexibility. The
feature is intended for use with the `PythonLibraryResolver` class and
is co-authored by Eric Vergnaud and Andrew Snare. Integration tests have
been enabled to ensure the proper functioning of the updated code.
* Updated error messages for SparkConnect linter
([#2348](#2348)). The
SparkConnect linter's error messages have been updated to improve
clarity and precision. The term `UC Shared clusters` has been replaced
with `Unity Catalog clusters in Shared access mode` throughout the
codebase, affecting messages related to various unsupported
functionalities or practices on these clusters. These changes include
warnings about direct Spark log level setting, accessing the Spark
Driver JVM or its logger, using `sc`, and employing RDD APIs. This
revision enhances user experience by providing more accurate and
descriptive error messages, enabling them to better understand and
address the issues in their code. The functionality of the linter
remains unchanged.
* Updated sqlglot requirement from <25.8,>=25.5.0 to >=25.5.0,<25.9
([#2279](#2279)). In this
update, we have updated the required version range of the `sqlglot`
dependency in the 'pyproject.toml' file from 'sqlglot>=25.5.0,<25.8' to
'sqlglot>=25.5.0,<25.9'. This change allows the project to utilize any
version of `sqlglot` that is greater than or equal to 25.5.0 and less
than 25.9, including the latest version. The update also includes a
changelog for the updated version range, sourced from 'sqlglot's
official changelog. This changelog includes various bug fixes and new
features for several dialects such as BigQuery, DuckDB, and tSQL.
Additionally, the parser has undergone some refactors and improvements.
The commits section lists the individual commits included in this
update.

Dependency updates:

* Updated sqlglot requirement from <25.8,>=25.5.0 to >=25.5.0,<25.9
([#2279](#2279)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants