-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: improve validation when dropping temp schemas #151553
Description
We've seen cases where temporary objects aren't properly cleaned up (see support issue #3391). The root cause is still unknown. One challenge is that the initial issue often happens long before any visible failure, making debugging difficult.
This ticket proposes improving descriptor validation during temp object cleanup, to catch issues closer to when they happen.
Ideas to explore:
-
Detect missing temp schemas during descriptor validation
Right now, we intentionally skip validating temp schemas:
cockroach/pkg/sql/catalog/tabledesc/validate.go
Lines 162 to 177 in f09c7b8
// Check that parent schema exists. // TODO(richardjcai): Remove logic for keys.PublicSchemaID in 22.2. if desc.GetParentSchemaID() != keys.PublicSchemaID && !desc.IsTemporary() { schemaDesc, err := vdg.GetSchemaDescriptor(desc.GetParentSchemaID()) if err != nil { vea.Report(err) } if schemaDesc != nil && dbDesc != nil && schemaDesc.GetParentID() != dbDesc.GetID() { vea.Report(errors.AssertionFailedf("parent schema %d is in different database %d", desc.GetParentSchemaID(), schemaDesc.GetParentID())) } if schemaDesc != nil && schemaDesc.Dropped() { vea.Report(errors.AssertionFailedf("parent schema %q (%d) is dropped", schemaDesc.GetName(), schemaDesc.GetID())) } } Adding this check might be tricky, since temp schema info lives in session-specific structures and descriptor validation is session-agnostic.
-
Validate that all objects under a temp schema are removed during cleanup
We could do this here:
cockroach/pkg/sql/temporary_schema.go
Lines 216 to 227 in 7e50c9f
func cleanupTempSchemaObjects( ctx context.Context, txn isql.Txn, descsCol *descs.Collection, codec keys.SQLCodec, db catalog.DatabaseDescriptor, sc catalog.SchemaDescriptor, ) error { objects, err := descsCol.GetAllObjectsInSchema(ctx, txn.KV(), db, sc) if err != nil { return err } At this point we have the temp schema ID, so once cleanup finishes, we could verify that all objects within the schema are gone. One edge case: temp sequences owned by permanent tables are skipped during deletion and would need special handling.
Jira issue: CRDB-53336
Epic CRDB-17128