-
Notifications
You must be signed in to change notification settings - Fork 53
Description
🎯 Objective
Fix SQL syntax error in extract-sample-data.sql Phase 2 that prevents ANY view extraction from occurring.
📋 Background
PR #8348 extraction log shows that 0 views were extracted despite 28 materialized views being refreshed successfully. The root cause is a SQL syntax error at line ~1487 in the extraction command generation SQL.
Error from log (line 2243-2246):
psql:../extract-sample-data.sql:1618: ERROR: 42601: syntax error at or near "viewname"
LINE 301: file_prefix
^
This error occurs during Phase 2 ("Generating extraction commands") which builds the dynamic SQL commands to extract each view. Because this phase fails, Phase 3 executes zero extraction commands.
📊 Current State (Measured from Log)
- Phase 0: ✅ 28 materialized views refreshed successfully (all populated)
- Phase 1: ✅ 55+ distinct value CSV files extracted
- Phase 2: ❌ SQL SYNTAX ERROR in extraction command generation
- Phase 3:
⚠️ 0 view extraction commands executed (empty due to Phase 2 failure) - Phase 4: ✅ 94 tables extracted (93 with data, 1 empty)
- Phase 5-6: ✅ Statistics and distributions generated
- Overall Result: 0 of 109 views extracted, but all other phases successful
Misleading File: The problematic_views.csv file showing "materialized view not populated" errors is from a PREVIOUS validation run, not from this extraction attempt. This caused initial misanalysis.
✅ Acceptance Criteria
- Fix SQL syntax error around line 1487 in extract-sample-data.sql
- Verify SQL compiles without errors:
psql -f extract-sample-data.sql --dry-run - Re-run full extraction and verify all 109 views are processed
- Confirm extraction summary shows: "VIEWS: Total processed: 109"
- All view sample CSV files created in sample-data/ directory
- Zero SQL syntax errors in extraction log
- Update extraction documentation if needed
🛠️ Implementation Guidance
Files to Modify:
service.data.impl/src/main/resources/extract-sample-data.sql- Fix syntax error around line 1487
Root Cause:
Line 1487 in the WEEKLY views section is missing the E'\n' string continuation:
-- INCORRECT (line 1487):
'INSERT INTO cia_extraction_tracking (object_type, object_name, status, row_count) VALUES (''view'', ''' || '%s' || ''', CASE WHEN %s = 0 THEN ''empty'' ELSE ''success'' END, %s);' || E''
-- CORRECT (should be):
'INSERT INTO cia_extraction_tracking (object_type, object_name, status, row_count) VALUES (''view'', ''' || '%s' || ''', CASE WHEN %s = 0 THEN ''empty'' ELSE ''success'' END, %s);' || E'\n' ||Fix:
-- Around line 1487, in the WEEKLY views section
-- Find this line:
'INSERT INTO cia_extraction_tracking (object_type, object_name, status, row_count) VALUES (''view'', ''' || '%s' || ''', CASE WHEN %s = 0 THEN ''empty'' ELSE ''success'' END, %s);' || E''
-- Replace with:
'INSERT INTO cia_extraction_tracking (object_type, object_name, status, row_count) VALUES (''view'', ''' || '%s' || ''', CASE WHEN %s = 0 THEN ''empty'' ELSE ''success'' END, %s);' || E'\n' ||Testing Strategy:
# 1. Syntax validation
cd service.data.impl/src/main/resources
psql -U postgres -d cia_dev -f extract-sample-data.sql 2>&1 | tee test_run.log
# 2. Check for errors
grep -i "ERROR" test_run.log
# 3. Verify view count
grep "VIEWS: Total processed:" test_run.log
# 4. Check sample files created
ls -1 service.data.impl/sample-data/view_*_sample.csv | wc -l # Should be ~109🔍 Verification Steps
- Before Fix: Run extraction, see "ERROR: syntax error at or near viewname"
- After Fix: Run extraction, see "VIEWS: Total processed: 109"
- Confirm: 109 view_*_sample.csv files exist in sample-data/
- Validate: extraction_summary_report.csv shows 109 view entries
🤖 Recommended Agent
Agent: @stack-specialist
Rationale: This issue requires PostgreSQL SQL expertise, understanding of dynamic SQL generation with format() functions, and knowledge of the CIA database schema. The Stack Specialist has the SQL and database skills to fix this syntax error correctly.
📚 Related Resources
- PR sample data update and log #8348 - Contains extract-sample-data.log showing the error
service.data.impl/sample-data/extract-sample-data.log- Full extraction log (2843 lines)- Line 2243-2246 of log shows the syntax error
- Line ~1487 of extract-sample-data.sql contains the bug
📊 Impact
Critical - Without this fix, zero views are extracted, making sample data incomplete for:
- View debugging and validation
- Query performance testing
- Intelligence product development
- Temporal analysis and trends
📊 Metadata
Priority: Critical | Effort: S (15-30 min) | Impact: Enables extraction of all 109 views
Labels: type: bug, priority: critical, domain: database, size: small