-
Notifications
You must be signed in to change notification settings - Fork 715
fix: dynamic stream name should work if it starts with { and ends with } #8882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR adds comprehensive test coverage for the pipeline dynamic template substitution feature that was fixed in PR #8874. The test file validates that pipeline destination stream names with template variables like {kubernetes.namespace_name} are correctly substituted with actual field values at runtime.
Key Changes
- Added helper functions (
safe_sql_identifier,get_time_window,create_validation_query,validate_data_flow,enable_pipeline) to reduce code duplication and improve maintainability - Introduced constants for wait times and configuration values for better maintainability
- Refactored existing
test_pipeline_dynamic_template_substitutiontest to use new helper functions - Added extensive test coverage for edge cases: complex multi-field templates, special characters, long names, numeric values, case sensitivity, unicode characters, nested templates, repeated fields, and stress testing
Issues Found
- Critical SQL Injection Vulnerabilities: Multiple instances where user-controlled values are directly interpolated into SQL queries instead of using parameterized queries (lines 55-57, 598, 724, 861). This violates the custom instruction about SQL injection prevention.
- Logic Error: Line 938 incorrectly calls
safe_sql_identifier()inside an f-string within JSON data, which will result in a malformed template string rather than proper sanitization.
Test Coverage
The tests validate that:
- Simple template substitution works correctly
- Multi-field templates are substituted properly
- Special characters and edge cases are handled
- The system can handle rapid pipeline creation
- Data flows to the correct dynamically-named destination streams
Confidence Score: 2/5
- This PR has critical SQL injection vulnerabilities that must be fixed before merging
- While this PR adds valuable test coverage for the dynamic template feature, it introduces multiple SQL injection vulnerabilities by directly interpolating values into SQL query strings instead of using parameterized queries. These are security-critical issues that violate established custom instructions and could lead to database compromise if exploited in a testing environment with malicious input.
- tests/api-testing/tests/test_pipeline_dynamic.py requires immediate attention to fix SQL injection vulnerabilities in query construction (lines 55-57, 598, 724, 861) and the logic error at line 938
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| tests/api-testing/tests/test_pipeline_dynamic.py | 2/5 | Added comprehensive test coverage for pipeline dynamic template substitution. Contains critical SQL injection vulnerabilities in query construction that directly interpolate user-controlled values. |
Sequence Diagram
sequenceDiagram
participant Test as Test Suite
participant API as OpenObserve API
participant Pipeline as Pipeline Engine
participant Stream as Stream Storage
Note over Test: Test Setup Phase
Test->>Test: Load logs_data.json
Test->>Test: Generate unique pipeline name & node IDs
Note over Test,API: Pipeline Creation Phase
Test->>API: POST /api/{org_id}/pipelines
Note right of API: Pipeline with dynamic<br/>template destination<br/>(e.g., "logs_{kubernetes.namespace_name}_test")
API-->>Test: 200 OK (pipeline created)
Test->>API: GET /api/{org_id}/pipelines
API-->>Test: Pipeline list with pipeline_id
Test->>API: PUT /api/{org_id}/pipelines/{id}/enable?value=true
API-->>Test: 200 OK (pipeline enabled)
Note over Test,Stream: Data Ingestion Phase
Test->>API: POST /api/{org_id}/{source_stream}/_json
Note right of API: Ingest test data with<br/>kubernetes.namespace_name field
API->>Stream: Store data in source stream
Stream-->>API: Data stored
API-->>Test: 200 OK
Note over Test,Pipeline: Pipeline Processing Phase
Test->>Test: Wait 15 seconds for processing
Pipeline->>Stream: Read from source stream
Stream-->>Pipeline: Log data with kubernetes fields
Pipeline->>Pipeline: Evaluate conditions
Pipeline->>Pipeline: Substitute template variables<br/>{kubernetes.namespace_name} → "monitoring"
Pipeline->>Stream: Write to dynamic destination<br/>(e.g., "logs_monitoring_test")
Note over Test,Stream: Validation Phase
Test->>API: GET /api/{org_id}/pipelines
API-->>Test: Verify pipeline enabled
Test->>API: POST /api/{org_id}/_search?type=logs
Note right of API: Query expected destination<br/>with template substituted
API->>Stream: Search in destination stream
Stream-->>API: Record count
API-->>Test: Search results
alt Data found in expected destination
Test->>Test: ✅ Test PASSED
else Data not in expected destination
Test->>API: POST /api/{org_id}/_search?type=logs
Note right of API: Check if data went to<br/>literal template name
API-->>Test: Search results
alt Data found in literal template
Test->>Test: ❌ Test FAILED (bug detected)
else Data not found anywhere
Test->>Test: ❌ Test FAILED (complete failure)
end
end
1 file reviewed, 5 comments
| base_sql = f'SELECT COUNT(*) as count FROM "{safe_stream}"' | ||
| if filter_condition: | ||
| base_sql += f" WHERE {filter_condition}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Directly interpolating safe_stream and filter_condition into the SQL query creates a potential SQL injection vulnerability, even though safe_stream is validated. Use parameterized queries instead.
| base_sql = f'SELECT COUNT(*) as count FROM "{safe_stream}"' | |
| if filter_condition: | |
| base_sql += f" WHERE {filter_condition}" | |
| base_sql = 'SELECT COUNT(*) as count FROM ?' | |
| params = [safe_stream] | |
| if filter_condition: | |
| base_sql += ' WHERE ?' | |
| params.append(filter_condition) |
Context Used: Context from dashboard - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/api-testing/tests/test_pipeline_dynamic.py
Line: 55:57
Comment:
**logic:** Directly interpolating `safe_stream` and `filter_condition` into the SQL query creates a potential SQL injection vulnerability, even though `safe_stream` is validated. Use parameterized queries instead.
```suggestion
base_sql = 'SELECT COUNT(*) as count FROM ?'
params = [safe_stream]
if filter_condition:
base_sql += ' WHERE ?'
params.append(filter_condition)
```
**Context Used:** Context from `dashboard` - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... ([source](https://app.greptile.com/review/custom-context?memory=5078ddce-9cb0-4860-a7aa-e4a273809929))
How can I resolve this? If you propose a fix, please make it concise.| safe_destination = safe_sql_identifier(expected_destination) | ||
| validation_payload = { | ||
| "query": { | ||
| "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE special_test_marker LIKE \'SPECIAL_CHARS_{source_stream}\'', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: SQL injection: directly interpolating safe_destination and source_stream in the SQL query string.
| "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE special_test_marker LIKE \'SPECIAL_CHARS_{source_stream}\'', | |
| "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE special_test_marker = ?', |
Context Used: Context from dashboard - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/api-testing/tests/test_pipeline_dynamic.py
Line: 598:598
Comment:
**logic:** SQL injection: directly interpolating `safe_destination` and `source_stream` in the SQL query string.
```suggestion
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE special_test_marker = ?',
```
**Context Used:** Context from `dashboard` - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... ([source](https://app.greptile.com/review/custom-context?memory=5078ddce-9cb0-4860-a7aa-e4a273809929))
How can I resolve this? If you propose a fix, please make it concise.| safe_destination = safe_sql_identifier(expected_destination) | ||
| validation_payload = { | ||
| "query": { | ||
| "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE \'%{test_case}%\'', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: SQL injection: directly interpolating safe_destination and test_case in the SQL query string.
| "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE \'%{test_case}%\'', | |
| "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE ?', |
Context Used: Context from dashboard - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/api-testing/tests/test_pipeline_dynamic.py
Line: 724:724
Comment:
**logic:** SQL injection: directly interpolating `safe_destination` and `test_case` in the SQL query string.
```suggestion
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE ?',
```
**Context Used:** Context from `dashboard` - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... ([source](https://app.greptile.com/review/custom-context?memory=5078ddce-9cb0-4860-a7aa-e4a273809929))
How can I resolve this? If you propose a fix, please make it concise.| safe_destination = safe_sql_identifier(expected_destination) | ||
| validation_payload = { | ||
| "query": { | ||
| "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE \'%{expected_behavior}%\'', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: SQL injection: directly interpolating safe_destination and expected_behavior in the SQL query string.
| "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE \'%{expected_behavior}%\'', | |
| "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE ?', |
Context Used: Context from dashboard - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/api-testing/tests/test_pipeline_dynamic.py
Line: 861:861
Comment:
**logic:** SQL injection: directly interpolating `safe_destination` and `expected_behavior` in the SQL query string.
```suggestion
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE ?',
```
**Context Used:** Context from `dashboard` - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... ([source](https://app.greptile.com/review/custom-context?memory=5078ddce-9cb0-4860-a7aa-e4a273809929))
How can I resolve this? If you propose a fix, please make it concise.| "nodes": [ | ||
| {"id": "node_1", "type": "stream", "data": {"stream": source_stream}}, | ||
| {"id": "node_2", "type": "condition", "data": {"condition": f'kubernetes.namespace_name = "stress-test-{i}"'}}, | ||
| {"id": "node_3", "type": "stream", "data": {"stream": f"stress_output_{{{{{safe_sql_identifier('kubernetes.namespace_name')}}}}}_stream"}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: The safe_sql_identifier() function call in an f-string inside JSON will not work as intended - it becomes a literal string in the JSON payload, not a properly sanitized identifier.
| {"id": "node_3", "type": "stream", "data": {"stream": f"stress_output_{{{{{safe_sql_identifier('kubernetes.namespace_name')}}}}}_stream"}} | |
| {"id": "node_3", "type": "stream", "data": {"stream": f"stress_output_{{kubernetes.namespace_name}}_stream"}} |
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/api-testing/tests/test_pipeline_dynamic.py
Line: 938:938
Comment:
**logic:** The `safe_sql_identifier()` function call in an f-string inside JSON will not work as intended - it becomes a literal string in the JSON payload, not a properly sanitized identifier.
```suggestion
{"id": "node_3", "type": "stream", "data": {"stream": f"stress_output_{{kubernetes.namespace_name}}_stream"}}
```
How can I resolve this? If you propose a fix, please make it concise.
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 340 | 0 | 19 | 6 | 93% | 4m 42s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 343 | 0 | 19 | 3 | 94% | 7m 6s |
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 342 | 0 | 19 | 4 | 94% | 4m 42s |
…/openobserve into pipeline_dynamic_names_tests
PR Type
Tests, Enhancement
Description
Add comprehensive dynamic template tests
Introduce reusable validation utilities
Parameterize edge and nested cases
Standardize waits and identifiers
Diagram Walkthrough
File Walkthrough
test_pipeline_dynamic.py
Expand pipeline dynamic template tests and utilitiestests/api-testing/tests/test_pipeline_dynamic.py