Skip to content

Conversation

@neha00290
Copy link
Contributor

@neha00290 neha00290 commented Oct 24, 2025

PR Type

Tests, Enhancement


Description

  • Add comprehensive dynamic template tests

  • Introduce reusable validation utilities

  • Parameterize edge and nested cases

  • Standardize waits and identifiers


Diagram Walkthrough

flowchart LR
  Consts["Add constants (waits, limits)"]
  Utils["Add utils: time window, query builder, validator, enabler"]
  DynamicTest["Enhance dynamic template test"]
  ComplexTests["Add complex, special-char, edge, nested tests"]
  Perf["Add performance and cleanup tests"]

  Consts -- "used by" --> Utils
  Utils -- "reused in" --> DynamicTest
  Utils -- "reused in" --> ComplexTests
  Utils -- "reused in" --> Perf
Loading

File Walkthrough

Relevant files
Tests
test_pipeline_dynamic.py
Expand pipeline dynamic template tests and utilities         

tests/api-testing/tests/test_pipeline_dynamic.py

  • Add constants and safety helpers.
  • Add reusable validation and enable helpers.
  • Expand tests: complex, special chars, edge, nested.
  • Add performance and cleanup tests.
+774/-49

@github-actions
Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Issue

The complex template expected destination string appears inconsistent: template has two placeholders but the expected value includes an extra "_v1" segment for container, which may cause false failures.

    ("e2e_automate9", "data_{kubernetes.namespace_name}_v1_{kubernetes.container_name}", "kubernetes.namespace_name", "zinc-cp1", "data_zinc-cp1_v1_prometheus_v1", 200),
]
Fragile Timing

Tests rely on fixed sleep durations; consider polling with timeouts to reduce flakiness across environments.

# Wait for pipeline to process the newly ingested data
logger.info(f"Waiting {PIPELINE_PROCESSING_WAIT} seconds for pipeline to process data...")
time.sleep(PIPELINE_PROCESSING_WAIT)
Inconsistent Template Rules

Template substitution sometimes replaces hyphens with underscores and sometimes not; ensure consistent normalization to match backend behavior.

expected_destination = template.replace("{kubernetes.namespace_name}", condition_value.replace("-", "_"))

logger.info(f"🔍 Validating edge case {test_case}: {template}{expected_destination}")

# Validate data reached correct destination
now = datetime.now(timezone.utc)
end_time = int(now.timestamp() * 1000000)
start_time = int((now - timedelta(minutes=5)).timestamp() * 1000000)

safe_destination = safe_sql_identifier(expected_destination)

@github-actions
Copy link
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Remove escaping of template placeholder

Embedding safe_sql_identifier around a field placeholder turns it into a literal and
breaks template substitution. Use the bare placeholder braces expected by the
pipeline engine.

tests/api-testing/tests/test_pipeline_dynamic.py [938]

-"stream": f"stress_output_{{{{{safe_sql_identifier('kubernetes.namespace_name')}}}}}_stream"
+"stream": "stress_output_{kubernetes.namespace_name}_stream"
Suggestion importance[1-10]: 8

__

Why: Wrapping the placeholder with safe_sql_identifier will indeed literalize it and break substitution; using the bare {kubernetes.namespace_name} is a high-impact fix for correctness in stress test pipeline creation.

Medium
Normalize substituted template values

The test substitutes values that don't match the ingested data transformation rules
(hyphens vs underscores), leading to false negatives. Normalize the substitution
values consistently with how the pipeline sanitizes identifiers (replace '-' and '.'
with '_').

tests/api-testing/tests/test_pipeline_dynamic.py [581-586]

+norm_ns = "special_test".replace("-", "_").replace(".", "_")
+norm_container = "container-with-hyphens".replace("-", "_").replace(".", "_")
 if "kubernetes.namespace_name" in template_with_special_chars:
-    expected_destination = template_with_special_chars.replace("{kubernetes.namespace_name}", "special_test")
+    expected_destination = template_with_special_chars.replace("{kubernetes.namespace_name}", norm_ns)
 elif "kubernetes.container_name" in template_with_special_chars:
-    expected_destination = template_with_special_chars.replace("{kubernetes.container_name}", "container_with_hyphens")
+    expected_destination = template_with_special_chars.replace("{kubernetes.container_name}", norm_container)
 else:
-    expected_destination = template_with_special_chars  # No substitution expected
+    expected_destination = template_with_special_chars
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies potential mismatch due to hyphen/dot normalization when forming expected_destination and proposes consistent normalization, reducing false negatives; it's relevant and accurate to the new code.

Medium
Broaden identifier normalization

Only replacing hyphens misses dots and other characters, making expected stream
names misaligned with actual sanitized outputs. Apply the same normalization across
characters (e.g., '-' and '.') to avoid validation failures.

tests/api-testing/tests/test_pipeline_dynamic.py [713]

-expected_destination = template.replace("{kubernetes.namespace_name}", condition_value.replace("-", "_"))
+normalized_value = condition_value.replace("-", "_").replace(".", "_")
+expected_destination = template.replace("{kubernetes.namespace_name}", normalized_value)
Suggestion importance[1-10]: 6

__

Why: Expanding normalization to include dots makes the expected destination align better with sanitized outputs; it's a minor but useful correction based on nearby logic.

Low

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR adds comprehensive test coverage for the pipeline dynamic template substitution feature that was fixed in PR #8874. The test file validates that pipeline destination stream names with template variables like {kubernetes.namespace_name} are correctly substituted with actual field values at runtime.

Key Changes

  • Added helper functions (safe_sql_identifier, get_time_window, create_validation_query, validate_data_flow, enable_pipeline) to reduce code duplication and improve maintainability
  • Introduced constants for wait times and configuration values for better maintainability
  • Refactored existing test_pipeline_dynamic_template_substitution test to use new helper functions
  • Added extensive test coverage for edge cases: complex multi-field templates, special characters, long names, numeric values, case sensitivity, unicode characters, nested templates, repeated fields, and stress testing

Issues Found

  • Critical SQL Injection Vulnerabilities: Multiple instances where user-controlled values are directly interpolated into SQL queries instead of using parameterized queries (lines 55-57, 598, 724, 861). This violates the custom instruction about SQL injection prevention.
  • Logic Error: Line 938 incorrectly calls safe_sql_identifier() inside an f-string within JSON data, which will result in a malformed template string rather than proper sanitization.

Test Coverage

The tests validate that:

  1. Simple template substitution works correctly
  2. Multi-field templates are substituted properly
  3. Special characters and edge cases are handled
  4. The system can handle rapid pipeline creation
  5. Data flows to the correct dynamically-named destination streams

Confidence Score: 2/5

  • This PR has critical SQL injection vulnerabilities that must be fixed before merging
  • While this PR adds valuable test coverage for the dynamic template feature, it introduces multiple SQL injection vulnerabilities by directly interpolating values into SQL query strings instead of using parameterized queries. These are security-critical issues that violate established custom instructions and could lead to database compromise if exploited in a testing environment with malicious input.
  • tests/api-testing/tests/test_pipeline_dynamic.py requires immediate attention to fix SQL injection vulnerabilities in query construction (lines 55-57, 598, 724, 861) and the logic error at line 938

Important Files Changed

File Analysis

Filename Score Overview
tests/api-testing/tests/test_pipeline_dynamic.py 2/5 Added comprehensive test coverage for pipeline dynamic template substitution. Contains critical SQL injection vulnerabilities in query construction that directly interpolate user-controlled values.

Sequence Diagram

sequenceDiagram
    participant Test as Test Suite
    participant API as OpenObserve API
    participant Pipeline as Pipeline Engine
    participant Stream as Stream Storage
    
    Note over Test: Test Setup Phase
    Test->>Test: Load logs_data.json
    Test->>Test: Generate unique pipeline name & node IDs
    
    Note over Test,API: Pipeline Creation Phase
    Test->>API: POST /api/{org_id}/pipelines
    Note right of API: Pipeline with dynamic<br/>template destination<br/>(e.g., "logs_{kubernetes.namespace_name}_test")
    API-->>Test: 200 OK (pipeline created)
    
    Test->>API: GET /api/{org_id}/pipelines
    API-->>Test: Pipeline list with pipeline_id
    
    Test->>API: PUT /api/{org_id}/pipelines/{id}/enable?value=true
    API-->>Test: 200 OK (pipeline enabled)
    
    Note over Test,Stream: Data Ingestion Phase
    Test->>API: POST /api/{org_id}/{source_stream}/_json
    Note right of API: Ingest test data with<br/>kubernetes.namespace_name field
    API->>Stream: Store data in source stream
    Stream-->>API: Data stored
    API-->>Test: 200 OK
    
    Note over Test,Pipeline: Pipeline Processing Phase
    Test->>Test: Wait 15 seconds for processing
    Pipeline->>Stream: Read from source stream
    Stream-->>Pipeline: Log data with kubernetes fields
    Pipeline->>Pipeline: Evaluate conditions
    Pipeline->>Pipeline: Substitute template variables<br/>{kubernetes.namespace_name} → "monitoring"
    Pipeline->>Stream: Write to dynamic destination<br/>(e.g., "logs_monitoring_test")
    
    Note over Test,Stream: Validation Phase
    Test->>API: GET /api/{org_id}/pipelines
    API-->>Test: Verify pipeline enabled
    
    Test->>API: POST /api/{org_id}/_search?type=logs
    Note right of API: Query expected destination<br/>with template substituted
    API->>Stream: Search in destination stream
    Stream-->>API: Record count
    API-->>Test: Search results
    
    alt Data found in expected destination
        Test->>Test: ✅ Test PASSED
    else Data not in expected destination
        Test->>API: POST /api/{org_id}/_search?type=logs
        Note right of API: Check if data went to<br/>literal template name
        API-->>Test: Search results
        alt Data found in literal template
            Test->>Test: ❌ Test FAILED (bug detected)
        else Data not found anywhere
            Test->>Test: ❌ Test FAILED (complete failure)
        end
    end
Loading

1 file reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +55 to +57
base_sql = f'SELECT COUNT(*) as count FROM "{safe_stream}"'
if filter_condition:
base_sql += f" WHERE {filter_condition}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Directly interpolating safe_stream and filter_condition into the SQL query creates a potential SQL injection vulnerability, even though safe_stream is validated. Use parameterized queries instead.

Suggested change
base_sql = f'SELECT COUNT(*) as count FROM "{safe_stream}"'
if filter_condition:
base_sql += f" WHERE {filter_condition}"
base_sql = 'SELECT COUNT(*) as count FROM ?'
params = [safe_stream]
if filter_condition:
base_sql += ' WHERE ?'
params.append(filter_condition)

Context Used: Context from dashboard - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/api-testing/tests/test_pipeline_dynamic.py
Line: 55:57

Comment:
**logic:** Directly interpolating `safe_stream` and `filter_condition` into the SQL query creates a potential SQL injection vulnerability, even though `safe_stream` is validated. Use parameterized queries instead.

```suggestion
    base_sql = 'SELECT COUNT(*) as count FROM ?'
    params = [safe_stream]
    if filter_condition:
        base_sql += ' WHERE ?'
        params.append(filter_condition)
```

**Context Used:** Context from `dashboard` - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... ([source](https://app.greptile.com/review/custom-context?memory=5078ddce-9cb0-4860-a7aa-e4a273809929))

How can I resolve this? If you propose a fix, please make it concise.

safe_destination = safe_sql_identifier(expected_destination)
validation_payload = {
"query": {
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE special_test_marker LIKE \'SPECIAL_CHARS_{source_stream}\'',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: SQL injection: directly interpolating safe_destination and source_stream in the SQL query string.

Suggested change
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE special_test_marker LIKE \'SPECIAL_CHARS_{source_stream}\'',
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE special_test_marker = ?',

Context Used: Context from dashboard - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/api-testing/tests/test_pipeline_dynamic.py
Line: 598:598

Comment:
**logic:** SQL injection: directly interpolating `safe_destination` and `source_stream` in the SQL query string.

```suggestion
                        "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE special_test_marker = ?',
```

**Context Used:** Context from `dashboard` - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... ([source](https://app.greptile.com/review/custom-context?memory=5078ddce-9cb0-4860-a7aa-e4a273809929))

How can I resolve this? If you propose a fix, please make it concise.

safe_destination = safe_sql_identifier(expected_destination)
validation_payload = {
"query": {
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE \'%{test_case}%\'',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: SQL injection: directly interpolating safe_destination and test_case in the SQL query string.

Suggested change
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE \'%{test_case}%\'',
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE ?',

Context Used: Context from dashboard - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/api-testing/tests/test_pipeline_dynamic.py
Line: 724:724

Comment:
**logic:** SQL injection: directly interpolating `safe_destination` and `test_case` in the SQL query string.

```suggestion
                            "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE ?',
```

**Context Used:** Context from `dashboard` - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... ([source](https://app.greptile.com/review/custom-context?memory=5078ddce-9cb0-4860-a7aa-e4a273809929))

How can I resolve this? If you propose a fix, please make it concise.

safe_destination = safe_sql_identifier(expected_destination)
validation_payload = {
"query": {
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE \'%{expected_behavior}%\'',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: SQL injection: directly interpolating safe_destination and expected_behavior in the SQL query string.

Suggested change
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE \'%{expected_behavior}%\'',
"sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE ?',

Context Used: Context from dashboard - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/api-testing/tests/test_pipeline_dynamic.py
Line: 861:861

Comment:
**logic:** SQL injection: directly interpolating `safe_destination` and `expected_behavior` in the SQL query string.

```suggestion
                            "sql": f'SELECT COUNT(*) as count FROM "{safe_destination}" WHERE log LIKE ?',
```

**Context Used:** Context from `dashboard` - Avoid directly interpolating values into SQL queries; instead, generate parameter placeholders and b... ([source](https://app.greptile.com/review/custom-context?memory=5078ddce-9cb0-4860-a7aa-e4a273809929))

How can I resolve this? If you propose a fix, please make it concise.

"nodes": [
{"id": "node_1", "type": "stream", "data": {"stream": source_stream}},
{"id": "node_2", "type": "condition", "data": {"condition": f'kubernetes.namespace_name = "stress-test-{i}"'}},
{"id": "node_3", "type": "stream", "data": {"stream": f"stress_output_{{{{{safe_sql_identifier('kubernetes.namespace_name')}}}}}_stream"}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: The safe_sql_identifier() function call in an f-string inside JSON will not work as intended - it becomes a literal string in the JSON payload, not a properly sanitized identifier.

Suggested change
{"id": "node_3", "type": "stream", "data": {"stream": f"stress_output_{{{{{safe_sql_identifier('kubernetes.namespace_name')}}}}}_stream"}}
{"id": "node_3", "type": "stream", "data": {"stream": f"stress_output_{{kubernetes.namespace_name}}_stream"}}
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/api-testing/tests/test_pipeline_dynamic.py
Line: 938:938

Comment:
**logic:** The `safe_sql_identifier()` function call in an f-string inside JSON will not work as intended - it becomes a literal string in the JSON payload, not a properly sanitized identifier.

```suggestion
                {"id": "node_3", "type": "stream", "data": {"stream": f"stress_output_{{kubernetes.namespace_name}}_stream"}}
```

How can I resolve this? If you propose a fix, please make it concise.

@testdino-playwright-reporter
Copy link

⚠️ Test Run Unstable


Author: neha00290 | Branch: pipeline_dynamic_names_tests | Commit: 8052a74

Testdino Test Results

Status Total Passed Failed Skipped Flaky Pass Rate Duration
All tests passed 365 340 0 19 6 93% 4m 42s

View Detailed Results

@testdino-playwright-reporter
Copy link

⚠️ Test Run Unstable


Author: neha00290 | Branch: pipeline_dynamic_names_tests | Commit: 070e507

Testdino Test Results

Status Total Passed Failed Skipped Flaky Pass Rate Duration
All tests passed 365 343 0 19 3 94% 7m 6s

View Detailed Results

@testdino-playwright-reporter
Copy link

⚠️ Test Run Unstable


Author: neha00290 | Branch: pipeline_dynamic_names_tests | Commit: 070e507

Testdino Test Results

Status Total Passed Failed Skipped Flaky Pass Rate Duration
All tests passed 365 342 0 19 4 94% 4m 42s

View Detailed Results

@Subhra264 Subhra264 changed the title test: pytests-pipeline-dynamic-names fix: dynamic stream name should work if it starts with { and ends with } Oct 27, 2025
@github-actions github-actions bot added the ☢️ Bug Something isn't working label Oct 27, 2025
@Subhra264 Subhra264 merged commit 56024fe into main Oct 27, 2025
29 of 31 checks passed
@Subhra264 Subhra264 deleted the pipeline_dynamic_names_tests branch October 27, 2025 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants