Skip to content

Ingest flow cleanup: Fix ingest duplication#1086

Closed
SB2318 wants to merge 17 commits intoTracer-Cloud:mainfrom
SB2318:collapse-inject-flow
Closed

Ingest flow cleanup: Fix ingest duplication#1086
SB2318 wants to merge 17 commits intoTracer-Cloud:mainfrom
SB2318:collapse-inject-flow

Conversation

@SB2318
Copy link
Copy Markdown
Contributor

@SB2318 SB2318 commented Apr 29, 2026

Fixes #867

Describe the changes you have made in this PR -

This PR refactors the ingest logic in generate_report() by extracting the two-step flow (create investigation → attach URL) into a dedicated helper.
Previously, generate_report() directly called send_ingest() twice, which made the write path harder to follow and test. This change introduces a single helper that encapsulates that flow without modifying the existing API contract

Key Changes

✔ generate_report() no longer has duplicate ingest logic
✔ Helper encapsulates the two-step flow clearly
✔ No change to send_ingest() contract
✔ Failure behavior unchanged

Demo/Screenshot for feature changes and bug fixes -


Code Understanding and AI Usage

Did you use AI assistance (ChatGPT, Claude, Copilot, etc.) to write any part of this code?

  • No, I wrote all the code myself
  • Yes, I used AI assistance (continue below)

If you used AI assistance:

  • I have reviewed every single line of the AI-generated code
  • I can explain the purpose and logic of each function/component I added
  • I have tested edge cases and understand how the code handles them
  • I have modified the AI output to follow this project's coding standards and conventions

Explain your implementation approach:

What this solves:

  • Eliminates duplicated two-step ingest logic
  • Improves readability and debuggability

Implementation approach:

Ingest Flow Refactor

  • Extracted helper: create_investigation_and_attach_url()
  • Encapsulates:
    • create investigation (send_ingest)
    • attach investigation_url (conditional second call)
  • Replaced inline logic in generate_report()

Key design choices:

  • Kept send_ingest() unchanged (no API contract changes)

Behavior guarantees:

  • Same ingest behavior and failure handling (logged, non-blocking)

Test coverage:

  • Ingest helper:
    • success path
    • second call failure
    • missing investigation ID

Checklist before requesting a review

  • I have added proper PR title and linked to the issue
  • I have performed a self-review of my code
  • I can explain the purpose of every function, class, and logic block I added
  • I understand why my changes work and have tested them thoroughly
  • I have considered potential edge cases and how my code handles them
  • If it is a core feature, I have added thorough tests
  • My code follows the project's style guidelines and conventions

Note: Please check Allow edits from maintainers if you would like us to assist in the PR.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This PR extracts the two-step ingest flow (create investigation → attach URL) from generate_report() into a dedicated helper create_investigation_and_attach_url() in app/utils/ingest_delivery.py. The refactor preserves all existing behavior and error-handling semantics without modifying send_ingest()'s contract.

Confidence Score: 5/5

Safe to merge — pure refactoring with no behavioral changes; all findings are P2 style/cleanup items.

All issues found are P2: a cross-layer import that doesn't cause a runtime problem, missing tests that were claimed but not blocking the refactored logic (existing behavior is unchanged), and a stale comment. No functional regressions introduced.

app/utils/ingest_delivery.py — cross-layer import from app/nodes/ and missing tests for the new helper.

Important Files Changed

Filename Overview
app/utils/ingest_delivery.py New helper create_investigation_and_attach_url correctly encapsulates the two-step ingest flow; introduces a cross-layer import from app/nodes/ into app/utils/, and promised tests are absent from the test file.
app/nodes/publish_findings/node.py Call site cleanly delegates to the new helper; stale comment and redundant variable declarations remain as minor cleanup items.

Sequence Diagram

sequenceDiagram
    participant N as node.py generate_report()
    participant H as ingest_delivery.py create_investigation_and_attach_url()
    participant S as send_ingest()
    participant API as Ingest API

    N->>H: create_investigation_and_attach_url(state, slack_message, summary)
    H->>S: send_ingest(state_with_report)
    S->>API: POST /api/investigations/ingest
    API-->>S: { data: { investigation_id } }
    S-->>H: investigation_id
    H->>H: get_investigation_url(org_slug, investigation_id)
    alt investigation_id present
        H->>S: send_ingest(state_with_url)
        S->>API: POST /api/investigations/ingest (with investigation_url)
        API-->>S: 200 OK
        S-->>H: ignored
    end
    H-->>N: (investigation_id, investigation_url)
Loading

Reviews (1): Last reviewed commit: "Merge branch 'Tracer-Cloud:main' into co..." | Re-trigger Greptile

Comment thread app/utils/ingest_delivery.py Outdated
Comment thread app/utils/ingest_delivery.py
Comment thread app/nodes/publish_findings/node.py Outdated
@Ghraven
Copy link
Copy Markdown

Ghraven commented Apr 29, 2026

Hey @SB2318, this is a clean refactor — moving get_investigation_url to live alongside send_ingest in ingest_delivery.py makes sense since they share the same app.config dependency, and removing it from report.py keeps the formatter layer focused on presentation.

One small thing I noticed in the new create_investigation_and_attach_url function — looking at the test:

investigation_id, investigation_url = (
    ingest_delivery.create_investigation_and_attach_url(sample_state, "slack message")
)

It returns a tuple of (str | None, str | None), which is fine, but the function signature in the diff seems to be cut off so I couldn't verify the return type annotation. If it's not there yet, a -> tuple[str | None, str | None] annotation would make the contract clear for callers.

Also just noticed the cast import was removed from node.py — was that intentional (i.e. there are no more cast() calls after the refactor)? If so, the unused import removal is a nice cleanup bonus.

Good work on this one — the test coverage for the new function looks solid too! 🙏

@SB2318
Copy link
Copy Markdown
Contributor Author

SB2318 commented Apr 30, 2026

Hey @SB2318, this is a clean refactor — moving get_investigation_url to live alongside send_ingest in ingest_delivery.py makes sense since they share the same app.config dependency, and removing it from report.py keeps the formatter layer focused on presentation.

One small thing I noticed in the new create_investigation_and_attach_url function — looking at the test:

investigation_id, investigation_url = (
    ingest_delivery.create_investigation_and_attach_url(sample_state, "slack message")
)

It returns a tuple of (str | None, str | None), which is fine, but the function signature in the diff seems to be cut off so I couldn't verify the return type annotation. If it's not there yet, a -> tuple[str | None, str | None] annotation would make the contract clear for callers.

Also just noticed the cast import was removed from node.py — was that intentional (i.e. there are no more cast() calls after the refactor)? If so, the unused import removal is a nice cleanup bonus.

Good work on this one — the test coverage for the new function looks solid too! 🙏

Thanks, I have some pending review tasks on these two PR.
Bot suggested me also to write unit test for the create_investigation_and_data_attach_url function.

But due natural calamity, I didnot fixed it yesterday.
please give me 1-2 hour, And please let me know any more suggestions from your side. 😇

@SB2318
Copy link
Copy Markdown
Contributor Author

SB2318 commented Apr 30, 2026

@Ghraven, please review this PR, and let me know if it would be better to squash those commits, or you need any further changes.

I added corresponding test cases for the new methods and updated all usages to align with the project’s new send_ingest workflow changes.

Copy link
Copy Markdown
Collaborator

@yashksaini-coder yashksaini-coder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran locally: 65/65 pass, lint clean. Refactor logic is correct and the move of get_investigation_url into ingest_delivery.py makes sense — it shares the app.config dependency with send_ingest and has no business being in the formatters layer.

One functional regression though:

investigation_url fallback is lost

Old node.py always computed a URL even when ingest failed:

investigation_url = get_investigation_url(state.get("organization_slug"), investigation_id)
# investigation_id=None → returns "{base}/investigations" (the list page)

The new helper only sets investigation_url when investigation_id is not None, so on first-ingest failure it returns (None, None). The call site then does:

build_action_blocks(investigation_url or "", investigation_id)

Slack's Block Kit rejects an empty string on the button's url field — the "View Details in Tracer" button will break whenever ingest is down or unreachable. The old behavior (fallback to the investigations list URL) should be preserved. Simplest fix is to always call get_investigation_url in the helper regardless of whether investigation_id is set:

investigation_url = get_investigation_url(state.get("organization_slug"), investigation_id)

and remove the investigation_url or "" guard at the call site.


Minor — test_node.py patches internals instead of the helper

test_masking_unmask.py patches create_investigation_and_attach_url wholesale, which is the right approach for an integration-level test. test_node.py still patches send_ingest and get_investigation_url at the utility module level — it works but locks the test to the helper's internal structure. Worth aligning to the same strategy.


The refactor itself is clean and the three new tests in test_ingest_delivery.py cover the right paths. Fix the URL fallback and this is good to merge.

@SB2318
Copy link
Copy Markdown
Contributor Author

SB2318 commented Apr 30, 2026 via email

@Ghraven
Copy link
Copy Markdown

Ghraven commented Apr 30, 2026

Thanks for the updates @SB2318 — both points I raised look good now:

  1. Return type annotation-> tuple[str | None, str | None] is there on create_investigation_and_attach_url, type checkers can now infer the unpacked values correctly.

  2. cast removal confirmed — the unused import is gone from node.py.

I also noticed get_investigation_url has been moved into ingest_delivery.py with the investigation_id=None → returns list page fallback preserved, which addresses the regression @yashksaini-coder flagged. That's the right call.

On squashing — I'd lean toward keeping the commits separate for now since the history is readable and the maintainers may want to see the progression. But that's ultimately muddlebee's call.

Looks solid to me! 🙏

@SB2318
Copy link
Copy Markdown
Contributor Author

SB2318 commented Apr 30, 2026

Ran locally: 65/65 pass, lint clean. Refactor logic is correct and the move of get_investigation_url into ingest_delivery.py makes sense — it shares the app.config dependency with send_ingest and has no business being in the formatters layer.

One functional regression though:

investigation_url fallback is lost

Old node.py always computed a URL even when ingest failed:

investigation_url = get_investigation_url(state.get("organization_slug"), investigation_id)
# investigation_id=None → returns "{base}/investigations" (the list page)

The new helper only sets investigation_url when investigation_id is not None, so on first-ingest failure it returns (None, None). The call site then does:

build_action_blocks(investigation_url or "", investigation_id)

Slack's Block Kit rejects an empty string on the button's url field — the "View Details in Tracer" button will break whenever ingest is down or unreachable. The old behavior (fallback to the investigations list URL) should be preserved. Simplest fix is to always call get_investigation_url in the helper regardless of whether investigation_id is set:

investigation_url = get_investigation_url(state.get("organization_slug"), investigation_id)

and remove the investigation_url or "" guard at the call site.

Minor — test_node.py patches internals instead of the helper

test_masking_unmask.py patches create_investigation_and_attach_url wholesale, which is the right approach for an integration-level test. test_node.py still patches send_ingest and get_investigation_url at the utility module level — it works but locks the test to the helper's internal structure. Worth aligning to the same strategy.

The refactor itself is clean and the three new tests in test_ingest_delivery.py cover the right paths. Fix the URL fallback and this is good to merge.

With empty string removal, your test is falling.

Run python -m mypy app/
app/nodes/publish_findings/node.py:42: error: Argument 1 to "build_action_blocks" has incompatible type "str | None"; expected "str"  [arg-type]
Found 1 error in 1 file (checked 424 source files)
Error: Process completed with exit code 1.

```

@SB2318
Copy link
Copy Markdown
Contributor Author

SB2318 commented Apr 30, 2026

@yashksaini-coder Please check my comment on these two issues.
#317 and #481

@muddlebee
Copy link
Copy Markdown
Collaborator

@SB2318 pls fix CI

@SB2318
Copy link
Copy Markdown
Contributor Author

SB2318 commented May 1, 2026

@SB2318 pls fix CI

okay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Collapse the current two-step ingest flow into one explicit helper

4 participants