Skip to content

Comments

Add search Engine _ GreyNoise#697

Merged
dogancanbakir merged 12 commits intoprojectdiscovery:devfrom
deehyeon:feat/696-greynoise-provider
Nov 25, 2025
Merged

Add search Engine _ GreyNoise#697
dogancanbakir merged 12 commits intoprojectdiscovery:devfrom
deehyeon:feat/696-greynoise-provider

Conversation

@deehyeon
Copy link
Contributor

@deehyeon deehyeon commented Sep 15, 2025

Summary

Added GreyNoise provider to uncover, enabling users to run queries against the GreyNoise API.

Changes

New agent: greynoise.go, request.go, response.go
Integrated GREYNOISE_API_KEY into key management
Added CLI option -e greynoise
Added integration tests for GreyNoise

Notes

GreyNoise GNQL (/v3/gnql) requires Enterprise/Research API keys.
Free/community keys will not return results (logged with a clear message).

Summary by CodeRabbit

  • New Features

    • Added GreyNoise as a new data source with GNQL query support; streams IP, host, raw data when available and respects query limits.
    • New CLI flag --greynoise (-gn) and automatic inclusion when no engine/queries specified.
    • GreyNoise registered among supported agents and subject to per-engine rate limiting.
    • Provider/config and GREYNOISE_API_KEY env var support.
  • Tests

    • Integration test added for GreyNoise; runs when API key present and skips gracefully otherwise.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 15, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds GreyNoise GNQL integration: new agent and GNQL models, CLI option and runner support, provider key handling and env loading, session rate-limit entry, agent registration, and integration-test addition. No exported signatures were removed.

Changes

Cohort / File(s) Summary
Integration tests
integration-tests/integration-test.go, integration-tests/source-test.go
Adds greynoise test entry and a new non-exported greynoiseTestcases that requires GREYNOISE_API_KEY, writes a provider config entry, runs a GNQL query, and skips on missing key/errors/zero results. Minor import formatting change.
Runner options & CLI
runner/options.go
Adds Options.GreyNoise goflags.StringSlice, --greynoise / -gn flag, integrates GreyNoise into engine auto-detection/validation and query assembly.
GreyNoise agent implementation
sources/agent/greynoise/greynoise.go, sources/agent/greynoise/request.go, sources/agent/greynoise/response.go
New exported Agent with Name() and Query(...) that validates key, pages GNQL via scroll, streams results, respects query limits, builds HTTP requests with auth, decodes typed GNQL response models, and maps items to sources.Result. Adds Request and rich Response models.
Provider keys and env loading
sources/keys.go, sources/provider.go
Adds Keys.GreyNoiseKey and Provider.GreyNoise []string (yaml:"greynoise"); loads GREYNOISE_API_KEY from env, selects random provider key into Keys, and updates emptiness/has-keys logic.
Session rate limiting
sources/session.go
Adds default rate-limit entry for greynoise (Key="greynoise", MaxCount=1, Duration=1s).
Agent registration & exports
uncover.go
Imports greynoise agent, registers greynoise.Agent{} in agent switch, and includes "greynoise" in AllAgents().

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI as CLI (flags)
  participant Runner as Runner / Options
  participant Session as Session / Provider
  participant Uncover as Agent Registry
  participant GN as GreyNoise Agent
  participant API as GreyNoise GNQL API

  CLI->>Runner: Parse --greynoise / -gn
  Runner->>Session: Build engines, include greynoise
  Session->>Session: Load provider keys / env (GREYNOISE_API_KEY)
  Session-->>Runner: Provide Keys (GreyNoiseKey)
  Runner->>Uncover: Instantiate agents (includes greynoise)
  Runner->>GN: Query(session, query)
  GN->>GN: Validate key, set size/limit, start scroll loop
  loop fetch pages
    GN->>API: GET /gnql?query=...&size=...&scroll=... (key header)
    API-->>GN: 200 JSON {request_metadata, data[]}
    GN->>Runner: Stream Result {IP, Host, Source="greynoise", Raw?}
  end
  GN-->>Runner: Close results channel
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • release v1.1.0 #680 — Adds another search-agent integration with analogous changes (agent registration, options, provider/keys, session rate-limits, integration tests).
  • Add search Engine _ Onyphe #672 — Similar new engine integration with parallel edits across agents, provider keys, runner options, and tests.
  • Add search Engine _ Driftnet #678 — Parallel integration pattern (new search-agent) with comparable file-level modifications.

Suggested reviewers

  • dogancanbakir
  • ehsandeep

Poem

In burrows deep I sniff the GNQL breeze,
Keys in paw, I scroll with gentle ease.
Results hop out, one by one and true—
A rabbit coder bringing grey noise through. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Add search Engine _ GreyNoise" correctly identifies the primary change — adding GreyNoise as a search engine/provider and CLI option — and therefore matches the PR's main intent and changeset. It is short and focused enough for a reviewer to understand the primary intent. The stray underscore and inconsistent capitalization reduce polish but do not make the title misleading.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
sources/agent/greynoise/request.go (1)

3-9: Add JSON tags (+omitempty) to ensure correct GNQL serialization

Without tags, marshaling will emit Go field names. Recommend explicit JSON keys and omitting empty fields.

Apply:

 type Request struct {
-	Query      string // GNQL query string (required)
-	Size       int    // Number of results per page (1-10000, defaults to 10000)
-	Scroll     string // Scroll token for pagination
-	Quick      bool   // Quick=true returns only IP and classification/trust level
-	ExcludeRaw bool   // Optional: request without heavy raw_data
+	Query      string `json:"query"`                   // GNQL query string (required)
+	Size       int    `json:"size,omitempty"`          // Number of results per page (1-10000, defaults to 10000)
+	Scroll     string `json:"scroll,omitempty"`        // Scroll token for pagination
+	Quick      bool   `json:"quick,omitempty"`         // Quick=true returns only IP and classification/trust level
+	ExcludeRaw bool   `json:"exclude_raw,omitempty"`   // Optional: request without heavy raw_data
 }

If GNQL expects different keys (e.g., exclude vs exclude_raw), please adjust accordingly.

sources/agent/greynoise/response.go (1)

24-25: Make heavy/optional substructures pointers with omitempty

Reduces allocations and output churn when ExcludeRaw is used or when business context is absent.

 type GNQLItem struct {
 	IP                          string                      `json:"ip"`
-	InternetScannerIntelligence InternetScannerIntelligence `json:"internet_scanner_intelligence"`
-	BusinessServiceIntelligence BusinessServiceIntelligence `json:"business_service_intelligence"`
+	InternetScannerIntelligence InternetScannerIntelligence  `json:"internet_scanner_intelligence"`
+	BusinessServiceIntelligence *BusinessServiceIntelligence `json:"business_service_intelligence,omitempty"`
 }
@@
-	Tags     Tags     `json:"tags"`
-	RawData  RawData  `json:"raw_data"`
+	Tags     Tags     `json:"tags"`
+	RawData  *RawData `json:"raw_data,omitempty"`

Also applies to: 44-47

sources/agent/greynoise/greynoise.go (2)

136-154: Non-2xx handling here is unreachable due to Session.Do

Session.Do already errors on any non-200, so this block never runs; wrap the Session.Do error to add GNQL guidance and drop the redundant check.

Apply this diff:

 resp, err := session.Do(req, agent.Name())
-if err != nil {
-    return nil, err
-}
+if err != nil {
+    // Provide clearer hint for auth/plan errors while preserving rate limiter usage
+    if resp != nil && (resp.StatusCode == http.StatusUnauthorized || resp.StatusCode == http.StatusForbidden || resp.StatusCode == http.StatusNotFound) {
+        b, _ := io.ReadAll(resp.Body)
+        msg := strings.TrimSpace(string(b))
+        return nil, fmt.Errorf("GreyNoise GNQL request failed: status=%d. Your API key may not include GNQL access (Enterprise key required). body=%s", resp.StatusCode, msg)
+    }
+    return nil, err
+}
 defer resp.Body.Close()
-
-if resp.StatusCode < 200 || resp.StatusCode > 299 {
-    b, _ := io.ReadAll(resp.Body)
-    msg := strings.TrimSpace(string(b))
-
-    if resp.StatusCode == http.StatusUnauthorized || resp.StatusCode == http.StatusForbidden || resp.StatusCode == http.StatusNotFound {
-        return nil, fmt.Errorf(
-            "GreyNoise GNQL request failed: status=%d. Your API key may not include GNQL access (Enterprise key required). body=%s",
-            resp.StatusCode, msg,
-        )
-    }
-
-    return nil, fmt.Errorf("greynoise GNQL request failed: status=%d body=%s", resp.StatusCode, msg)
-}

162-169: Use project logger for debug output

Prefer gologger.Debug() over fmt.Fprintf(os.Stderr, ...) for consistent observability and easy silencing in non-debug runs.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6a159bf and c6ffccd.

⛔ Files ignored due to path filters (1)
  • sources/agent/greynoise/example.json is excluded by !**/*.json
📒 Files selected for processing (10)
  • integration-tests/integration-test.go (1 hunks)
  • integration-tests/source-test.go (1 hunks)
  • runner/options.go (7 hunks)
  • sources/agent/greynoise/greynoise.go (1 hunks)
  • sources/agent/greynoise/request.go (1 hunks)
  • sources/agent/greynoise/response.go (1 hunks)
  • sources/keys.go (2 hunks)
  • sources/provider.go (4 hunks)
  • sources/session.go (1 hunks)
  • uncover.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
sources/agent/greynoise/request.go (1)
sources/agent.go (1)
  • Query (3-6)
uncover.go (2)
sources/agent/greynoise/greynoise.go (1)
  • Agent (21-21)
sources/agent.go (1)
  • Agent (8-11)
integration-tests/source-test.go (2)
uncover.go (1)
  • New (57-113)
testutils/integration.go (1)
  • RunUncoverAndGetResults (10-38)
sources/agent/greynoise/greynoise.go (7)
sources/agent.go (1)
  • Query (3-6)
sources/session.go (1)
  • Session (39-44)
sources/keys.go (1)
  • Keys (3-23)
uncover.go (1)
  • New (57-113)
sources/agent/greynoise/request.go (1)
  • Request (3-9)
sources/agent/greynoise/response.go (4)
  • InternetScannerIntelligence (28-47)
  • Metadata (50-75)
  • RequestMetadata (10-18)
  • Response (4-7)
sources/util.go (1)
  • NewHTTPRequest (10-17)
🔇 Additional comments (18)
runner/options.go (7)

65-65: Add GreyNoise flag field — LGTM

Field naming is consistent with existing engines.


77-77: Engine help updated to include greynoise

Looks good. Please ensure README/usage docs mention -e greynoise.


98-99: Register --greynoise (-gn) flag — LGTM

Flag wiring matches other engines.


174-176: Default-engine fallback includes greynoise — LGTM

Coverage for the new slice is correct.


247-249: Validation: “no query provided” includes greynoise — LGTM


276-277: Validation: “no engine specified” includes greynoise — LGTM


320-321: Agent name matches CLI/engine key — no action needed
Agent.Name() returns exactly "greynoise".

uncover.go (3)

16-16: Import greynoise agent — LGTM


93-95: Register greynoise agent in factory — LGTM

Switch-case wiring consistent with other agents.


199-200: Expose greynoise via AllAgents — LGTM; default rate-limit entry present.
DefaultRateLimits already includes "greynoise" (MaxCount: 1, Duration: time.Second); no further changes required.

sources/keys.go (2)

22-22: Add GreyNoiseKey — LGTM

Naming aligns with existing Key/Token conventions.


43-45: LGTM — GreyNoiseKey included and provider wiring verified
Provider loads GREYNOISE_API_KEY into provider.GreyNoise (line 176) and GetKeys assigns it to keys.GreyNoiseKey (lines 123–126). No changes required.

sources/agent/greynoise/response.go (1)

110-121: Confirmed HTTP field keys — no change required

Verified keys: raw_data.http.useragent, raw_data.http.request_header, raw_data.http.request_cookies, raw_data.http.cookie_keys — the struct tags already match GreyNoise schema.

integration-tests/integration-test.go (1)

35-35: Add greynoise suite to default tests — looks good

Test wiring is consistent with other engines.

sources/provider.go (2)

39-40: Provider config: new greynoise key list — OK

YAML key matches the written config in tests.


198-199: HasKeys includes greynoise — OK

Keeps feature parity with other providers.

integration-tests/source-test.go (1)

268-294: GreyNoise integration test behavior — OK

Graceful handling for Community keys and GNQL access is sensible. Test file write/cleanup matches existing patterns.

sources/session.go (1)

35-35: GreyNoise GNQL rate limit — 1 req/s is a conservative default; add 429 handling

GreyNoise docs do not publish a fixed per‑second GNQL limit; usage is counted as "Searches" and exceeding plan quotas returns 429s. 1 req/s is a reasonable conservative default but not guaranteed for all plans/queries.

  • Action: keep the greynoise entry (MaxCount: 1, Duration: time.Second) as a conservative default, implement retries with exponential backoff on 429 responses, and record/honor rate/usage headers; confirm against your GreyNoise plan if you require higher throughput.

Location: sources/session.go:35 (greynoise entry).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
sources/agent/greynoise/request.go (1)

3-8: Add GoDoc and basic validation for Request.

  • Exported type lacks a GoDoc comment.
  • Enforce size bounds (1–10000) and normalize defaults to avoid server-side 400s.

Apply:

 package greynoise

+// Request contains GNQL request parameters.
 type Request struct {
 	Query      string // GNQL query string (required)
 	Size       int    // Number of results per page (1-10000, defaults to 10000)
 	Scroll     string // Scroll token for pagination
 	ExcludeRaw bool   // Optional: request without heavy raw_data
 }
+
+// Normalize clamps fields to API-accepted ranges and fills defaults.
+func (r *Request) Normalize() {
+	if r.Size <= 0 || r.Size > 10000 {
+		r.Size = 10000
+	}
+}
sources/agent/greynoise/response.go (2)

27-35: Optional: parse timestamps into time.Time with custom layout.

GN timestamps are string-encoded; typed parsing helps downstream filtering/sorting.

Example helper:

type GNTime struct{ time.Time }

const gnLayout = "2006-01-02 15:04:05"

func (t *GNTime) UnmarshalJSON(b []byte) error {
	var s string
	if err := json.Unmarshal(b, &s); err != nil { return err }
	tt, err := time.ParseInLocation(gnLayout, s, time.UTC)
	if err != nil { return err }
	t.Time = tt
	return nil
}

Then use GNTime for FirstSeen, LastSeen, LastSeenTS, LastUpdated.

Also applies to: 137-147


27-47: Guard against nulls in fields that are sometimes unknown.

If API returns null for fields like Actor, VPNService, Classification, Datacenter, etc., unmarshalling into non-pointer primitives will fail. Consider pointers for nullable fields.

Would you like a quick pass converting known-nullable fields to pointers based on sample responses?

Also applies to: 49-75

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c6ffccd and f0178b2.

📒 Files selected for processing (4)
  • sources/agent/greynoise/greynoise.go (1 hunks)
  • sources/agent/greynoise/request.go (1 hunks)
  • sources/agent/greynoise/response.go (1 hunks)
  • sources/provider.go (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • sources/agent/greynoise/greynoise.go
  • sources/provider.go
🧰 Additional context used
🧬 Code graph analysis (1)
sources/agent/greynoise/request.go (1)
sources/agent.go (1)
  • Query (3-6)
🔇 Additional comments (3)
sources/agent/greynoise/response.go (2)

3-7: Top-level shape LGTM.

Response, RequestMetadata, GNQLItem are modeled cleanly and match expectations.

If you have a sample fixture, I can generate a table-test to unmarshal and assert counts/scroll.


9-18: Change RequestMetadata.RestrictedFields to []string (not [][]string).
GreyNoise GNQL v3 shows request_metadata.restricted_fields is a JSON array of strings; update the struct in sources/agent/greynoise/response.go (RequestMetadata.RestrictedFields).

Likely an incorrect or invalid review comment.

sources/agent/greynoise/request.go (1)

3-8: Limit → Request.Size mapping verified; Normalize() not present.

  • GreyNoise maps sources.Query.Limit → pageSize and sets Request.Size (sources/agent/greynoise/greynoise.go:39–49) and adjusts pageSize during pagination to enforce Limit.
  • No Normalize() function or call found in the repo; if upstream requires normalizing the query, call query.Normalize() (or add equivalent) at the start of greynoise.Agent.Query (sources/agent/greynoise/greynoise.go).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
sources/agent/greynoise/request.go (2)

3-8: Add url tags for query encoding (or confirm your encoder).

Prevents accidental Query=...&Size=... casing if a url-tag encoder is used.

Apply one of the following:

Option A — add url tags (keep JSON tags):

 type Request struct {
-    Query      string `json:"query"`
-    Size       int    `json:"size,omitempty"`
-    Scroll     string `json:"scroll,omitempty"`
-    ExcludeRaw bool   `json:"exclude_raw,omitempty"`
+    Query      string `json:"query" url:"query"`
+    Size       int    `json:"size,omitempty" url:"size,omitempty"`
+    Scroll     string `json:"scroll,omitempty" url:"scroll,omitempty"`
+    ExcludeRaw bool   `json:"exclude_raw,omitempty" url:"exclude_raw,omitempty"`
 }

Option B — if ExcludeRaw only toggles the endpoint (/v3/gnql/metadata) and should not be sent on the wire, don’t tag it:

 type Request struct {
-    Query      string `json:"query"`
-    Size       int    `json:"size,omitempty"`
-    Scroll     string `json:"scroll,omitempty"`
-    ExcludeRaw bool   `json:"exclude_raw,omitempty"`
+    Query      string `json:"query" url:"query"`
+    Size       int    `json:"size,omitempty" url:"size,omitempty"`
+    Scroll     string `json:"scroll,omitempty" url:"scroll,omitempty"`
+    ExcludeRaw bool
 }

3-3: Add a Go doc comment for the exported type.

Tiny lint/readability win.

-package greynoise
+package greynoise
+
+// Request holds GNQL parameters. ExcludeRaw toggles /v3/gnql/metadata (no raw data).
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f0178b2 and e3da069.

📒 Files selected for processing (2)
  • sources/agent/greynoise/request.go (1 hunks)
  • sources/agent/greynoise/response.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • sources/agent/greynoise/response.go
🧰 Additional context used
🧬 Code graph analysis (1)
sources/agent/greynoise/request.go (1)
sources/agent.go (1)
  • Query (3-6)
🔇 Additional comments (1)
sources/agent/greynoise/request.go (1)

3-8: Struct and snake_case mapping look good.

Fields and JSON tags align with GNQL names.

If this struct is used for query-string encoding (GET), please confirm the encoder respects json tags; many use url tags instead. See next comment for a safe tweak.

@dogancanbakir dogancanbakir self-requested a review September 16, 2025 07:16
@ehsandeep ehsandeep changed the base branch from main to dev September 21, 2025 16:24
@dogancanbakir dogancanbakir requested review from Mzack9999 and dwisiswant0 and removed request for dogancanbakir September 26, 2025 10:01
@dogancanbakir dogancanbakir merged commit 15e626f into projectdiscovery:dev Nov 25, 2025
9 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants