Your application attack surface is growing faster than your security team can keep up with. AI coding agents are shipping new APIs and microservices daily. Developers are deploying multiple times a day. And your security team still can’t answer a basic question: what applications and APIs actually exist, and which ones have been tested?
This is the problem StackHawk was built to solve. StackHawk is an AppSec Intelligence Platform that combines three capabilities: automatic attack surface discovery from source code, runtime security testing in CI/CD pipelines, and centralized oversight of application risk posture. Together, these three pillars give security teams complete visibility into what exists, what’s vulnerable, and what’s been fixed.
This post walks through how each piece works, from discovering your APIs to testing them for exploitable vulnerabilities to tracking risk across your entire application portfolio.
The Three Pillars of the StackHawk Platform
StackHawk is organized around three core functions that work together as a continuous loop: Discover, Test, and Oversee.
Attack Surface Discovery answers the question “what do we have?” It connects to your source code repositories and uses AI to map every API, microservice, and web application in your organization.
Runtime Testing answers the question “what’s actually exploitable?” It runs dynamic security tests against your applications in CI/CD pipelines, staging environments, and even local developer workstations.
Oversight & Intelligence answers the question “are we actually reducing risk?” It tracks testing coverage, vulnerability remediation, and overall risk posture across your application portfolio.
Most security programs struggle because these three functions live in separate tools with no connection between them. StackHawk brings them into a single platform so that a discovered API flows directly into a testing pipeline, and test results feed directly into risk tracking.

Attack Surface Discovery: Know What You Have
You can’t secure what you can’t see. For most organizations, the first problem isn’t that they have vulnerabilities. It’s that they don’t know how many applications and APIs they even have.
Traditional discovery tools wait for network traffic to reveal APIs in production. By that point, the API has already been exposed to the internet. StackHawk takes a different approach with Attack Surface Discovery. It connects directly to your source code repositories in GitHub, GitLab, Bitbucket, or Azure Repos and analyzes your code to identify every testable application before it reaches production.
What gets discovered:
- API endpoints: REST, GraphQL, gRPC, JSON-RPC, and WebSockets identified directly from the source code
- Sensitive data exposure: APIs handling PII, PCI, or HIPAA data are flagged at the code level, so your team knows which applications carry the highest risk
- Languages and frameworks: StackHawk detects the technology stack (Spring Boot, Rails, Django, Express, and others) to understand what type of application each repository contains
- Change velocity: Repositories with high commit activity are surfaced so your team can prioritize fast-moving, high-risk codebases

This discovery is powered by StackHawk’s AI engine. The engine analyzes code structures and metadata across your repositories to distinguish testable applications and automatically maps repositories to applications, handling common patterns such as monorepos with multiple services, single-repo applications, and microservice architectures where multiple repos feed into a single logical application.
OpenAPI spec generation is another capability available through the discovery process. StackHawk can auto-generate OpenAPI specifications directly from your source code, eliminating the manual work of writing and maintaining specs. They also feed directly into StackHawk’s testing engine, bridging the gap between discovering an API and actually scanning it for vulnerabilities.
For small AppSec teams (which is most AppSec teams), this means you can focus your limited resources on the 15 applications that matter, rather than manually auditing the 100 repositories where those applications’ code components live.
Runtime Testing: Find What’s Actually Exploitable
Discovery tells you what exists. Runtime testing tells you what’s actually vulnerable.
StackHawk’s scanner, HawkScan, is a dynamic application security testing (DAST) tool designed to run inside your development workflow, not after deployment. HawkScan runs as a hosted scanner (on StackHawk’s infrastructure), Docker container, or via the Hawk CLI, and it tests your applications in their running state by simulating real attack patterns against each discovered endpoint.
How HawkScan Works
HawkScan operates through a straightforward three-step process: discover endpoints, test them, and report findings.

Step 1: Scan Discovery. Before testing for vulnerabilities, HawkScan needs to know what endpoints exist. It supports multiple discovery methods depending on your application type:
- Web applications: A base spider crawls your application, starting from a provided URL, recursively discovering HTML pages and routes
- REST APIs: HawkScan consumes your OpenAPI/Swagger specification to map every route and understand expected inputs
- GraphQL APIs: An introspection query retrieves all available queries, mutations, and input types from your GraphQL server
- gRPC services: Schema reflection and file descriptor sets provide the service structure for testing
- JSON-RPC: Endpoint mapping and method discovery for JSON-RPC protocol APIs
- SOAP services: WSDL parsing maps the operations and expected message formats
- Custom sources: Postman collections, Burp Suite site maps, and HAR files can all feed discovery
Step 2: Active Testing. As endpoints are discovered, HawkScan runs a set of security test cases against each one. These tests simulate real attacker behavior. HawkScan injects SQL into input fields, tests for cross-site scripting, probes for authentication bypass, checks for insecure direct object references, and exercises vulnerabilities from the OWASP Top 10 and beyond.
The key difference from static analysis (SAST) is that HawkScan validates whether a vulnerability is actually exploitable at runtime, not just theoretically present in code. A SAST tool might flag a code pattern as risky. HawkScan and its DAST capabilities prove whether that pattern can actually be exploited through a running application.
Step 3: Findings and Remediation. Results flow into the StackHawk Platform, where your team can review, triage, and manage findings. Each finding includes the HTTP request and response that triggered it, a severity classification (High, Medium, or Low), links to relevant OWASP guidance, and an auto-generated cURL command so developers can reproduce the vulnerability locally and step through the request in their debugger. StackHawk also provides remediation guidance written to match the target language where the fix would be applied, not generic security jargon.
Findings follow a structured triage workflow. Every new finding starts as New (unprocessed). Your team can then mark it as Assigned with a link to a Jira ticket, Risk Accepted if the business decides not to fix it, or False Positive to reduce noise in future scans. Comments added during triage create an audit trail that integrates directly with your issue tracking tools.

Configuration
All of this is driven by a single configuration file: stackhawk.yml. This contains all of the details HawkScan requires to test your application. From the base configuration created when you add your application to StackHawk, you can extend it to handle authenticated scanning (OAuth 2.0, API keys, session tokens, and custom multi-step auth flows), tune spider behavior, set scan duration limits, and enable specialized testing for specific API types. Configuration can also be managed from the StackHawk Platform directly through hosted configuration, so teams don’t have to manage local YAML files if they prefer a centralized approach.
Two features make scan configuration especially efficient. Scan policies let you define reusable sets of security tests tailored to your application types. You can create organization-level policies that standardize which vulnerability checks run across all your applications, then customize individual policies by including or excluding specific test plugins. Technology Flags let you tell HawkScan which databases, languages, frameworks, and web servers your application actually uses. By disabling tests for technologies that don’t apply, you get faster scans and fewer false positives without sacrificing coverage of real vulnerabilities.
For sensitive credentials like API tokens and passwords that scans need at runtime, StackHawk’s Secrets Manager stores them securely so they’re never exposed as plain-text environment variables on the scanner host. Secrets are automatically applied during scans and redacted in all logs.
Where Testing Happens
One of the most significant changes in modern DAST is that testing no longer needs to wait for a staging or production deployment. StackHawk is designed to run across four stages of the development lifecycle:

- CI/CD pipelines: Scans run automatically on every pull request or merge, giving developers fast feedback while they’re still in context
- Staging environments: Full-depth testing for authorization bypasses, business logic flaws, and privilege escalation that you wouldn’t run against production
- Production: Non-invasive, prod-safe policies that won’t disrupt live users
- Local development: Developers can run HawkScan locally against their application during development for immediate security feedback
For teams that need to scan applications without local setup or pipeline integration, StackHawk also offers cloud-deployed scanning directly from StackHawk’s infrastructure. This is useful for legacy applications without modern pipelines, inherited systems that need immediate security validation, or compliance-driven production scans where teams want results before completing a full CI/CD integration.
StackHawk scans complete in minutes. This is what makes CI/CD integration practical. A scan that takes hours blocks the pipeline. A scan that finishes in minutes fits naturally into a pull request workflow.
Advanced Testing Capabilities
Beyond standard DAST scanning, StackHawk includes specialized testing for the types of vulnerabilities that traditional scanners miss.
Business Logic Testing
Traditional DAST tools scan with a single user session. They can’t answer the question “Can User A access User B’s data?” StackHawk’s Business Logic Testing automates multi-user authorization testing to catch BOLA (Broken Object-Level Authorization) and BFLA (Broken Function-Level Authorization) vulnerabilities.
You configure multiple user profiles with different roles and access levels. StackHawk analyzes your OpenAPI spec, understands how your API endpoints relate to each other, and orchestrates tests that realistically simulate different users interacting with the same resources. When an authorization flaw is found, the report includes complete evidence: the request and response from both the privileged and unprivileged user, so your team can see exactly what happened.

This type of testing is typically reserved for manual penetration tests, which are expensive and don’t scale. Automating it means you can run authorization checks continuously across your entire portfolio.
LLM Application Security Testing
As teams integrate large language models into their applications, new attack vectors appear that static analysis simply cannot detect. You can’t find prompt injection vulnerabilities by reading source code. You find them by testing how the LLM actually behaves when it receives adversarial inputs.
StackHawk’s LLM security testing covers five categories from the OWASP LLM Top 10:
- Prompt injection: Testing whether attackers can manipulate prompts to override system instructions, bypass safety controls, or extract unauthorized information
- Sensitive data disclosure: Probing whether the LLM leaks customer PII, API keys, internal system details, or proprietary business logic in its responses
- Improper output handling: Checking whether unvalidated LLM outputs get used in SQL queries, system commands, or downstream API calls, turning the model into an injection vector
- System prompt leakage: Detecting whether attackers can extract hidden system instructions, configuration details, or safety bypass mechanisms
- Unbounded consumption: Testing for missing rate limits or resource controls that could allow attackers to exhaust API costs or launch denial-of-service attacks
These tests run as part of your existing runtime testing workflow in CI/CD, not as a separate tool or manual exercise. When a developer ships a new LLM-powered feature, it gets tested for AI-specific vulnerabilities automatically alongside the standard DAST scan.
Oversight & Intelligence: Track Your Risk Posture
Testing finds vulnerabilities. Oversight tells you whether your security program is actually working.
The Oversight & Intelligence layer provides centralized visibility into three things:
Testing coverage. Which applications have been tested? Which haven’t? Where are the gaps? This is the fundamental question that most AppSec teams struggle to answer when discovery, testing, and reporting live in separate tools.
Risk tracking. Every vulnerability is tracked from initial discovery through validated remediation. Your team can see the current risk distribution across your application portfolio at any point, not just after a quarterly report.
Program effectiveness. Board-level metrics show whether your AppSec program is reducing risk over time or just generating scan activity. This is the difference between reporting “we ran 500 scans” and reporting “we reduced critical vulnerabilities by 40% across our API portfolio.”

The platform backs this up with two levels of reporting. Scan reports provide detailed findings for a specific application and environment, including every vulnerability with its severity, category, triage status, and remediation guidance. Summary reports give an executive-level view of the most recent scan results across your entire application portfolio, showing vulnerabilities by application and environment in a single document. Both can be exported as PDF or JSON for compliance workflows and stakeholder reviews.
For AppSec leaders who need to report upward, this layer turns raw scan data into evidence that your security investment is working.
How It All Connects
The real power of the StackHawk platform comes from how these three pillars feed each other in a continuous loop.
Discovery feeds testing. When StackHawk identifies a new API in your source code, it doesn’t just add it to an inventory list. It auto-generates an OpenAPI spec and connects it to your scanning configuration so the new API can be tested automatically.
Testing feeds oversight. Every scan result flows into the oversight layer, updating your real-time risk posture. A newly discovered vulnerability in a critical API immediately surfaces in your risk dashboard.
Oversight feeds prioritization. Coverage gaps identified in the oversight layer tell your team exactly which applications need discovery and testing attention. You stop guessing where to focus and start working from data.
This loop matters because application security is a moving target. New APIs ship daily, code changes introduce new risks, and developers fix some vulnerabilities while introducing others.
A point-in-time audit gives you a snapshot. StackHawk gives you a continuous picture.
Security Testing in AI Coding Environments
StackHawk also extends runtime testing into AI-powered development environments through Model Context Protocol (MCP) integration. Developers using Cursor, Claude Code,Github Copilot, or Windsurf can run security tests, review findings, and generate fixes without leaving their editor. The MCP integration translates natural language commands into StackHawk API calls, so a developer can type “scan my app for vulnerabilities” and get actionable results in context. This brings DAST testing even further left, into the moment when code is being written, not just when it’s committed.
Integrations and Workflow
StackHawk connects to the tools your team already uses:
- Source control: GitHub, GitLab, Bitbucket, and Azure Repos for attack surface discovery
- CI/CD: GitHub Actions, Jenkins, CircleCI, Bitbucket Pipelines, AWS CodePipeline, Buildkite, and other major platforms for automated scanning
- Issue tracking: Jira for sending findings directly to development teams
- Communication: Slack for scan notifications and finding alerts
- Compliance and observability: Vanta for compliance workflows, Datadog for monitoring, and more coming soon for cloud security posture
- Complementary security tools: Semgrep, Snyk, and Endor Labs for teams running multiple security tools in parallel
- AI coding environments: Model Context Protocol (MCP) integration with Cursor, Claude Code, and Windsurf for conversational security testing directly inside AI-powered editors
Findings can be triaged directly in the StackHawk Platform and classified as Risk Accepted, False Positive, or sent to Jira for developer assignment. The platform supports team-based access control with configurable roles and SAML SSO for enterprise authentication.
Getting Started With StackHawk
StackHawk is designed to get you from sign-up to your first scan quickly. The minimum configuration is a three-line YAML file pointing at your running application, but the onboarding wizard will quickly build you out a complete and ready-to-go config to add to your project in a few clicks. From there, you can progressively add authenticated scanning, API-specific configurations, CI/CD integration, and custom test policies as your needs grow.
To see how StackHawk maps to your application portfolio and security workflow, schedule a demo or explore the documentation to get started on your own.