FastAPI Services Documentation

Defects4C: Benchmarking Large Language Model Repair Capability with C/C++ Bugs 👋

⚠️⚠️⚠️ We are updating the online platform. You do not need to install and deploy docker-container locally—just call the API(see http_tutorial.py) to test your LLM results. So far, it supports pass@1 for each defect. If you prefer pass@k>1, you will have to deploy locally as our computation budget is limited. ⚠️⚠️⚠️

Most existing Automated Program Repair (APR) research focuses on Java programs, primarily through Defects4J. Despite the significant prevalence of C/C++ vulnerabilities, extensive research on the automated repair of such vulnerabilities is lacking.

To fill this critical gap, we introduce Defects4C, a high-quality executable benchmark for C/C++ defects. It consists of 248 buggy functions and 102 vulnerable functions, paired with test cases for reproduction.

Scenario

To assess the effectiveness of existing state-of-the-art APR techniques in repairing C/C++ faults, we conduct a comprehensive empirical study using 24 state-of-the-art LLMs with Defects4C in two different scenarios:

Single-round repair
Conversation-based repair for evaluation.

FastAPI Services Documentation

API Overview

Bug Helper Service API (`http://localhost:8000`)

assume the the host of API is localhost as paper undering Anonymous Peer Review, we will public real API host soon when review done.

Method	Endpoint	Description
`GET`	`/projects`	Retrieve all available projects
`POST`	`/reproduce`	Initiate bug reproduction
`POST`	`/fix_with_patch(deperated /fix)`	Apply patch and test fix
`GET`	`/status/{handle}`	Get task status and results
`GET`	`/cache/status`	Check Redis cache status
`DELETE`	`/cache/{redis_key}`	Clear specific cache entry
`GET`	`/all_tasks`	Retrieve all active tasks

Unified Patch Service API (`http://localhost:8000`)

Method	Endpoint	Description
`GET`	`/health`	Service health check
`POST`	`/extract_anchor_patch(deperated /build_patch)`	Generate and apply patches

📋 Overview

This document covers two complementary FastAPI services designed for comprehensive software bug workflows:

Bug Helper Service - Bug reproduction and fixing service with Redis caching
Unified Patch Service - Advanced patch generation and application service

Both services provide REST APIs for managing complete software bug workflows, from initial reproduction through patch generation and application.

🔧 Bug Helper Service API

The Bug Helper Service is a FastAPI-based service for reproducing and fixing software bugs with Redis-based caching support for improved performance and reliability.

Base URL: http://localhost:8000
Version: 1.0.0

Core Endpoints

GET /projects

def get_projects():
    """
    Retrieve all available projects for bug reproduction.
    
    Returns:
        dict: Dictionary containing list of available project names
    
    Raises:
        HTTPException: 500 if projects cannot be loaded
    """

Description: Retrieves a list of all available projects that can be used for bug reproduction and fixing. This endpoint requires no authentication and provides the foundation for other operations by listing valid project names.

Input Parameters:

None required

Output Format:

Type: dict
Structure: {"projects": List[str]}

Example:

# Request
GET /projects

{
  "projects": [
    "libxml2",
    "openssl", 
    "curl",
    "nginx",
    "apache"
  ]
}

POST /reproduce

def reproduce_bug(bug_id: str, is_force_cleanup: bool = True):
    """
    Initiate bug reproduction for a specific bug ID.
    
    Args:
        bug_id (str): Bug identifier in format "project@commit_sha"
        is_force_cleanup (bool, optional): Force cleanup before reproduction. Defaults to True.
    
    Returns:
        dict: Dictionary containing task handle for status tracking
        
    Raises:
        HTTPException: 400 if bug_id format is invalid
        HTTPException: 404 if project not found
        HTTPException: 500 if reproduction fails to start
    """

Description: Initiates bug reproduction for a specific bug ID. This endpoint queues a background task to reproduce the bug environment and run tests. The bug reproduction process includes environment setup, dependency installation, and test execution.

Input Parameters:

bug_id (str, required): Bug identifier following "project@commit_sha" format
is_force_cleanup (bool, optional): Whether to force cleanup before reproduction (default: true)

Output Format:

Type: dict
Structure: {"handle": str}

Example:

# Request
POST /reproduce
Content-Type: application/json

{
  "bug_id": "libxml2@a1b2c3d4e5f6789012345678901234567890abcd",
  "is_force_cleanup": true
}

{
  "handle": "abc123def456789012345678901234567890uvwx"
}

POST /fix_with_patch or /fix

def fix_bug(bug_id: str, patch_path: str):
    """
    Apply a patch to fix a bug and test the fix.
    
    Args:
        bug_id (str): Bug identifier in format "project@commit_sha"
        patch_path (str): File system path to the patch file
    
    Returns:
        dict: Dictionary containing task handle and Redis key for status tracking
        
    Raises:
        HTTPException: 400 if bug_id format is invalid or patch_path doesn't exist
        HTTPException: 500 if patch application fails
    """

Description: Applies a patch to fix a bug and tests the fix. This endpoint uses Redis caching to avoid redundant processing of the same patch. Results are cached based on the combination of bug_id and patch_path, making subsequent requests with identical parameters return immediately.

Input Parameters:

bug_id (str, required): Bug identifier in "project@commit_sha" format
patch_path (str, required): File system path to the patch file

Output Format:

Type: dict
Structure: {"handle": str, "redis_key": str}

Example:

# Request
POST /fix
Content-Type: application/json

{
  "bug_id": "openssl@f1e2d3c4b5a6789012345678901234567890cdef",
  "patch_path": "/patches/openssl_security_fix_20241215.patch"
}

{
  "handle": "cGF0Y2hfZjFlMmQzYzRiNWE2XzEyMzQ1Njc4LmxvZw==",
  "redis_key": "patch_f1e2d3c4b5a6_12345678.log"
}

GET /status/{handle}

def get_task_status(handle: str):
    """
    Retrieve current status and results of a task.
    
    Args:
        handle (str): Task handle from /reproduce or /fix response
    
    Returns:
        dict: Task status information varying by operation type
        
    Raises:
        HTTPException: 404 if handle not found
        HTTPException: 500 if status retrieval fails
    """

Description: Retrieves the current status and results of a task identified by its handle. Works for both reproduce and fix operations. The response format varies depending on the operation type and current status.

Input Parameters:

handle (str, path parameter, required): Task handle from /reproduce or /fix response

Output Format:

Type: dict
Structure varies by operation type:
- Reproduce operations: {"bug_id": str, "sha": str, "status": str, "log_file": str, "result": dict}
- Fix operations: {"bug_id": str, "status": str, "return_code": int, "fix_log": str, "fix_msg": str, "fix_status": str, "cached": bool}

Example:

# Request
GET /status/cGF0Y2hfZjFlMmQzYzRiNWE2XzEyMzQ1Njc4LmxvZw==

{
  "bug_id": "openssl@f1e2d3c4b5a6789012345678901234567890cdef",
  "sha": "f1e2d3c4b5a6789012345678901234567890cdef",
  "status": "completed",
  "return_code": 0,
  "fix_log": "Building project...\nApplying patch...\nRunning tests...\nAll tests passed.",
  "fix_msg": "Patch applied successfully",
  "fix_status": "All tests passed",
  "cached": false,
  "patch": "/patches/openssl_security_fix_20241215.patch",
  "redis_key": "patch_f1e2d3c4b5a6_12345678.log"
}

Management Endpoints

GET /cache/status

def get_cache_status():
    """
    Get Redis cache connection status and information.
    
    Returns:
        dict: Cache status information including connection state and Redis info
        
    Raises:
        HTTPException: 500 if cache status check fails
    """

Description: Provides information about the Redis cache connection and status. Useful for monitoring and debugging cache-related issues. Returns connection state and Redis server information when available.

Input Parameters:

None required

Output Format:

Type: dict
Structure: {"redis_connected": bool, "redis_info": dict|null}

Example:

# Request
GET /cache/status

{
  "redis_connected": true,
  "redis_info": {
    "redis_version": "6.2.6",
    "used_memory": "2097152",
    "connected_clients": "3",
    "uptime_in_seconds": "7200",
    "total_commands_processed": "156"
  }
}

DELETE /cache/{redis_key}

def clear_cache_entry(redis_key: str):
    """
    Remove a specific cache entry from Redis.
    
    Args:
        redis_key (str): Redis key to delete (from /fix response)
    
    Returns:
        dict: Deletion status and key information
        
    Raises:
        HTTPException: 404 if Redis key not found
        HTTPException: 500 if Redis not connected or deletion fails
    """

Description: Removes a specific cache entry from Redis. This forces the next request with the same parameters to recalculate the result. Use with caution as this will trigger full recomputation on subsequent requests.

Input Parameters:

redis_key (str, path parameter, required): Redis key to delete (obtained from /fix response)

Output Format:

Type: dict
Structure: {"deleted": bool, "key": str}

Example:

# Request
DELETE /cache/patch_f1e2d3c4b5a6_12345678.log

{
  "deleted": true,
  "key": "patch_f1e2d3c4b5a6_12345678.log"
}

GET /all_tasks

def get_all_tasks():
    """
    Retrieve all active tasks from memory and Redis.
    
    Returns:
        dict: All active tasks keyed by handle
        
    Raises:
        HTTPException: 500 if task retrieval fails
    """

Description: Retrieves all active tasks from both in-memory storage (reproduce operations) and Redis (fix operations). Useful for monitoring and debugging. Each task object contains the same fields as returned by /status/{handle}.

Input Parameters:

None required

Output Format:

Type: dict
Structure: {handle: task_object, ...}

Example:

# Request
GET /all_tasks

{
  "abc123def456": {
    "bug_id": "libxml2@a1b2c3d4e5f6789012345678901234567890abcd",
    "sha": "a1b2c3d4e5f6789012345678901234567890abcd",
    "status": "completed",
    "log_file": "/logs/a1b2c3d4e5f6_reproduce_abc123def456.log",
    "result": {
      "log_file": "/logs/a1b2c3d4e5f6_reproduce_abc123def456.log",
      "return_code": 0
    }
  },
  "xyz789uvw012": {
    "bug_id": "openssl@f1e2d3c4b5a6789012345678901234567890cdef",
    "status": "running",
    "patch": "/patches/openssl_security_fix_20241215.patch",
    "cached": false
  }
}

🛠️ Unified Patch Service API

The Unified Patch Service provides advanced patch generation and application capabilities with support for multiple patch formats, strategies, and intelligent code extraction from LLM responses.

Base URL: http://localhost:8000
Version: 1.0.0

Core Endpoints

GET /health

def health_check():
    """
    Health check endpoint for service monitoring.
    
    Returns:
        dict: Service health status and version information
    """

Description: Health check endpoint to verify service availability and provide basic service information. Returns service status for monitoring and load balancing purposes.

Input Parameters:

None required

Output Format:

Type: dict
Structure: {"status": str, "service": str, "version": str}

Example:

# Request
GET /health

{
  "status": "healthy",
  "service": "Patch Service",
  "version": "1.0.0"
}

POST /extract_anchor_patch

def write_patch(bug_id: str, llm_response: str, method: str = "prefix", generate_diff: bool = True, persist_flag: bool = False):
    """
    Generate and apply patches using various strategies.
    
    Args:
        bug_id (str): Bug identifier in format "project@commit_sha"
        llm_response (str): LLM-generated patch content (inline code or unified diff)
        method (str, optional): Patch application method. Defaults to "prefix".
        generate_diff (bool, optional): Whether to generate diff files. Defaults to True.
        persist_flag (bool, optional): Whether to save files persistently. Defaults to False.
    
    Returns:
        WritePatchResponse: Comprehensive patch generation results
        
    Raises:
        HTTPException: 400 for various validation and processing errors
    """

Description: Core endpoint for processing LLM responses and generating patches. Supports multiple patch application methods including direct replacement, unified diff application, and prefix-based patching. The service automatically detects the input format and selects appropriate processing strategies. This endpoint extracts code from markdown responses, applies the patch using the specified method, and generates both the patched file and a git diff.

Input Parameters:

Field	Type	Required	Default	Description
`bug_id`	`string`	✅ Yes	-	Bug identifier in format "project@sha"
`llm_response`	`string`	✅ Yes	-	LLM response containing code or diff
`method`	`string`	❌ No	`"prefix"`	Patch application method
`generate_diff`	`boolean`	❌ No	`true`	Whether to generate git diff patch file
`persist_flag`	`boolean`	❌ No	`false`	Save files persistently vs temporary files

Patch Methods:

diff: Apply unified diff format patches
inline: Extract code from markdown and apply directly
inline+meta: Apply unified diff with metadata context
direct: Direct replacement of function body
prefix: Prefix-based replacement with context

Output Format:

Type: WritePatchResponse
Structure: Complex object with patch results, file paths, and metadata

Response Fields:

Field	Type	Description
`success`	`boolean`	Whether patch operation succeeded
`md5_hash`	`string`	MD5 hash of patch content
`patch_content`	`string`	Generated git diff content
`bug_id`	`string`	Original bug identifier
`sha`	`string`	Git commit SHA
`fix_p`	`string`	Path to generated fix file
`fix_p_diff`	`string`	Path to generated patch file
`func_start_byte`	`integer`	Start byte position of modified function
`func_end_byte`	`integer`	End byte position of modified function
`content`	`string`	Raw patch content applied
`error`	`string`	Error message (if success=false)
`error_code`	`string`	Structured error code (if success=false)

Example:

# Request
POST /extract_anchor_patch
Content-Type: application/json

{
  "bug_id": "libxml2@a1b2c3d4e5f6789012345678901234567890abcd",
  "llm_response": "```cpp\nint validateInput(const char* input) {\n    if (!input || strlen(input) == 0) {\n        return 0;\n    }\n    \n    // Additional validation logic\n    for (size_t i = 0; i < strlen(input); i++) {\n        if (!isalnum(input[i]) && input[i] != '_') {\n            return 0;\n        }\n    }\n    \n    return 1;\n}\n```",
  "method": "direct",
  "generate_diff": true,
  "persist_flag": false
}

Success Response:

{
  "success": true,
  "md5_hash": "5d41402abc4b2a76b9719d911017c592",
  "patch_content": "diff --git a/src/validation.cpp b/src/validation.cpp\nindex 1234567..abcdefg 100644\n--- a/src/validation.cpp\n+++ b/src/validation.cpp\n@@ -15,8 +15,18 @@ int validateInput(const char* input) {\n-    return input != NULL;\n+    if (!input || strlen(input) == 0) {\n+        return 0;\n+    }\n+    \n+    // Additional validation logic\n+    for (size_t i = 0; i < strlen(input); i++) {\n+        if (!isalnum(input[i]) && input[i] != '_') {\n+            return 0;\n+        }\n+    }\n+    \n+    return 1;\n }",
  "bug_id": "libxml2@a1b2c3d4e5f6789012345678901234567890abcd",
  "sha": "a1b2c3d4e5f6789012345678901234567890abcd",
  "fix_p": "/tmp/patches/tmp_xyz123.cpp",
  "fix_p_diff": "/tmp/patches/tmp_xyz123.patch",
  "func_start_byte": 450,
  "func_end_byte": 485,
  "content": "int validateInput(const char* input) {\n    if (!input || strlen(input) == 0) {\n        return 0;\n    }\n    \n    // Additional validation logic\n    for (size_t i = 0; i < strlen(input); i++) {\n        if (!isalnum(input[i]) && input[i] != '_') {\n            return 0;\n        }\n    }\n    \n    return 1;\n}\n"
}

Error Response:

{
  "success": false,
  "error": "Bug ID libxml2@invalid not found in guidance data",
  "error_code": "err_bug_id_not_in_guidance",
  "md5_hash": null,
  "patch_content": null,
  "bug_id": null,
  "sha": null,
  "fix_p": null,
  "fix_p_diff": null,
  "func_start_byte": null,
  "func_end_byte": null,
  "content": null
}

⚠️ Error Handling & Codes

Both services implement comprehensive error handling with structured error responses for better debugging and integration.

Bug Helper Service Error Codes

400: Invalid input parameters
- Invalid bug_id format
- Missing or invalid patch_path
404: Resource not found
- Handle not found
- Redis key not found
- Project not found
500: Internal server errors
- Redis connection issues
- Process execution failures
- Task management errors

Unified Patch Service Error Codes

Error Code	Description
`err_extract_code_fail`	Failed to extract code from markdown
`err_invalid_bug_id_format`	Bug ID format is invalid (should be "project@sha")
`err_guidance_not_loaded`	Guidance data not loaded at startup
`err_bug_id_not_in_guidance`	Bug ID not found in guidance database
`err_record_not_found`	Metadata record not found
`err_src_content_not_cached`	Source file content not available in cache
`err_context_mismatch_byte_range`	Patch context doesn't match source file
`err_no_patch_content_identified`	Unable to identify patch content
`err_patch_file_creation_failed`	Failed to create patch file

Error Response Format

Bug Helper Service:

{
  "detail": "Error message describing the issue"
}

Unified Patch Service:

{
  "detail": {
    "error_code": "err_invalid_bug_id_format",
    "message": "bug_id must be 'project@sha', got: invalid_format"
  }
}

⚙️ Configuration & Environment

Bug Helper Service Configuration

Environment variables and settings:

Redis connection parameters
Task timeout settings
Logging configuration
Project directory paths

Unified Patch Service Configuration

Variable	Default	Description
`SRC_DIR`	`/src`	Source directory for input data
`SRC_OUT`	`/out`	Output directory for results
`META_DIR`	`/src/projects`	Metadata directory
`PATCH_OUTPUT_DIR`	`/patches/`	Persistent patch output directory
`PATCH_OUTPUT_BEFORE_DIR`	`/tmp/patches_before`	Temporary patch directory

Data Loading (Patch Service)

The service loads several data sources at startup:

Metadata: Bug information from JSON files in /src/projects/**
Guidance: CSV file with function locations and metadata
Source Content: Source file contents from JSONL format
Prompt Content: LLM prompt templates with infill markers

Details

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github/workflows		.github/workflows
assets/videos		assets/videos
defectsc_tpl		defectsc_tpl
docker_dirs		docker_dirs
evaluate_LLM		evaluate_LLM
out_tmp_dirs		out_tmp_dirs
patche_dirs		patche_dirs
.gitignore		.gitignore
Defects4C_vul.md		Defects4C_vul.md
Defects4c_paper.pdf		Defects4c_paper.pdf
LICENSE		LICENSE
README.md		README.md
api_docs.html		api_docs.html
http_tutorial.py		http_tutorial.py
minor_category.md		minor_category.md
pipeline.svg		pipeline.svg
project_manner.svg		project_manner.svg
readme_cli.md		readme_cli.md

License

defects4c/defects4c

Folders and files

Latest commit

History

Repository files navigation

Defects4C: Benchmarking Large Language Model Repair Capability with C/C++ Bugs 👋

Scenario

FastAPI Services Documentation

API Overview

Bug Helper Service API (http://localhost:8000)

Unified Patch Service API (http://localhost:8000)

📋 Overview

🔧 Bug Helper Service API

Core Endpoints

GET /projects

POST /reproduce

POST /fix_with_patch or /fix

GET /status/{handle}

Management Endpoints

GET /cache/status

DELETE /cache/{redis_key}

GET /all_tasks

🛠️ Unified Patch Service API

Core Endpoints

GET /health

POST /extract_anchor_patch

⚠️ Error Handling & Codes

Bug Helper Service Error Codes

Unified Patch Service Error Codes

Error Response Format

⚙️ Configuration & Environment

Bug Helper Service Configuration

Unified Patch Service Configuration

Data Loading (Patch Service)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Bug Helper Service API (`http://localhost:8000`)

Unified Patch Service API (`http://localhost:8000`)

Packages