Skip to content

Improve user story description parsing for multiple LLM formats#200

Merged
subsy merged 4 commits intomainfrom
claude/fix-prd-user-story-format-wMTfi
Jan 23, 2026
Merged

Improve user story description parsing for multiple LLM formats#200
subsy merged 4 commits intomainfrom
claude/fix-prd-user-story-format-wMTfi

Conversation

@subsy
Copy link
Owner

@subsy subsy commented Jan 22, 2026

Summary

Enhanced the PRD parser to correctly handle user story descriptions in multiple LLM output formats, and updated the PRD generation guidance to specify plain text descriptions without bold prefixes.

Changes

PRD Generation Guidance (src/chat/engine.ts)

  • Updated PRD_COMPATIBILITY_GUIDANCE to clarify that user story descriptions should be plain text on the next line following the format: "As a user, I want to ... so that ..."
  • Removed the **Description:** bold prefix from the example format
  • Added explicit note: "IMPORTANT: User story descriptions must be plain text (no Description: prefix)"

Description Extraction Logic (src/prd/parser.ts)

  • Added DESCRIPTION_STOP_PATTERN regex to identify known metadata fields (**Acceptance Criteria:**, **Priority:**, **Depends on:**, **Labels:**, **Notes:**) that should terminate description extraction
  • Added DESCRIPTION_LABEL_PATTERN regex to detect and strip the **Description:** label prefix that some LLMs generate
  • Enhanced extractStoryDescription() function to:
    • Replace the generic line.startsWith('**') check with the more precise DESCRIPTION_STOP_PATTERN match to avoid false positives
    • Strip the **Description:** label prefix from lines before processing
    • Remove any remaining bold emphasis markers (e.g., **As a**As a) from the final description text
  • Updated JSDoc to document the three LLM output formats now supported:
    • Plain text: "As a user, I want..."
    • Bold-label: "Description: As a user, I want..."
    • Bold-keyword: "As a user I want to... So that..."

Implementation Details

The parser now gracefully handles inconsistent LLM formatting by:

  1. Detecting and stripping known metadata field prefixes to prevent premature termination
  2. Removing the **Description:** label if present
  3. Stripping any remaining bold formatting used for emphasis within the description text
  4. Joining lines and trimming whitespace to produce clean, plain text descriptions

This makes the parser more robust to variations in LLM output while encouraging the generation of properly formatted descriptions going forward.

Summary by CodeRabbit

Release Notes

  • Improvements

    • Updated user story description formatting to require plain text without markdown prefixes for better consistency and clarity.
    • Enhanced description parsing to handle multiple formatting variations and metadata boundaries more robustly.
  • Tests

    • Added comprehensive test coverage for PRD description parsing and formatting guidance.

✏️ Tip: You can customize this high-level summary in your review settings.

The parser's extractStoryDescription() previously stopped at any line
starting with '**', which broke when LLMs generated descriptions with
**Description:** prefixes or **As a**/**I want**/**So that** bold
keywords. Now the parser only stops at known metadata fields (Priority,
Depends on, etc.) and strips bold formatting from description content.

Also updates the LLM prompt guidance to prefer plain-text descriptions,
reducing the likelihood of format mismatches in newly generated PRDs.
@vercel
Copy link

vercel bot commented Jan 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Review Updated (UTC)
ralph-tui Ignored Ignored Preview Jan 23, 2026 0:06am

Request Review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 22, 2026

Warning

Rate limit exceeded

@subsy has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 18 minutes and 23 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

The changes update PRD parsing logic to enforce plain text descriptions without "Description:" prefixes, introducing new validation patterns and stopping conditions in the parser whilst updating guidance documentation and adding comprehensive test coverage.

Changes

Cohort / File(s) Summary
Core Parser Logic
src/prd/parser.ts
Introduces DESCRIPTION_STOP_PATTERN and DESCRIPTION_LABEL_PATTERN to guard description extraction. Modified extractStoryDescription to stop at metadata boundaries, strip bold prefixes, and remove emphasis markers from final descriptions.
Parser Tests
src/prd/parser.test.ts
Comprehensive test coverage for parsePrdMarkdown function including plain text extraction, bold prefix handling, stop conditions (Acceptance Criteria, Priority, Depends on), edge cases, and multi-story scenarios.
Engine Guidance & Tests
src/chat/engine.ts, src/chat/engine.test.ts
Updated PRD_COMPATIBILITY_GUIDANCE to specify plain text descriptions without bold prefixes. New test suite validating buildPrdSystemPromptFromSkillSource generates correct guidance text, including format requirements and header instructions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 No more bold prefixes cluttering the way,
Plain text descriptions now hold their sway,
Patterns protect each parsing flow,
Cleaner user stories for our code to grow! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main objective of the PR, which is to enhance description parsing to handle multiple LLM output formats for user stories.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Jan 22, 2026

Codecov Report

❌ Patch coverage is 61.53846% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.23%. Comparing base (708d7c6) to head (91f6618).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
src/prd/parser.ts 61.53% 5 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #200      +/-   ##
==========================================
+ Coverage   43.87%   44.23%   +0.36%     
==========================================
  Files          83       84       +1     
  Lines       23963    24229     +266     
==========================================
+ Hits        10514    10718     +204     
- Misses      13449    13511      +62     
Files with missing lines Coverage Δ
src/chat/engine.ts 10.43% <ø> (+2.35%) ⬆️
src/prd/parser.ts 60.13% <61.53%> (+56.63%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…ance

Covers extractStoryDescription with multiple LLM output formats:
- Plain text descriptions
- **Description:** prefixed descriptions
- **As a**/**I want**/**So that** bold-keyword format
- Stop conditions (metadata fields, headings, separators)
- Edge cases (empty descriptions, no acceptance criteria)

Also tests buildPrdSystemPromptFromSkillSource to verify the prompt
instructs plain-text format and discourages the **Description:** prefix.
These test directories were missing from CI, so Codecov had no coverage
data for parser.ts and engine.ts changes, resulting in 16% diff coverage.
@subsy subsy merged commit 94c8ea3 into main Jan 23, 2026
9 checks passed
@subsy subsy deleted the claude/fix-prd-user-story-format-wMTfi branch January 23, 2026 00:08
sakaman pushed a commit to sakaman/ralph-tui that referenced this pull request Feb 15, 2026
…mat-wMTfi

Improve user story description parsing for multiple LLM formats
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments