fix: prevent CherryAI provider from using native PDF input in middleware#13777
Merged
fix: prevent CherryAI provider from using native PDF input in middleware#13777
Conversation
EurFelux
approved these changes
Mar 25, 2026
DeJeune
approved these changes
Mar 25, 2026
6 tasks
kangfenmao
pushed a commit
that referenced
this pull request
Mar 26, 2026
…rt list (#13809) ### What this PR does Before this PR: - `PDF_NATIVE_PROVIDER_TYPES` included `'openai'`, `'new-api'`, and `'gateway'` types, assuming all OpenAI-compatible providers support native PDF file input via the `file` part type. - Sending a PDF to providers like Moonshot/Kimi (which have `type: 'openai'`) resulted in a 400 error: `"invalid part type: file"`. - A special-case `isCherryAI` check was needed because CherryAI also has `type: 'openai'` but doesn't support native PDF. After this PR: - Only first-party provider protocols (`openai-response`, `anthropic`, `gemini`, `azure-openai`, `vertexai`, `aws-bedrock`, `vertex-anthropic`) are in `PDF_NATIVE_PROVIDER_TYPES`. - All `type: 'openai'` providers (Moonshot, DeepSeek, Groq, CherryAI, cherryin, etc.) correctly have PDFs converted to text before sending. - The CherryAI special-case check is removed as it's no longer needed. ### Why we need it and why it was done in this way The following tradeoffs were made: - Removed `'openai'` entirely from the native set rather than adding per-provider ID exceptions, because the actual OpenAI provider uses `type: 'openai-response'` (not `'openai'`), and the vast majority of `type: 'openai'` providers are third-party APIs that don't support `file` parts. - Also removed `'new-api'` and `'gateway'` aggregator types, since these route to various backends — it's safer to convert PDFs to text and let specific backends handle text rather than risk `file` part errors. The following alternatives were considered: - Adding individual provider IDs (like `moonshot`) to a blocklist — rejected as it's whack-a-mole; new OpenAI-compatible providers would keep hitting the same bug. - Keeping `'openai'` in the set and adding more ID-based exceptions — rejected for the same reason. Links to places where the discussion took place: - PR #13641 introduced the `pdfCompatibilityPlugin` with the overly broad provider type set - PR #13777 added the CherryAI special case as a point fix ### Breaking changes None. Providers that previously had PDFs silently fail with 400 errors will now correctly receive extracted text content instead. ### Special notes for your reviewer - The actual OpenAI provider uses `type: 'openai-response'`, which remains in the native set — real OpenAI API users are unaffected. - All existing tests updated to match new behavior. Test suite passes fully (3811 tests). - The `isCherryAI` special case from PR #13777 is removed since CherryAI (`type: 'openai'`) is now naturally handled by the conversion path. ### Checklist - [x] PR: The PR description is expressive enough and will help future contributors - [x] Code: [Write code that humans can understand](https://en.wikiquote.org/wiki/Martin_Fowler#code-for-humans) and [Keep it simple](https://en.wikipedia.org/wiki/KISS_principle) - [x] Refactor: You have [left the code cleaner than you found it (Boy Scout Rule)](https://learning.oreilly.com/library/view/97-things-every/9780596809515/ch08.html) - [x] Upgrade: Impact of this change on upgrade flows was considered and addressed if required - [ ] Documentation: A [user-guide update](https://docs.cherry-ai.com) was considered and is present (link) or not required. Check this only when the PR introduces or changes a user-facing feature or behavior. - [x] Self-review: I have reviewed my own code (e.g., via [`/gh-pr-review`](/.claude/skills/gh-pr-review/SKILL.md), `gh pr diff`, or GitHub UI) before requesting review from others ### Release note ```release-note Fixed PDF file upload failing with "invalid part type: file" error for OpenAI-compatible providers (Moonshot, DeepSeek, Groq, etc.). PDFs are now correctly converted to text for these providers. ``` --------- Signed-off-by: suyao <[email protected]> Co-authored-by: Claude Opus 4.6 <[email protected]> Co-authored-by: Phantom <[email protected]>
MyPrototypeWhat
pushed a commit
that referenced
this pull request
Mar 30, 2026
…are (#13777) ### What this PR does Before this PR: 1. CherryAI provider was incorrectly treated as supporting native PDF input because its provider type matches `PDF_NATIVE_PROVIDER_TYPES`, causing PDF files not being converted to text 2. PDF text extraction failed in production Electron builds with error: `Cannot find module '.../pdf.worker.mjs'` because `pdf-parse` was in devDependencies instead of dependencies After this PR: 1. Added explicit check to exclude CherryAI provider from native PDF handling 2. Moved `pdf-parse` from devDependencies to dependencies so it's bundled correctly in Electron production builds Fixes # N/A ### Why we need it and why it was done in this way **Issue 1 - CherryAI provider:** CherryAI is a proxy provider that doesn't actually support native PDF input like the underlying providers (Anthropic, Google) do. The middleware needs to convert PDFs to text before sending to CherryAI. **Issue 2 - pdf-parse not found:** In Electron apps, only `dependencies` are bundled into the production build (`app.asar`), while `devDependencies` are excluded. Since `pdf-parse` is required at runtime for PDF text extraction, it must be in `dependencies`. The following tradeoffs were made: - Used provider ID check (`provider.id === 'cherryai'`) rather than modifying the provider type set, as this is more explicit and maintainable The following alternatives were considered: - Removing CherryAI's provider type from `PDF_NATIVE_PROVIDER_TYPES` - rejected because CherryAI can proxy to multiple provider types ### Breaking changes None ### Special notes for your reviewer This is a bug fix that doesn't change Redux data models or IndexedDB schemas. ### Checklist - [x] PR: The PR description is expressive enough and will help future contributors - [x] Code: Write code that humans can understand and Keep it simple - [x] Refactor: You have left the code cleaner than you found it (Boy Scout Rule) - [x] Upgrade: Impact of this change on upgrade flows was considered and addressed if required - [x] Documentation: A user-guide update was considered and is not required - [x] Self-review: I have reviewed my own code before requesting review from others ### Release note ```release-note fix: CherryAI provider now correctly converts PDF files to text instead of attempting native PDF input fix: PDF text extraction now works in production builds (moved pdf-parse to dependencies) ```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Before this PR:
PDF_NATIVE_PROVIDER_TYPES, causing PDF files not being converted to textCannot find module '.../pdf.worker.mjs'becausepdf-parsewas in devDependencies instead of dependenciesAfter this PR:
pdf-parsefrom devDependencies to dependencies so it's bundled correctly in Electron production buildsFixes # N/A
Why we need it and why it was done in this way
Issue 1 - CherryAI provider:
CherryAI is a proxy provider that doesn't actually support native PDF input like the underlying providers (Anthropic, Google) do. The middleware needs to convert PDFs to text before sending to CherryAI.
Issue 2 - pdf-parse not found:
In Electron apps, only
dependenciesare bundled into the production build (app.asar), whiledevDependenciesare excluded. Sincepdf-parseis required at runtime for PDF text extraction, it must be independencies.The following tradeoffs were made:
provider.id === 'cherryai') rather than modifying the provider type set, as this is more explicit and maintainableThe following alternatives were considered:
PDF_NATIVE_PROVIDER_TYPES- rejected because CherryAI can proxy to multiple provider typesBreaking changes
None
Special notes for your reviewer
This is a bug fix that doesn't change Redux data models or IndexedDB schemas.
Checklist
Release note