Run type check as first step #862

ftnext · 2025-12-30T14:55:53Z

Inspired by #859

TypeError: cannot inherit from both a TypedDict type and a non-TypedDict base class

Copilot

Pull request overview

This PR introduces mypy type checking as a CI/CD step and addresses various type-related issues throughout the codebase. The changes refactor duplicated type definitions into a shared generic base type, add type ignore comments where dynamic attribute assignment occurs, and rename exception variables for clarity.

Key changes:

Adds mypy type checking workflow and Makefile target
Introduces generic TranscribeOutputBase[SegmentT] to eliminate duplicate type definitions
Adds type ignore comments for dynamic attribute assignments and runtime monkey-patching

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`.github/workflows/typecheck.yml`	New GitHub Actions workflow to run type checking on push/PR to master
`Makefile`	Adds typecheck target with mypy configuration
`setup.cfg`	Adds mypy and types-requests to dev dependencies
`speech_recognition/recognizers/whisper_local/base.py`	Introduces generic `TranscribeOutputBase` and updates Protocol with generics
`speech_recognition/recognizers/whisper_local/whisper.py`	Removes duplicate `TranscribeOutput` definition, uses shared `TranscribeOutputBase`
`speech_recognition/recognizers/whisper_local/faster_whisper.py`	Removes duplicate `TranscribeOutput` definition, uses shared `TranscribeOutputBase`
`speech_recognition/recognizers/google_cloud.py`	Changes encoding type from enum to int for type compatibility
`speech_recognition/__init__.py`	Adds type ignore comments for dynamic attributes and renames exception variables

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-30T15:23:34Z

speech_recognition/recognizers/whisper_local/base.py

+class TranscribeOutputBase(TypedDict, Generic[SegmentT]):
+    text: str
+    segments: list[SegmentT]
+    language: str


TypedDict does not support inheritance from Generic in Python. While this syntax may work in some type checkers, it's not officially supported and could lead to runtime errors or inconsistent behavior across different type checkers. Consider using a Protocol with properties instead, or use a regular class with dataclass for better type safety.

Copilot · 2025-12-30T15:23:35Z

speech_recognition/recognizers/google_cloud.py

    """

-    encoding: Required[RecognitionConfig.AudioEncoding]
+    encoding: Required[int]


Changing the type from RecognitionConfig.AudioEncoding to int reduces type safety. While RecognitionConfig.AudioEncoding values are integers, using the specific enum type provides better type checking and makes the API clearer. If this change is necessary for type checking to pass, consider using RecognitionConfig.AudioEncoding | int or investigate why mypy is having issues with the original type.

Suggested change

encoding: Required[int]

encoding: Required["RecognitionConfig.AudioEncoding | int"]

ftnext · 2025-12-30T15:26:24Z

Code Review by Claude Sonnet 4.5 (`/review` in local)

Overview

This PR introduces type checking to the project using mypy and adds a GitHub Actions workflow to run type checks on every push and pull request to master. The changes include:

New type check CI workflow
Type checking configuration in Makefile
Type annotations and type: ignore comments throughout the codebase
Refactored type definitions for better type safety (Generic TypedDict for Whisper)

Stats: +83 additions, -49 deletions

Positive Aspects

✅ Strong foundation for type safety

Adding mypy to CI is excellent for catching type errors early
The Generic TypedDict approach in whisper_local/base.py is well-designed

✅ Improved code clarity

Renaming exc → failed/not_ready in speech_recognition/init.py:1040-1050 improves readability

✅ Proper infrastructure setup

CI workflow properly installs all dependencies including audio libraries
Makefile target is simple and clear

Issues & Suggestions

🔴 Critical Issues

Excessive use of type: ignore comments

Files affected: speech_recognition/init.py (lines 376, 379, 386-387, 1289-1298, 1326)
Issue: While some type: ignore comments are necessary initially, this PR adds many without addressing the root cause
Suggestion:
- For AudioSource attributes (CHUNK, SAMPLE_RATE, etc.), consider adding a Protocol or base class that defines these attributes
- For dynamic method assignment (recognize_google, etc.), consider using proper type stubs or Protocol definitions

# Example: Define a proper Protocol instead of ignoring
from typing import Protocol

class AudioSourceProtocol(Protocol):
    CHUNK: int
    SAMPLE_RATE: int
    SAMPLE_WIDTH: int
    stream: Any  # or more specific type

Type regression in google_cloud.py:52

encoding: Required[RecognitionConfig.AudioEncoding]

encoding: Required[int]

Issue: This loses type information. RecognitionConfig.AudioEncoding is likely an Enum that provides better type safety than raw int
Suggestion: Keep the original type and investigate why mypy complained. The issue might be with how Required is used with Enums in TypedDict

⚠️ Medium Priority

Missing type annotations in Makefile .PHONY
.PHONY: lint rstcheck distribute publish

Issue: The new typecheck target is missing from .PHONY
Fix: Should be .PHONY: lint rstcheck typecheck distribute publish

CI workflow could be optimized

Currently installs ALL optional dependencies for type checking
Suggestion: Consider whether all these are needed just for type checking, or if type stubs would suffice

Inconsistent ignore comments

Some ignore comments specify the error code ([attr-defined]), others don't
Best practice: Always specify error codes for better maintainability
Example: # type: ignore → # type: ignore[attr-defined]

💡 Suggestions for Future Improvements

No tests for type checking

Consider adding test cases that verify types work correctly with actual usage
This ensures the type hints match runtime behavior

Documentation

Add a comment explaining why --ignore-missing-imports is used
Document in README or CONTRIBUTING.md that type checking is now required

Generic TypedDict is excellent but could be more complete

The TranscribeOutputBase design is good, but consider adding more strict types for segment fields if possible

Security & Performance

✅ No security concerns identified
✅ No performance implications - type checking only runs in CI

Code Quality Score: 7/10

Strengths:

Solid infrastructure addition
Good refactoring of Whisper type definitions
Proper CI integration

Areas for Improvement:

Too many type: ignore comments mask underlying type issues
Type regression in google_cloud.py
Missing typecheck from .PHONY
Could benefit from more thorough type coverage

Recommendation

Approve with changes requested ✏️

This PR adds valuable type checking infrastructure, but should address:

Fix the .PHONY declaration
Reconsider the google_cloud.py type change
Add issue/TODO comments for the ignored type errors to track future improvements

The foundation is solid - once these issues are addressed, this will significantly improve code quality.

ftnext added 7 commits December 30, 2025 23:36

refactor: Introduce TypeVar

7cae32a

bugfix: AudioEncoding is int

f684e23

style: Pass type check with ignore

fcc0e7a

chore: type check deps

a800a43

chore: Start make typecheck

783ad35

bugfix: Fix TypedDict TypeError

f5b98bb

TypeError: cannot inherit from both a TypedDict type and a non-TypedDict base class

chore: Run type check on GitHub Actions

ca74e36

ftnext requested a review from Copilot December 30, 2025 15:20

Copilot started reviewing on behalf of ftnext December 30, 2025 15:20 View session

Copilot AI reviewed Dec 30, 2025

View reviewed changes

ftnext added 3 commits December 31, 2025 00:27

bugfix: missing PHONY

f58a650

bugfix: Revert to concise AudioEncoding type

b13b1d5

bugfix: Fix NameError in tests

d63e4d2

ftnext merged commit 2baf672 into Uberi:master Dec 30, 2025
9 checks passed

ftnext deleted the run-type-check branch December 30, 2025 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Run type check as first step #862

Run type check as first step #862

Uh oh!

ftnext commented Dec 30, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 30, 2025

Uh oh!

Copilot AI Dec 30, 2025

Uh oh!

ftnext commented Dec 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	encoding: Required[int]
	encoding: Required["RecognitionConfig.AudioEncoding \| int"]

Run type check as first step #862

Run type check as first step #862

Uh oh!

Conversation

ftnext commented Dec 30, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

ftnext commented Dec 30, 2025

Code Review by Claude Sonnet 4.5 (/review in local)

Overview

Positive Aspects

Issues & Suggestions

Security & Performance

Recommendation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Code Review by Claude Sonnet 4.5 (`/review` in local)