Releases · GRAAL-Research/deepparse

Breaking change: Drop Python 3.8 and 3.9 support. Minimum required version is now Python 3.10.
Add Python 3.12 support.
Add compatibility with Intel GPUs and other Torch acceleration devices.
Remove numpy<2.0.0 version cap to support NumPy 2.x.
Bump pinned dependency versions for Python 3.12+ compatibility (unpin uvicorn, python-decouple, pylint-django, pre-commit, pycountry).
Update CI/CD workflows, documentation, and package setup for Python 3.12.
Fix outdated model download URLs from graal.ift.ulaval.ca to HuggingFace Hub.
Remove unused BASE_URL constant.
Add missing parameterized type hints across the codebase (List[str], Dict[str, int], List[Callable], etc.).
Add missing return type annotations to multiple functions and methods.
Add Python 3.13 support.
Make fasttext-wheel an optional dependency. On Python 3.13+ (or when fasttext is not installed), Deepparse
automatically falls back to Gensim's load_facebook_vectors to load FastText embeddings. Install with
pip install deepparse[fasttext] for native FastText support on Python 3.10–3.12.
Fix FastTextEmbeddingsModel.dim property to use vector_size when using the Gensim fallback.
Fix Python version in deploy.yml and python-publish.yml workflows (was set to non-existent 3.14).
Update GitHub Actions to latest major versions (checkout@v4, setup-python@v5, codeql-action@v3, gh-pages@v4).
Fix Dependabot target branch from non-existent develop to dev.
Remove dead code: unused attention_output tensor allocation in Seq2SeqModel._decoder_step.
Fix download_fasttext_magnitude_embeddings always re-downloading even when cached.
Replace assert statements with HTTPException in FastAPI app for proper error handling.
Change default logging level from DEBUG to WARNING in app module.
Remove unnecessary pybind11 from build-system requirements.
Remove obsolete "temporary fix for torch 1.6" comments in encoder, decoder and embedding network.
Remove deprecated version key from docker-compose.yml.
Fix mixed f-string/%-formatting in download progress bar.
Update Dockerfile base images to PyTorch 2.5.1 / CUDA 12.4 and Python 3.13.
Migrate PyPI publish workflow from deprecated setup.py sdist bdist_wheel to python -m build.
Add MANIFEST.in to ensure version.txt and README.md are included in source distributions.
Restrict docs workflow to only build on main, dev, and stable branches.
Bump actions/setup-python from v5 to v6.
Bump docker/metadata-action from 4.3.0 to 5.10.0.
Bump docker/build-push-action from 4.0.0 to 6.19.2.
Bump github/codeql-action from v3 to v4.
Bump actions/first-interaction from v1 to v3.
Bump fastapi[all] from 0.109.1 to 0.134.0.

Switch model weights hosting to hugging face

0.9.13

Fix dependency issues with Gensim and Scipy versions.

0.9.12

Bug-fix the call to the BPEmb class instead of the BPEmbBaseURLWrapperBugFix to fix the download URL in download_models.

0.9.11

Fix Sentry version error in Docker Image.

Fix and improve documentation.
Remove fixed dependencies version.
Fix app errors.
Add data validation for 1) multiple consecutive whitespace and 2) newline.
Fixes some errors in tests.
Add an argument to the DatasetContainer interface to use a pre-processing data cleaning function before validation.
Hot-fix the issue with the BPEmb base URL download problem. See issue 221.
Fix the NumPy version due to a major release with breaking changes.
Fix the SciPy version due to breaking change with Gensim.
Fix circular import in the API app.
Fix deprecated max_request_body_size in Sentry.

Add version to Seq2Seq and AddressParser.
Add a Deepparse as an API using FastAPI.
Add a Dockerfile and a docker-compose.yml to build a Docker container for the API.
Bug-fix the default pre-processors that were not all apply but only the last one.

Hot-Fix wheel install (See issue 196).

New models release with more meta-data.
Add a feature to use an AddressParser from a URI.
Add a feature to upload the trained model to a URI.
Add an example of how to use URI for parsing from and uploading to.
Improve error handling of path_to_retrain_model.
Bug-fix pre-processor error.
Add verbose override and improve verbosity handling in retrain.
Bug-fix the broken FastText installation using fasttext-wheel instead of fasttext (
see here
and here).

Add Python 3.11.
Add pre-processor when parsing addresses.
Add pin_memory=True when using a CUDA device to increase performance as suggested
by Torch documentation.
Add torch.no_grad() context manager in call() to increase performance.
Reduce memory swap between CPU and GPU by instantiating Tensor directly on the GPU device.
Improve some Warnings clarity (i.e. category and message).
Bug-fix MacOS multiprocessing. It was impossible to use in multiprocess since we were not testing whether torch
multiprocess was set properly. Now, we set it properly and raise a warning instead of an error.
Drop Python 3.7 support since newer Python versions are faster
and Torch 2.0 does not support Python 3.7.
Improve error handling with wrong checkpoint loading in AddressParser retrain_path use.
Add torch.compile integration to improve performance (Torch 1.x still supported) with mode="reduce-overhead" as
suggested in the documentation. It
increases the performance by about 1/100.

Uh oh!

Releases: GRAAL-Research/deepparse

0.10.0

Uh oh!

0.9.14

Uh oh!

0.9.13

0.9.13

Uh oh!

0.9.12

0.9.12

Uh oh!

0.9.11

0.9.11

Uh oh!

0.9.10

Uh oh!

0.9.9

Uh oh!

0.9.8 and weights release

Uh oh!

0.9.7

Uh oh!

0.9.6

Uh oh!