Releases: GRAAL-Research/deepparse
Releases · GRAAL-Research/deepparse
0.10.0
- Breaking change: Drop Python 3.8 and 3.9 support. Minimum required version is now Python 3.10.
- Add Python 3.12 support.
- Add compatibility with Intel GPUs and other Torch acceleration devices.
- Remove
numpy<2.0.0version cap to support NumPy 2.x. - Bump pinned dependency versions for Python 3.12+ compatibility (unpin
uvicorn,python-decouple,pylint-django,pre-commit,pycountry). - Update CI/CD workflows, documentation, and package setup for Python 3.12.
- Fix outdated model download URLs from
graal.ift.ulaval.cato HuggingFace Hub. - Remove unused
BASE_URLconstant. - Add missing parameterized type hints across the codebase (
List[str],Dict[str, int],List[Callable], etc.). - Add missing return type annotations to multiple functions and methods.
- Add Python 3.13 support.
- Make
fasttext-wheelan optional dependency. On Python 3.13+ (or whenfasttextis not installed), Deepparse
automatically falls back to Gensim'sload_facebook_vectorsto load FastText embeddings. Install with
pip install deepparse[fasttext]for native FastText support on Python 3.10–3.12. - Fix
FastTextEmbeddingsModel.dimproperty to usevector_sizewhen using the Gensim fallback. - Fix Python version in
deploy.ymlandpython-publish.ymlworkflows (was set to non-existent 3.14). - Update GitHub Actions to latest major versions (
checkout@v4,setup-python@v5,codeql-action@v3,gh-pages@v4). - Fix Dependabot target branch from non-existent
developtodev. - Remove dead code: unused
attention_outputtensor allocation inSeq2SeqModel._decoder_step. - Fix
download_fasttext_magnitude_embeddingsalways re-downloading even when cached. - Replace
assertstatements withHTTPExceptionin FastAPI app for proper error handling. - Change default logging level from
DEBUGtoWARNINGin app module. - Remove unnecessary
pybind11from build-system requirements. - Remove obsolete "temporary fix for torch 1.6" comments in encoder, decoder and embedding network.
- Remove deprecated
versionkey fromdocker-compose.yml. - Fix mixed f-string/
%-formatting in download progress bar. - Update Dockerfile base images to PyTorch 2.5.1 / CUDA 12.4 and Python 3.13.
- Migrate PyPI publish workflow from deprecated
setup.py sdist bdist_wheeltopython -m build. - Add
MANIFEST.into ensureversion.txtandREADME.mdare included in source distributions. - Restrict docs workflow to only build on
main,dev, andstablebranches. - Bump
actions/setup-pythonfrom v5 to v6. - Bump
docker/metadata-actionfrom 4.3.0 to 5.10.0. - Bump
docker/build-push-actionfrom 4.0.0 to 6.19.2. - Bump
github/codeql-actionfrom v3 to v4. - Bump
actions/first-interactionfrom v1 to v3. - Bump
fastapi[all]from 0.109.1 to 0.134.0.
0.9.14
0.9.13
0.9.12
0.9.12
- Bug-fix the call to the BPEmb class instead of the BPEmbBaseURLWrapperBugFix to fix the download URL in
download_models.
0.9.11
0.9.11
- Fix Sentry version error in Docker Image.
0.9.10
- Fix and improve documentation.
- Remove fixed dependencies version.
- Fix app errors.
- Add data validation for 1) multiple consecutive whitespace and 2) newline.
- Fixes some errors in tests.
- Add an argument to the
DatasetContainerinterface to use a pre-processing data cleaning function before validation. - Hot-fix the issue with the BPEmb base URL download problem. See issue 221.
- Fix the NumPy version due to a major release with breaking changes.
- Fix the SciPy version due to breaking change with Gensim.
- Fix circular import in the API app.
- Fix deprecated
max_request_body_sizein Sentry.
0.9.9
- Add version to Seq2Seq and AddressParser.
- Add a Deepparse as an API using FastAPI.
- Add a Dockerfile and a
docker-compose.ymlto build a Docker container for the API. - Bug-fix the default pre-processors that were not all apply but only the last one.
0.9.8 and weights release
- Hot-Fix wheel install (See issue 196).
0.9.7
- New models release with more meta-data.
- Add a feature to use an AddressParser from a URI.
- Add a feature to upload the trained model to a URI.
- Add an example of how to use URI for parsing from and uploading to.
- Improve error handling of
path_to_retrain_model. - Bug-fix pre-processor error.
- Add verbose override and improve verbosity handling in retrain.
- Bug-fix the broken FastText installation using
fasttext-wheelinstead offasttext(
see here
and here).
0.9.6
- Add Python 3.11.
- Add pre-processor when parsing addresses.
- Add pin_memory=True when using a CUDA device to increase performance as suggested
- by Torch documentation.
- Add torch.no_grad() context manager in call() to increase performance.
- Reduce memory swap between CPU and GPU by instantiating Tensor directly on the GPU device.
- Improve some Warnings clarity (i.e. category and message).
- Bug-fix MacOS multiprocessing. It was impossible to use in multiprocess since we were not testing whether torch
- multiprocess was set properly. Now, we set it properly and raise a warning instead of an error.
- Drop Python 3.7 support since newer Python versions are faster
- and Torch 2.0 does not support Python 3.7.
- Improve error handling with wrong checkpoint loading in AddressParser retrain_path use.
- Add torch.compile integration to improve performance (Torch 1.x still supported) with mode="reduce-overhead" as
- suggested in the documentation. It
- increases the performance by about 1/100.