Skip to content

Releases: GRAAL-Research/deepparse

0.10.0

28 Feb 01:52
5b17c0b

Choose a tag to compare

  • Breaking change: Drop Python 3.8 and 3.9 support. Minimum required version is now Python 3.10.
  • Add Python 3.12 support.
  • Add compatibility with Intel GPUs and other Torch acceleration devices.
  • Remove numpy<2.0.0 version cap to support NumPy 2.x.
  • Bump pinned dependency versions for Python 3.12+ compatibility (unpin uvicorn, python-decouple, pylint-django, pre-commit, pycountry).
  • Update CI/CD workflows, documentation, and package setup for Python 3.12.
  • Fix outdated model download URLs from graal.ift.ulaval.ca to HuggingFace Hub.
  • Remove unused BASE_URL constant.
  • Add missing parameterized type hints across the codebase (List[str], Dict[str, int], List[Callable], etc.).
  • Add missing return type annotations to multiple functions and methods.
  • Add Python 3.13 support.
  • Make fasttext-wheel an optional dependency. On Python 3.13+ (or when fasttext is not installed), Deepparse
    automatically falls back to Gensim's load_facebook_vectors to load FastText embeddings. Install with
    pip install deepparse[fasttext] for native FastText support on Python 3.10–3.12.
  • Fix FastTextEmbeddingsModel.dim property to use vector_size when using the Gensim fallback.
  • Fix Python version in deploy.yml and python-publish.yml workflows (was set to non-existent 3.14).
  • Update GitHub Actions to latest major versions (checkout@v4, setup-python@v5, codeql-action@v3, gh-pages@v4).
  • Fix Dependabot target branch from non-existent develop to dev.
  • Remove dead code: unused attention_output tensor allocation in Seq2SeqModel._decoder_step.
  • Fix download_fasttext_magnitude_embeddings always re-downloading even when cached.
  • Replace assert statements with HTTPException in FastAPI app for proper error handling.
  • Change default logging level from DEBUG to WARNING in app module.
  • Remove unnecessary pybind11 from build-system requirements.
  • Remove obsolete "temporary fix for torch 1.6" comments in encoder, decoder and embedding network.
  • Remove deprecated version key from docker-compose.yml.
  • Fix mixed f-string/%-formatting in download progress bar.
  • Update Dockerfile base images to PyTorch 2.5.1 / CUDA 12.4 and Python 3.13.
  • Migrate PyPI publish workflow from deprecated setup.py sdist bdist_wheel to python -m build.
  • Add MANIFEST.in to ensure version.txt and README.md are included in source distributions.
  • Restrict docs workflow to only build on main, dev, and stable branches.
  • Bump actions/setup-python from v5 to v6.
  • Bump docker/metadata-action from 4.3.0 to 5.10.0.
  • Bump docker/build-push-action from 4.0.0 to 6.19.2.
  • Bump github/codeql-action from v3 to v4.
  • Bump actions/first-interaction from v1 to v3.
  • Bump fastapi[all] from 0.109.1 to 0.134.0.

0.9.14

24 Jul 16:39

Choose a tag to compare

  • Switch model weights hosting to hugging face

0.9.13

12 Sep 22:12
559bad2

Choose a tag to compare

0.9.13

Fix dependency issues with Gensim and Scipy versions.

0.9.12

09 Jul 13:04

Choose a tag to compare

0.9.12

  • Bug-fix the call to the BPEmb class instead of the BPEmbBaseURLWrapperBugFix to fix the download URL in download_models.

0.9.11

09 Jul 02:47

Choose a tag to compare

0.9.11

  • Fix Sentry version error in Docker Image.

0.9.10

23 Jun 22:40
2ecb38d

Choose a tag to compare

  • Fix and improve documentation.
  • Remove fixed dependencies version.
  • Fix app errors.
  • Add data validation for 1) multiple consecutive whitespace and 2) newline.
  • Fixes some errors in tests.
  • Add an argument to the DatasetContainer interface to use a pre-processing data cleaning function before validation.
  • Hot-fix the issue with the BPEmb base URL download problem. See issue 221.
  • Fix the NumPy version due to a major release with breaking changes.
  • Fix the SciPy version due to breaking change with Gensim.
  • Fix circular import in the API app.
  • Fix deprecated max_request_body_size in Sentry.

0.9.9

11 Aug 18:28

Choose a tag to compare

  • Add version to Seq2Seq and AddressParser.
  • Add a Deepparse as an API using FastAPI.
  • Add a Dockerfile and a docker-compose.yml to build a Docker container for the API.
  • Bug-fix the default pre-processors that were not all apply but only the last one.

0.9.8 and weights release

25 May 15:48
2752cfa

Choose a tag to compare

0.9.7

22 May 13:10
dfbab27

Choose a tag to compare

  • New models release with more meta-data.
  • Add a feature to use an AddressParser from a URI.
  • Add a feature to upload the trained model to a URI.
  • Add an example of how to use URI for parsing from and uploading to.
  • Improve error handling of path_to_retrain_model.
  • Bug-fix pre-processor error.
  • Add verbose override and improve verbosity handling in retrain.
  • Bug-fix the broken FastText installation using fasttext-wheel instead of fasttext (
    see here
    and here).

0.9.6

31 Mar 10:49
fdf1a8e

Choose a tag to compare

  • Add Python 3.11.
  • Add pre-processor when parsing addresses.
  • Add pin_memory=True when using a CUDA device to increase performance as suggested
  • by Torch documentation.
  • Add torch.no_grad() context manager in call() to increase performance.
  • Reduce memory swap between CPU and GPU by instantiating Tensor directly on the GPU device.
  • Improve some Warnings clarity (i.e. category and message).
  • Bug-fix MacOS multiprocessing. It was impossible to use in multiprocess since we were not testing whether torch
  • multiprocess was set properly. Now, we set it properly and raise a warning instead of an error.
  • Drop Python 3.7 support since newer Python versions are faster
  • and Torch 2.0 does not support Python 3.7.
  • Improve error handling with wrong checkpoint loading in AddressParser retrain_path use.
  • Add torch.compile integration to improve performance (Torch 1.x still supported) with mode="reduce-overhead" as
  • suggested in the documentation. It
  • increases the performance by about 1/100.