Skip to content
This repository was archived by the owner on Apr 8, 2025. It is now read-only.

Download models from (private) S3#500

Merged
tholor merged 5 commits intomasterfrom
s3_downloads
Aug 25, 2020
Merged

Download models from (private) S3#500
tholor merged 5 commits intomasterfrom
s3_downloads

Conversation

@tholor
Copy link
Copy Markdown
Member

@tholor tholor commented Aug 25, 2020

Simple utility function to download a model from an s3 bucket (e.g. private AWS or on-prem deployment).
We'll skip those files that have already been downloaded before.

Usage:

from farm.modeling.tokenization import Tokenizer
from farm.modeling.language_model import LanguageModel
from farm.file_utils import download_from_s3

# download model (if no custom cache_dir is supplied => we use FARM default cache dir, i.e. ~/.cache/torch/farm on Linux)
remote_model_path = "s3://your_bucket/bert-base-german-cased/"
local_model_path = download_from_s3(s3_url=remote_model_path, cache_dir=None)

# load components as usual
tokenizer = Tokenizer.load(local_model_path)
language_model = LanguageModel.load(local_model_path)

Potential improvements in the future:

  • add a file with the hash to see if remote files have been updated and therefore need to be downloaded again
  • progress bar for download

@tholor tholor requested a review from tanaysoni August 25, 2020 11:59
@tholor tholor changed the title WIP Download models from (private) S3 Download models from (private) S3 Aug 25, 2020
@tholor tholor merged commit 761028f into master Aug 25, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants