Merged
Conversation
* Group NER formatted preds for multi sample * Fix test * Fix another test error
* Adapted DataSilo for only loading test data + added example for evaluation * Catch ML Logger exceptions + provide data for evaluation example * Add Checksum for germeval 17 + initialize tensor_names variable in constructor of DataSilo
* Added registering and param for eval report * Fix registered ph output types bug * rename to register_report(). change order in checks. add docstring and error message * fix typo Co-authored-by: Malte Pietsch <[email protected]>
* Init * Add loading from remote sources * Add wordembedding conversion * Add own Embeddingtokenizer * Add embedding extraction script * Adjust vocab loading, reformat code * Add fasttext as LM * Add docstrings, change fasttext converter
* update data silo for own QA cross valid * provide x-validation for QA + example script * Include 'evaluator_test' parameter in Trainer to prevent evaluating twice * Do more balanced split of documents * Provide top-n-recall metric for QA * remove whitespace * Add assertions on tensor names * Add topn recall to examples Co-authored-by: Timo Moeller <[email protected]>
Co-authored-by: Timo Moeller <[email protected]>
* more flexible tqdm disabling in inferencer. move fasttext import * add logging of worker num
* Provide possibility to implicitly connect heads with processor when loading AdaptiveModel * Fix minor bug in eval script * Assert that prediction heads are connected to Processor
* add english glove models * fix config path * fix lowercasing flag
…class can be pickled (#345) Co-authored-by: skirdey <[email protected]>
* Update tutorial 1 * Update tutorial 2
* requirements: bump Transformers to latest 2.8 version * tokenization: add support for ELECTRA * modeling: add support for new ELECTRA model * experiments: add configuration for English ELECTRA evaluation
* init * Add qa accuracy benchmarks * Rename and move script
…al implementation (#364)
* proper attribute assignment with yes / no answer * simplify add_cls() Co-authored-by: Malte Pietsch <[email protected]>
* Add model optimization to inference loading * Add check for onnx model * make optimize model function public
* unify squad and nq baskets * attempt at simplifying ids * clean id handling * Better handling of different input dicts * apply_tokenization merged * clean apply_tokenization * Rename samples to passages * Clean id handling * rename is_impossible to no_answer * Fix ID error
* Implement test for AdaptiveModel loading * Also save model_weight numpy array as list * Refactor test naming and documentation * Add check for class_weights dimensions - throw ValueError - add test * Remove unused imports
… in scheduler_opts (#437)
* unify squad and nq baskets * Clean id handling * Add QAInference type hints * add input_features test
* unify squad and nq baskets * attempt at simplifying ids * clean id handling * Better handling of different input dicts * apply_tokenization merged * clean apply_tokenization * Rename samples to passages * Clean id handling * rename is_impossible to no_answer * Rename preds_p to preds * Add QAInference type hints * Adjust examples to new changes * Fix type hint error * Check that label character index matches label str * Minor improvements * Enforce single label doc cls in preprocessing * Refactor span_to_string, clean predictions objects * Remove unneccessary iteration * WIP clean and document predictions.py * Add documentation of Pred objects * Fix list index bug * Fix index in test sample * Refactor data check * Fix docstring * Context window adjusts to long answers * Auto generate attributes of QACandidate * Refactor for readability * Remove unused fn * Add formatting space * Fix tok instead of char issue * Fix issue populating QAPred * Refactor sample check * Fix bugs
* Start implementation of Classes * Working first implementation * qa tests refactored * Add todos * input objects are implemented * add more granular tests * Make methods private * improve id extraction * add nq tests * fix bug * Address reviewer comments * Fix test bug * comment out test assert * simplify fixture * revert tests * Remove call to fixture * Adjust multiprocessing of pytest fixture * Fix typo Co-authored-by: Timo Moeller <[email protected]>
* upgrade transformers version * fix encoder attribute. fix truncation warning * fix add_prefix_space for RobertaTokenizer * change way to set add_prefix_space as question answering accuracy benchmark was decreasing by ~ 1% * change log level back
* fix end ch calculation * Clean up fn return
* Fix index checks * Better error message
* Set QAInferencer.task_type = "question_answering" * ensure QAInferencer attribute is set * fix type
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.