Updating Fork by lingsond · Pull Request #1 · lingsond/FARM

lingsond · 2020-07-15T18:37:44Z

No description provided.

* Group NER formatted preds for multi sample * Fix test * Fix another test error

)

* Adapted DataSilo for only loading test data + added example for evaluation * Catch ML Logger exceptions + provide data for evaluation example * Add Checksum for germeval 17 + initialize tensor_names variable in constructor of DataSilo

* Added registering and param for eval report * Fix registered ph output types bug * rename to register_report(). change order in checks. add docstring and error message * fix typo Co-authored-by: Malte Pietsch <[email protected]>

* Init * Add loading from remote sources * Add wordembedding conversion * Add own Embeddingtokenizer * Add embedding extraction script * Adjust vocab loading, reformat code * Add fasttext as LM * Add docstrings, change fasttext converter

* update data silo for own QA cross valid * provide x-validation for QA + example script * Include 'evaluator_test' parameter in Trainer to prevent evaluating twice * Do more balanced split of documents * Provide top-n-recall metric for QA * remove whitespace * Add assertions on tensor names * Add topn recall to examples Co-authored-by: Timo Moeller <[email protected]>

Co-authored-by: Timo Moeller <[email protected]>

* more flexible tqdm disabling in inferencer. move fasttext import * add logging of worker num

* Provide possibility to implicitly connect heads with processor when loading AdaptiveModel * Fix minor bug in eval script * Assert that prediction heads are connected to Processor

* add english glove models * fix config path * fix lowercasing flag

…class can be pickled (#345) Co-authored-by: skirdey <[email protected]>

* Update tutorial 1 * Update tutorial 2

* requirements: bump Transformers to latest 2.8 version * tokenization: add support for ELECTRA * modeling: add support for new ELECTRA model * experiments: add configuration for English ELECTRA evaluation

* init * Add qa accuracy benchmarks * Rename and move script

…al implementation (#364)

* proper attribute assignment with yes / no answer * simplify add_cls() Co-authored-by: Malte Pietsch <[email protected]>

* Add model optimization to inference loading * Add check for onnx model * make optimize model function public

* unify squad and nq baskets * attempt at simplifying ids * clean id handling * Better handling of different input dicts * apply_tokenization merged * clean apply_tokenization * Rename samples to passages * Clean id handling * rename is_impossible to no_answer * Fix ID error

* Implement test for AdaptiveModel loading * Also save model_weight numpy array as list * Refactor test naming and documentation * Add check for class_weights dimensions - throw ValueError - add test * Remove unused imports

… in scheduler_opts (#437)

…427)

resolves #423

* unify squad and nq baskets * Clean id handling * Add QAInference type hints * add input_features test

* unify squad and nq baskets * attempt at simplifying ids * clean id handling * Better handling of different input dicts * apply_tokenization merged * clean apply_tokenization * Rename samples to passages * Clean id handling * rename is_impossible to no_answer * Rename preds_p to preds * Add QAInference type hints * Adjust examples to new changes * Fix type hint error * Check that label character index matches label str * Minor improvements * Enforce single label doc cls in preprocessing * Refactor span_to_string, clean predictions objects * Remove unneccessary iteration * WIP clean and document predictions.py * Add documentation of Pred objects * Fix list index bug * Fix index in test sample * Refactor data check * Fix docstring * Context window adjusts to long answers * Auto generate attributes of QACandidate * Refactor for readability * Remove unused fn * Add formatting space * Fix tok instead of char issue * Fix issue populating QAPred * Refactor sample check * Fix bugs

* Start implementation of Classes * Working first implementation * qa tests refactored * Add todos * input objects are implemented * add more granular tests * Make methods private * improve id extraction * add nq tests * fix bug * Address reviewer comments * Fix test bug * comment out test assert * simplify fixture * revert tests * Remove call to fixture * Adjust multiprocessing of pytest fixture * Fix typo Co-authored-by: Timo Moeller <[email protected]>

* upgrade transformers version * fix encoder attribute. fix truncation warning * fix add_prefix_space for RobertaTokenizer * change way to set add_prefix_space as question answering accuracy benchmark was decreasing by ~ 1% * change log level back

* fix end ch calculation * Clean up fn return

* Fix index checks * Better error message

* Set QAInferencer.task_type = "question_answering" * ensure QAInferencer attribute is set * fix type

brandenchan and others added 30 commits April 15, 2020 13:41

Fixes issue with NER data format (#322)

a7d01c2

Update config when saving model to include changes of parameters (#323)

11aa9fd

Stop Exception being thrown when it didn't need to be (#324)

0455439

Add Dockerfile for GPU

1b84b18

Group NER preds by sample (#327)

bfce440

* Group NER formatted preds for multi sample * Fix test * Fix another test error

Make multiprocessing attributes an instance property of Inferencer (#329

ecee40e

)

pin pytorch version

e3eeb87

Add S3E pooling of embeddings (#286)

f8d0744

Co-authored-by: Timo Moeller <[email protected]>

Add tqdm progress bar in inferencer (#338)

5de49dc

* more flexible tqdm disabling in inferencer. move fasttext import * add logging of worker num

Implicitly connect heads with processor + check for connection (#337)

3d39512

* Provide possibility to implicitly connect heads with processor when loading AdaptiveModel * Fix minor bug in eval script * Assert that prediction heads are connected to Processor

Add english glove models (#339)

cc3a5ff

* add english glove models * fix config path * fix lowercasing flag

Upgrade version

6c2b00b

Upgrade version

e127d42

Adaptive Model - change usage of lambda to a regular function so the …

f129385

…class can be pickled (#345) Co-authored-by: skirdey <[email protected]>

Make ONNXRuntime dependency optional (#347)

2f28310

Update PyTorch to 1.5 (#351)

ab9ad89

Fix batchnorm error for batchsize 1 in wordembedding LM

dbcf91d

Update tutorials (#348)

ec66342

* Update tutorial 1 * Update tutorial 2

Add support for ELECTRA (#349)

2a33382

* requirements: bump Transformers to latest 2.8 version * tokenization: add support for ELECTRA * modeling: add support for new ELECTRA model * experiments: add configuration for English ELECTRA evaluation

Add target device optimisations for ONNX export (#354)

a2e3f37

Add inference speed benchmarking pipeline (#321)

397bbaa

Remove hack for additional tokens (#360)

37591d1

Make onnx import optional (#363)

997e360

Add __init__.py files for farm.conversion (#365)

af36b76

Question answering accuracy test (#357)

fc8c750

* init * Add qa accuracy benchmarks * Rename and move script

modeling: use gelu for pooled output of ELECTRA model to match origin…

b8c5299

…al implementation (#364)

brandenchan and others added 29 commits June 19, 2020 12:02

Proper attribute assignment in QA prediction with yes / no answer (#414)

677b9d6

* proper attribute assignment with yes / no answer * simplify add_cls() Co-authored-by: Malte Pietsch <[email protected]>

Fix problematic loop in QA predictions (#416)

115e5d6

Fix bug in QACandidate.add_cls (#417)

9e8fa00

Add model optimization to inference loading (#415)

56b6095

* Add model optimization to inference loading * Add check for onnx model * make optimize model function public

Remove pydantic and reformat prediction objects

7a2c67d

Change assertions to logger errors

ceb9954

Upgrade version

3887ed5

Close Inferencer's multiprocessing pool in tests (#403)

451bc0b

Add doc link as badge (#432)

fbe8a1e

Fix loading of saved models with class weights (#431)

a5bda34

* Implement test for AdaptiveModel loading * Also save model_weight numpy array as list * Refactor test naming and documentation * Add check for class_weights dimensions - throw ValueError - add test * Remove unused imports

Make sure automatic num_training_steps doesn't overwrite user setting…

14c857e

… in scheduler_opts (#437)

Remove randomization in individual files when splitting a train file (#…

0098819

…427)

Remove hardcoded seeds from trainer (#424)

daa6291

resolves #423

Add ELECTRA to readme

68cbe01

Question Answering improvements - NQ3 (#419)

f04c230

* unify squad and nq baskets * Clean id handling * Add QAInference type hints * add input_features test

Update readme.rst

ff8ce49

Fix ONNX conversion in benchmark fixture (#430)

16628c0

Upgrade pytorch and python versions (#447)

1228e60

Fix End Ch Calculation (#449)

86af052

* fix end ch calculation * Clean up fn return

Fix start and end offset checks in QA (#450)

fd3e0eb

* Fix index checks * Better error message

Remove exception in processor for individual samples (#451)

b8b59c4

Add meta attribute (#455)

a170498

Increase version to 0.4.6. relax pytorch requirement

856e276

fix bugs with regression label standardization (#456)

b98884d

Ensure QAInferencer always has task_type "question_answering" (#460)

1616f29

* Set QAInferencer.task_type = "question_answering" * ensure QAInferencer attribute is set * fix type

lingsond merged commit df22009 into lingsond:master Jul 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updating Fork#1

Updating Fork#1
lingsond merged 83 commits intolingsond:masterfrom
deepset-ai:master

lingsond commented Jul 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Conversation

lingsond commented Jul 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants