Su Zhu
Su Zhu
Find solid baselines in https://github.com/sz128/slot_filling_and_intent_detection_of_SLU.
中文结果有问题,不完整。
@lale314 可以用如下代码过滤出不完整的数据。 ```python import sys import json with open(sys.argv[1]) as fin: for line in fin: line = line.strip() sample = json.loads(line) output = sample['output'].strip(" \n\"”") if output[-1] in set("?!.。?!})]`》)") or...
> Isn't this already supported with #31629 ? It seems that both implementations are similar. But, we need to consider the situation that position ids are not reset between different...
> Hey! Don't you think that ragging the tensor would be more efficient? Yes. I didn't describe it well. I updated the description of this PR. The implementations of this...
> > > Hey! Don't you think that ragging the tensor would be more efficient? > > > > > > Yes. I didn't describe it well. I updated the...
> > we need to consider the scenario where position IDs are not reset between different short samples, especially for LLM pre-training > > does this imply us properly computing...
> > > > we need to consider the scenario where position IDs are not reset between different short samples, especially for LLM pre-training > > > > > >...
> Why wouldn't we use `position_ids` to encode all information (packed, not packed, padded, not padded) in a slightly more elegant way without touching `attention_mask`? > > For example let's...
> Yep, agree with that definitely! My proposal was to leave this choice to users to set in data collator. If they wish to treat such concatenated sequences as a...