Some finetune and augment question for bert and gpt2

Hi, I'm trying to reproduce your experiment.

I trained the 61 examples using BERT classifier as baseline, but it gets 72.49% accuracy, and EDA gets 74.57% accuracy, but BERT-finetune and GPT2-finetune only get the 64% and 66% accuracy

I have some question while doing finetune BERT by MLM and GPT2 by CLM by using this two code: https://github.com/huggingface/transformers/blob/master/examples/language-modeling/run_mlm.py
https://github.com/huggingface/transformers/blob/master/examples/language-modeling/run_clm.py

1.  How do you select the best model when finetuning complete? Is just set the flag --load_best_model_at_end?

2.  How do you mask the tokens when augment by fine-tuned BERT? Is using the DataCollector.py? Masking the whole word or the single tokens?

3. Can you tell more details about the GPT2 finetune details? Because I get the mini-perplexity for 47 by epochs=10, I am confused about how to get the best fine-tune GPT2 model for augmentation. 

Thank you!!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Some finetune and augment question for bert and gpt2 #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Some finetune and augment question for bert and gpt2 #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions