-
Notifications
You must be signed in to change notification settings - Fork 21
Description
Hi, I'm trying to reproduce your experiment.
I trained the 61 examples using BERT classifier as baseline, but it gets 72.49% accuracy, and EDA gets 74.57% accuracy, but BERT-finetune and GPT2-finetune only get the 64% and 66% accuracy
I have some question while doing finetune BERT by MLM and GPT2 by CLM by using this two code: https://github.com/huggingface/transformers/blob/master/examples/language-modeling/run_mlm.py
https://github.com/huggingface/transformers/blob/master/examples/language-modeling/run_clm.py
-
How do you select the best model when finetuning complete? Is just set the flag --load_best_model_at_end?
-
How do you mask the tokens when augment by fine-tuned BERT? Is using the DataCollector.py? Masking the whole word or the single tokens?
-
Can you tell more details about the GPT2 finetune details? Because I get the mini-perplexity for 47 by epochs=10, I am confused about how to get the best fine-tune GPT2 model for augmentation.
Thank you!!!