How you set the labels of Commonsense_Reasonsing

Hi, I notice there are two ways of setting labels in the commonsense reasoning dataset.

1) If set train_on_inputs=True, the prompt is like: [Instructions, Inputs, Outputs] in the Commonsense Reasoning datasets. One example will be:
"""
Below is an instruction that describes a task. Write a response that appropriately completes the request.  
Instruction: Please answer the following question with true or false, question: do iran and afghanistan speak the same language?
Answer format: true/false
Response: the correct answer is true.
"""
The tokenizer will tokenize everything and we simply do next token prediction on the whole sentence. This includes next-token predictions even on the instruction, which is a bit weird to the task. 

2)If we set train_on_inputs=False, the labels will be masked as [-100, -100, ...,-100] till the true Reponse. This makes the prediction more like commonsense prediction, where our focus is on the "Response" and "Output". 

I notice the default setting is the first one. Can authors kindly explain why not using the second setting?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How you set the labels of Commonsense_Reasonsing #32

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How you set the labels of Commonsense_Reasonsing #32

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions