Safetensors
English
t5
wencke-lm commited on
Commit
674139b
·
verified ·
1 Parent(s): ae2ade0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -1
README.md CHANGED
@@ -13,7 +13,7 @@ base_model:
13
  #### Description
14
  The provided model was trained to respond to incorrect student answers in an interactive reading comprehension exercise setting. Incorrect student answers can become valuable learning opportunities, provided that the student understands where they went wrong and why. To this end, rather than being given the correct answer, students should receive elaborated feedback on how to correct a mistake on their own. Highlighting the complex demands that the generation of such feedback places on a model's input utilization abilities, we proposed two extensions to the training pipeline. Firstly, we employed a KL regularization term between a standard and enriched input format to achieve more targeted input representations. Secondly, we added a preference optimization step to encourage student answer-adaptive feedback generation.
15
 
16
- #### Evaluation Results
17
  The final model was trained and evaluated on all feedback turns from the DIRECT and DIRECT-Feedback datasets partially available at https://github.com/DIRECTDataset/DIRECTFeedback/blob/main/data/feedback_data_partial.csv
18
 
19
  | BLEU | METEOR | ROUGE | BERTScore |
@@ -23,6 +23,14 @@ The final model was trained and evaluated on all feedback turns from the DIRECT
23
 
24
  For additional details we refer the reader to our paper.
25
 
 
 
 
 
 
 
 
 
26
  #### Execution
27
  Code and instructions on how to perform inference on the model are provided at https://github.com/DIRECTDataset/DIRECTFeedback
28
 
 
13
  #### Description
14
  The provided model was trained to respond to incorrect student answers in an interactive reading comprehension exercise setting. Incorrect student answers can become valuable learning opportunities, provided that the student understands where they went wrong and why. To this end, rather than being given the correct answer, students should receive elaborated feedback on how to correct a mistake on their own. Highlighting the complex demands that the generation of such feedback places on a model's input utilization abilities, we proposed two extensions to the training pipeline. Firstly, we employed a KL regularization term between a standard and enriched input format to achieve more targeted input representations. Secondly, we added a preference optimization step to encourage student answer-adaptive feedback generation.
15
 
16
+ #### Automatic Evaluation Results
17
  The final model was trained and evaluated on all feedback turns from the DIRECT and DIRECT-Feedback datasets partially available at https://github.com/DIRECTDataset/DIRECTFeedback/blob/main/data/feedback_data_partial.csv
18
 
19
  | BLEU | METEOR | ROUGE | BERTScore |
 
23
 
24
  For additional details we refer the reader to our paper.
25
 
26
+ #### Manual Evaluation Results
27
+ We sampled 250 items for the joined DIRECT+DIRECT-F feedback set and had one of the authors of this paper manually evaluate the generated feedback.
28
+
29
+ | appropriate (verification, explanation and hint feedback) | direct (correct feedback) | irrelevant or ambigue | unfaithful (contradicting the passage or alluding to an incorrect answer) |
30
+ | :---: | :---: | :---: | :---: |
31
+ | | | | |
32
+ | 43.6% | 23.6% | 22% | 10.8% |
33
+
34
  #### Execution
35
  Code and instructions on how to perform inference on the model are provided at https://github.com/DIRECTDataset/DIRECTFeedback
36