Use T5_large for training and output outputs.loss as NAN.
#19
by
Swithun
- opened
Why is T5_large loaded for training, e.g. outputs = model(input_ids, labels = target_ids) and the output outputs.loss is NAN. But on the same dataset loaded T5_small model outputs.loss is correct.
