optimum-neuron-cache / inference-cache-config
33.9 kB
dacorvo's picture
dacorvo HF Staff
use longer sequence length for llama3 on trn2
f8538f0 verified