Checkpoint file corrupt?

by nshmyrevgmail - opened 1 day ago

1 day ago

Trying to load a checkpoint I get this error

(tone) shmyrev@local:~/tink/T-one$ python3 /home/shmyrev/tink/t-one-repo/tone/scripts/export.py --config config.json --checkpoint model.safetensors --output_path out.onnx
Traceback (most recent call last):
  File "/home/shmyrev/tink/t-one-repo/tone/scripts/export.py", line 261, in <module>
    model = ModelToExport(config, args.checkpoint, args.chunk_duration_ms)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shmyrev/tink/t-one-repo/tone/scripts/export.py", line 91, in __init__
    checkpoint = torch.load(checkpoint_path, map_location=torch.device("cpu"), weights_only=False)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shmyrev/tink/tone/lib/python3.11/site-packages/torch/serialization.py", line 1554, in load
    return _legacy_load(
           ^^^^^^^^^^^^^
  File "/home/shmyrev/tink/tone/lib/python3.11/site-packages/torch/serialization.py", line 1802, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_pickle.UnpicklingError: unpickling stack underflow

Usually it means the checkpoint file is corrupt.

md5sum model.safetensors 
a111792853b72b5bad4502e31753c126  model.safetensors

sxdxfan

T-Tech org about 15 hours ago

Hi, thanks for reaching out!

The issue you're encountering is due to a change in our model checkpoint format in the export script. Older versions of the script expected model.ckpt files saved with torch.save(). The current version on the main branch is designed to work with model.safetensors checkpoints, which are the standard output from the Hugging Face Trainer. These require a different loading method and do not use torch.load().

To resolve this, please update your local code and run the updated export script:

python3 tone/scripts/export.py --path-to-pretrained /path/to/your/model_directory --output_path model.onnx

The /path/to/your/model_directory should contain files like config.json, model.safetensors, etc.

This new script handles the safetensors format correctly. If you're interested, you can see the updated loading logic here in the source code.

Please let us know if that solves the problem for you!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment