triton_kernels and multiprocessing
Has anyone been able to share the quantized model across multiple processes in python (i.e. using multiprocessing
or torch.multiprocessing
)? I'm myself getting issues with pickling/unpickling related to triton_kernels
:
ModuleNotFoundError: No module named 'triton_kernels_b8dc79e809df14c1'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/opt/conda/lib/python3.12/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.12/multiprocessing/spawn.py", line 132, in _main
self = reduction.pickle.load(from_parent)
Note that I'm able to use the model fine from the main process and can share other models on CUDA with child processes (via spawn) but not this one. Anyone seen this? (and better yet, have a solution?)
We packaged the triton_kernels
better, if you use the main, it should be a much better experience.
https://github.com/huggingface/transformers/pull/39926
Thank you for looking into it. I got the latest versions of transformers and kernels from github but am still seeing the issue. Here is a minimal example to test:
from transformers import AutoModelForCausalLM
from torch.multiprocessing import get_context
import torch
import os
MODEL_NAME = "openai/gpt-oss-20b"
DEVICE = "cuda"
spawn_context = get_context("spawn")
def use_model(model):
print(os.getpid(), model.device)
print(
model.generate(
input_ids=torch.tensor([[0, 1, 2]], device=model.device), max_new_tokens=1))
if __name__ == "__main__":
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME,
torch_dtype="auto",
device_map=DEVICE)
use_model(model)
spawn_context.Process(target=use_model, args=(model, )).start()
Here are my versions (I removed triton_kernels just in case it was interfering in some way):
kernels 0.9.0.dev0
transformers 4.56.0.dev0
triton 3.4.0
# at these commits (main branch head as of Sat Aug 16):
- Installing transformers (4.56.0.dev0 cd22550)
- Installing kernels (0.9.0.dev0 1caa4c1)
Are you able to reproduce?