onnx-runtime-web

#3
by pdufour - opened

I’m curious if anyone has tried running this on the browser yet? Maybe via onnx-runtime-web? I assume it will probably run out of memory and throw an error but has anyone tried yet?

We are working on webgpu support (qmoe is missing), should be coming soon.
Pretty optimistic that webgpu native will work (ie. node.js) as long the gpu has enough vram, 16GB should work.
For running it on ort-web / webgpu - technically no reason why this should not work.
But its a large model to download. Will definitely try.

Sign up or log in to comment