This example demonstrates how to run Segment-Anything in your browser using onnxruntime-web and webgpu.
Segment-Anything is a encoder/decoder model. The encoder creates embeddings and using the embeddings the decoder creates the segmentation mask.
One can run the decoder in onnxruntime-web using WebAssembly with latencies at ~200ms.
The encoder is much more compute intensive and takes ~45sec using WebAssembly what is not practical. Using webgpu we can speedup the encoder ~50 times and it becomes visible to run it inside the browser, even on a integrated GPU.
First, install the required dependencies by running the following command in your terminal:
npm install
Next, bundle the code using webpack by running:
npm run build
this generates the bundle file ./dist/bundle.min.js
We use samexporter to export encoder and decoder to onnx. Install samexporter:
pip install https://github.com/vietanhdev/samexporter
Download the pytorch model from Segment-Anything. We use the smallest flavor (vit_b).
curl -o models/sam_vit_b_01ec64.pth https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
Export both encoder and decoder to onnx:
python -m samexporter.export_encoder --checkpoint models/sam_vit_b_01ec64.pth \
--output models/sam_vit_b_01ec64.encoder.onnx \
--model-type vit_b
python -m samexporter.export_decoder --checkpoint models/sam_vit_b_01ec64.pth \
--output models/sam_vit_b_01ec64.decoder.onnx \
--model-type vit_b \
--return-single-mask
Use NPM package light-server
to serve the current folder at http://localhost:8888/.
To start the server, run:
npx light-server -s . -p 8888
Once the web server is running, open your browser and navigate to http://localhost:8888/. You should now be able to run Segment-Anything in your browser.
- add support for fp16
- add support for MobileSam