HuggingSnap

HuggingSnap is an iOS app that lets users quickly learn more about the places and objects around them. Just point your camera to do things like have text translated, summarized; identify plants and animals; and more.

HuggingSnap runs SmolVLM2, a compact open multimodal model that accepts arbitrary sequences of image, videos, and text inputs to produce text outputs.

Designed for efficiency, SmolVLM can answer questions about images, describe visual content, create stories grounded on multiple images, or function as a pure language model without visual inputs. Its lightweight architecture makes it suitable for on-device applications while maintaining strong performance on multimodal tasks.

The repository makes use of a modified version of mlx-swift-examples for VLM support.

How to run

Install from the App Store. You need an iPhone running iOS 18.

Or, to build the app yourself:

Clone the repository
Open HuggingSnap.xcodeproj in Xcode
Run the app on a physical device

You'll need to change the bundle identifier and developer team to run the app on your device.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
HuggingSnap.xcodeproj		HuggingSnap.xcodeproj
HuggingSnap		HuggingSnap
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HuggingSnap

How to run

About

Releases

Packages

Contributors 2

Languages

huggingface/HuggingSnap

Folders and files

Latest commit

History

Repository files navigation

HuggingSnap

How to run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages