Skip to content

Using Google Gemini AI multi modal API to read and interpret images

Notifications You must be signed in to change notification settings

kipyegonline/gemini-ai-api

Repository files navigation

GENERATIVE UI

An Artificial Intelligence web app powered by Google Gemini API to read and interpret images to text using multi modal models.

The app can also compare upto 2 images and return the comparison in text format.

The app runs on react & Typescript, hosted on firebase.

Usage

Upload an image from file system or use device camera then enter prompt message and Gemini will take time to read your image together with your prompts. You can copy the text from the UI using the copy icon

Installation

The app can be run locally by cloning the project here

run npm install then run npm run dev serve locally.

Deployment

The app is deployed and hosted on Firebase using Firebase CLI

Issues and PR

You can fork and create PR against the main branch

Issues can be created on the same branch.

About

Using Google Gemini AI multi modal API to read and interpret images

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published