Welcome to the buycycle Image Recognition Model project! This repository contains the code and resources for a serverless image recognition model designed to recognize and classify images of bicycles. The project leverages AWS Lambda and Google Vision AI API to detect web entities and filter results based on a certain score threshold. The ultimate goal is to match user-uploaded bicycle images to a predefined set of template IDs using a word matching algorithm.
CI CD github docker
The primary objective of this project is to develop a robust image recognition model that can accurately identify and classify bicycle images uploaded by users. The model will utilize Google Vision AI's Web Detection feature to extract relevant web entities and match them to a predefined set of template IDs.
- Image Upload: Users can upload images of bicycles to an S3 bucket.
- AWS Lambda Integration: An S3 event triggers a Lambda function to process the uploaded image.
- Google Vision AI Integration: The Lambda function sends the image to Google Vision AI for web entity detection.
- Score Filtering: The detected web entities are filtered based on a predefined score threshold to ensure accuracy.
- Word Matching Algorithm: A custom word matching algorithm is used to match the filtered web entities to a predefined set of template IDs.
- Result Return: The top n similar matches are returned directly from the Lambda function.
- An AWS account with access to S3 and Lambda.
- A Google Cloud account with access to the Vision AI API.
- AWS CLI and CDK installed on your local machine.
- Python 3.11 or higher.
-
Clone the Repository
git clone https://github.com/yourusername/bicycle-image-recognition-lambda.git cd bicycle-image-recognition-lambda
-
Create venv and install cdk dependencies
python -m venv .venv source .venv/bin/activate.fish pip install cdk/requirements_cdk.txt
-
Install dependencies to lambda and zip for cdk
pip install -r requirements.txt -t cdk/lambda/lib/ find cdk/lambda -maxdepth 1 -type f -exec zip -r9 lambda_function.zip {} + (cd cdk/lambda && zip -r ../../lambda_function.zip lib)
-
Deploy the CDK Stack
cdk deploy --app cdk/bin/app.py
The image recognition lambda function is triggered by an upload to the S3 bucket. The results are published with SNS.
The Lambda function is triggered by an S3 event whenever a new image is uploaded to the designated S3 bucket. The function reads the image from S3 and sends it to Google Vision AI's Web Detection API.
The Lambda function sends the uploaded image to Google Vision AI's Web Detection API. The API returns a list of web entities, full matching images, partial matching images, pages with matching images, visually similar images, and best guess labels.
The returned web entities are filtered based on a predefined score threshold to ensure that only the most relevant entities are considered.
A custom word matching algorithm is used to match the filtered web entities to a predefined set of template IDs. The algorithm calculates the similarity between the web entity descriptions and the template IDs to determine the most likely match.
The most likely match are published over SNS.
We welcome contributions to this project! If you have any ideas, suggestions, or bug reports, please open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
- Google Cloud Vision API for providing the web detection capabilities.
- AWS Lambda for serverless computing.
- AWS S3 for storage.