💡 Inspiration utilized from Weaviate's open-sourced Health Search Demo
Welcome to News Search, a web application that enables users to search and filter news articles from BBC's RSS feed using a combination of keyword and vector search. This project uses Weaviate, a vector search database, along with a TypeScript React frontend and an Node.js & Express backend.
News Search relies on Weaviate's Generative search hybrid model to conduct search between pure keyword search, hybrid search, or pure vector semantic search using the alpha property in the query, combine with Weaviate's OpenAI module integration and JS/TS v3 Client Library to produce a powerful semantic query with generative summary result.
- Natural Language Query: Enable search in natural language to query for news articles.
- Hybrid Search Slider: Adjust the alpha weight between keyword, hybrid, and vector search.
- Date Range Filter: Select a specific week to filter news articles.
- Generative Summaries: Get a summary of the top 5 articles picked from the search result. Generated by the Weaviate OpenAI module with consideration to influence of public opinion and relativity to major current events.
The dataset use for News Search is the BBC News RSS Feed on Kaggle.
- The dataset comes in CSV format. You can download for latest data and use the
convertCsvToJson.ts
in thebackend
folder to transform it and seed your Weaviate database - Alternatively, you can write a script for daily cron job to download dataset daily, transform, and update the database, and it would be a LIVE up-to-date news app!
-
Ensure you have the following installed on your local machine:
-
This project uses Weaviate Cloud's Sandbox Cluster (14 day free trial) for quick and easy setup:
- Weaviate Quickstart Guide
- Weaviate Cloud (WCD)
- Be sure to take note of your Weaviate instance's url and API key once the sandbox cluster is spinned up.
-
The generative search uses the OpenAI module that requires OpenAI API key:
- Sign Up with OpenAI API: $18 free credit on first sign up
- Create an OpenAI API key and take note of it
You can use Docker to setup the demo in one line of code! If you're not familiar with Docker you can read more about it here (https://docker-curriculum.com/)
- Set environment variables:
- The following environment variables need to be set
WEAVIATE_ENDPOINT=your-weaviate-url
WEAVIATE_API_KEY=your-weaviate-api-key
OPENAI_API_KEY=your-openai-api-key
Use the
example.env
file inside the backend folder, make a copy of it as.env
and set your variables. Note that if you're using the GPT-3.5 model (by default with JS/TS client v3), ensure your OpenAI key has access. You can change themodel_name
variable togpt-4
inside thecreateSchema.ts
script, additional cost with OpenAI would apply.
- Use docker compose
docker-compose up --build
- Access the frontend on:
localhost:3000
To kick-start with the Healthsearch Demo, please refer to the READMEs in the Frontend
and Backend
folders:
Follow these steps to use the News Search App:
- Set up the Weaviate database, Node Express backend, and the React frontend by following the instructions in their respective READMEs.
- Launch the database, backend server, and the frontend application.
- Use the interactive frontend to input your natural language query to search for news articles on current events of your interest.
- The frontend sends the query to the backend along with your chosen filters that dictate whether it's closer to a keyword, hybrid, or vector search.
- The backend sends the parameterized query to the Weaviate database with a generative prompt to fetch relevant articles based on the user query.
- The frontend displays the articles from search results and also provide you with top 5 articles picked by OpenAI along with summaries.