The SectionSeeker AI is an innovative and advanced search tool designed to pinpoint relevant data sections within uploaded DOCX files. It responds to user-provided keywords or entire sentences. Unlike traditional search functionalities, this tool employs cutting-edge AI models to comprehend and address user queries with unparalleled precision.
- DOCX File Support: Instantly render your
DOCX
files searchable. - Dual Search Mode: Whether it's a specific keyword or a full-fledged sentence, we've got you covered.
- Multiple Document Selection: Why limit to one? Search across several uploaded documents simultaneously.
SectionSeeker_AI_Chat_Demo.mov
SectionSeeker AI adopts an embedding search technique
utilizing the text-embedding-ada-002
model from OpenAI. It further integrates the gpt-3.5-turbo-16k
model for AI-driven filtering
during its learning journey. Such an approach empowers users to converse in natural language, breaking free from traditional tool constraints, like the "find"
feature in standard browsers.
- Contents Page
- Organize and manage with folders.
- Ability to delete folders and remove uploaded documents.
- Uploaded documents remain permanent in the database unless deleted.
- Upload documents for the search bot to learn from.
- View how the search bot interprets the uploaded documents.
- Chat Page
- Dive into a direct conversation with the search bot post-document upload.
To test and try SectionSeeker AI:
- Hosted Site: https://lucasmys.pythonanywhere.com/login
- Login Details:
- Email: [email protected]
- Password: 123456
- Sample Content:
- The platform comes preloaded with 4 sample policies/handbooks. Each of these samples contains approximately 4,000 words, all generated by ChatGPT.
- 📄 Sample Docx
-
Section Identification: SectionSeeker AI primarily identifies sections through heading styles, but it also uses GPT to categorize larger sections into sub-sections.
-
Optimal Heading Styles: For the best results, apply only essential heading styles. However, enriching the document with additional styles improves the AI's processing speed.
-
Performance Insight: As a reference, a
.docx
file of about 4,000 words, formatted correctly, is processed in under 15 seconds. -
Recognizable Styles: The AI identifies headings based on Microsoft Word styles with terms like "Headings" or "SubTitle". See the example below for clarification:
- Initiate by logging in with the credentials given in the Quick Start section.
- Access the content page by selecting "Contents" from the main navigation bar.
- Add a new folder.
- Choose a folder and upload your document.
- Review the processed documents by clicking on "Documents" within the Contents page.
- Access the chat interface via the "Chat" option in the navigation bar.
- Choose your desired document(s) to inquire about.
- Hit "Ask" or simply press "Enter" to submit your question.
- The AI will present the top 5 relevant sections, each with:
- Document source and section title.
- A relevance score (as a percentage).
- The extracted text recognized by SectionSeeker AI.
Please note the following constraints when using SectionSeeker AI:
-
Development Stage: As SectionSeeker AI is still in its early development stages, it currently supports only
.docx
file types. -
Document Complexity: The tool may not function optimally with
.docx
files containing images or intricate structures. It is best suited for plain text documents. This is ideal for business-related documents like company handbooks or business agreements, which usually have a straightforward format. -
Styles Dependency: SectionSeeker AI primarily uses heading styles to demarcate sections. If a
.docx
file has pre-existing styles that aren't correctly assigned, the tool might still process the document, but there might be unintended segmentations. For instance, what should be a single section might be split into two. As a result, when users search for specific sections based on queries, the returned results might not be entirely accurate.
To achieve the best results, ensure your document adheres to the guidelines and is compatible with the tool's current capabilities.
- Main Directory:
SectionSeeker_AI
- Essential files:
requirements.txt
,wsgi.py
,config.py
.
- Essential files:
- Subdirectory:
flask_APP
(WithinSectionSeeker_AI
)- Contains files and folders for styling, web interface templates, user registration, search bot web endpoints, and more.
To expand upon or set up SectionSeeker AI on your local machine, check out the detailed installation guide on GitHub: