SearchCap - An Internal RAG AI Search Engine
Authors: Troy Galicia, Arnab Roy, Donald Duhaney, Athena Deng
Abstract: This project demonstrates a Retrieval-Augmented Generation (RAG) AI-powered search engine. The Shiny application developed in this project allows users to upload various document types (pdf, docx, pptx), which are then analyzed and indexed by an AI model. The app provides an intuitive search interface that leverages RAG to deliver precise and contextually relevant results, thereby empowering users
to obtain relevant information quickly and efficiently.
-
Upload the documents to the document repository against which you would like to query
-
Set the relevant search and filter settings based on your preference.
-
Search based on keyword or ask a question to receive a summarized response by document for the top relevant matches.
Full Description: Key Features:
-
Display the top document matches from the repository based on your user query and search settings across different document types (Note: Try out different query combinations. To get more relevant results, ensure that the query is specific to what you are looking for).
-
Get summarized AI generated responses based on your query for the most relevant document matches.
-
Get the page numbers of the documents (if applicable) that are most likely to contain relevant information about your query
-
Display a AI generated similarity score between 0 and 1 (where 1 denotes the highest match) between the query and the document
Key Technologies Used:
-
Langchain
-
OpenAI (for embeddings and document summarization) (For this project, we used GPT-3.5 turbo for
document summaries based on the user query) -
Unstructured (for document preprocessing)
-
QDrant (vector store to store document chunk embeddings)
-
Shiny (User Interface)
For more details on the code, setup and usage, visit our Github repository: GitHub - Todorotsky/RAG-project-deployment
Sample documents used for testing can be accessed using this link: Shiny Competition - Google Drive
Shiny app: https://rag-ai-search-engine.shinyapps.io/searchcap/
Repo: GitHub - Todorotsky/RAG-project-deployment
Thumbnail:
Full image: