A Python-based Retrieval-Augmented Generation (RAG) API for querying personal documents using a vector database (ChromaDB) and integrating with a UI like LibreChat.
- Overview
- Features
- Prerequisites
- Installation
- Usage
- Configuration
- Contributing
- License
- Code of Conduct
- Support
The rag_api
project enables users to ingest personal documents into a vector database (ChromaDB) and query them using a natural language interface. It integrates with tools like LibreChat for a user-friendly experience and leverages models from ollama
for processing queries.
- Ingest and index documents into a ChromaDB vector database.
- Query documents using natural language via a Python script or LibreChat UI.
- Support for multiple document formats (PDF, text, etc.) via
unstructured
. - Easy integration with
ollama
for language model inference. - Dockerized deployment for the API endpoint.
- Docker: Required for running the API endpoint.
- Python 3.8+: Required for running the Python scripts.
- Ollama: Required for language model inference.
- A directory containing documents to be indexed.
-
Clone the Repository
git clone https://github.com/FlorentB974/rag_api.git cd rag_api
-
Set Up Python Virtual Environment
python -m venv rag_env source rag_env/bin/activate # On Windows: rag_env\Scripts\activate
-
Install Dependencies
pip install langchain langchain_community langchain_huggingface langchain_chroma langchain_ollama unstructured huggingface_hub chromadb sentence-transformers llama-cpp-python pypdf
The API and scripts use environment variables for configuration. These are loaded automatically from a .env
file using python-dotenv.
- Create a file named
.env
in the project root (same directory asquery.py
andvector_db.py
). - Add the following variables (example values):
VECTOR_DB_PATH=./vector_db
EMBED_MODEL_NAME=sentence-transformers/all-MiniLM-L6-v2
COLLECTION_NAME=my_documents
OLLAMA_MODEL=mistral
- Adjust values as needed for your setup (e.g., change model names or paths).
Note: Never commit sensitive information (API keys, passwords) to your .env
file if sharing your code.
To create a new vector database and index documents:
python vector_db.py --source /path/to/docs --db /path/to/vector_db --init
To add additional documents to an existing database:
python vector_db.py --source /path/to/newfile --db /path/to/vector_db
Run the query script to test the setup:
python query.py
Start the API endpoint using Docker Compose:
docker compose up -d --build
Add the following configuration to your librechat.yml
file:
endpoints:
- name: "Personal Docs"
apiKey: "ollama"
baseURL: "http://host.docker.internal:5500/v1" # Use endpoint_ip if not using Docker
models:
default:
- "mistral"
fetch: false
titleConvo: true
titleModel: "current_model"
summarize: true
summaryModel: "current_model"
forcePrompt: false
After updating the configuration, restart LibreChat to apply the changes.
- Vector DB Path: Specify the path for the ChromaDB database using the
--db
flag invector_db.py
. - Document Source: Provide the path to your documents using the
--source
flag. - API Endpoint: The default port is
5500
. Update thebaseURL
inlibrechat.yml
if you change the port in the Docker configuration.
Contributions are welcome! Please read our Contributing Guidelines for details on how to submit pull requests, report issues, or suggest improvements.
This project is licensed under the MIT License. See the LICENSE file for details.
We are committed to fostering an open and inclusive community. Please read our Code of Conduct before contributing.
If you encounter issues or have questions, please file an issue on the GitHub Issues page.