Important
There is a Proof Of Concept demo live at: https://pqstem.org The demo will be available until 11th March 2025
PhiQwenSTEM is an assistant aimed at helping you solve complex STEM questions through reasoning. It is based on Phi-3.5 and QwQ-32B-preview by Microsoft, provided by HuggingFace Inference API, and has a vast knowledge base (more than 15,000 STEM questions) managed via Qdrant.
PhiQwenSTEM operates through three main components:
- Front-end: Utilizes Vite to render a landing page and a ChatGPT-like chat interface.
- Back-end: Employs a Python-based websocket to process messages from the front-end and send responses.
- Database: Uses a vector database built on Qdrant to store data for retrieval-augmented generation and semantic caching.
Once you launch the application, the vector database will ingest more than 15,000 STEM-related questions. Each question is associated with:
- The question itself
QwQ-32B-preview
reasoning about the question
The questions span the following domains of science:
- Chemistry (General, Organic, and Biochemistry)
- Physics
- Physical Chemistry
- Quantum Mechanics
- Differential Equations
- Linear Algebra
- Electromagnetism
- Mathematics
- Engineering
- Classical Mechanics
The data comes from the HuggingFace dataset EricLu/SCP-116K, made by more than 116,000 STEM-related questions accompanied by the ground truth answer, QwQ-32B-preview
reasoning and solution and o1
reasoning and solution: we selected questions (from the most represented domains in the dataset) in which reasoning by QwQ-32B-preview
produced the correct answer.
Dense embeddings are obtained using the static text encoder tomaarsen/static-retrieval-mrl-en-v1
(embedding size is truncated to 384), while sparse embeddings are generated with Qdrant/bm25
. To speed up retrieval, the medical vector database leverages binary quantization.
When a user asks a medical question:
- The backend first checks for similar questions in the semantic cache using
modernbert-embed-base
. If a match is found, the corresponding answer is returned. - If no significant match is found, it prompts
Phi-3.5-mini-instruct
(served on HuggingFace Inference API) to produce a question for searching the vector database. - The optimized question prompts a hybrid search within the medical vector database. The top 2 ranking matches for both sparse and dense vectors are retrieved and re-scored by
modernbert-embed-base
. - The top-ranking retrieved match (after re-scoring) is retained, and
QwQ-32B-preview
-generating reasoning (from the match payload) is passed on as a "reasoning" context. QwQ-32B-preview
is prompted to produce an answer based on the reasoning.
Note
QwQ-32B-preview
is instructed to assess if the reasoning and the answer provided are valid, relevant to the user's question, and correct. It is also instructed to output an "I don't know" answer when the question is ambiguous and the solution is not completely clear.
Required: Docker and docker compose
- Clone this repository
git clone https://github.com/AstraBert/PhiQwenSTEM.git
cd PhiQwenSTEM/docker-workflow/
- Add the
hf_token
secret in the.env.example
file and modify the name of the file to.env
. You can get your HuggingFace token by registering to HuggingFace and creating a fine-grained token that has access to the Inference API.
# modify your access token, e.g. hf_token="hf_abcdefg1234567"
mv .env.example .env
- Launch the docker application:
# If you are on Linux/macOS
bash start_services.sh
# If you are on Windows
.\start_services.ps1
You will see the application running on http://localhost:8501 and you will be able to use it successfully only after the backend is set up (you can see it from the logs). Depending on your connection and on your hardware, this might take some time (up to 30 mins to set up).
Required: Docker, docker compose and conda
- Clone this repository
git clone https://github.com/AstraBert/PhiQwenSTEM.git
cd PhiQwenSTEM/local
- If you are on macOS/Linux, you can run:
bash local_setup.sh
- If you are on Windows, running all the commands separately might be optimal:
# Launch Qdrant
docker compose up -d
# Create conda environment for the backend
conda env create -f ./backend/environment.yml
conda activate backend
# Ingest data
python3 data/toDatabase.py
# Create a semantic cache
python3 data/createCache.py
conda deactivate
# Install necessary dependencies for the UI
cd chatbot-ui/
npm install
# Back to the local folder
cd ..
- Once you are done with the set-up, launch the UI:
cd chatbot-ui/
npm run dev
- And, on a separate terminal window, launch the backend:
conda activate backend
cd backend/
python3 backend.py
Head over to http://localhost:8501 and you should see PhiQwenSTEM up and running in less than one minute!
Contributions are more than welcome! See contribution guidelines for more information :)
If you found this project useful, please consider to fund it and make it grow: let's support open-source together!😊
The software is hereby provided under an MIT license and is free to use.