@@ -13,6 +13,358 @@ Propulse is a multi-agent system that leverages AI to generate high-quality prop
1313 - Verifier Agent: Ensures factual accuracy and compliance
1414- ** Modern Tech Stack** : Built with FastAPI, Streamlit, and Google Cloud Platform
1515
16+ ## 🚀 Quick Start
17+
18+ 1 . ** Clone the Repository**
19+ ``` bash
20+ git clone https://github.com/nerdy1texan/propulse.git
21+ cd propulse
22+ ```
23+
24+ 2 . ** Set Up Environment**
25+
26+ For Windows Git Bash:
27+ ``` bash
28+ # Initialize conda in Git Bash (do this once)
29+ source ~ /anaconda3/etc/profile.d/conda.sh
30+
31+ # Create and activate conda environment
32+ conda env create -f environment.yml
33+ conda activate propulse
34+ ```
35+
36+ For other terminals:
37+ ``` bash
38+ # Create and activate conda environment
39+ conda env create -f environment.yml
40+ conda activate propulse
41+ ```
42+
43+ 3 . ** Configure Environment Variables**
44+ ``` bash
45+ cp .env.example .env
46+ # Edit .env with your configuration
47+ ```
48+
49+ 4 . ** Start Services**
50+ ``` bash
51+ # Start backend
52+ cd backend
53+ uvicorn main:app --reload
54+
55+ # In another terminal, start frontend
56+ cd frontend
57+ streamlit run main.py
58+ ```
59+
60+ ## 🏗️ Architecture
61+
62+ ### System Architecture
63+ ``` mermaid
64+ graph TD
65+ subgraph "Frontend Layer"
66+ UI[Streamlit UI]
67+ Upload[Document Upload]
68+ Preview[Proposal Preview]
69+ end
70+
71+ subgraph "Backend Layer"
72+ API[FastAPI Service]
73+ Auth[Authentication]
74+ Cache[Redis Cache]
75+ end
76+
77+ subgraph "Agent Pipeline"
78+ R[Retriever Agent]
79+ W[Writer Agent]
80+ V[Verifier Agent]
81+ end
82+
83+ subgraph "Storage Layer"
84+ VDB1[Vector DB - RFPs]
85+ VDB2[Vector DB - Proposals]
86+ DB[(PostgreSQL)]
87+ GCS[Cloud Storage]
88+ end
89+
90+ UI --> API
91+ Upload --> API
92+ API --> Auth
93+ API --> Cache
94+ API --> R
95+ R --> VDB1
96+ R --> VDB2
97+ R --> W
98+ W --> V
99+ V --> API
100+ API --> Preview
101+ API --> DB
102+ API --> GCS
103+ ```
104+
105+ ### Workflow Diagram
106+ ``` mermaid
107+ sequenceDiagram
108+ actor User
109+ participant UI as Frontend
110+ participant API as Backend
111+ participant R as Retriever
112+ participant W as Writer
113+ participant V as Verifier
114+ participant DB as Databases
115+
116+ User->>UI: Upload RFP/Enter Prompt
117+ UI->>API: Submit Request
118+ API->>R: Get Relevant Context
119+ R->>DB: Query Vector DBs
120+ DB-->>R: Return Matches
121+ R->>W: Context + Prompt
122+ W->>V: Generated Proposal
123+ V->>API: Verified Content
124+ API->>UI: Return Proposal
125+ UI->>User: Display Result
126+ ```
127+
128+ ## 📁 Detailed Project Structure
129+
130+ ```
131+ Propulse/
132+ ├── backend/ # FastAPI backend service
133+ │ ├── agents/ # Agent implementations
134+ │ │ ├── retriever/ # Retriever agent logic
135+ │ │ │ ├── __init__.py
136+ │ │ │ ├── agent.py
137+ │ │ │ └── utils.py
138+ │ │ ├── writer/ # Writer agent logic
139+ │ │ │ ├── __init__.py
140+ │ │ │ ├── agent.py
141+ │ │ │ └── templates.py
142+ │ │ └── verifier/ # Verifier agent logic
143+ │ │ ├── __init__.py
144+ │ │ ├── agent.py
145+ │ │ └── rules.py
146+ │ ├── api/ # API endpoints
147+ │ │ ├── v1/
148+ │ │ │ ├── __init__.py
149+ │ │ │ ├── auth.py
150+ │ │ │ ├── proposals.py
151+ │ │ │ └── users.py
152+ │ │ └── middleware/
153+ │ ├── core/ # Core business logic
154+ │ │ ├── config/
155+ │ │ ├── models/
156+ │ │ └── services/
157+ │ ├── logs/ # Log files
158+ │ └── main.py
159+ ├── frontend/ # Streamlit frontend
160+ │ ├── assets/ # Static assets
161+ │ │ ├── css/
162+ │ │ └── img/
163+ │ ├── components/ # Reusable components
164+ │ │ ├── upload/
165+ │ │ ├── prompt/
166+ │ │ └── preview/
167+ │ ├── pages/ # Application pages
168+ │ │ ├── home.py
169+ │ │ ├── generate.py
170+ │ │ └── history.py
171+ │ └── main.py
172+ ├── shared/ # Shared resources
173+ │ ├── mcp_schemas/ # MCP protocol schemas
174+ │ │ ├── input/
175+ │ │ └── output/
176+ │ ├── sample_rfps/ # Sample RFP documents
177+ │ └── templates/ # Proposal templates
178+ ├── infra/ # Infrastructure code
179+ │ ├── gcp/ # GCP configurations
180+ │ │ ├── backend/
181+ │ │ └── frontend/
182+ │ └── terraform/ # Terraform configurations
183+ ├── scripts/ # Utility scripts
184+ │ ├── setup.sh
185+ │ └── cleanup.sh
186+ ├── .github/ # GitHub configurations
187+ │ └── workflows/ # CI/CD workflows
188+ ├── tests/ # Test suite
189+ │ ├── unit/
190+ │ └── integration/
191+ ├── .env.example # Environment variables template
192+ ├── environment.yml # Conda environment file
193+ ├── .gitignore # Git ignore rules
194+ └── README.md # Project documentation
195+ ```
196+
197+ ## 🔑 Key Features
198+
199+ ### Implemented Components ✅
200+
201+ #### ** Retriever Agent (Prompt 2)**
202+ - ** Dual Vector Search** : Simultaneously queries RFP and proposal vector databases
203+ - ** Multi-Format Support** : Processes PDF, DOCX, and TXT documents
204+ - ** Smart Text Chunking** : Intelligent document segmentation with overlapping windows
205+ - ** MCP Compliance** : Follows Model Context Protocol for standardized I/O
206+ - ** Real-time Logging** : Comprehensive JSONL logs with retrieval metadata
207+ - ** Error Resilience** : Graceful handling of missing files or processing errors
208+ - ** Flexible Querying** : Supports text-only, document-only, or combined queries
209+ - ** Embedding Models** : Uses Sentence Transformers for semantic similarity
210+ - ** FAISS Integration** : High-performance vector similarity search
211+ - ** GPU Acceleration** : Optional GPU support for faster processing
212+
213+ #### ** Text Processing Pipeline**
214+ - ** PDF Extraction** : Advanced PDF text extraction with page preservation
215+ - ** DOCX Processing** : Complete DOCX parsing including tables and paragraphs
216+ - ** Text Normalization** : Intelligent cleaning and formatting
217+ - ** Metadata Preservation** : Maintains source file information and processing timestamps
218+
219+ #### ** Vector Database Management**
220+ - ** Automated Building** : Scripts to build vector databases from document collections
221+ - ** Index Management** : FAISS index creation and optimization
222+ - ** Metadata Storage** : JSON-based chunk and database metadata
223+ - ** Version Control** : Timestamped database builds with provenance tracking
224+
225+ ### Upcoming Components 🚧
226+ - Writer Agent: Context-aware proposal generation
227+ - Verifier Agent: Hallucination detection and fact-checking
228+ - API Integration: RESTful endpoints for agent coordination
229+ - Frontend Interface: Streamlit-based user interface
230+ - Cloud Deployment: GCP Cloud Run deployment pipeline
231+
232+ ## 💻 Usage Commands
233+
234+ ### Environment Setup
235+ ``` bash
236+ # Initialize conda in Git Bash (Windows)
237+ source ~ /anaconda3/etc/profile.d/conda.sh
238+
239+ # Create and activate environment
240+ conda env create -f environment.yml
241+ conda activate propulse
242+
243+ # Copy environment variables template
244+ cp .env.example .env
245+ # Edit .env with your configuration
246+ ```
247+
248+ ### Vector Database Operations
249+ ``` bash
250+ # Build vector databases from sample documents
251+ python scripts/build_vector_db.py
252+
253+ # Build with custom paths
254+ python scripts/build_vector_db.py \
255+ --rfp-dir shared/sample_rfps \
256+ --proposal-dir shared/templates \
257+ --output-dir data/vector_dbs
258+
259+ # Build with GPU acceleration
260+ python scripts/build_vector_db.py --gpu
261+
262+ # Use different embedding model
263+ python scripts/build_vector_db.py --model all-mpnet-base-v2
264+ ```
265+
266+ ### Retriever Agent Usage
267+ ``` python
268+ # Basic retrieval example
269+ from backend.agents.retriever_agent import RetrieverAgent, QueryInput
270+
271+ # Initialize agent
272+ agent = RetrieverAgent(
273+ rfp_db_path = " data/vector_dbs/rfp_db" ,
274+ proposal_db_path = " data/vector_dbs/proposal_db"
275+ )
276+
277+ # Text-only query
278+ query = QueryInput(
279+ text = " Need web application development with user authentication" ,
280+ top_k = 5 ,
281+ similarity_threshold = 0.2
282+ )
283+ result = agent.retrieve(query)
284+
285+ # Query with document upload
286+ query_with_doc = QueryInput(
287+ text = " Software development project" ,
288+ document_path = " path/to/rfp.pdf" ,
289+ top_k = 10
290+ )
291+ result = agent.retrieve(query_with_doc)
292+
293+ # Save results
294+ agent.save_result(result)
295+ ```
296+
297+ ### Testing
298+ ``` bash
299+ # Run all tests
300+ pytest
301+
302+ # Run specific test file
303+ pytest tests/test_retriever.py -v
304+
305+ # Run with coverage
306+ pytest --cov=backend tests/
307+
308+ # Run only unit tests (skip integration)
309+ pytest -m " not integration"
310+ ```
311+
312+ ### Development Tools
313+ ``` bash
314+ # Code formatting
315+ black .
316+ isort .
317+
318+ # Linting
319+ flake8
320+
321+ # Type checking
322+ mypy backend/
323+
324+ # Pre-commit hooks
325+ pre-commit install
326+ pre-commit run --all-files
327+ ```
328+
329+ ### Service Management
330+ ``` bash
331+ # Start backend service
332+ cd backend
333+ uvicorn main:app --reload --port 8000
334+
335+ # Start frontend (in separate terminal)
336+ cd frontend
337+ streamlit run main.py
338+
339+ # View API documentation
340+ # http://localhost:8000/docs
341+ ```
342+
343+ ### Logging and Monitoring
344+ ``` bash
345+ # View retriever logs
346+ tail -f logs/retriever_log.jsonl
347+
348+ # Monitor vector database build
349+ tail -f logs/vector_db_build.log
350+
351+ # Clean up logs and artifacts
352+ bash scripts/cleanup.sh
353+ ```
354+
355+ ### Infrastructure Management
356+ ``` bash
357+ # Deploy to GCP (when implemented)
358+ cd infra/terraform/prod
359+ terraform init
360+ terraform plan
361+ terraform apply
362+
363+ # View cloud resources
364+ gcloud run services list
365+ gcloud storage ls
366+ ```
367+
16368## 🏗️ Architecture
17369
18370### System Architecture
@@ -246,5 +598,5 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file
246598
247599## 👥 Team
248600
249- - Project Lead: [ Your Name ] ( https://github.com/yourusername )
250- - Contributors: [ See all contributors] ( https://github.com/yourusername/propulse/graphs/contributors ) # Propulse
601+ - Project Lead: [ Maulin Raval ] ( https://github.com/nerdy1texan )
602+ - Contributors: [ See all contributors] ( https://github.com/yourusername/propulse/graphs/contributors )
0 commit comments