An AI-powered FastAPI application that automatically categorizes news articles using Google's Gemini AI model. The system intelligently classifies scientific and technological news into predefined categories with multi-label support.
- AI-Powered Classification: Automatic categorization using Google Gemini 2.5 Flash
- Multi-Label Support: Articles can belong to multiple categories (max 3)
- Fallback System: Keyword-based classification if AI fails
- RESTful API: Complete CRUD operations for news management
- PostgreSQL Database: Persistent storage with SQLAlchemy ORM
- CORS Enabled: Ready for frontend integration
The system classifies news into these categories:
- AI - Artificial Intelligence, Machine Learning, Neural Networks
- Robotics - Robots, Automation, Autonomous Systems
- Space - Astronomy, Space Exploration, Satellites
- Aeronautics - Aircraft, Aviation, Flight Technology
- Physics - Quantum Mechanics, Particle Physics
- Engineering - Civil, Mechanical, Electrical Systems
- Biology - Genetics, Ecology, Evolution
- Medical Science - Healthcare, Treatments, Pharmaceuticals
- Environment - Climate, Conservation, Sustainability
- Other - Miscellaneous topics
- Python 3.8+
- PostgreSQL database
- Google Gemini API key
- Clone the repository
git clone <https://github.com/Asifmahmud436/news_analyzer>
cd news_analyzer- Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies
pip install fastapi uvicorn sqlalchemy psycopg2-binary python-dotenv google-generativeai typing-extensions- Set up environment variables
Create a .env file in the project root:
DATABASE_URL=postgresql://username:password@localhost:5432/news_db
GEMINI_API_KEY=your_gemini_api_key_here- Run the application
uvicorn main:app --reloadThe API will be available at http://127.0.0.1:8000
- Swagger UI:
http://127.0.0.1:8000/docs - ReDoc:
http://127.0.0.1:8000/redoc
POST /newsRequest Body:
{
"headline": "SpaceX Successfully Lands Starship",
"body": "The massive rocket completed its flight test and touched down vertically on the pad."
}Response:
{
"id": 1,
"headline": "SpaceX Successfully Lands Starship",
"body": "The massive rocket completed its flight test...",
"categories": ["Space", "Engineering"],
"created_at": "2025-12-18T01:00:00",
"updated_at": "2025-12-18T01:00:00"
}GET /newsGET /news/{news_id}PUT /news/{news_id}Request Body:
{
"headline": "Updated Headline",
"body": "Updated content"
}DELETE /news?id={news_id}Create a medical news article:
curl -X 'POST' \
'http://127.0.0.1:8000/news' \
-H 'Content-Type: application/json' \
-d '{
"headline": "Breakthrough in Cancer Treatment",
"body": "Scientists discover a new immunotherapy drug that targets cancer cells with 90% success rate."
}'Create an AI + Medical Science article:
curl -X 'POST' \
'http://127.0.0.1:8000/news' \
-H 'Content-Type: application/json' \
-d '{
"headline": "AI Revolutionizes Medical Diagnosis",
"body": "A new artificial intelligence system can detect diseases from X-rays faster than human doctors."
}'Get all news:
curl -X 'GET' 'http://127.0.0.1:8000/news'news-classification-api/
βββ main.py # FastAPI application & routes
βββ ai.py # Gemini AI classification logic
βββ models.py # SQLAlchemy database models
βββ schema.py # Pydantic schemas
βββ database.py # Database configuration
βββ .env # Environment variables (not in git)
βββ requirements.txt # Python dependencies
βββ README.md # This file
- Create PostgreSQL database:
CREATE DATABASE news_db;- Update
DATABASE_URLin.envfile
- Get API key from Google AI Studio
- Add to
.envfile asGEMINI_API_KEY
Update origins list in main.py to include your frontend URLs:
origins = [
"http://localhost:3000", # React
"http://localhost:5173", # Vite
"https://your-app.vercel.app"
]- Primary Method: Gemini AI analyzes article text with detailed prompt engineering
- Fallback Method: If AI fails, keyword-based classification kicks in
- Post-Processing: Removes redundant "Other" category if specific categories exist
- Combines headline + body for better context
- Maximum 3 categories per article
- Prioritizes specific categories over "Other"
- Special handling for medical content
CREATE TABLE news (
id SERIAL PRIMARY KEY,
headline TEXT NOT NULL,
body TEXT NOT NULL,
categories JSON,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);fastapi>=0.104.1
uvicorn[standard]>=0.24.0
sqlalchemy>=2.0.23
psycopg2-binary>=2.9.9
python-dotenv>=1.0.0
google-generativeai>=0.3.1
typing-extensions>=4.8.0The API includes comprehensive error handling:
- AI Failures: Automatic fallback to keyword-based classification
- Database Errors: Proper rollback and error messages
- 404 Errors: Clear messages for missing resources
- Validation: Pydantic schema validation
- API keys stored in environment variables
- CORS configured for specific origins
- SQL injection protected by SQLAlchemy ORM
- Input validation via Pydantic schemas
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Open a Pull Request
This project is licensed under the MIT License.
- Google Gemini AI for classification
- FastAPI for the web framework
- SQLAlchemy for database ORM
For issues or questions:
- Open an issue on GitHub
- Contact: safaandsafa4@example.com
Built with β€οΈ using FastAPI and Google Gemini AI