This project analyzes Walmart sales data to derive business insights using Python for data processing and SQL Server for advanced analytics.

- Analyze payment methods and sales patterns
- Identify highest-rated product categories
- Determine busiest days and peak hours
- Calculate profitability by category
- Python 3.13.7
- Pandas - Data manipulation and analysis
- SQL Server - Database management and queries
- Jupyter Notebook - Interactive development
- pyodbc - Database connectivity
- Python 3.13
- MS SQL Server 2022
- Kaggle API key
This project provides comprehensive analysis of Walmart sales data to support data-driven business decisions.
- Original Dataset: 10,051 records, 11 columns
- After Cleaning: 9,969 records, 12 columns
- Data Quality: Handled 31 missing values and 51 duplicates
- Data loading and initial exploration
- Data cleaning (handling missing values, duplicates)
- Data transformation (unit_price conversion)
- SQL Server database creation
- Data export to SQL tables
- Connection validation
- 9 key business questions answered
- Advanced SQL queries (CTEs, Window Functions, Joins)
- Performance optimization
# Key operations performed:
- pd.read_csv() with error handling
- df.drop_duplicates() and df.dropna()
- String manipulation for unit_price
- SQLAlchemy for database connection
# Advanced techniques used:
- Common Table Expressions (CTEs)
- Window Functions (RANK(), OVER())
- Date/Time functions
- Conditional aggregation total_transactions = 9969 total_revenue = "$1.2M+" average_rating = 5.83 top_category = "Fashion accessories & Home and Lifestyle" most_popular_payment = "Credit Card"
📱 PAYMENT METHOD BREAKDOWN
• Credit Card: 42% (Most popular)
• E-wallet: 38%
• Cash: 20% (Least popular)
💰 TRANSACTION VOLUME BY PAYMENT
• E-wallet: 3,881 transactions
• Credit Card: 4,256 transactions
• Cash: 1,832 transactions
🏆 TOP CATEGORIES BY RATING:
- Fashion accessories: ★★★★☆ (7.1/10)
- Home and Lifestyle: ★★★★☆ (7.1/10)
🌅 SALES BY TIME SHIFTS
• Morning (6AM-12PM): 15% of sales
• Afternoon (12PM-5PM): 55% of sales (Peak)
• Evening (5PM-12AM): 25% of sales
⭐ CUSTOMER SATISFACTION
• Average Rating: 5.83/10
• Fashion accessories & Home and lifestyle receive highest satisfaction
- Promote Credit Card payments - Already most popular, consider loyalty rewards
- Stock optimization - Increase Fashion accessories inventory
- Staff scheduling - Peak hours (2PM-5PM) need maximum staff
- Home and lifestyle expansion - Highest rated and most profitable category
- Afternoon marketing - Target campaigns during peak sales hours (12PM-5PM)
This analysis demonstrates that Walmart's sales are strongest in:
- Fashion accessories & Home and lifestyle (highest profit and ratings)
- Afternoon shifts (peak transaction hours)
- Credit Card payments (most preferred method)
This project is licensed under the MIT License. You are free to use, modify, and share this project with proper attribution.
Hi there! I'm Shoaib Dyre, an aspiring Data Analyst passionate about transforming raw data into meaningful insights. This GitHub repository showcases my journey in learning data analysis, including projects, SQL queries and analytic reposts.
🔍 Curious about data – I love exploring datasets, finding patterns, and telling stories through numbers.
📊 Skills in development: SQL, Python (Pandas, NumPy), Excel, Power BI and data cleaning.
🎯 Goal: Land my first Data Analyst role and contribute to data-driven decision-making.