For Caltech's CS 121 Final Project, we plan to develop an e-commerce application for cosmetics products and brands.
Here's the link to our Final Project Proposal: Final Project Proposal
For our CS 121 Final Project, we have worked on creating a cosmetics e-commerce application where the users can browse and purchase items from.
- Running our database
In mysql, run the following:
CREATE DATABASE cosmeticsdb;
USE cosmeticsdb;
SOURCE setup.sql;
SOURCE load-data.sql;
SOURCE setup-passwords.sql;
SOURCE setup-routines.sql;
SOURCE grant-permissions.sql;
SOURCE queries.sql;
This should be done prior to running any of the applications.
- Running our client-side application
Our client application has been implemented using Python's Flask framework. To run the app, you must install required libraries. Run the following commands (assuming you are at the root directory, CS121FinalProject):
cd webClientInterface
brew install pkg-config
if you have Python 3:
pip3 install -r requirements.txt
flask --app app run
if you have Python 2:
pip install -r requirements.txt
flask --app app run
If you have issue running, consider running it in virtual environment:
source env/bin/activate
Flask will take a bit of time to load (up to several minutes) when its loading for the first time. Even if you get a 404 error, wait for a few minutes and retry. We have attached a demo video of our program for those of you who might want to see our project without downloading/running Flask.
Also, when testing the website, you MUST manually log out of the website before you close. We haven't had a chance to implement flushing sessions on close of the website, so you will remain logged in until you sign out.
Our Client Application is pretty intuitive, since it has a front-end interface. Simply navigate around the website as you normally would on any website!
To enjoy all the functionalities of a user, you must create an account and login.
- Running our admin-side application
Run the following in your command-line:
python3 app-admin.py
You can start by using one of the following options:
- (a) check/update inventory
- (b) check/update product
- (c) check/update brand
- (d) check/update user
- (e) view statistics
View statistics allows you to choose from viewing inventory/sales statistics.
We obtained our data for the project from Kaggle's Cosmetics datasets, which contains a singular file, cosmetics.csv, that contains the following 11 columns: Label (Product type, e.g. Moisturizer), Brand, Name (of the product), Price, Rank (Rating), Ingredients, Combination (boolean value), Dry (boolean), Normal (boolean), Oily (boolean), and Sensitive (boolean).
Part of our project involved designing an efficient DDL and loading our pre-existing data into our database tables. In order to do so, we have cleaned and pre-processed our data rigorously through either utilizing Excel/Google Sheets or writing a Python Pandas/NumPy program.
brands.csv, products.csv, stores.csv
- Using Excel, we were able to generate our primary datasets,
brands, products, andstores. These contain unique ids for each, and any attribute associated with them. - These datasets were generated using Excel tools and were relatively simple to generate
ingredients.csv
ingredients.csvhas been mainly generated using a Python script (Pandas/NumPy) and manual processing- it had a lot of anomaly data and hard-to-detect duplicates (e.g. duplicates in different languages or same ingredients written in different string format), which had to be manually removed
- the initial, raw dataset generated by Pandas script contained about 8500 rows, and through manual cleaning, we narrowed it down to 5800 rows.
product_ingredients.csv
product_ingredients.csvhas also been generated using a Python script- the script goes through each of the original
ingredientstexts one by one, and loops through all the possible ingredients in our ingredients dataset to find all ingredients that are present in the text - it then writes to
product_ingredients.csvfile line by line all the pairs of(product_id, ingredient_id)