UBC-MDS · dd5124 · Jan 31, 2025 · Jan 31, 2025 · Jan 31, 2025 · Jan 31, 2025
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -47,10 +47,10 @@ If you spot small typos or grammatical errors in documentation, you can fix them
 
 ## Get Started!
 
-Ready to contribute? Here's how to set up `salesanalyzer` for local development.
+Ready to contribute? Here's how to set up `salesanalyzer_mds` for local development.
 
-1. Download a copy of `salesanalyzer` locally.
-2. Install `salesanalyzer` using `poetry`:
+1. Download a copy of `salesanalyzer_mds` locally.
+2. Install `salesanalyzer_mds` using `poetry`:
 
     ```console
     $ poetry install
@@ -85,5 +85,5 @@ Before you submit a pull request, check that it meets these guidelines:
 
 ## Code of Conduct
 
-Please note that the `salesanalyzer` project is released with a
+Please note that the `_mds_` project is released with a
 Code of Conduct. By contributing to this project you agree to abide by its terms.
diff --git a/README.md b/README.md
@@ -1,32 +1,34 @@
-# salesanalyzer
+# salesanalyzer_mds
 
 [![Documentation Status](https://readthedocs.org/projects/salesanalyzer/badge/?version=latest)](https://salesanalyzer.readthedocs.io/en/latest/?badge=latest)
+[![ci-cd](https://github.com/UBC-MDS/salesanalyzer/actions/workflows/ci-cd.yml/badge.svg)](https://github.com/UBC-MDS/salesanalyzer/actions/workflows/ci-cd.yml)
 
 A python package that helps with the analysis on a sales data. The packagage will contain functions to be used as tools for identifying market segment, predicting future sales and analyzing seasonal revenue trends. <br>
 
-The sales_analyzer package will be an addition to the Python ecosystem as a specialized tool for analyzing retail sales data, targeting small to medium-sized businesses that may not have the resources for an in-house data analytics team and who could benefit from ready-to-use functions for common sales-related tasks. While existing packages such as `Pandas` and `Scikit-learn` provide general tools for data manipulation and machine learning predictions, `salesanalyzer` aims to streamline the process by offering a suite of pre-built, retail-specific analytical functions.
+The sales_analyzer package will be an addition to the Python ecosystem as a specialized tool for analyzing retail sales data, targeting small to medium-sized businesses that may not have the resources for an in-house data analytics team and who could benefit from ready-to-use functions for common sales-related tasks. While existing packages such as `Pandas` and `Scikit-learn` provide general tools for data manipulation and machine learning predictions, `salesanalyzer_mds` aims to streamline the process by offering a suite of pre-built, retail-specific analytical functions.
 
 ## Installation
 
 ```bash
-$ pip install salesanalyzer
+$ pip install salesanalyzer_mds
 ```
 
 ## Functions
+
 - `segment_revenue_share`: Segments products into three categories: cheap, medium, expensive, based on price, and calculates their respective share in total revenue. 
 - `predictSales`: Predicts future sales based on the provided historical data and the target.
 - `sales_summary_statistics`: Calculates a variety of summary statistics that provide insights into overall sales performance,
     customer behavior, and product performance.
 
 ## Usage
 
-`salesanalyzer` can be used to extract sales data insights from available data.
+`salesanalyzer_mds` can be used to extract sales data insights from available data.
 1. Set up imports
 
 ```
-from salesanalyzer.sales_summary_statistics import sales_summary_statistics
-from salesanalyzer.segment_revenue_share import segment_revenue_share
-from salesanalyzer.predict_sales import predict_sales
+from salesanalyzer_mds.sales_summary_statistics import sales_summary_statistics
+from salesanalyzer_mds.segment_revenue_share import segment_revenue_share
+from salesanalyzer_mds.predict_sales import predict_sales
 import pandas as pd     # additional import to handle your sales data
 ```
 
@@ -35,10 +37,13 @@ import pandas as pd     # additional import to handle your sales data
 3. Retrieve the insights:
 
 **Summary statistics**
+
 ```
 sales_summary_statistics(your_sales_data)
 ```
-The `sales_summary_statistics` returns a pandas DataFrame with:
+
+The `sales_summary_statistics()` function returns a pandas DataFrame with:
+
 - 'total_revenue': The total revenue generated by all sales.
 - 'unique_customers': The number of unique customers.
 - 'average_order_value': The average value of an order (sum of revenue per invoice).
@@ -47,15 +52,22 @@ The `sales_summary_statistics` returns a pandas DataFrame with:
 - 'average_revenue_per_customer': The average revenue generated by each customer.
 
 **Segment revenue share**
+
 ```
 segment_revenue_share(your_sales_data, 
                       price_col='UnitPrice', 
-                      quantity_col='Quantity')      # replace column names with your data column names
+                      quantity_col='Quantity',
+                      price_thresholds=None)      # replace column names with your data column names
 ```
-The `segment_revenue_share` returns a pandas DataFrame showing the total revenue share for each price segment:
-'cheap', 'medium', 'expensive'.
+
+The `segment_revenue_share()` funtion returns a pandas DataFrame showing the total revenue share for each price segment:
+'cheap', 'medium', 'expensive'. Custom price thresholds can be set by the user other set automatically.
+
+- Custom price thresholds can be set using the `price_thresholds` parameter.
+- If not specified, thresholds are automatically determined based on the data.
 
 **Predict sales**
+
 ```
 predict_sales(your_sales_data, 
               new_data,     # new sales data to base the predictions on
@@ -64,19 +76,21 @@ predict_sales(your_sales_data,
               target = 'Quantity', 
               date_feature = 'InvoiceDate')
 ```
-The `predict_sales` returns a DataFrame with prediction values, and a printed out MSE score.
+
+The `predict_sales()` function returns a DataFrame with prediction values, and a printed out MSE score.
 
 ## Developer notes:
 ### Running The Tests
 
 Run the following command in the terminal from the project's root directory to execute the tests:
+
 ```bash
 pytest tests/
 ```
 
 To assess the branch coverage for this package:
 ```bash
-pytest --cov=salesanalyzer --cov-branch
+pytest --cov=salesanalyzer_mds --cov-branch
 ```
 
 ## Dependencies
@@ -103,8 +117,8 @@ Interested in contributing? Check out the contributing guidelines. Please note t
 
 ## License
 
-`salesanalyzer` was created by Yeji Sohn, Daria Khon, Franklin Aryee. It is licensed under the terms of the MIT license.
+`salesanalyzer_mds` was created by Yeji Sohn, Daria Khon, Franklin Aryee. It is licensed under the terms of the MIT license.
 
 ## Credits
 
-`salesanalyzer` was created with [`cookiecutter`](https://cookiecutter.readthedocs.io/en/latest/) and the `py-pkgs-cookiecutter` [template](https://github.com/py-pkgs/py-pkgs-cookiecutter).
+`salesanalyzer_mds` was created with [`cookiecutter`](https://cookiecutter.readthedocs.io/en/latest/) and the `py-pkgs-cookiecutter` [template](https://github.com/py-pkgs/py-pkgs-cookiecutter).
diff --git a/docs/example.ipynb b/docs/example.ipynb
@@ -4,11 +4,11 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Example usage of salesanalyzer package\n",
+    "# Example Usage\n",
     "\n",
-    "Welcome to the `sales_analyzer` package! This package is designed to help small-sized businesses analyze their retail sales data efficiently, without needing extensive data analytics expertise. If you've ever felt overwhelmed by tools like Pandas or Scikit-learn, or wished for more retail-specific functions, you're in the right place.\n",
+    "Welcome to the `salesanalyzer_mds` package! This package is designed to help small-sized businesses analyze their retail sales data efficiently, without needing extensive data analytics expertise. If you've ever felt overwhelmed by tools like Pandas or Scikit-learn, or wished for more retail-specific functions, you're in the right place.\n",
     "\n",
-    "In this notebook, we'll walk through how to use the `salesanalyzer` package to extract valuable insights from your sales data. We’ll demonstrate key functionalities using real-world examples, so you can start improving your business decisions right away!"
+    "In this notebook, we'll walk through how to use the `salesanalyzer_mds` package to extract valuable insights from your sales data. We’ll demonstrate key functionalities using real-world examples, so you can start improving your business decisions right away!"
    ]
   },
   {
@@ -17,7 +17,7 @@
    "source": [
     "## Imports\n",
     "\n",
-    "Let us begin by setting up all our imports for this demonstration, which includes all 3 `salesanalyzer` functions:\n",
+    "Let us begin by setting up all our imports for this demonstration, which includes all 3 `salesanalyzer_mds` functions:\n",
     "- `sales_summary_statistics`: Calculates a variety of summary statistics that provide insights into overall sales performance, customer behavior, and product performance.\n",
     "- `segment_revenue_share`: Segments products into three categories: cheap, medium, expensive, based on price, and calculates their respective share in total revenue.\n",
     "- `predict_sales`: Predicts future sales based on the provided historical data and the target.\n",
@@ -45,7 +45,7 @@
     "\n",
     "Next, let us create a sample data to work with. \n",
     "> Note:\n",
-    "> `salesanalyzer` package is not limited to the sample data columns and can be customized to suit your specific requirements."
+    "> `salesanalyzer_mds` package is not limited to the sample data columns and can be customized to suit your specific requirements."
    ]
   },
   {
@@ -179,7 +179,7 @@
    "source": [
     "## Get Summary Statistics\n",
     "\n",
-    "One of the key features of `salesanalyzer` is its ability to quickly generate sales summary. Use the `analyze_sales_trends()` function to generate insights like total revenue, average order value, and top selling products.\n",
+    "One of the key features of `salesanalyzer_mds` is its ability to quickly generate sales summary. Use the `analyze_sales_trends()` function to generate insights like total revenue, average order value, and top selling products.\n",
     "> Use help(sales_summary_statistics) for more information about the function"
    ]
   },
@@ -266,7 +266,7 @@
    "source": [
     "## Get Revenue Share for each Product Category\n",
     "\n",
-    "Another feature of `saleanalyzer`, the `segment_revenue_share()` function, segments products into three categories (cheap < medium < expensive) — based on their price, and calculates the respective share of total revenue contributed by each segment. This function is particularly useful for analyzing product sales data and understanding revenue distribution across different pricing tiers.\n",
+    "Another feature of `saleanalyzer`, the `segment_revenue_share()` function, segments products into three categories (cheap < medium < expensive) — based on their price, and calculates the respective share of total revenue contributed by each segment. By default, the price thresholds are set automatically, but users can define custom thresholds to categorize products according to their specific business needs. This function is particularly useful for analyzing product sales data and understanding revenue distribution across different pricing tiers.\n",
     "> Use help(sales_summary_statistics) for more information about the function"
    ]
   },
@@ -337,10 +337,83 @@
     }
    ],
    "source": [
+    "# Using default price thresholds\n",
     "revenue_share = segment_revenue_share(sample_data, price_col='UnitPrice', quantity_col='Quantity')\n",
     "revenue_share"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>PriceSegment</th>\n",
+       "      <th>TotalRevenue</th>\n",
+       "      <th>RevenueShare (%)</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>cheap</td>\n",
+       "      <td>1150</td>\n",
+       "      <td>9.24</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>medium</td>\n",
+       "      <td>3600</td>\n",
+       "      <td>28.92</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>expensive</td>\n",
+       "      <td>7700</td>\n",
+       "      <td>61.85</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "  PriceSegment  TotalRevenue  RevenueShare (%)\n",
+       "0        cheap          1150              9.24\n",
+       "1       medium          3600             28.92\n",
+       "2    expensive          7700             61.85"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Using user-defined price thresholds\n",
+    "revenue_share = segment_revenue_share(sample_data, price_col='UnitPrice', quantity_col='Quantity', price_thresholds=(300, 500))\n",
+    "revenue_share"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -356,7 +429,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [
     {
@@ -409,7 +482,7 @@
        "1              1.33"
       ]
      },
-     "execution_count": 5,
+     "execution_count": 6,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -441,7 +514,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 7,
    "metadata": {},
    "outputs": [
     {
@@ -494,7 +567,7 @@
        "1              1.88"
       ]
      },
-     "execution_count": 6,
+     "execution_count": 7,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -507,13 +580,13 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### This is the end of the tutorial, where you have seen how to get sales data insights using our package."
+    "This is the end of the tutorial, where you have seen how to get sales data insights using our package."
    ]
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "salesanalyzser",
+   "display_name": "salesanalyzer",
    "language": "python",
    "name": "python3"
   },
@@ -527,7 +600,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.12.2"
+   "version": "3.11.9"
   }
  },
  "nbformat": 4,

diff --git a/pyproject.toml b/pyproject.toml
@@ -1,7 +1,7 @@
 [tool.poetry]
 name = "salesanalyzer_mds"
 version = "2.0.1"
-description = "A package for doing great things!"
+description = "A Python package for sales forecasting, statistical analysis, and data-driven insights"
 authors = ["Daria Khon, Franklin Aryee, Yeji Sohn"]
 license = "MIT"
 readme = "README.md"

diff --git a/src/salesanalyzer_mds/__pycache__/__init__.cpython-311.pyc b/src/salesanalyzer_mds/__pycache__/__init__.cpython-311.pyc
diff --git a/src/salesanalyzer_mds/__pycache__/predict_sales.cpython-311.pyc b/src/salesanalyzer_mds/__pycache__/predict_sales.cpython-311.pyc
diff --git a/src/salesanalyzer_mds/__pycache__/sales_summary_statistics.cpython-311.pyc b/src/salesanalyzer_mds/__pycache__/sales_summary_statistics.cpython-311.pyc
diff --git a/src/salesanalyzer_mds/__pycache__/segment_revenue_share.cpython-311.pyc b/src/salesanalyzer_mds/__pycache__/segment_revenue_share.cpython-311.pyc