This repo highlights applied projects in text analytics, forecasting, and regression.
All public datasets here are synthetic or public—no sensitive data.
- 🔗 Project page: milkshake-forecasting/README.md
- 📓 Notebook: milkshake-forecasting/MilkshakeModeling.ipynb
- 🗂️ Data: ItemSales_2023_2025.csv, Weather Data.csv
- ✨ Outcome: Best overall = OLS (lags + weather) — strongest point accuracy and interval calibration on the synthetic set.
Compare sentiment and themes between two brands using NLP (sentiment, n-grams, light topic modeling).
- 🔗 Project page: competitive-analysis/README.md
- 📓 Notebook: competitive-analysis/Competitive Analysis.ipynb
- 🗂️ Data: suck_it_up_reviews.csv, brushy_mountain_reviews.csv
- Project page: store-sales-forecasting/README.md
- Notebook: store-sales-forecasting/StoreSalesForecasting.ipynb
- Data: StoreSales_Summary_23-24_synthetic.csv, StoreSales_Summary_24-25_synthetic.csv, Weather_Data_SYNTH.csv
Headline (test set, dollars):
| Model | RMSE | MAE | % within $20 | PI coverage | Avg PI range |
|---|---|---|---|---|---|
| Calendar only | 89.98 | 59.49 | 50.70% | 88.73% | $790.84 |
| Calendar + weather | 91.46 | 59.96 | 53.52% | 91.55% | $734.92 |
Python (pandas, NumPy, scikit-learn, statsmodels, matplotlib, seaborn)
- This repository contains my capstone project for the M.S. at Appalachian State University.
- All code was written by me. I drew on class materials and public documentation for reference, and I generated/used synthetic datasets that mirror the original private data.
- I received feedback on modeling choices and presentation from Prof. Jeff Kaleta.
- I also used AI assistance (ChatGPT) for drafting/refactoring text, improving documentation, and suggesting code organization. I reviewed, tested, and am responsible for all final code and results.