- β¨ Statistics graduate & IT Cloud Ops Analyst based in Singapore.
- π Interested in data, cloud, analytics, machine learning, and related fields.
- π± Growing skills in AWS, GCP, Python, SQL, and machine learning.
- β‘ Experienced in data analytics, building data pipelines, automating processes, and creating dashboards.
- π Published research in statistical quality control (acceptance sampling) and COVID-19 survival regression analysis.
- π« How to reach me:
Here are selected projects completed using Python, Power BI, Tableau, SQL, Looker Studio, and Java:
β Β Note: The dates indicate the month and year when each project was completed.
- Parcel Delivery Time Prediction (Regression Modeling) β | β Oct 2025 β | β Show project
- Tools: Jupyter Notebook, Python (
Pandas
,NumPy
,Seaborn
,Matplotlib
,Scikit-learn
,XGBoost
,TensorFlow
,PyTorch
,Joblib
) - Description: Built regression models to predict parcel delivery times using historical Amazon order data.
- Techniques: EDA, feature engineering (date-time transformations, distance calculations), outlier detection, multicollinearity check, mixed scaling (StandardScaler + RobustScaler), feature encoding, and model comparison (Random Forest, XGBoost, TensorFlow DNN, PyTorch DNN).
- Result: Selected TensorFlow DNN as the final model due to its stability and generalization, achieving competitive MAE, RMSE, and RΒ². Deployed a pipeline with preprocessing and model serialization for inference on new parcel orders.
- Tools: Jupyter Notebook, Python (
- SQL Driven Business Analytics Framework β | β Aug 2025 β | β Show project
- Tools: SQL Server, Power BI (DAX)
- Designed queries for customer, sales, and profitability insights.
- Built dashboards with KPIs (revenue, profit margin, repeat purchase rate), maps, bar charts, and trend visualizations.
- Created calculated metrics in Power BI using DAX measures (e.g., Revenue per Order, Profit per Order, Margin per Order, Repeat Purchase Rate).
- Identification of Disaster-Related Tweets (NLP Classification) β | β May 2023 β | β Show project
- Tools: Jupyter Notebook, Python (
Pandas
,NumPy
,Seaborn
,Matplotlib
,SciPy
,Plotly
,NLTK
,re
,collection
,wordcloud
,TensorFlow
,Scikit-learn
) - Description: Developed an NLP classification model to predict disaster-related tweets.
- Techniques: EDA, Text Preprocessing, Classification Model Comparison (Linear SVC, Multinomial NB, Neural Network).
- Result: Achieved AUC 0.86 with Linear SVC, showing strong separation between disaster and non-disaster tweets
- Tools: Jupyter Notebook, Python (
- Churn Prediction (IBM Telco Dataset) β | β Apr 2023 β | β Show project
- Tools: Jupyter Notebook, Python (
Pandas
,NumPy
,Seaborn
,Matplotlib
,Plotly
,H3
,Folium
,TensorFlow
,imblearn
,Scikit-learn
,XGBoost
). - Description: Built ML models to predict customer churn.
- Techniques: EDA, Visualization, Classification Model Comparison (Random Forest, Logistic Regression, AdaBoost, XGBoost).
- Result: Achieved AUC 0.86 with XGBoost.
- Tools: Jupyter Notebook, Python (
- Web Scraping Booking.com β | β Apr 2023 β | β Show project
- Tools: Python (
Pandas
,Requests
,BeautifulSoup
,RegEx
) - Description: Scraped hotel data (name, rating, reviews, distance from city center, prices) and processed structured datasets for further analytics.
- Tools: Python (
- Titanic Survival Prediction (Kaggle Competition) β | β Mar 2023 β | β Show project
- Tools: Jupyter Notebook, Python (
Pandas
,NumPy
,Seaborn
,Matplotlib
,Scikit-learn
,TensorFlow
). - Description: Developed ML models to predict survival outcomes.
- Techniques: EDA, Feature Engineering, Visualization, Classification Model Comparison (Random Forest, Logistic Regression, Complement Naive Bayes).
- Result: Achieved stratified k-fold CV score of 0.85 with Random Forest.
- Tools: Jupyter Notebook, Python (
- Feature Engineering - Convert UTC to Local time β | β Mar 2023 β | β Show project
- Tools: Python (
Pandas
,DateTime
,Dateutil
,pytz
) - Converted UTC time to Malaysia Standard Time for analytics workflows.
- Tools: Python (
- Worldwide Movie Series Visualization β | β Jan 2023 β | β Show project
- Tools: Python (
Pandas
,NumPy
,Seaborn
,Matplotlib
,wordcloud
) - Created visualizations highlighting patterns and trends in movie series data.
- Techniques: EDA, Feature Engineering, Visualization.
- Tools: Python (
- Online Payment Fraud Detection β | β Dec 2022 β | β Show project
- Tools: Jupyter Notebook, Python (
Pandas
,NumPy
,Seaborn
,Matplotlib
,Tabulate
,Scikit-learn
) - Description: Trained ML models to classify fraudulent vs. non-fraudulent transactions.
- Techniques: EDA, Visualization, Classification Model Comparison (Random Forest, Logistic Regression).
- Result: Achieved stratified k-fold CV F1 score of 0.985 using a Random Forest model.
- Tools: Jupyter Notebook, Python (
- Cookies Sales Dashboard β | β May 2023 β | β Show project
- Tools: Power BI (Power Query, DAX).
- Description: Built a dashboard to analyze sales, cost, profit, lead time, and customer trends.
- Flight Ticket Sales Analysis Dashboard β | β Jan 2023 β | β Show project
- Tools: PostgreSQL, Tableau
- Description: Queried airline ticket data and built dashboards for sales, booking periods, and fare conditions.
- KPMG Data Analytics Consulting Virtual Internship β | β Nov 2022 β | β Show project
- Tools: Python (Jupyter Notebook), Tableau.
- Conducted data quality assessment and insights analysis.
- Built Tableau dashboards for customer segmentation and insights presentation.
- Non-parametric Test for Patient Health Status β | β Mar 2022 β | β Show project
- Tools: SAS Studio
- Description: Applied Shapiro-Wilk, Wilcoxon, Kolmogorov-Smirnov, Kruskal-Wallis, and Spearmanβs correlation to patient health data.
- E-commerce Dashboard β| β Jul 2021
- Tools: Looker Studio (Google Data Studio)
- Description: Built dashboards displaying sessions, transactions, revenue, checkout behavior, AOV, and conversion rate.
- Java Application -Simple Student Information System β | Nov 2019 | β Show project
- Tools: Java (NetBeans)
- Built a Java application to represent a simple student information system.