Skip to content

This project uses machine learning models like Logistic Regression, Random Forest, and XGBoost to detect fraudulent credit card transactions. It handles class imbalance using SMOTE and visualizes key fraud patterns through an interactive Power BI dashboard.

Notifications You must be signed in to change notification settings

kamali1331/Credit-Card-Fraud-Detection-using-ML-with-Power-BI-Dashboard

Repository files navigation

Credit-Card-Fraud-Detection-using-ML-Power-BI-Dashboard

1. Introduction

With the rise of digital payments, credit card fraud has become a significant challenge for financial institutions. Traditional rule-based systems fail to detect subtle patterns used by fraudsters. This project aims to develop a machine learning-based fraud detection system that can intelligently flag suspicious transactions and help reduce financial loss. We also integrate Power BI to visualize trends and improve decision-making for fraud analysts.

πŸ“Œ 2. Project Overview

This project uses a real-world credit card transactions dataset that includes over 280,000 records with anonymized features. The goal is to build a classification model that can distinguish between genuine and fraudulent transactions with high accuracy and recall, especially since fraudulent cases represent a very small percentage of the data (imbalanced dataset problem).

The ML output is then visualized using Power BI, allowing analysts to interact with fraud trends, risk indicators, and transaction behavior.

πŸ’» 3. Programming Language & Tools Used

Language: Python

Libraries:

Pandas, NumPy – for data handling

Scikit-learn – for ML models and evaluation

XGBoost, Logistic Regression, Random Forest – models used

Matplotlib, Plotly – for visualization

imblearn – for handling class imbalance (SMOTE)

Power BI:

Used to create dashboards with fraud trends, KPIs, and filters

Integrated using CSV exports from the Python model

credit-card-fraud-detection

Project structure

β”‚ β”œβ”€β”€ data/ β”‚ #β”œβ”€β”€ raw_data.csv # Original dataset β”‚ β”œβ”€β”€ cleaned_data.csv # Cleaned and preprocessed data β”‚ └── prediction_results.csv # Model predictions for Power BI β”‚ β”œβ”€β”€ notebooks/ β”‚ β”œβ”€β”€ eda_visualization.ipynb # EDA using Plotly/Matplotlib β”‚ └── fraud_detection_model.ipynb # Model training and evaluation β”‚ β”œβ”€β”€ powerbi_dashboard/ β”‚ └── fraud_dashboard.pbix # Power BI report file β”‚ β”œβ”€β”€ images/ β”‚ └── fraud_dashboard.png # Screenshot of dashboard β”‚ β”œβ”€β”€ models/ β”‚ └── xgboost_model.pkl # Trained model (pickle format) β”‚ β”œβ”€β”€ README.md # Project overview and instructions β”œβ”€β”€ requirements.txt # Python libraries required

πŸ€– 4. About the Model

We experimented with multiple models:

Logistic Regression – for baseline performance

Random Forest – for robust, ensemble-based prediction

XGBoost – for high accuracy with optimized boosting

πŸ§ͺ 6. Training the Model (Workflow)

Data Preprocessing:

Remove duplicates

Scale Amount and Time using StandardScaler

Drop irrelevant columns (if any)

Handle Imbalance:

Apply SMOTE to balance fraud/genuine classes in training data

Model Training:

Train models using train_test_split (80:20)

Evaluate using metrics: Recall, Precision, F1-score, ROC-AUC

Prediction & Output:

Predict fraud probability on test set

Save results with transaction IDs, amounts, and fraud scores

Power BI Integration:

Export cleaned + prediction data as .csv

Load into Power BI for dashboard creation

βœ… 6. Merits

🎯 High Precision & Recall: ML models outperform manual rules

πŸ“Š Power BI Dashboards: Easy monitoring of fraud trends

πŸ” Retrainable: Model can improve with more labeled data

⚑ Real-time capable: Can be deployed as an API for streaming detection

⚠️ 7. Demerits / Limitations

❗ False Positives: May block genuine users occasionally

πŸ” Data Privacy: Needs secure handling of sensitive data

πŸ’Ύ Model Drift: Patterns change over time, requiring regular retraining

βš–οΈ Explainability: Advanced models (like XGBoost) are harder to interpret without SHAP

πŸ›‘ 8. Cautions

Use balanced evaluation metrics (don't trust accuracy alone)

Keep fraud experts in the loop for false positive/negative feedback

Ensure GDPR/PCI compliance while handling personal data

Don't deploy models trained only on historical data without validating on live data

Screenshots

![Credit card Fraud Detection]fraud detection

![Fraud detection]fraud detection 2

🏁 9. Conclusion

This project demonstrates how Machine Learning can revolutionize the fight against credit card fraud when combined with Power BI dashboards. With intelligent prediction models and dynamic visualization tools, businesses can quickly detect suspicious activities, minimize financial loss, and stay ahead of evolving fraud tactics. Future extensions could include deploying the model Add new featues

Welocme to Contributors

How to Contribute: Fork the repository

Create your feature branch: git checkout -b my-feature

Commit your changes: git commit -am 'Add new feature'

Push to the branch: git push origin my-feature

Create a new Pull Request

About

This project uses machine learning models like Logistic Regression, Random Forest, and XGBoost to detect fraudulent credit card transactions. It handles class imbalance using SMOTE and visualizes key fraud patterns through an interactive Power BI dashboard.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published