This is a student project about predicting the chance of admission. For this project we are using Graduate Admission Dataset from Kaggle. We will use machine learning to analyze the data, find a model to predict the University Ranking and then visualize the result.
The project uses the following key libraries:
- Dash - Web application framework
- Numpy - Numerical computing
- Pandas - Data manipulation and analysis
- scikit-learn - Machine learning
- Joblib - Model persistence
- Plotly - Interactive visualizations
- Dash-DAQ - Advanced Dash components
All dependencies are managed through requirements.txt
with specific versions for compatibility:
- Core Dependencies: dash, pandas, numpy, scikit-learn, joblib, plotly
- UI Components: dash-daq, dash-table
- Development: gunicorn (for deployment)
git clone <repository-url>
cd StdAdmitPred
# Create virtual environment
python -m venv venv
# Activate virtual environment
# For Windows:
venv\Scripts\activate
# For Linux/macOS:
source venv/bin/activate
# Install all required packages
pip install -r requirements.txt
If you encounter model compatibility issues, retrain the model:
cd ML
python retrain_model_modular.py
cd WebApp
python app.py
Open your web browser and go to:
http://127.0.0.1:8050/
StdAdmitPred/
├── ML/ # Machine Learning & Data Processing
│ ├── config.py # ML configuration settings
│ ├── data_processor.py # Data loading and preprocessing
│ ├── model_trainer.py # Model training and evaluation
│ ├── retrain_model_modular.py # Model retraining script
│ ├── model_RandF.sav # Trained model file
│ └── Prediction.ipynb # Jupyter notebook for model development
├── WebApp/ # Dash Web Application
│ ├── config/
│ │ ├── __init__.py
│ │ └── settings.py # Application configuration
│ ├── data/
│ │ ├── __init__.py
│ │ └── data_loader.py # Data loading and preprocessing
│ ├── models/
│ │ ├── __init__.py
│ │ └── model_manager.py # Model loading and predictions
│ ├── visualizations/
│ │ ├── __init__.py
│ │ └── charts.py # Chart creation functions
│ ├── components/
│ │ ├── __init__.py
│ │ ├── header.py # Header component
│ │ ├── home_tab.py # Home tab component
│ │ ├── dataset_tab.py # Dataset tab component
│ │ ├── dashboard_tab.py # Dashboard tab component
│ │ ├── ml_tab.py # Machine Learning tab component
│ │ └── prediction_tab.py # Prediction tab component
│ ├── callbacks/
│ │ ├── __init__.py
│ │ └── prediction_callback.py # Prediction callback logic
│ ├── assets/ # Static assets (images, CSS)
│ ├── app.py # Main application file
│ └── Visualization.ipynb # Visualization development notebook
├── Dataset/ # Data Files
│ ├── admission_predict_V1.2.csv
│ └── Admission_Predict_Ver1.1.csv
├── requirements.txt # Python dependencies
├── .gitignore # Git ignore file
└── README.md # This file
- Interactive Predictions: Real-time admission probability calculation
- Data Visualizations: Comprehensive charts and graphs
- Model Performance: Feature importance and evaluation metrics
- Responsive Design: Modern, user-friendly interface
- Modular Architecture: Clean, maintainable, and scalable codebase
- Error Handling: Comprehensive error handling and logging throughout
- Type Safety: Full type hints and documentation for better code clarity
https://www.kaggle.com/mohansacharya/graduate-admissions?select=Admission_Predict_Ver1.1.csv
- Regression Models:
- DecisionTree
- Linear Regression
- RandomForest (Selected)
- KNeighbours
- SVM
- AdaBoostClassifier
- GradientBoostingClassifier
- Ridge
- BayesianRidge
- ElasticNet
- HuberRegressor
- DASH/Plotly - Web framework and visualizations
- scikit-learn - Machine learning algorithms
- Pandas/Numpy - Data processing
- Machine Learning Model: RandomForestRegressor (78.53% accuracy)
- ML Module: Python with scikit-learn for model training and data processing
- WebApp Module: Dash/Plotly web application with modular design
- Model Persistence: Joblib for model saving/loading
- Code Organization: Modular architecture with separation of concerns
- Error Handling: Comprehensive logging and error management
- CGPA: 75.97% (Most important factor)
- GRE Score: 10.27%
- TOEFL Score: 4.48%
- Statement of Purpose: 3.19%
- Letter of Recommendation: 2.92%
- University Rating: 1.67%
- Research Experience: 1.51%
- RandomForest Regressor: 78.53%
- Test Set Performance: Consistent and reliable predictions
-
Model Loading Error:
- Run
python ML/retrain_model_modular.py
to create a compatible model
- Run
-
Import Errors:
- Ensure you're using the virtual environment
- Install dependencies with
pip install -r requirements.txt
-
Port Already in Use:
- Change the port in
app.py
or kill the existing process
- Change the port in
-
Dash-DAQ Import Issues:
- Ensure
dash-daq
is installed:pip install dash-daq
- Ensure
cd WebApp
python app.py
- Create
Procfile
with:web: gunicorn app:server
- Ensure
requirements.txt
is up to date - Deploy using Heroku CLI or GitHub integration
- GitHub Repository: https://github.com/LameesKadhim/SAP-project
- Live Demo: https://predict-student-admission.herokuapp.com/
- Video Trailer: https://youtu.be/rXDHiqIxYuQ
- Maintainability: Each component is isolated and easy to modify
- Testability: Individual modules can be tested independently
- Scalability: Easy to add new features or components
- Reusability: Components can be reused across different parts
- Error Handling: Comprehensive error handling and logging
- Documentation: Full type hints and docstrings
- Separation of Concerns: Each module has a single responsibility
- Factory Pattern: Clean initialization patterns for components
- Configuration Management: Centralized settings management
- Error Logging: Comprehensive logging for debugging
- Type Safety: Full type annotations for better code clarity
This project is part of the Learning Analysis course (WS20/21) by the Datology Group.
Note: This project uses modern Python and Dash compatibility with a modular architecture following industry best practices.