Student Admission Prediction

Project Idea

This is a student project about predicting the chance of admission. For this project we are using Graduate Admission Dataset from Kaggle. We will use machine learning to analyze the data, find a model to predict the University Ranking and then visualize the result.

Requirements and Preparation

Required Libraries

The project uses the following key libraries:

Dash - Web application framework
Numpy - Numerical computing
Pandas - Data manipulation and analysis
scikit-learn - Machine learning
Joblib - Model persistence
Plotly - Interactive visualizations
Dash-DAQ - Advanced Dash components

Dependencies Management

All dependencies are managed through requirements.txt with specific versions for compatibility:

Core Dependencies: dash, pandas, numpy, scikit-learn, joblib, plotly
UI Components: dash-daq, dash-table
Development: gunicorn (for deployment)

Quick Start Guide

1. Clone the Repository

git clone <repository-url>
cd StdAdmitPred

2. Set Up Virtual Environment

# Create virtual environment
python -m venv venv

# Activate virtual environment
# For Windows:
venv\Scripts\activate
# For Linux/macOS:
source venv/bin/activate

3. Install Dependencies

# Install all required packages
pip install -r requirements.txt

4. Retrain Model (if needed)

If you encounter model compatibility issues, retrain the model:

cd ML
python retrain_model_modular.py

5. Start the Application

cd WebApp
python app.py

6. Access the Application

Open your web browser and go to:

http://127.0.0.1:8050/

Project Structure

StdAdmitPred/
├── ML/                    # Machine Learning & Data Processing
│   ├── config.py          # ML configuration settings
│   ├── data_processor.py  # Data loading and preprocessing
│   ├── model_trainer.py   # Model training and evaluation
│   ├── retrain_model_modular.py  # Model retraining script
│   ├── model_RandF.sav    # Trained model file
│   └── Prediction.ipynb   # Jupyter notebook for model development
├── WebApp/                # Dash Web Application
│   ├── config/
│   │   ├── __init__.py
│   │   └── settings.py    # Application configuration
│   ├── data/
│   │   ├── __init__.py
│   │   └── data_loader.py # Data loading and preprocessing
│   ├── models/
│   │   ├── __init__.py
│   │   └── model_manager.py # Model loading and predictions
│   ├── visualizations/
│   │   ├── __init__.py
│   │   └── charts.py      # Chart creation functions
│   ├── components/
│   │   ├── __init__.py
│   │   ├── header.py      # Header component
│   │   ├── home_tab.py    # Home tab component
│   │   ├── dataset_tab.py # Dataset tab component
│   │   ├── dashboard_tab.py # Dashboard tab component
│   │   ├── ml_tab.py      # Machine Learning tab component
│   │   └── prediction_tab.py # Prediction tab component
│   ├── callbacks/
│   │   ├── __init__.py
│   │   └── prediction_callback.py # Prediction callback logic
│   ├── assets/            # Static assets (images, CSS)
│   ├── app.py             # Main application file
│   └── Visualization.ipynb # Visualization development notebook
├── Dataset/               # Data Files
│   ├── admission_predict_V1.2.csv
│   └── Admission_Predict_Ver1.1.csv
├── requirements.txt       # Python dependencies
├── .gitignore            # Git ignore file
└── README.md             # This file

Features

Interactive Predictions: Real-time admission probability calculation
Data Visualizations: Comprehensive charts and graphs
Model Performance: Feature importance and evaluation metrics
Responsive Design: Modern, user-friendly interface
Modular Architecture: Clean, maintainable, and scalable codebase
Error Handling: Comprehensive error handling and logging throughout
Type Safety: Full type hints and documentation for better code clarity

Implementation Approach

Dataset:

https://www.kaggle.com/mohansacharya/graduate-admissions?select=Admission_Predict_Ver1.1.csv

Algorithms:

Regression Models:
- DecisionTree
- Linear Regression
- RandomForest (Selected)
- KNeighbours
- SVM
- AdaBoostClassifier
- GradientBoostingClassifier
- Ridge
- BayesianRidge
- ElasticNet
- HuberRegressor

Tools:

DASH/Plotly - Web framework and visualizations
scikit-learn - Machine learning algorithms
Pandas/Numpy - Data processing

Project Architecture:

Machine Learning Model: RandomForestRegressor (78.53% accuracy)
ML Module: Python with scikit-learn for model training and data processing
WebApp Module: Dash/Plotly web application with modular design
Model Persistence: Joblib for model saving/loading
Code Organization: Modular architecture with separation of concerns
Error Handling: Comprehensive logging and error management

Visualization Features

Home

Dataset

Dashboard

Machine Learning

Prediction

Model Performance

Feature Importance (Latest Model):

CGPA: 75.97% (Most important factor)
GRE Score: 10.27%
TOEFL Score: 4.48%
Statement of Purpose: 3.19%
Letter of Recommendation: 2.92%
University Rating: 1.67%
Research Experience: 1.51%

Model Accuracy:

RandomForest Regressor: 78.53%
Test Set Performance: Consistent and reliable predictions

Troubleshooting

Common Issues and Solutions:

Model Loading Error:
- Run python ML/retrain_model_modular.py to create a compatible model
Import Errors:
- Ensure you're using the virtual environment
- Install dependencies with pip install -r requirements.txt
Port Already in Use:
- Change the port in app.py or kill the existing process
Dash-DAQ Import Issues:
- Ensure dash-daq is installed: pip install dash-daq

Deployment

Local Development:

cd WebApp
python app.py

Production Deployment (Heroku):

Create Procfile with: web: gunicorn app:server
Ensure requirements.txt is up to date
Deploy using Heroku CLI or GitHub integration

Project Links

GitHub Repository: https://github.com/LameesKadhim/SAP-project
Live Demo: https://predict-student-admission.herokuapp.com/
Video Trailer: https://youtu.be/rXDHiqIxYuQ

Contributors

Code Quality and Standards

✅ Modular Design Benefits

Maintainability: Each component is isolated and easy to modify
Testability: Individual modules can be tested independently
Scalability: Easy to add new features or components
Reusability: Components can be reused across different parts
Error Handling: Comprehensive error handling and logging
Documentation: Full type hints and docstrings

✅ Development Best Practices

Separation of Concerns: Each module has a single responsibility
Factory Pattern: Clean initialization patterns for components
Configuration Management: Centralized settings management
Error Logging: Comprehensive logging for debugging
Type Safety: Full type annotations for better code clarity

License

This project is part of the Learning Analysis course (WS20/21) by the Datology Group.

Note: This project uses modern Python and Dash compatibility with a modular architecture following industry best practices.

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
Dataset		Dataset
ML		ML
WebApp		WebApp
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

LameesKadhim/SAP-project

Folders and files

Latest commit

History

Repository files navigation