- Demo
- Overview
- Motivation
- Problem Solving Steps
- Source of Dataset
- Data Cleaning Techniques
- Exploratory Data Analysis
- Model Building
- Model Performance
- Deployment
- Future scope of project
campus_placement.mp4
This project builds a model that predicts whether a student can get job placement opportunity after graduating based on his/her academic performance, work experiences, projects etc.
The Placement of students is one of the most important objective of an educational institution. Reputation and yearly admissions of an institution invariably depend on the placements it provides it students with. That is why all the institutions, arduously, strive to strengthen their placement department so as to improve their institution on a whole. Any assistance in this particular area will have a positive impact on an institution’s ability to place its students. This will always be helpful to both the students, as well as the institution.
- Load the data into a dataframe
- Perform Data Preprocessing like handling missing values, feature creation etc.
- Perform Eploratory Data Analysis and get valuable insights from the data
- Perform feature selection and select the best algorithm which fits the data
- Save the model in a pickle file and integrate the model with the UI which is made using flask.
- Deploy the model web app on a cloud platform
This data set consists of Placement data of students in a XYZ campus. It includes secondary and higher secondary school percentage and specialization. It also includes degree specialization, type and Work experience and salary offers to the placed students.
I would like to thank Dr. Dhimant Ganatara, Professor Jain University for helping the students by providing this data.
Data set can be found here
The salary
column was the only column with missing values which were filled with the mean salary. This because visualizing the comparison between mean salary, mode salary and median salary, mean salary had a better representation of the salary distribution.
Extensive Data Analysis was carried out during this project to help understand the data and answer importand questions such as;
- Top 5 earns from each degree coure
- Maximum and Minimum Salary
- Maximun number of students placed from each department
- Which Departnment get the most placement
- Percentage of Male and Female Placement
Feature Selection was carried out using ExtraTreesClassifier
feature importance and mutual_infor_classif
. This is to get the features that contributes the most to predicting the target.
4 Models were built, these include
- RandomForestClassifier
- Logistic Regression
- DecisionTreeClassifie and
- SVM
RandomForestClassifier was the best performing model with an accuracy of over 88%
The model was deployed on Render cloud platform
Get more data to improve theaccuracy and performance of the model