This project implements a spam detection model using a two-layer deep neural network. The model classifies emails as either spam or not spam using a dataset of email messages.
The dataset used in this project is spam_or_not_spam.csv, which contains email texts and their corresponding labels:
0
: Not spam1
: Spam
To run this project, install the required dependencies using:
pip install numpy pandas scikit-learn matplotlib
- Load the dataset and handle missing values.
- Balance the dataset to avoid class imbalance issues.
- Convert email texts into numerical representations using
CountVectorizer
. - Apply word embeddings and flatten the data for neural network input.
- Split the data into training and testing sets.
- The model is trained using a two-layer neural network.
- It takes in the preprocessed feature vectors and outputs binary classifications (spam or not spam).
- The training set is used to optimize model parameters.
The model is evaluated using:
- Accuracy
- Precision
- Recall
- F1-score
- Confusion Matrix
Run the Jupyter Notebook to execute the steps:
jupyter notebook spam_detection_nn.ipynb
The model's performance is analyzed using metrics and visualizations to assess its effectiveness in detecting spam emails.