Skip to content

The Spam SMS/Email Classifier is an NLP-based machine learning model designed to automatically detect whether a given SMS or Email is spam (unsolicited message) or ham (legitimate message). It processes textual content using NLP techniques to clean and extract meaningful features, then applies machine learning algorithms to classify the message.

License

Notifications You must be signed in to change notification settings

DevWaqarAhmad/Spam-SMS-Email-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📧 Spam SMS/Email Classifier (with Streamlit App)

A powerful and accurate Machine Learning-based solution that detects whether a given SMS or email is spam or not. This project uses Logistic Regression and TF-IDF vectorization, built using Python, Scikit-learn, Pandas, and Streamlit.

Made with Streamlit Python License


📂 Repository Structure

Spam-SMS-Email-Classifier/
│
├── data/
│   └── spam.csv                  # Raw dataset (SMS labeled as ham/spam)
│
├── pkl/
│   ├── model.pkl                 # Trained ML model (Logistic Regression)
│   └── vectorizer.pkl            # TF-IDF vectorizer
│
├── app.py                        # Streamlit app for message prediction
├── spam_classifier_notebook.ipynb  # Jupyter notebook with full model pipeline
├── README.md                     # Project documentation
├── requirements.txt              # Python dependencies
└── .gitignore

📊 Features

  • Detects whether input message is Spam or Not
  • Uses Logistic Regression (option to change model)
  • Data preprocessing & text cleaning
  • Spam class balancing using upsampling
  • TF-IDF feature extraction (top 5000 features)
  • User-friendly Streamlit Web App
  • Ready for deployment (Render/HuggingFace)

📁 Dataset

  • Dataset: spam.csv
  • Classes:
    • ham: Legitimate messages
    • spam: Fraud, loan, prize messages

🚀 How to Run Locally

1. Clone the repository

git clone https://github.com/DevWaqarAhmad/Spam-SMS-Email-Classifier.git
cd Spam-SMS-Email-Classifier

2. Create a virtual environment

python -m venv env

3. Activate the environment

  • On Windows:
env\Scripts\activate

4. Install dependencies

pip install -r requirements.txt

5. Run Streamlit app

streamlit run app.py

🧠 Model Details

  • Vectorizer: TfidfVectorizer with max_features=5000
  • Classifier: LogisticRegression (max_iter=1000)
  • Balanced Dataset: Using sklearn.utils.resample to upsample spam class
  • Accuracy Achieved: ~97-98%

🧪 Sample Predictions

Message Prediction
"Claim your free $5000 loan now. No documents required!" Spam
"Hey, meeting is at 3 PM today. Don't be late." Not Spam

🖼️ Screenshots

✅ Not Spam Example 🚫 Spam Example
Not Spam Spam

These screenshots show how the Streamlit app classifies input text messages.

🌍 Live Deployment (Coming Soon)

Deployed app link will be added here once hosted on Render/HuggingFace


👨‍💻 Author

Waqar Ahmad
📧 Email: devwaqarahmad@gmail.com
🌐 GitHub: DevWaqarAhmad


📃 License

This project is licensed under the MIT License.


🤝 Contributing

Pull requests are welcome! For major changes, please open an issue first to discuss what you would like to change.


About

The Spam SMS/Email Classifier is an NLP-based machine learning model designed to automatically detect whether a given SMS or Email is spam (unsolicited message) or ham (legitimate message). It processes textual content using NLP techniques to clean and extract meaningful features, then applies machine learning algorithms to classify the message.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published