Skip to content

A machine learning pipeline to classify IMDB reviews into positive or negative sentiment using TF-IDF and Logistic Regression.

Notifications You must be signed in to change notification settings

ahsankhizar5/text-sentiment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎭 Text Sentiment Analysis

A machine learning pipeline to classify IMDB movie reviews as positive or negative using NLP preprocessing, TF-IDF vectorization, and a Logistic Regression model.


🧠 Overview

This project builds a text sentiment analysis model using the IMDB Reviews Dataset. The pipeline involves preprocessing, vectorization, training, evaluation, and live prediction.


πŸ“ Dataset

This project uses the IMDB Reviews Dataset:

➑️ Download from Kaggle

Steps:

  1. Download and extract the dataset.
  2. Rename or ensure the file is named IMDB_Dataset.csv.
  3. Place it in the project root directory.

⚠️ The dataset is not included in the repo due to GitHub file size limits.


πŸ› οΈ Tech Stack

  • Python 3
  • Pandas
  • Scikit-learn
  • NLTK
  • TF-IDF Vectorizer
  • Logistic Regression

πŸš€ Features

  • Clean and normalize text using NLP techniques
  • Convert reviews into numerical features using TF-IDF
  • Train and evaluate a logistic regression model
  • Save trained model and vectorizer for reuse
  • Predict sentiment of custom reviews in real-time

πŸ“Š Results

Metric Score
Accuracy 85.13%
F1-Score 85%

πŸ“¦ Project Structure


text-sentiment-analysis/
β”œβ”€β”€ IMDB\_Dataset.csv
β”œβ”€β”€ sentiment\_analysis.py
β”œβ”€β”€ sentiment\_model.pkl
β”œβ”€β”€ tfidf\_vectorizer.pkl
└── README.md


▢️ Getting Started

  1. Clone the repo

    git clone https://github.com/ahsankhizar5/text-sentiment-analysis.git
    cd text-sentiment-analysis
  2. Install dependencies

    pip install -r requirements.txt
  3. Run the script

    python sentiment_analysis.py
  4. Enter your own review for live prediction!


πŸ“Œ Example

πŸ“ Try your own review:
Enter a movie review: this seems to be bad one
Predicted Sentiment: Negative 😞

πŸ“‘ License

MIT License


🀝 Contact

For queries or collaboration, feel free to reach out: Ahsan Khizar GitHub β€” LinkedIn

β€œCode is not just about solving problems. It’s about building trust, clarity, and real-world impact β€” one line at a time.”> β€” Ahsan Khizar

About

A machine learning pipeline to classify IMDB reviews into positive or negative sentiment using TF-IDF and Logistic Regression.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages