Emotion Detection: A Comparative Analysis

A comprehensive analysis of Machine Learning and Transformer models for multi-label emotion detection on the GoEmotions dataset.

This repository contains the code and resources for the project "Emotion Detection: A Comparative Analysis," which evaluates the performance of classic machine learning models against modern Transformer architectures. The key achievement of this project is a fine-tuned RoBERTa model that demonstrates high sensitivity with a state-of-the-art recall score.

🚀 Features

Comprehensive Analysis: Compares 4 different models (Logistic Regression, Random Forest, DistilBERT, and RoBERTa).
Multi-Label Classification: Tackles the complex task of predicting multiple emotions for a single piece of text.
In-Depth EDA: Includes detailed Exploratory Data Analysis on the GoEmotions dataset, highlighting the severe class imbalance.
High-Recall Model: The fine-tuned RoBERTa model achieves a weighted average recall of 0.66, demonstrating high sensitivity.
Code: All code is provided in easy-to-follow Jupyter/Colab notebooks.

📖 Methodology

The project workflow is divided into two parallel approaches:

Classic Machine Learning Baseline:
- Text is vectorized using TF-IDF.
- Logistic Regression and Random Forest models are trained using a MultiOutputClassifier.
Transformer Fine-Tuning:
- Text is tokenized using specific tokenizers for DistilBERT and RoBERTa.
- The pre-trained models are fine-tuned on the GoEmotions dataset using PyTorch and Hugging Face.

📊 Results

The results clearly show the superiority of Transformer models. Our fine-tuned RoBERTa model achieved the best performance, most notably a high recall score, indicating its effectiveness at identifying emotions.

A key finding was the precision-recall trade-off, a direct consequence of the dataset's class imbalance. While our model excels at finding emotions (high recall), it sometimes over-predicts, leading to lower precision.

🛠️ How to Use

To run this project, follow these steps:

Clone the repository:
Install the dependencies:
```
pip install -r requirements.txt
```
Download the dataset: The GoEmotions.csv dataset kaggle or Google's github.
Run the notebooks: Open the files in the notebooks/ directory using Jupyter Notebook, JupyterLab, or Google Colab.

👥 Authors

Mahmud - GitHub - LinkedIn
Tanjila Hussen

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Emotion Detection Project		Emotion Detection Project
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Emotion Detection: A Comparative Analysis

A comprehensive analysis of Machine Learning and Transformer models for multi-label emotion detection on the GoEmotions dataset.

🚀 Features

📖 Methodology

📊 Results

🛠️ How to Use

👥 Authors

About

Uh oh!

Releases

Packages

Languages

License

Cyber-Mood/Emotion-Detection-Project

Folders and files

Latest commit

History

Repository files navigation

Emotion Detection: A Comparative Analysis

A comprehensive analysis of Machine Learning and Transformer models for multi-label emotion detection on the GoEmotions dataset.

🚀 Features

📖 Methodology

📊 Results

🛠️ How to Use

👥 Authors

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages