This project explores the relationship between dietary habits, financial stress, and work/study hours with self-reported depression levels among students. The analysis uses real-world data and employs visualization, statistical testing, and predictive modeling in R.
- Understand patterns between student lifestyle factors and depression.
- Use visualizations and t-tests to examine significant group differences.
- Build a logistic regression model to predict depression based on key features.
student_depression_analysis.R
: Main analysis script.student_depression_dataset.csv
: Dataset used for the analysis..Rproj
and RStudio-related files (optional for reproduction).README.md
: This file.
- Data Cleaning: Handling missing values, formatting categorical/numerical data.
- Exploratory Data Analysis (EDA): Boxplots, bar charts, and correlation plots.
- Statistical Testing: Welch Two Sample t-tests on Financial Stress and Work/Study Hours.
- Predictive Modeling: Logistic regression using
glm()
with depression as the binary outcome.
- Financial stress and long work/study hours show strong associations with depression.
- Healthy dietary habits may have a protective effect but were less predictive.
- Logistic regression confirms these variables significantly contribute to depression predictions.
- Language: R
- Libraries:
ggplot2
,stats
, base R - Environment: RStudio
- Clone the repository.
- Open
student_depression_analysis.R
in RStudio. - Make sure the CSV file is in your working directory.
- Run the script from top to bottom.
This project was shaped by the guidance and experience I gained in my statistics classes, particularly through working on the LAPD project and a happiness study. Those earlier analyses helped build the skills and understanding necessary for this depression-focused exploration.
Created by Kaitlyn Kirt
Feel free to reach out for questions, suggestions, or collaboration!