This project involves scraping quotes from the web, performing exploratory data analysis (EDA), and using SQL for deep data insights. The goal is to uncover patterns in author popularity, sentiment distribution, common themes, and keyword trends. Ideal for practicing web scraping, data cleaning, SQL querying, and data visualization.
Key Features: • Web Scraping: Collected quotes using Python libraries like requests and BeautifulSoup . • Data Cleaning: Removed duplicates, cleaned text, and standardized author names. • EDA (Exploratory Data Analysis):
- Top authors with the most quotes
- Most frequent keywords
- Quote length distribution
- Common themes and topics • SQL Analysis:
- Insights using advanced SQL queries (group by,substring, subquery,Order by,limit)
- Frequency of words or phrases
- Author-wise sentiment or theme analysis
• Python (BeautifulSoup, Pandas, Matplotlib, Seaborn, Regex) • SQL (MySQL or SQLite) • Jupyter Notebook • • Web Scraping: Collected quotes using Python libraries like requests and BeautifulSoup . • Data Cleaning: Removed duplicates, cleaned text, and standardized author names. • EDA (Exploratory Data Analysis):
- Top authors with the most quotes
- Most frequent keywords
- Quote length distribution
- Common themes and topics • SQL Analysis:
- Insights using advanced SQL queries (group by,subquery, substring, order by, limit)
- Frequency of words or phrases
- Author-wise sentiment or theme analysis