This project is a beginner-to-intermediate level data analysis of sales data using pandas, NumPy, and Matplotlib. It demonstrates how to read, clean, analyze, and visualize sales information from a CSV file.
sales2_1.csv
: The dataset.sales2.py
: Main Python script that processes and visualizes the data.revenue_profit_chart.png
: Output chart showing revenue per product.
- Convert raw sales data into useful insights.
- Calculate total revenue per product.
- Use NumPy for array manipulation and slicing.
- Visualize results with a colorful, labeled bar chart.
The dataset contains the following columns:
- Product: The name of the product.
- Quantity: Units sold.
- Price: Unit price in dollars.
- Date: Date of sale.
- Read CSV Data using
pandas
. - Convert Columns to Numeric types with error handling.
- Calculate Revenue per row (Price × Quantity).
- Convert DataFrame to NumPy Array for slicing and filtering.
- Extract Unique Products and compute:
- Total revenue per product.
- Percentage share of total revenue.
- Visualize the Results using
Matplotlib
:- Each product is assigned a unique color.
- Products are displayed as numbered bars.
- A dynamic legend explains which number corresponds to which product.
- Python
- pandas
- NumPy
- Matplotlib
- Data cleaning with
pandas
- NumPy slicing and boolean masking
- Revenue calculation by category
- Building clear, colorful visualizations
- Working with legends and layout in
Matplotlib
- Group data by date and analyze revenue trends over time.
- Add Seaborn or Plotly for interactive visualizations.
- Build a simple dashboard using Streamlit.
If you like this project or have questions, feel free to connect:
-
GitHub: [DataFalcon 🦅]
-
Email: [tammahakki700@gmail.com]
This project is open-source and available under the MIT License.