This project demonstrates CNN-LSTM hybrid neural networks for time series forecasting using real-world temperature data from Melbourne (1981-1990). By combining Convolutional Neural Networks for local pattern recognition with Long Short-Term Memory networks for temporal dependencies, we achieve superior forecasting performance that surpasses traditional approaches.
Temperature patterns reveal seasonal cycles, trends, and complex dependencies. Understanding and predicting these patterns is crucial for climate research, agricultural planning, energy management, and environmental monitoring. This repository showcases how modern deep learning architectures can capture both local patterns and long-term temporal relationships in meteorological data.
π‘ Educational Focus: This project emphasizes the practical implementation of hybrid CNN-LSTM architectures with proper learning rate optimization, data windowing strategies, and performance evaluation on real-world time series data.
- Overview
- Project Architecture
- Dataset
- Getting Started
- Code Structure
- Neural Network Architecture
- Data Processing Pipeline
- Learning Rate Optimization
- Results & Performance
- Implementation Details
- Key Features
- Practical Applications
- Future Enhancements
- Acknowledgements
- Contact
The CNN-LSTM forecasting system follows a sophisticated approach combining the best of both architectures:
- Data Parsing: Real-world CSV temperature data preprocessing
- Windowing Strategy: Sliding window technique for supervised learning
- Hybrid Architecture: CNN for local pattern extraction + LSTM for temporal modeling
- Learning Rate Optimization: Empirical LR finding with exponential scheduling
- Training Pipeline: Efficient batch processing with proper validation splits
- Forecasting Engine: Fast batch-based prediction system
- Performance Evaluation: Comprehensive MSE and MAE metrics
Each component is designed for robustness, efficiency, and real-world applicability in meteorological forecasting.
Our time series uses real daily minimum temperature recordings from Melbourne, providing authentic patterns for model validation:
- Source: Daily Minimum Temperatures in Melbourne (1981-1990)
- Time Span: 10 years of daily observations (3,650 data points)
- Temperature Range: Celsius measurements with seasonal variations
- Seasonal Patterns: Clear annual temperature cycles
- Weather Variability: Day-to-day temperature fluctuations
- Long-term Trends: Subtle climate pattern evolution
- Real-world Complexity: Authentic meteorological characteristics
- Training Set: First 2,500 days (68%) - Model learning period
- Validation Set: Remaining 1,150 days (32%) - Performance evaluation
- Split Method: Chronological split preserving temporal order
- Real-world Application: Demonstrates practical forecasting on authentic data
- Complexity Level: Suitable for advanced neural network architectures
- Performance Target: MSE β€ 6.0, MAE β€ 2.0 for successful forecasting
- Educational Value: Shows progression from simple to sophisticated models
- Python 3.6+
- TensorFlow 2.x
- NumPy
- Matplotlib
- Pandas (for CSV handling)
git clone https://github.com/yourusername/cnn-lstm-temperature-forecasting
cd cnn-lstm-temperature-forecasting
pip install -r requirements.txt# Import the main components
from temperature_forecasting import parse_data_from_file, windowed_dataset, create_model
# Load Melbourne temperature data
TIME, SERIES = parse_data_from_file('./data/daily-min-temperatures.csv')
# Create windowed dataset for training
train_dataset = windowed_dataset(series_train, window_size=64)
# Build and train the CNN-LSTM hybrid model
model = create_model()
history = model.fit(train_dataset, epochs=50)
# Generate temperature forecasts
forecast = model_forecast(model, SERIES, window_size=64)- Complete Analysis: Run
temperature_forecasting.pyfor full pipeline - Learning Rate Optimization: Use
adjust_learning_rate()for LR finding - Custom Implementation: Import individual functions for specific needs
temperature_forecasting.py- Main implementation with all graded exercisesparse_data_from_file.py- CSV data parsing and preprocessingcreate_uncompiled_model.py- CNN-LSTM architecture definitioncreate_model.py- Model compilation with optimized parameterswindowed_dataset.py- Data windowing and batch processinglearning_rate_optimization.py- LR scheduling and optimizationmodel_forecast.py- Efficient batch-based forecastingevaluation_metrics.py- MSE and MAE performance assessmentdata/daily-min-temperatures.csv- Melbourne temperature datasetrequirements.txt- Project dependenciestests/- Unit tests for core functionality
model = tf.keras.models.Sequential([
tf.keras.Input(shape=(WINDOW_SIZE, 1)),
# Convolutional layers for local pattern extraction
tf.keras.layers.Conv1D(filters=64, kernel_size=5, activation='relu'),
tf.keras.layers.Conv1D(filters=32, kernel_size=3, activation='relu'),
# LSTM layer for temporal dependencies
tf.keras.layers.LSTM(64, return_sequences=False),
# Regularization and output
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(1)
])- Input Layer: Windowed sequences (64 time steps Γ 1 feature)
- Conv1D Layers: Local pattern recognition with 64β32 filters
- LSTM Layer: 64 units for capturing temporal dependencies
- Dropout Layer: 0.3 rate for regularization and generalization
- Output Layer: Single neuron for temperature prediction
- Loss Function: Mean Squared Error (MSE) for regression
- Optimizer: Adam with optimized learning rate (1e-3)
- Hybrid Approach: CNN extracts local patterns, LSTM models sequences
- Parameter Efficiency: ~31K parameters (well under 60K reference limit)
- Performance Focus: Achieves MSE β€ 6.0 and MAE β€ 2.0 targets
- Regularization: Dropout prevents overfitting on temperature data
def windowed_dataset(series, window_size):
# Expand dimensions for Conv1D compatibility
series = tf.expand_dims(series, axis=-1)
dataset = tf.data.Dataset.from_tensor_slices(series)
# Create sliding windows
dataset = dataset.window(window_size + 1, shift=1, drop_remainder=True)
dataset = dataset.flat_map(lambda window: window.batch(window_size + 1))
# Shuffle and split features/labels
dataset = dataset.shuffle(SHUFFLE_BUFFER_SIZE)
dataset = dataset.map(lambda window: (window[:-1], window[-1]))
# Batch and prefetch for efficiency
dataset = dataset.batch(BATCH_SIZE).prefetch(1)
return dataset- Dimension Expansion: Add feature dimension for Conv1D layers
- Window Creation: 64-day sliding windows for temperature patterns
- Feature-Label Split: Previous 64 days predict next day's temperature
- Shuffling: Randomizes training order (1000 sample buffer)
- Batching: 256 samples per batch for efficient training
- Prefetching: Optimizes data pipeline performance
- Window Size: 64 time steps (optimal for seasonal patterns)
- Batch Size: 256 samples per batch
- Shuffle Buffer: 1000 samples for randomization
- Split Time: Day 2500 for train/validation division
def adjust_learning_rate(dataset):
model = create_uncompiled_model()
# Exponential learning rate schedule
lr_schedule = tf.keras.callbacks.LearningRateScheduler(
lambda epoch: 1e-5 * 10**(epoch / 20)
)
model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam(), metrics=["mae"])
history = model.fit(dataset, epochs=100, callbacks=[lr_schedule])
return history- Exponential Schedule: LR increases from 1e-5 to 1e-1 over 100 epochs
- Loss Monitoring: Track performance across learning rate range
- Optimal Detection: Find LR where loss is minimized (~1e-3)
- Implementation: Use optimal LR in final model training
Optimal Learning Rate: 1e-3
- Sweet Spot: Loss minimized around 1e-3
- Stable Training: Good convergence without gradient explosion
- Performance: Enables target MSE/MAE achievement
- Target MSE: β€ 6.0 on validation set β
- Target MAE: β€ 2.0 on validation set β
- Training Speed: ~5-10 seconds per epoch
- Convergence: Stable loss reduction over 50 epochs
# Final performance metrics
mse: 5.83, mae: 1.88 for forecast
β
Model achieved MSE <= 6.0 and MAE <= 2.0! Target performance reached.| Iteration | MSE | MAE | Status |
|---|---|---|---|
| Initial | 7.54 | 2.12 | β Close but not passing |
| Optimized | 5.83 | 1.88 | β Passed requirements! |
SPLIT_TIME = 2500 # Train/validation split point
WINDOW_SIZE = 64 # Input sequence length (optimal for temperature)
BATCH_SIZE = 256 # Training batch size
SHUFFLE_BUFFER_SIZE = 1000 # Randomization bufferdef parse_data_from_file(filename):
# Load temperatures using np.loadtxt
temperatures = np.loadtxt(filename, delimiter=',', skiprows=1,
usecols=1, dtype=float)
times = np.arange(len(temperatures))
return times, temperaturesdef model_forecast(model, series, window_size):
ds = tf.data.Dataset.from_tensor_slices(series)
ds = ds.window(window_size, shift=1, drop_remainder=True)
ds = ds.flat_map(lambda w: w.batch(window_size))
ds = ds.batch(32).prefetch(1)
forecast = model.predict(ds)
return forecast- Authentic Melbourne temperature dataset
- Proper CSV parsing and preprocessing
- Chronological train/test splits
- Convolutional pattern extraction
- LSTM temporal modeling
- Optimized parameter count
- Empirical LR finding methodology
- Exponential scheduling callbacks
- Performance-driven selection
- Batch-based prediction system
- TensorFlow Dataset API optimization
- Memory-efficient processing
- Multiple performance metrics (MSE, MAE)
- Visual forecast comparisons
- Training progress monitoring
- Robust error handling
- Modular architecture
- Clear documentation
- Weather Prediction: Short-term temperature forecasting
- Climate Research: Long-term pattern analysis
- Agricultural Planning: Growing season optimization
- Energy Management: Heating/cooling demand prediction
- Climate Change Analysis: Temperature trend detection
- Ecosystem Management: Species habitat modeling
- Urban Planning: Heat island effect prediction
- Renewable Energy: Solar/wind resource forecasting
- Retail Planning: Seasonal demand forecasting
- Tourism Industry: Travel pattern prediction
- Insurance Modeling: Weather-related risk assessment
- Supply Chain: Temperature-sensitive logistics
- Model Comparison: Baseline for advanced architectures
- Algorithm Development: Hybrid network experimentation
- Educational Tools: Time series learning frameworks
- Benchmarking: Performance evaluation standards
- Attention Mechanisms: Transformer-based temperature modeling
- Multi-scale CNNs: Different kernel sizes for various patterns
- Bidirectional LSTMs: Forward and backward temporal processing
- Ensemble Methods: Multiple model combination for robustness
- Multi-step Forecasting: Predicting multiple days ahead
- Uncertainty Quantification: Prediction intervals and confidence bounds
- Transfer Learning: Pre-trained models for other locations
- Real-time Processing: Streaming temperature data integration
- Multi-variate Models: Include humidity, pressure, wind data
- Spatial Components: Geographic temperature correlation
- External Factors: Weather station metadata integration
- Data Augmentation: Synthetic temperature pattern generation
- Model Deployment: REST API for temperature predictions
- Monitoring Dashboard: Real-time forecast performance tracking
- Automated Retraining: Model updates with new temperature data
- A/B Testing Framework: Architecture comparison in production
Special thanks to:
- Andrew Ng for foundational machine learning education and time series insights
- Laurence Moroney for excellent TensorFlow instruction and practical deep learning guidance
- The TensorFlow team for providing robust tools for time series modeling
- The time series forecasting community for developing best practices and methodological frameworks
- The Bureau of Meteorology for providing authentic temperature datasets
- The open source community for providing excellent analysis tools and libraries
This project was developed to demonstrate practical applications of CNN-LSTM hybrid architectures to real-world meteorological forecasting challenges, emphasizing both theoretical understanding and production-ready implementation.
For inquiries about this project:
Β© 2025 Melissa Slawsky. All Rights Reserved.






