This repository contains my implementation of an advanced algorithmic trading system that combines multiple machine learning methodologies to predict individual stock returns. The system is based on and extends the work from k-dickinson/quant-simulations-and-risk, incorporating state-of-the-art ensemble methods and risk management techniques.
I've designed this system to treat each stock as an independent time series prediction problem allowing for the following advantages:
- Reduced Model Risk: By avoiding correlation assumptions, the system is less susceptible to correlation breakdown during market stress
- Scalability: Each stock model can be trained independently, allowing for parallel processing and easier system scaling
- Robustness: Individual models capture stock-specific dynamics that might be obscured in factor-based approaches
Note: Future iterations will incorporate cross-asset relationships and regime-dependent correlations as outlined in the risk management framework.
The strategy itself lies in the three-level ensemble architecture that combines diverse learning paradigms. While this is, in no way, revolutionary, it provides consistency as opposed to more subjective approaches:
The first level employs heterogeneous base models to capture different aspects of market dynamics:
Neural Network (MLP Regressor)
- Architecture: Deep feed-forward network with batch normalization and dropout
- Activation Functions: Alternating ReLU and LeakyReLU to prevent vanishing gradients
- Regularization: L1 penalty (λ = 1×10⁻⁵) + gradient clipping + early stopping
- Optimization: AdamW with cosine annealing warm restarts
The loss function combines MSE with L1 regularization:
Gradient Boosting Models (XGBoost & LightGBM)
- XGBoost: GPU-accelerated tree boosting with sophisticated regularization
- LightGBM: Leaf-wise tree growth optimized for efficiency
- Hyperparameters: Conservative learning rates (0.01) with high iteration counts for stability
Boosted Residual Models
Here I trained traditional ML models (SVR, Random Forest, Lasso) as base predictors, then use XGBoost to learn the residual patterns:
AutoML (AutoGluon)
- Automated hyperparameter optimization across multiple model families
- Stacked generalization with automatic feature selection
- Provides robust baseline predictions with minimal manual tuning
TabPFN (Transformer-based)
- Prior-fitted network leveraging transformer architecture
- Particularly effective for tabular data with complex feature interactions
- Serves as a strong non-parametric baseline
The second level trains meta-models on the predictions from Level 1:
XGBoost Meta-Learner
- Learns optimal combination weights for base model predictions
- Captures non-linear interactions between base model outputs
- Regularized to prevent overfitting to training ensemble diversity
Neural Network Meta-Learner
- Smaller architecture (fewer layers) to avoid overfitting
- Learns complex combination rules between base predictions
- Dropout and early stopping for regularization
The third level employs Ridge regression to learn optimal linear weights:
The feature engineering incorporates established technical analysis principles:
Momentum Indicators
- RSI (Relative Strength Index): Captures overbought/oversold conditions
- Rate of Change (ROC): Multi-timeframe momentum assessment
- MACD: Trend-following momentum oscillator
Volatility Measures
- Bollinger Bands: Statistical volatility bands for mean reversion signals
- Average True Range: Volatility-adjusted position sizing
- Rolling standard deviation: Multi-horizon volatility estimation
Volume Analysis
- On-Balance Volume (OBV): Price-volume confirmation
- Volume Rate of Change: Institutional activity detection
- Volume-weighted indicators: Smart money tracking
While respecting EMH principles, the system exploits several well-documented market inefficiencies:
- Momentum Effects: Short-term continuation of price trends
- Mean Reversion: Long-term price normalization
- Volatility Clustering: GARCH-like volatility patterns
- Microstructure Effects: Intraday trading patterns
The risk management framework extends traditional MPT:
Value at Risk (VaR) Calculation
Using Monte Carlo simulation with Geometric Brownian Motion:
- μ = drift parameter (estimated from historical returns)
- σ = volatility parameter (estimated from historical volatility)
- dW_t = Wiener process (random walk component)
Risk-Adjusted Position Sizing
Position weights are calculated using a confidence-weighted approach:
- r_i = predicted return for asset i
- c_i = prediction confidence for asset i
- α = confidence exponent (1.5 for non-linear emphasis)
- A = total allocation constraint (90%)
The system employs sophisticated Monte Carlo methods for portfolio risk assessment:
Geometric Brownian Motion Implementation
Where Z ~ N(0,1) represents random market shocks.
Risk Metrics Calculation
- 95% Value at Risk: 5th percentile of return distribution
- Expected Shortfall: Mean of worst 5% outcomes
- Maximum Drawdown: Largest peak-to-trough decline
- Probability of Loss: Frequency of negative returns
The position sizing algorithm incorporates multiple risk factors:
- Prediction Confidence: Higher confidence → larger positions
- Volatility Adjustment: Higher volatility → smaller positions
- Correlation Limits: Reduces exposure to highly correlated positions
- Concentration Limits: Maximum 10% per single position
Each stock is processed independently with its own:
- Feature engineering pipeline
- Model ensemble training
- Hyperparameter optimization
- Validation framework
This approach ensures that:
- Model failures are isolated
- Stock-specific patterns are captured
- System scales linearly with universe size
The system leverages GPU acceleration where available:
- PyTorch neural networks on CUDA
- XGBoost tree methods with GPU histograms
- LightGBM GPU training
- Parallel Monte Carlo simulations
Trained models are serialized with:
- Complete ensemble architecture
- Training metadata and statistics
- Feature engineering parameters
- Risk model coefficients
The multi-level ensemble addresses the bias-variance tradeoff by:
- Reducing Bias: Multiple diverse base models
- Controlling Variance: Regularization at each level
- Optimal Combination: Meta-learning for weight optimization
Multiple regularization techniques prevent overfitting:
- Cross-validation: Time-series aware validation splits
- Early stopping: Based on validation loss plateaus
- Dropout: Stochastic regularization in neural networks
- L1/L2 penalties: Parameter shrinkage
- Ensemble diversity: Decorrelated base models
- Cross-Asset Modeling: Incorporate sector and market regime information
- Alternative Data: Integration of sentiment, news, and satellite data
- Reinforcement Learning: Dynamic strategy adaptation
- Options Integration: Volatility surface modeling for options strategies
- High-Frequency Components: Intraday pattern recognition
Key equations used throughout the system:
Ensemble Prediction:
Risk-Adjusted Return:
Position Weight:
Value at Risk:
This system represents my approach to quantitative trading that combines machine learning techniques with established financial theory, designed to generate consistent risk-adjusted returns through independent stock modeling and advanced ensemble methods.