Scripts for data collection
- yahoo: get US/CN stock data from Yahoo Finance
- fund: get fund data from http://fund.eastmoney.com
- cn_index: get CN index from http://www.csindex.com.cn, CSI300/CSI100
- us_index: get US index from https://en.wikipedia.org/wiki, SP500/NASDAQ100/DJIA/SP400
- contrib: scripts for some auxiliary functions
Specific implementation reference: https://github.com/microsoft/qlib/tree/main/scripts/data_collector/yahoo
- Create a dataset code directory in the current directory
- Add
collector.py- add collector class:
CUR_DIR = Path(__file__).resolve().parent sys.path.append(str(CUR_DIR.parent.parent)) from data_collector.base import BaseCollector, BaseNormalize, BaseRun class UserCollector(BaseCollector): ...
- add normalize class:
class UserNormalzie(BaseNormalize): ...
- add
CLIclass:class Run(BaseRun): ...
- add collector class:
- add
README.md - add
requirements.txt
| Basic data | |
|---|---|
| Features | Price/Volume: - $close/$open/$low/$high/$volume/$change/$factor |
| Calendar | <freq>.txt: - day.txt - 1min.txt |
| Instruments | <market>.txt: - required: all.txt; - csi300.txt/csi500.txt/sp500.txt |
Features: data, digital- if not adjusted, factor=1
To make the component running correctly, the dependent data are required
| Component | required data |
|---|---|
| Data retrieval | Features, Calendar, Instrument |
| Backtest | Features[Price/Volume], Calendar, Instruments |