This issue recommends Qlib, an open-source AI quantitative trading platform from Microsoft, which contains modules such as data processing, model training, and backtesting, covering functions such as Alpha mining, risk modeling, and combinatorial optimization.
Project Description
Qlib is an AI-oriented quantitative investment platform that aims to tap the potential, empower research, and create the value of AI technology in quantitative investment.
It contains a complete ML pipeline for data processing, model training, and backtest; It covers the entire chain of quantitative investing: alpha seeking, risk modeling, portfolio optimization, and order execution.
Using Qlib, users can easily try out ideas for creating better quantitative investment strategies.
Qlib framework
At the module level, Qlib is a platform made up of the above components. The components are designed as loosely coupled modules, and each component can be used independently.
Name |
Description |
Infrastructure layer |
The Infrastructure layer provides the underlying support for Quant research. DataServer provides a high-performance infrastructure for users to manage and retrieve raw data. The Trainer provides a flexible interface to control the training process of the model, so that the algorithm can control the training process. |
Workflow layer |
The Workflow layer covers the entire workflow of the quantitative investment. Information Extractor extracts data for the model. The Forecast Model focuses on generating various prediction signals for other modules (e.g. alpha, risk). With these signal Decision generators, the target trading decision (i.e., portfolio, order) to execute is generated and the Execution Env (i.e., trading market) is generated. There may be multiple levels of Trading agents and Execution Env (for example, order executor Trading Agent and intraday order execution environment may behave like a day trading environment and be nested within daily portfolio management trading agent and day trading environment ) |
Interface layer |
Interfacelayer attempts to provide a user-friendly interface to the underlying system. The Analyser module will provide the user with a detailed analysis report on the predicted signals, portfolio and execution results |
Fast start
- It’s very easy to build a complete Quant research workflow using Qlib and try out your ideas.
- Despite the use of public data and simple models , machine learning techniques work well in actual quantitative investing.
Installation
Note:
- It is recommended to use Conda to manage your Python environment.
- Note that installing cython in Python 3.6 raises some errors when Qlib is installed from source. If a user is using Python 3.6 on their own machine, it is recommended to upgrade Python to version 3.7 or install it from source using conda’s Python.
- Qlib requires the tables package, and python3.9 is not supported in the hdf5 table.
Users can easily install it via pip according to the following command.
pip install pyqlib
In addition, users can install the latest development version from source code by following these steps:
Before installing from the source, the user needs to install some dependencies:
pip install numpy
pip install --upgrade cython
Clone the repository and install
as follows
git clone https://github.com/microsoft/qlib.git & & cd qli
pip install
Note : You can also install Qlib python setup.py install. But this is not the recommended approach. It skips pip and leads to arcane problems. For example, only the command pip install. Can overwrite the stable version pip install pyqlib that is installed, while the command python setup.py install cannot.
< data preparation
Load and prepare data by running the following code:
# get 1d data
python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn
# get 1min data
python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/cn_data_1min --region cn --interval 1min
Automatic quantization research workflow
Qlib provides a tool called qrun that automates the entire workflow (including building datasets, training models, backtesting, and evaluation).
Quantitative research workflow: qrun run using lightgbm workflow configuration
cd examples # Avoid running program under the directory contains `qlib`
qrun benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml
If the user wants qrun to be used in debug mode, use the following command:
python -m pdb qlib/workflow/cli.py examples/benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml
The results are as follows
'The following are analysis results of the excess return without cost.'
risk
mean 0.000708
std 0.005626
annualized_return 0.178316
information_ratio 1.996555
max_drawdown -0.081806
'The following are analysis results of the excess return with cost.'
risk
mean 0.000512
std 0.005626
annualized_return 0.128982
information_ratio 1.444287
max_drawdown -0.091078
Graphic Report analysis
Prediction signal (model prediction) Analysis:
Group aggregate return
Return assignment
Information coefficient (IC)
Portfolio analysis:
Qlib data server performance
The performance of data processing is important for data-driven approaches such as artificial intelligence technologies. Qlib, as an AI-oriented platform, provides solutions for data storage and data processing. To demonstrate the performance of the Qlib data server, we compared it to several other data storage solutions.
- +(-)E ExpressionCache
- +(-)D indicates DatasetCache