-
Notifications
You must be signed in to change notification settings - Fork 0
Why We Chose DQN over Traditional ML for Trading Decisions
In traditional applications of machine learning to financial data, the process is typically split into two stages:
- Prediction – Use ML models (e.g., XGBoost, LSTM) to forecast future prices, returns, or probabilities of price movement.
- Decision-making – Apply manually crafted rules to determine whether to buy, sell, or hold based on predictions.
However, we found the second stage problematic. These rule-based decisions often require arbitrary threshold tuning, which introduces subjective biases and is hard to optimize systematically. For example:
- “Buy if predicted return > 1%”
- “Sell if probability of drop > 70%”
This approach lacks consistency and does not directly optimize for long-term profit.
To address this issue, we adopted a Deep Q-Network (DQN) approach. DQN allows the model to directly output trading actions (buy/sell/hold) based on input market data, removing the need for manually defined rules.
Instead of predicting the future, DQN learns a policy that maximizes expected long-term rewards, which aligns better with the ultimate goal of trading strategies.
We also train separate models for buyer and seller behavior. This specialization helps each model learn the best timing and conditions for its respective task:
- The buyer model focuses on identifying optimal buy timing and price ranges.
- The seller model learns when and how to sell most effectively for maximum return.
This separation improves the learning efficiency and decision quality of each model.
Traditional ML + Rule-Based | DQN-Based End-to-End |
---|---|
Predicts future price | Directly outputs actions |
Needs manual rule tuning | Learns optimal policy via reward |
Indirect optimization | Directly maximizes long-term profit |
Two-stage pipeline | Unified, learnable framework |
By leveraging DQN, we aim for a more elegant and purpose-aligned solution to algorithmic trading.