Skip to content

Why We Chose DQN over Traditional ML for Trading Decisions

ai-lab-projects edited this page May 1, 2025 · 1 revision

Motivation

In traditional applications of machine learning to financial data, the process is typically split into two stages:

  1. Prediction – Use ML models (e.g., XGBoost, LSTM) to forecast future prices, returns, or probabilities of price movement.
  2. Decision-making – Apply manually crafted rules to determine whether to buy, sell, or hold based on predictions.

However, we found the second stage problematic. These rule-based decisions often require arbitrary threshold tuning, which introduces subjective biases and is hard to optimize systematically. For example:

  • “Buy if predicted return > 1%”
  • “Sell if probability of drop > 70%”

This approach lacks consistency and does not directly optimize for long-term profit.

Our Approach: End-to-End Trading with DQN

To address this issue, we adopted a Deep Q-Network (DQN) approach. DQN allows the model to directly output trading actions (buy/sell/hold) based on input market data, removing the need for manually defined rules.

Instead of predicting the future, DQN learns a policy that maximizes expected long-term rewards, which aligns better with the ultimate goal of trading strategies.

Buyer-Seller Model Separation

We also train separate models for buyer and seller behavior. This specialization helps each model learn the best timing and conditions for its respective task:

  • The buyer model focuses on identifying optimal buy timing and price ranges.
  • The seller model learns when and how to sell most effectively for maximum return.

This separation improves the learning efficiency and decision quality of each model.

Summary

Traditional ML + Rule-Based DQN-Based End-to-End
Predicts future price Directly outputs actions
Needs manual rule tuning Learns optimal policy via reward
Indirect optimization Directly maximizes long-term profit
Two-stage pipeline Unified, learnable framework

By leveraging DQN, we aim for a more elegant and purpose-aligned solution to algorithmic trading.

Clone this wiki locally