-
Notifications
You must be signed in to change notification settings - Fork 0
Buyer‐Seller Model Design
ai-lab-projects edited this page Apr 29, 2025
·
1 revision
This project uses a dual-agent architecture in which two separate models—buyer and seller—learn independently while being interdependent. Their coordination enables trading behavior to emerge from two distinct perspectives.
- The buying decision and the selling decision are fundamentally different in nature.
- Input features and relevant timing considerations differ.
- Training a single model to output Buy, Sell, and Hold decisions requires it to learn two logically opposing behaviors, which may be inefficient or unstable.
Instead, we train:
- A buyer model that learns when to enter.
- A seller model that learns when to exit.
Each agent’s reward depends on the other's decision:
- The buyer’s reward depends on how well the seller exits.
- The seller’s reward depends on the quality of the entry chosen by the buyer.
This mutual dependence results in a cooperative learning dynamic. Improvements in one agent can lead to improved learning signals for the other.
Partially. While it resembles multi-agent RL due to multiple policies and shared consequences, there are key differences:
- The buyer and seller do not act simultaneously in a shared environment.
- Their actions are temporally separated (entry and exit).
- It is better described as a role decomposition of a single-agent task, rather than true multi-agent interaction.
We also considered using one agent with three actions:
- Buy
- Sell
- Hold
However:
- It introduces action constraints (e.g., cannot sell without having bought).
- It increases the complexity of learning.
- It lacks the clean separation of roles, making interpretability harder.
- Role clarity for model behavior
- Modular and easier to debug or adjust
- Encourages specialized learning in each part of the decision process
- Empirical comparison with unified-agent approach
- Exploration of shared representations between buyer and seller
- Potential extensions to market-making or pair-trading scenarios