-
Notifications
You must be signed in to change notification settings - Fork 0
Data Preprocessing
ai-lab-projects edited this page Apr 29, 2025
·
1 revision
The preprocessing step prepares the historical ETF data to be suitable for training Deep Q-Networks (DQN).
- Historical daily data for the ETF (1655.T) is downloaded using yfinance.
- Only dates where the closing price exceeds 50 JPY are kept to ensure data quality.
- Close and Open prices are extracted separately.
- Close prices are used to calculate moving averages and technical indicators.
- Open prices are used for evaluating real-world buy/sell execution prices.
The data is divided into three sets:
- Training Set (60%)
- Validation Set (20%)
- Test Set (20%)
This ensures that the models are evaluated on unseen data and can generalize well.
For the seller agent, the following features are calculated:
- Return from Buy Price (%)
- Return from Average Price (%)
- RSI (Relative Strength Index)
- Elapsed Time Since Purchase (logarithmic scaling)
These features form a 4-dimensional input to the selling agent's network.
For the buyer agent:
- Recent close prices (look-back window) are normalized using scikit-learn scalers:
- StandardScaler
- MinMaxScaler
- RobustScaler (randomly selected)
This improves model convergence during training.
- No future information is leaked into the past (strictly causal features).
- Random factors such as scaling choice are intended to improve model robustness.