Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements the Ridge regression model as part of the
RustQuant_mlcrate.Ridge regression extends linear regression by adding an L2 regularisation term to the loss function, penalising large coefficient values to reduce overfitting.
This implementation is designed to closely align with Scikit-Learn’s
linear_model.Ridgemodel. The Ridge regression implementation from Scikit-Learn using the same data as the unit tests in this PR is available here.Take a features matrix$X$ , response vector $\textbf{y}$ and regularisation parameter $\lambda>0$ .
The loss function for a Ridge regression model is:
The optimal values for$\beta$ have a closed form solution. The loss function above can be written as
Expanding gives
where$I_{\text{d}}$ is the identity matrix.
Note that * is a scaler value, therefore
We can also combine the terms in ** to give:
Now we can further simplify the loss function:
Now calculate the derivative with respect to$\beta$ and set to $0$ to find the optimal values of $\beta$ :
Note that *** was derived using the fact that
and if A is symmetric, the above can be simplified to$2A\textbf{x}$ .
Solving for$\hat{\beta}$ :