Ridge Regression #305

yfnaji · 2025-10-12T13:57:30Z

This PR implements the Ridge regression model as part of the RustQuant_ml crate.

Ridge regression extends linear regression by adding an L2 regularisation term to the loss function, penalising large coefficient values to reduce overfitting.

This implementation is designed to closely align with Scikit-Learn’s linear_model.Ridge model. The Ridge regression implementation from Scikit-Learn using the same data as the unit tests in this PR is available here.

Take a features matrix $X$, response vector $\textbf{y}$ and regularisation parameter $\lambda>0$.

The loss function for a Ridge regression model is:

$$ C:=\lVert \mathbf{y} - X\beta \rVert^2_2 + \lambda\lVert\beta\rVert^2_2 $$

The optimal values for $\beta$ have a closed form solution. The loss function above can be written as

$$ \left(\mathbf{y}-X\beta\right)^T\left(\mathbf{y}-X\beta\right) + \lambda\beta^T\beta $$

Expanding gives

$$ \mathbf{y}^T\mathbf{y} -\beta^TX^T\mathbf{y} - \underbrace{\mathbf{y}^TX\beta}_*+\underbrace{\beta^TX^TX\beta+\beta^T \ \lambda \cdot I_\text{d} \ \beta}_{**} $$

where $I_{\text{d}}$ is the identity matrix.

Note that * is a scaler value, therefore

$$ \mathbf{y}^TX\beta = \left(\mathbf{y}^TX\beta\right)^T = \beta^TX^T\mathbf{y} $$

We can also combine the terms in ** to give:

$$ \beta^TX^T\mathbf{y}+\beta^T \lambda \cdot \ I_\text{d} \ \beta = \beta^T\left(X^TX + \lambda \cdot I_\text{d} \right)\beta $$

Now we can further simplify the loss function:

$$ \mathbf{y}^T\mathbf{y} -\beta^TX^T\mathbf{y} - \beta^TX^T\mathbf{y}+\beta^T\left(X^TX + \lambda \cdot I_\text{d} \right)\beta $$

$$ \Rightarrow \mathbf{y}^T\mathbf{y} -2\beta^TX^T\mathbf{y} + \beta^T\left(X^TX + \lambda \cdot I_\text{d} \right)\beta $$

Now calculate the derivative with respect to $\beta$ and set to $0$ to find the optimal values of $\beta$:

$$ \left.\frac{\partial C}{\partial \beta} \right\vert_{\beta=\hat{\beta}}= -2 X^T \mathbf{y} + \underbrace{2\left(X^TX + \lambda I_{\text{d}}\right)\hat{\beta}}_{***} = 0 $$

Note that *** was derived using the fact that

$$ \frac{\partial}{\partial x}\left[\textbf{x}^TA\textbf{x}\right]=\left(A + A^T\right)\text{x} $$

and if A is symmetric, the above can be simplified to $2A\textbf{x}$.

Solving for $\hat{\beta}$:

$$ \left(X^TX + \lambda I_{\text{d}}\right)\hat{\beta} = X^T \mathbf{y} $$

$$ \Rightarrow \hat{\beta} = \left(X^TX + \lambda I_{\text{d}}\right)^{-1}X^T \mathbf{y} $$

crates/RustQuant_ml/src/ridge_regression.rs

yfnaji added 3 commits October 12, 2025 13:53

Ridge: Create module + struct

c57b52a

Ridge: Implementations for constructor, fitting and create output struct

9606419

Ridge: Create unit tests

d9d2583

github-advanced-security bot found potential problems Oct 12, 2025

View reviewed changes

Ridge: Replace deprecated std::f64::EPISILON + variable spelling

e9985b1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Ridge Regression #305

Ridge Regression #305

yfnaji commented Oct 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Ridge Regression #305

Are you sure you want to change the base?

Ridge Regression #305

Conversation

yfnaji commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yfnaji commented Oct 12, 2025 •

edited

Loading