Proximal algorithms¶

We consider a convex objective function:

\[f(\theta) = L(\theta) + R(\theta)\]

Then the proximal operator has the form:

\[prox_R(y) = \arg \min_z (R(z) + \frac{1}{2}|| z - y ||_2^2)\]

The goal of proximal operator is to minimize R but stay also close to y. The key is to efficiently compute the proximal operator for different R.

Proximal operators¶

\[\begin{split} R(\theta) = \lambda || \theta||_1 \\ prox_R(\theta) = \text{soft}(\theta, \lambda) \end{split}\]

\[\begin{split} R(\theta) = \lambda || \theta||_0 \\ prox_R(\theta) = \text{hard}(\theta, \sqrt{2\lambda}) \end{split}\]

\[\begin{split} R(\theta) = I_C(\theta) \\ prox_R(\theta) = \argmin_{z \in C} || z - \theta ||_2^2 = proj_C(\theta) \end{split}\]

This is the projection operator. For convex sets this is easy to compute.

For rectangular box $C = \{ \theta : l_j \le \theta_j \le \mu_j \}$ $$ proj_C(\theta) = \begin{cases} l_j & \theta_j \le l_j \\ \theta_j & l_j \le \theta_j \le \mu_j \\ \mu_j & \theta_j \ge \mu_j \end{cases} $$
For euclidean ball $C = \{ \theta: ||\theta||_2 \le 1 \}$ $$ proj_c(\theta) = \begin{cases} \frac{\theta}{|| \theta||_2} & ||\theta||_2 > 1 \\ \theta & ||\theta||_2 \le 1 \end{cases} $$
The projection on the 1-norm ball $C = \{ \theta: ||\theta||_1 \le 1 \}$

\[ proj_C = soft(\theta, \lambda) \]

$\lambda = 0$ if $||\theta||_1 \le 1$ otherwise $\lambda$ is the solution to $\sum_{j=1}^D \max (|\theta_j| - \lambda, 0) = 1$