.. ******************************************************************************
.. * Copyright 2020-2021 Intel Corporation
.. *
.. * Licensed under the Apache License, Version 2.0 (the "License");
.. * you may not use this file except in compliance with the License.
.. * You may obtain a copy of the License at
.. *
.. *     http://www.apache.org/licenses/LICENSE-2.0
.. *
.. * Unless required by applicable law or agreed to in writing, software
.. * distributed under the License is distributed on an "AS IS" BASIS,
.. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
.. * See the License for the specific language governing permissions and
.. * limitations under the License.
.. *******************************************************************************/

.. _iterative_solver:

Iterative Solver
================

The iterative solver provides an iterative method to minimize an objective function
that can be represented as a sum of functions in composite form

.. math::
    \theta_{*} = \text{argmin}_{\theta \in R^p} K(\theta) =
    \text{argmin}_{\theta \in R^p} F(\theta) + M(\theta)

where:

- :math:`F(\theta) = \sum_{i=1}^{n} F_i(\theta)`, :math:`\theta \in R^p`,
  where :math:`F_i(\theta): R^p \to R` is a convex, continuously differentiable
  :math:`F_i(\theta) \in C^{l \geq 1}` (smooth) functions, :math:`i = 1, \ldots, n`
- :math:`M(\theta): R^p \to R` is a convex, non-differentiable (non-smooth) function

.. rubric:: The Algorithmic Framework of an Iterative Solver

All solvers presented in the library follow a common algorithmic framework.
Let :math:`S_t` be a set of intrinsic parameters of the iterative solver for updating the argument of the objective function.
This set is the algorithm-specific and can be empty. The solver determines the choice of :math:`S_0`.

To do the computations, iterate :math:`t` from :math:`1` until :math:`\text{nIterations}`:

#. Choose a set of indices without replacement :math:`I = \{ i_1, \ldots, i_b \}`, :math:`1 \leq i_j \leq n`,
   :math:`j = 1, \ldots, b`, where :math:`b` is the batch size.

#. Compute the gradient :math:`g(\theta_{t-1}) = \nabla F_I (\theta_{t-1})` where
   :math:`F_I (\theta_{t-1}) = \sum_{i \in I} F_i (\theta_{t-1})`

#. Convergence check:

   Stop if :math:`\frac {{|U|}_d} {\max (1, {|| \theta_{t-1} ||}_d )} < \epsilon` where :math:`U` is an algorithm-specific vector
   (argument or gradient) and d is an algorithm-specific power of Lebesgue space

#. Compute :math:`\theta_t` using the algorithm-specific transformation :math:`T` that updates the function's argument:

   .. math::
     \theta_t = T(\theta_{t-1}, g(\theta_{t-1}), S_{t-1})

#. Update :math:`S_t: S_t = U(S_{t-1})`  where :math:`U` is an algorithm-specific update of the set of intrinsic parameters.

The result of the solver is the argument :math:`\theta_{.}` and a set of parameters :math:`S_{.}` after the exit from the loop.

.. note::

    You can resume the computations to get a more precise estimate of the objective function minimum.
    To do this, pass to the algorithm the results :math:`\theta_{.}` and :math:`S_{.}` of the previous run of the optimization solver.
    By default, the solver does not return the set of intrinsic parameters.
    If you need it, set the ``optionalResultRequired`` flag for the algorithm.

.. toctree::
   :maxdepth: 1
   :caption: Iterative solvers

   solvers/computation.rst
   solvers/limited-memory-broyden-fletcher-goldfarb-shanno-algorithm.rst
   solvers/stochastic-gradient-descent-algorithm.rst
   solvers/adaptive-subgradient-method.rst
   solvers/coordinate-descent.rst
   solvers/stochastic-average-gradient-accelerated-method.rst