Solving Consistent Linear Systems

A system linear of equations–-that is, a linear system–-is specified by a coefficient matrix, $A$, and a constant vector $b.$ A linear system is consistent if there exists at least one vector $x^*$ such that $Ax^* = b$. The set of vectors, $x^*$, satisfying $Ax^* = b$ are called solutions. A linear system can be solved by using a Randomized Linear Solver, which leverages randomization to (hopefully) reduce the amount of computation or resources needed to (approximately) find a solution to a linear system.

Randomized linear solvers come in a number of varieties and under a number of names. In this library, a broad set of randomized linear solvers are encapsulated using a flexible yet uniform interface that is described below. Once a randomized linear solver is specified, it is passed to a solver function (i.e., rsolve or rsolve!) along with the linear system, which then produces an approximate solution to the linear system.

We will begin by specifying the quickest way of solving a linear system by using the default linear solver. We will then show how to pass a user supplied randomized linear solver to rsolve. Then, we will introduce the general interface for specifying randomized linear solvers.

Solving a linear system using defaults

Given a consistent linear system whose coefficient matrix is specified in A and whose constant vector is provided in b, an approximate solution to the system is returned by calling rsolve(A,b). For example, sol is an approximate solution to the generated system in the following script.

using RLinearAlgebra

# Generate a system
A = rand(20, 5);
x = rand(5);
b = A * x;

# Solve the system
sol = rsolve(A, b);

The default solver is hidden in the preceding example, but we can explicitly construct the default solver and pass it to rsolve as shown in the following script.

using RLinearAlgebra

# Generate a system
A = rand(20, 5);
x = rand(5);
b = A * x;

# Specify the default solver and stop it after 200 iterations
solver = RLSSolver(200);

# Solve the system
sol = rsolve(solver, A, b);
Warning

Owing to randomness, the two scripts will produce two different solutions. This is expected as randomized linear solvers do not have determinism. You can attempt to enforce this by specifying a seed if you wish to check that the solutions produced are identical.

In the above example, the randomized linear solver, solver, is given by the constructor RLSSolver with an argument 200, which specifies the total number of iterations allowed before the algorithm must stop. While we have explicitly constructed the solver and passed it to rsolve, we still have hidden many of the components of the randomized solver. In the next section, we discuss these components.

Note

For an in-place solver or to provide an initial iterate, see the documentation for rsolve!.

Components of a randomized linear system solver

A Randomized Linear Solver specifies the algorithm that is used to find an approximate solution to a linear system. In this library, a randomized linear solver is encapsulated by four general and flexible components.

  • A sampler (i.e., sketch, selection) is a way of extracting row or column space information from a given linear system.
  • A routine for generating a solution or updating an iterate given the information generated by the sampler.
  • A logger for recording essential information about the behavior of the randomized linear solver.
  • A stopping condition for determining when to terminate the linear solver (e.g., stopping at one iteration for a sketch-then-solve method).

These four components are the fields of an RLSSolver struct. Specifically, these fields are

Example: A randomized Kaczmarz method

Suppose we have a consistent linear system with coefficient matrix $A$ and constant vector $b$.

Note

The system can be overdetermined, square or underdetermined. The method will work regardless.

To solve this system, we will use a randomized linear solver which

  1. Randomly cycles through the equations of the linear system. That is, it will randomly permute the equations of the system, cycle through this permutation, generate another random permutation of the equations of the system, and repeat this process until a stopping condition is reached.
  2. Generates a sequence of iterates by projecting onto the hyperplane specified by the equation generated by the random cycling.
  3. Records the (Euclidean) residual norms using the entire system at each iteration. For a random algorithm, it makes very little sense to compute the residual norm at each iteration or even at any iteration as this requires accessing the entire original system, which is likely inefficient for realistic problems.
  4. The solver stops once the system's equations have been visited ten times each.

Each of these four components is addressed by the choice of sampler, routine, log, and stop condition. In sum, these four choices will be used to specify a solver.

  1. The sampler that implements the random permutation cycling is implemented as LinSysVecRowRandCyclic.
  2. The routine that implements the projection onto the hyperplane is implemented as LinSysVecRowProjStd.
  3. The log that implements the residual tracking is implemented as LSLogFull.
  4. The stop condition that implements the iteration requirement is implemented as LSStopMaxIterations. We will pass the iteration at which we want the algorithm to stop as an argument.

This example solver can be implemented and used as follows.

using RLinearAlgebra

# Generate a system
A = rand(20, 5);
x = rand(5);
b = A * x;

# Specify solver
solver = RLSSolver(
    LinSysVecRowRandCyclic(),   # Random Cyclic Sampling
    LinSysVecRowProjStd(),      # Hyperplane projection
    LSLogFull(),                # Full Logger: maintains residual history
    LSStopMaxIterations(200),   # Maximum iterations stopping criterion
    nothing                     # System solution (not solved yet)
);

# Solve the system
sol = rsolve(solver, A, b)

Block Methods for Linear Systems

Because of the way that computers operate, it is often more efficient to work using blocks of data rather than single vectors to generate updates to solutions.