Numerical methods for Data Assimilation

Solution algorithms

Newton method

  • Depends on the problem size. for reasonable size problems (machine?) it requires the solution of

  • It converges locally quadratically to a critical point under some assumptions ( is Lipschitz continuous around and is SPD)

  • It is possible to make the convergence global under mild assumptions with trust-region techniques. Linesearch is also possible.

  • If feasible (size, computational cost) this is a method of choice.

  • Impraticable for the ocean and atmosphere data assimilation problems.

Quasi-Newton Method

  • Several variantis available (SR1, PSB, DFP, BFGS,..)

  • For BFGS, the matrice is not computed, but its inverse is approximated using the secant equation :

  • The matrix is the closest matrix to for some weighted norm

  • Updating formula

  • Limited memory implementation : is kept in factored form and only the last pairs are taken into account . Example : L-BFGS, M1QN3.

Quasi-Newton method for data assimilation

  • Gradient computation ,

  • and are the jacobian matrices of  and at and at

  • Computing, storing at all step is not feasible in large scale data assimilation

The Gauss-Newton algorithm

  • Solution of a sequence of linear least squares problems and update

  • It can be seen as a stationary iteration

  • Does not require second order information. The linear least-squares problem can be solved by a direct or an iterative method (truncated Gauss-Newton method).

Local convergence of stationary iterations

Definition

The process is locally convergent to , if there exist such that any sequence generated by ( ) converges to whenever . In this case, is called a point of attraction.

  • If is a point of attraction of ( ) and if is continuous at , then , that is, is a fixed point of

  • In the case where is a fixed point of , but not a point of attraction of ( ), the sequence may behave in a variety of ways:

    it may diverge, cycle, or even exhibit a chaotic behaviour.

Ostrowsky's sufficient convergence condition

Let , , and . We denote by the identity matrix of order .

It is well known that, if , converges to for any . In particular, is a point of attraction of ( ).

Generalization of this result in the case where $G$ is no longer a linear mapping:

FundamentalTheorem

Suppose that has a fixed point , that  is Fréchet differentiable at , and that .

Then is a point of attraction of the iteration ( ).

Asymptotic speed of convergence

Let be a fixed point of . Let be the set of all of the sequences generated by ( ) and that converge to .

FundamentalTheorem

Suppose and satisfy the assumptions of Theorem . For , one has

Moreover, for any and , there exists a norm such that

for k large enough.

Remarks on Ostrowsky Theorem

  1. he quantity is referred to as the  root convergence factor of the sequence ,

    converging to . It can be shown (Ortega, Rheinboldt) that the root convergence factor  does not depend on the norm chosen on .

  2. Theorem ( ) shows that, if , for any sequence generated by ( )

    that converges to and for any such that , there exists such that

    Therefore, the convergence of is as rapid as a geometric progression of rate .

    Using the norm of Theorem ( ) , there exists such that

    that is, converge linearly to .

  3. Using Theorem ( ), for large enough, the smaller , the faster the error decreases as increases.

    Therefore, if and satisfy the assumptions of Theorem ( ), the quantity

    can be used to measure the asymptotic speed of the convergence of the stationary iteration .

Local convergence and linear rate of convergence

We denote by the Hessian matrix of the function .

Let

We define by

Obviously, is a fixed point of if .

The following theorem of local convergence holds :

FundamentalTheorem

If is a fixed point of , is twice continuously Fréchet differentiable in a neighborhood of ,

and that has full rank . If , then the Gauss-Newton iteration converges locally to .

Proof

A direct computation of reveals that

.

Let us denote by the surface in  given by the parametric representation .

The fixed point condition has the geometrical interpretation that follows.

let be the point of of coordinates and the origin of the coordinate system.

The vector is orthogonal to the plane tangent through to the surface .

FundamentalTheorem

Suppose that the assumptions of the previous Theorem hold and is nonzero.

One has ,

where is the maximal principal curvature of the surface ${\cal{S}}$ at point with respect to the normal direction .

Proof

We consider a curve through on the surface given by the parametric representation , and let be such that . The intrinsic parameter on the curve satisfies ,

where . The curvature vector of the curve at is by definition . Using the chain rule, yields

Again, the chain rule enables to obtain

The maximal principal curvature of the surface at point with respect to the normal direction is by definition the supremum over all the curves on passing through of the scalar product .

Since is a fixed point of the iterations, , therefore,

were we have denoted by the quantity . The searched supremum is

and matrix manipulations show that this quantity is equal to

Local convergence properties of the Gauss-Newton algorithm

  • The algorithm is not globally convergent. It is not locally convergent either : for some problems, a fixed point may be a repelling fixed point.

    • Consider .

    • The unique local minimum is .

    • It is possible to show that .

    • Therefore, if , is a repelling fixed point.

Example of nonlocal convergence of Gauss-Newton

The incremental 4D Var algorithm (Gauss-Newton)

  • Take

  • For { } solve iteratively the quadratic minimization problem , with

  • Update

  • End for

  • where we set .

PreviousPreviousNextNext
HomepageHomepagePrintPrint S. Gratton and Ph. Toint, submitted to Open Learn. Res. Ed. INPT 0502 (2013) 6h Attribution - Share AlikeCreated with Scenari (new window)