Frequency domain Full-Waveform Inversion (FWI) with constraints

Julia script for this example

Seismic full-waveform inversion (FWI) estimates rock properties (acoustic velocity in this example) from seismic signals (pressure) measured by hydrophones. FWI is a partial-differential-equation (PDE) constrained optimization problem where after eliminating the PDE constraint, the simulated data, \(d_\text{predicted}(m) \in \mathbb{C}^M\), are connected nonlinearly to the unknown model parameters, \(m \in \mathbb{R}^N\). We assume that we know the source and receiver locations, as well as the source function. A classic example of an objective for FWI is the nonlinear least-squares misfit \(f(m)=1/2 \| d_\text{obs} - d_\text{predicted}(m) \|_2^2\), which we use for this numerical experiment.

FWI is a problem hampered by local minima. Empirical evidence suggests that we can mitigate issues with parasitic local minima by insisting that all model iterates be elements of the intersection of multiple constraint sets. This means that we add regularization to the objective \(f(m) : \mathbb{R}^N \rightarrow \mathbb{R}\) in the form of multiple constraints—i.e., we have \[ \begin{equation} \min_{m} f(m) \quad \text{s.t.} \quad m \in \mathcal{V} = \bigcap_{i=1}^p \mathcal{V}_i. \label{FWI_prob} \end{equation} \] While many choices exist to solve this constrained optimization problem, we use the spectral projected gradient (SPG) algorithm with a non-monotone line search to solve the above problem. SPG uses information from the current and previous gradient of \(f(m)\) to approximate the action of the Hessian of \(f(m^k)\) with the scalar \(\alpha\): the Barzilai-Borwein step length. At iteration \(k\), SPG updates the model iterate as follows: \[ \begin{equation} m^{k+1} = (1-\gamma) m^k - \gamma \mathcal{P}_{\mathcal{V}} (m^k - \alpha \nabla_{m}f(m^k)), \label{SPG_iter} \end{equation} \] where the non-monotone line search determines \(\gamma \in (0,1]\).

The prior knowledge consists of: (a) minimum and maximum velocities (\(2350 - 2650\) m/s); (b) The anomaly is, but we do not know the size, aspect ratio, or location. The following constraint sets follow from the prior information:

  1. \(\{ x \: | \: \operatorname{card}( (D_z \otimes I_x) x ) \leq n_x \}\)
  2. \(\{ x \: | \: \operatorname{card}( (I_z \otimes D_x) x ) \leq n_z \}\)
  3. \(\{ x \: | \: \operatorname{rank}(x) \leq 3 \}\)
  4. \(\{ x \: | \: 2350 \leq x[i] \leq 2650 \: \forall i\}\)
  5. \(\{ x \: | \: \operatorname{card}( D_x X[i,:] ) \leq 2 \:\: \text{for} \:\: i \in \{1,2,\dots,n_z\} \}\), \(X[i,:]\) is a row of the 2D model
  6. \(\{ x \: | \: \operatorname{card}( D_z X[:,j] ) \leq 2 \:\: \text{for} \:\: j \in \{1,2,\dots,n_x\}\}\), \(X[:,j]\) is a column of the 2D model

We use slightly overestimated rank and matrix cardinality constraints compared to the true model to mimic the more realistic situation that not all prior knowledge was correct. The results in Figure 1 use PARSDMM to compute projections onto the intersection of constraints, and show that an intersection of non-convex constraints and bounds can lead to improved model estimates. Figure 1(e) is the result of working with constraints \([1,2,4]\), Figure 1(f) uses constraints \([1,2,4,5,6]\), and Figure 1(g) uses all constraints \([1,2,3,4,5,6]\). The result with rank constraints and both matrix and row/column-based cardinality constraints on the discrete gradient of the model is the most accurate in terms of the recovered anomaly shape. All results in Figure 1 that work with non-convex sets are at least as accurate as the result obtained with the true TV in terms of anomaly shape. Another important observation is that all non-convex results estimate a lower-than-background velocity anomaly, although not as low as the true anomaly. Contrary, the models obtained using convex sets show incorrect higher-than-background velocity artifacts in the vicinity of the true anomaly location.



Figure1True, initial, and estimated models with various constraint combinations for the full-waveform inversion example. Crosses and circles represent sources and receivers, respectively. All projections inside the spectral projected gradient algorithm are computed using PARSDMM.