Skip to main content

Official Journal of the Asia Oceania Geosciences Society (AOGS)

Simulating 2-D magnetotelluric responses using vector-quantized temporal associative memory artificial neural network-based approaches

Abstract

In this research, we explore the application of artificial neural networks, specifically the vector-quantized temporal associative memory (VQTAM) and VQTAM coupled with locally linear embedding (VQTAM-LLE) techniques, for simulating 2-D magnetotelluric forward modeling. The study introduces the concepts of VQTAM and VQTAM-LLE in the context of simulating 2-D magnetotelluric responses, outlining their underlying principles. We rigorously evaluate the accuracy and efficiency of both VQTAM variants through extensive numerical experiments conducted on diverse benchmark resistivity and real-terrain models. The results demonstrate the remarkable capability of VQTAM and VQTAM-LLE in accurately and efficiently predicting apparent resistivity and impedance phases, surpassing the performance of traditional numerical methods. This study underscores the potential of VQTAM and VQTAM-LLE as valuable computational alternatives for simulating magnetotelluric responses, offering a viable choice alongside conventional methods.

Introduction

In the traditional approach, magnetotelluric (MT) modeling is described using boundary-valued problems (BVP) of partial differential equations (PDEs). The approximated solutions are derived through classical numerical methods, including finite difference method (FDM) (Mackie et al. 1994; Siripunvaraporn et al. 2002; Tan et al. 2003; Tong et al. 2018), finite element method (FEM) (Wannamaker et al. 1987; Mogi 1996; Franke et al. 2007; Sarakorn 2017; Sarakorn and Vachiratienchai 2018; da Conceição Batista and Sampaio 2019), finite volume method (FVM) (Jahandari and Farquharson 2015; Du et al. 2016; Han et al. 2018; Wang et al. 2019; Khampichit and Sarakorn 2021), boundary element method (BEM) (Xu and Zhou 1997; Ren et al. 2014; Wittke and Tezkan 2014; Yang et al. 2019), hybrid mixed finite element domain decomposition method (HMFEDDM) Zyserman et al. (1999), and hybrid finite difference-finite element method (HBFDFEM) (Sarakorn and Vachiratienchai 2018). Each method has its own set of advantages and disadvantages. FDM is known for its speed and accuracy in handling simple models, while FEM, although slower, provides superior accuracy for more realistic models. Similarly, FVM and BEM offer comparable accuracy but require higher computational resources. HBFDFEM combines the strengths of both FDM and FEM, but its speed diminishes compared to FDM.

In recent years, artificial neural networks (ANN), computational models inspired by the human nervous system, have witnessed significant advancements in solving, simulating, and optimizing problems across diverse fields. Their applications in electromagnetic problems have been notable. For instance, in 2003, ANN were utilized to analyze magnetotelluric time-series data Manoj and Nagarajan (2003). Subsequently, during 2005–2006, ANN employing different learning paradigms were applied to determine subsurface layer structures from magnetotelluric data Khalil et al. (2006) and inversion of geo-electrical resistivity sounding data Singh et al. (2005). Developing convolutional neural networks (CNN) led to their use in MT inversion (Puzyrev 2019; Puzyrev and Swidinsky 2021). Furthermore, in 2021, ANN with multitask learning were employed in 2-D MT forward modeling to predict apparent resistivity and phase (Shan et al. 2021). Recent work in 2023 introduced Deep Learning MT forward modeling and its application in inversion Deng et al. (2023). Despite these advancements, improving the accuracy and efficiency of these ANN models remains challenging.

Among these ANN techniques, the vector-quantized temporal associative memory (VQTAM) method (Barreto and Araujo 2004) is one of the most efficient supervised neural network models. It was developed based on the self-organizing map (SOM) architecture (Kohonen 2013), an unsupervised clustering method that allows VQTAM to perform both recognition and clustering simultaneously. Originally, SOM was an unsupervised neural algorithm utilizing competitive learning to represent spatial neighborhood relationships within unlabeled datasets, designed to uncover topological structures within multidimensional data spaces. SOM has found applications in diverse fields, such as pattern classification (Chetchotsak et al. 2015), machine vision (Wongsriworaphon et al. 2015), image compression (Arnonkijpanich et al. 2011), data visualization (Arnonkijpanich et al. 2010), and even kinematic modeling of parallel manipulators (Limtrakul and Arnonkijpanich 2019). Subsequently, an improved version of VQTAM, known as vector-quantized temporal associative memory with locally linear embedding (VQTAM-LLE) technique, was introduced to enhance its accuracy (Wongsriworaphon et al. 2015; Limtrakul and Arnonkijpanich 2019).

This study employed VQTAM and VQTAM-LLE techniques alongside the traditional FE method to simulate 2-D magnetotelluric responses, specifically apparent resistivity and phases. In addition, the possibility of our technique to generate additional reliable data for the restricted cases where the MT survey cannot be done due to some limitations, such as the topographic zone in the pre-inversion process, was presented. Various benchmark resistivity models and real-terrain models were selected. The training dataset was generated using the Finite Element (FE) approach (Sarakorn 2017), and another dataset of responses was estimated using these two neural networks. The accuracy and efficiency of both VQTAM and VQTAM-LLE were thoroughly examined, analyzed, and discussed.

2-D magnetotelluric modeling

In the traditional way, MT modeling describes the natural electromagnetic (EM) fields interacting with the Earth. The EM fields act like a plane wave with harmonic diffusion, the displacement currents is neglected, and the time-dependent is assumed to be \(e^{-i\omega t}\), where \(\omega\) is the angular frequency. In the case that the strike direction is x-direction, the electrical conductivity \(\sigma\) is therefore varied in only the y and z-directions, i.e., \(\sigma = \sigma (y, z)\), the boundary-valued problem of the second-order partial equations for the 2-D MT problem is expressed by

$$\begin{aligned} \frac{\partial }{\partial y}\left( \mu ^{-1}\alpha \frac{\partial G_x}{\partial y} \right) + \frac{\partial }{\partial z}\left( \mu ^{-1}\alpha \frac{\partial G_x}{\partial z} \right) + \beta G_x,&= 0. \end{aligned}$$
(1)

The notations \(\alpha , \beta\) and G for two polarizations: E- and H-polarizations are denoted as follows:

$$\begin{aligned}&\textbf{E}\text { -Polarization }: \quad \alpha = 1, \quad \beta = i \omega \sigma , \quad G=E, \end{aligned}$$
(2)
$$\begin{aligned}&\quad \textbf{H}\text { -Polarization }: \quad \alpha = \frac{1}{\sigma }, \quad \beta = i \omega , \quad G=H, \end{aligned}$$
(3)

where \(E_{x}\) and \(H_{x}\) are the strike aligned electric and magnetic fields, respectively, \(\mu\) is the magnetic permeability in free space and \(\mu =\mu _{0}=4 \pi \times 10^{-7}(V s / A m)\). The computational domain for the 2-D case is considered as a region \(\Omega = \Omega _1 \cup \Omega _2 \cup \Gamma \cup \Gamma ^{int}\subset \text{ R}^2\).

As shown in Fig. 1, the subregion \(\Omega _1\) is defined as the air layer, the subregion \(\Omega _2\) is defined as the Earth layer, the boundary \(\Gamma\) is the outer boundary, and \(\Gamma ^{int}\) is the air–earth interface. Crossing \(\Gamma ^{int}\), the electrical conductivity \(\sigma\) (or the electrical resistivity \(\rho\)) is discontinuous. The partial differential equations (1) are subjected to the Dirichlet boundary conditions:

$$\begin{aligned} G=G_{0}(y, z) \text{ on } \Gamma , \end{aligned}$$
(4)

where \(G_{0}(y, z)\) is obtained analytically or numerically from 1-D MT forward problem. Equation (1) can be solved analytically when its model structure is simple. In contrast, solving (1) with various numerical techniques are more suitable (Wannamaker et al. 1987; Key and Weiss 2006; Franke et al. 2007; Sarakorn 2017; Tong et al. 2018; Sarakorn and Vachiratienchai 2018; Khampichit and Sarakorn 2021). When the electric and magnetic fields are obtained for each period, the impedances, the apparent resistivity, and phases at each sample station are calculated by (5)-(7), respectively,

$$\begin{aligned} Z_{ij}&=\frac{E_{i}}{H_{j}},\; Z_{ii}=Z_{jj}=0, \end{aligned}$$
(5)
$$\begin{aligned} \rho _{ij}^a&=\frac{1}{\omega \mu }|Z_{ij}|^2,\end{aligned}$$
(6)
$$\begin{aligned} \phi _{ij}&=\text {arg}(Z_{ij}), \end{aligned}$$
(7)

where \(i,j=x,y\).

The VQTAM and VQTAM-LLE techniques

VQTAM as SOM-based neural network

The self-organizing map (SOM) is a neural network based on unsupervised learning. SOM is categorized as an unsupervised version of a vector quantization technique with topology preservation among the neurons. A SOM consists of neurons arranged on a regular two-dimensional grid \({\mathcal {A}}\). Each neuron i on the grid corresponds to a location of the prototype vector in the input space, i.e., \(w_i^{in} \in {\mathbb {R}}^n\). This feature is used to connect both modules. At each time of prototype adjustment during SOM training, a prototype that belongs to the winner neuron \(i^*\) and its neighbor neurons are updated to approach a fed input sample. Then, an elastic net derived from linking between prototypes is unfolded and spanned over input space. This way, SOM can represent data faithfully using a set of prototypes, in which these learned prototypes are considered as centers of subclusters in input space. Thus, SOM is suitable for the task of data clustering. Interestingly, the prototype distribution still preserves neighborhood-based topology and ordering in the 2D neural grid. In addition, another goal of SOM is to transform n-dimensional patterns in input space into a two-dimensional array of neurons. VQTAM was proposed to increase the efficiency of SOM at mapping patterns in input space into output space. The space of the outputs must be included as the third module in the structure of the classical SOM. Thereby, a weight vector in output space, \(w_i^{out} \in {\mathbb {R}}^z\) is also attached to each \(w_i^{in}\) which belongs to neuron i. The underlying idea of VQTAM is that, during training on both spaces, the SOM algorithm adjusts the module of prototypes \(\left\{ w_i^{in}, w_i^{out}\right\}\) in order to represent the data manifold on both input and output spaces. Note that a prototype vector \(w_i^{in}\) corresponds to \(w_i^{out}\), which is similar to the dataset in input space \(x^{in}\) that corresponds to the target set in output space \(x^{out}\). Then, a mapping from \(w_i^{in}\) to \(w_i^{out}\) can be considered as forward mapping from the input space to the output space. Note that this mapping explicitly performs approximator/predictor like supervised ANN based on multilayer perceptron (MLP) Wongsriworaphon et al. (2015).

Because MT modeling can be considered as nonlinear system identification, the neural networks for nonlinear MT system can be constructed by training input/output data of the system. In this work, the input vector \(x^{in}\) contains values from periods and stations, while the output vector \(x^{out}\) consists of apparent resistivity and phases. VQTAM is used to map patterns in input space to desired outputs. The training vector \(\textrm{x}\) fed into the VQTAM and the module of prototypes \(\textrm{w}_{i}\) are defined as follows:

$$\begin{aligned} \textrm{x}= \left( \begin{array}{c} x^{in} \\ x^{out} \end{array}\right) \quad \text {and} \quad \textrm{w}_i=\left( \begin{array}{c} w_i^{in} \\ w_i^{out} \end{array}\right) . \end{aligned}$$
(8)

During training process, the winning neuron \(i^*\) is determined by using only the portion corresponding to \(x^{in}\):

$$\begin{aligned} i^*=\underset{i \in {\mathcal {A}}}{{\text {argmin}}}\left\{ \left\| x^{i n}-w_i^{i n}\right\| \right\} . \end{aligned}$$
(9)

Note that, each neuron i is arranged on a 2D lattice structure \({\mathcal {A}}\) such that prototype vectors or weight vectors in input and output spaces, i.e., \(w_i^{in}\) and \(w_i^{out}\), are associated to each neuron i. Then, the learning rules are

$$\begin{aligned}&\Delta w_i^{in}=a(m) h\left( i^*, i ; m\right) \left[ x^{in}-w_i^{in}\right] , \end{aligned}$$
(10)
$$\begin{aligned}&\Delta w_i^{out}=a(m) h\left( i^*, i ; m\right) \left[ x^{out}-w_i^{out}\right] . \end{aligned}$$
(11)

Thus, the weight vectors on both input and output spaces are updated by

$$\begin{aligned}&w_i^{in} \Longleftarrow w_i^{in}+a(m) h\left( i^*, i ; m\right) \left[ x^{i n}-w_i^{in}\right] , \end{aligned}$$
(12)
$$\begin{aligned}&w_i^{out}\Longleftarrow w_i^{out}+a(m) h\left( i^*, i ; m\right) \left[ x^{out}-w_i^{out}\right] , \end{aligned}$$
(13)

where \(0<a(m)<1\) is the learning rate in epoch m. \(h\left( i^*, i; m\right) =\exp \left( -\frac{\left\| r_i-r_{i^*}\right\| ^2}{2 b^2(m)}\right)\) is a Gaussian neighborhood function where \(r_i\) and \(r_{i^*}\) are locations of neurons i and \(i^*\) in the lateral lattice space. Note that \(r_i\) and \(r_{i^*}\) correspond to positions of \(\textrm{w}_i\) and \(\textrm{w}_{i^*}\), respectively. The variables a(m) and b(m) are the exponential decay functions, which are used to guarantee that the weight vectors converge to steady states. These functions are given by

$$\begin{aligned} a(m)=a_{0}\left( \frac{a_{M}}{a_{0}}\right) ^{\frac{m}{M}} \quad \text{ and } \quad b(m)=b_{0}\left( \frac{b_{M}}{b_{0}}\right) ^{\frac{m}{M}}, \end{aligned}$$
(14)

where \(a_{0}\) and \(b_{0}\) are initial values, while \(a_{M}\) and \(b_{M}\) are final ones after training M epochs. After training, we obtain the set of weight vectors \(\left\{ w_i^{in}, w_i^{out}\right\}\) belonging to each neuron i. Then, a trained VQTAM network can be used for the testing stage, such that each vector of the test set, \(x_{test}^{in}\), is used for the winner assignment: \(i^*=\) \(\underset{i \in {\mathcal {A}}}{{\text {argmin}}}\left\{ \left\| x_{test}^{in}-w_i^{in}\right\| \right\}\). Afterward, a weight vector \(w_{i^*}^{out}\), which corresponds to the weight of the winner neuron \(w_{i^*}^{in}\) can be used as an estimation of the outputs. This arrangement means that the approximate values of apparent resistivity and phase of \(x_{test}^{in}\) are obtained by using the following equation:

$$\begin{aligned} x_{test}^{out} = w_{i^*}^{out}. \end{aligned}$$
(15)

The algorithm of the VQTAM technique is summarized as Algorithm 1.

Algorithm 1
figure a

Algorithm of VQTAM scheme

Improvement of classical VQTAM

Note that the estimates of \(x_{test}^{out}\) produced by Eq. (15) have low reliability because of a high error value. The number of neurons must be sufficiently large to obtain a high-accuracy prediction. This arrangement leads to a time-consuming problem during the training process. An efficient strategy to defeat this restriction is to train VQTAM with a few neurons to get the data topology. Then, an autoregressive (AR) model and locally linear embedding (LLE) technique Roweis and Saul (2000) can be used to improve the accuracy of estimates derived from VQTAM. However, in practice, the AR model relying on the pseudo-inverse derived from the normal equations becomes problematic due to a pseudo-inverse matrix being close to singular, leading to a badly scaled optimization problem. In addition, the AR model is based on linear mapping and used to fit the linearly distributed data. For the LLE algorithm, this method is one of the most widely used nonlinear dimensionality reduction techniques. LLE reduces the dimensionality of high-dimensional data while preserving local geometries in a low-dimensional representation of the original data. In contrast to the AR model, LLE preserves the local relationships of the data manifold. LLE is also categorized as nonlinear mapping, which is appropriate to represent the nonlinear data. Further, as suggested in Wongsriworaphon et al. (2015); Limtrakul and Arnonkijpanich (2019), VQTAM based on LLE gives an accurate and robust prediction better than the result of VQTAM combined with the AR model. Thus, in this work, only the LLE technique is chosen to improve classical VQTAM. The outline of the LLE application to our work is composed of three steps. In the first step, the k winner neurons of \(x_{test}^{in}\) are defined as \(w_{i_1^*}^{in}, w_{i_2^*}^{in},\ldots , w_{i_k^*}^{in}\), where

$$\begin{aligned} \begin{aligned} i_1^*=\underset{i \in {\mathcal {A}}}{{\text {argmin}}}\left\{ \left\| x_{test}^{in}-w_i^{in}\right\| \right\} , \\ i_2^*=\underset{i \in {\mathcal {A}}, i \notin \left\{ i_1^*\right\} }{{\text {argmin}}}\left\{ \left\| x_{\text{ test } }^{i n}-w_i^{i n}\right\| \right\} , \\ \vdots \\ i_k^*=\underset{i \in {\mathcal {A}}, i \notin \left\{ i_1^* \ldots i_{k-1}^*\right\} }{{\text {argmin}}}\left\{ \left\| x_{test}^{i n}-w_i^{in}\right\| \right\} . \end{aligned} \end{aligned}$$
(16)

Then, each vector \(x_{test}^{in}\) in the test set can be written as a linear combination of its k nearest neural weights:

$$\begin{aligned} x_{test}^{i n}=c_1 w_{i_1^*}^{i n}+c_2 w_{i_2^*}^{in}+\cdots +c_k w_{i_k^*}^{i n}. \end{aligned}$$
(17)

In the second step, for each vector \(x_{test}^{in}\), the coefficients \(C=\left\{ c_l \mid l=1, \ldots , k\right\}\) of the linear combination are calculated to obtain the best representation of the local geometries of the input space. In this step, the LLE attempts to reconstruct the k coefficients. This action corresponds to minimizing the reconstruction error that appears in the cost function

$$\begin{aligned} \phi (C)=\left\| x_{t e s t}^{i n}-\sum _{l=1}^k c_l w_{i_l^*}^{i n}\right\| . \end{aligned}$$
(18)

This representation function is intended to minimize the cost function under two constraints. In the first constraint, each vector \(x_{t e s t}^{in}\) is reconstructed only from its k nearest winner neurons, thus \(c_l=0\), if \(w_{i_l^*}^{\text{ in } }\) is not a neighbor of \(x_{test}^{in}\). The second condition is that the coefficients sum to one, i.e., \(\sum _{l=1}^k c_l=1\). The optimal matrix C can be computed by solving a constrained least-squares problem. In the third step, the estimates of \(x_{test}^{out}\) are improved by using the coefficients from the previous step:

$$\begin{aligned} x_{test}^{out}=c_1 w_{i_1^*}^{\text{ out }}+c_2 w_{i_2^{*}}^{out}+\ldots +c_k w_{i_k^*}^{out}. \end{aligned}$$
(19)

The algorithm of VQTAM-LLE technique is summarized as Algorithm 2. We can see from Algorithms 1 and 2 that the VQTAM and VQTAM-LLE have the same training stage. Therefore, the number of iterations used to reach the stopping criteria is the same. However, both methods use a different number of iterations and consume different CPU time in the testing stage.

Algorithm 2
figure b

Algorithm of VQTAM-LLE scheme

Applications of VQTAM and VQTAM-LLE techniques for simulating MT responses

As described above, both VQTAM and VQTAM-LLE have been applied to a wide range of tasks such as function approximation Barreto and Araujo (2004), computer vision Wongsriworaphon et al. (2015), robot modeling Limtrakul and Arnonkijpanich (2019). These algorithms have shown promising results in various applications and are considered to be effective methods for dynamic modeling and pattern recognition. In the MT problems, ANN have been employed for both forward and inverse MT problems, as evidenced by previous studies (Singh et al. 2005; Khalil et al. 2006; Puzyrev 2019; Shan et al. 2021; Puzyrev and Swidinsky 2021; Deng et al. 2023). However, the application of VQTAM and VQTAM-LLE to MT modeling still needs to be explored. Exploring the potential of these VQTAM techniques in the context of MT problems presents an intriguing and challenging avenue of research. Despite their successful track record in other domains, their adaptation and performance in the specific domain of MT modeling have yet to be thoroughly investigated.

The process of utilizing VQTAM approaches to estimate 2-D magnetotelluric (MT) responses begins with the preparation of responses for the training dataset. Let us consider the sampling of electromagnetic (EM) periods \(T_i \in [A, B]\) for \(i=1,2, \ldots , N_T\), where \(N_T\) represents the total number of sampling periods during the training stage. Typically, the EM period interval for MT falls within \([A, B]=[10^{-1},10^{4}]\) seconds. Additionally, sampling stations on the Earth’s surface are denoted as \(S_j=\left( y_j, z_j\right) \in \left[ Y_l, Y_r\right] \times \left[ Z_a, Z_b\right] \subset \textrm{R}^2\) within the training domain, where \(j=1,2,.., N_S\) and \(N_S\) signifies the total number of sampling stations. It is important to note that the scale for \(\left( y_j, z_j\right)\) is in meters. In this context, the z-direction is considered as positive downward, meaning that \(z_j=0\) represents sea level. \(z_j\) on the Air–Earth interface can be positive or negative depending on the presence of terrain. These data, along with the resistivity structure model, serve as inputs for the finite element (FE) approach utilizing quadrilateral elements Sarakorn (2017). The outputs consist of matrices representing apparent resistivity

$$\begin{aligned} \mathbf {\rho }^{a}_{xy}=[\rho ^{a_{xy}}_{ij}]_{N_{S}\times N_{T}},\;\; \mathbf {\rho }^{a}_{yx}=[\rho ^{a_{yx}}_{ij}]_{N_{S}\times N_{T}}, \end{aligned}$$
(20)

and phase

$$\begin{aligned} \mathbf {\phi }_{xy}=[\phi ^{xy}_{ij}]_{N_{S}\times N_{T}},\;\; \mathbf {\phi }_{yx}=[\phi ^{yx}_{ij}]_{N_{S}\times N_{T}}. \end{aligned}$$
(21)

The input vector \(x^{in}=[x^{in}_{pq}]_{N_{d}\times 3}\), where \(N_{d}=N_{T}\cdot N_{S}\), is constructed as

$$\begin{aligned} x^{in}=\begin{bmatrix} T_{1}&{} y_{1}&{} z_{1}\\ T_{1}&{} y_{2}&{} z_{2}\\ \vdots &{}\vdots &{}\vdots \\ T_{1}&{} y_{N_{S}}&{} z_{N_{S}}\\ T_{2}&{} y_{1}&{} z_{1}\\ T_{2}&{} y_{2}&{} z_{2}\\ \vdots &{}\vdots &{}\vdots \\ T_{2}&{} y_{N_{S}}&{} z_{N_{S}}\\ \vdots &{}\vdots &{}\vdots \\ T_{N_{T}-1}&{} y_{N_{S}}&{} z_{N_{S}}\\ T_{N_{T}}&{} y_{1}&{} z_{1}\\ T_{N_{T}}&{} y_{2}&{} z_{2}\\ \vdots &{}\vdots &{}\vdots \\ T_{N_{T}}&{} y_{N_{S}}&{} z_{N_{S}} \end{bmatrix}, \end{aligned}$$
(22)

and the vector \(x^{out}=[x^{out}_{pq}]_{N_{d}\times 2}\)

$$\begin{aligned} x^{out}=\begin{bmatrix} \rho ^{a_{xy}}_{11}&{} \phi ^{{xy}}_{11}\\ \rho ^{a_{xy}}_{21}&{} \phi ^{{xy}}_{21}\\ \vdots &{}\vdots \\ \rho ^{a_{xy}}_{N_{S}1}&{} \phi ^{{xy}}_{N_{S}1}\\ \rho ^{a_{xy}}_{12}&{} \phi ^{{xy}}_{12}\\ \rho ^{a_{xy}}_{22}&{} \phi ^{{xy}}_{22}\\ \vdots &{}\vdots \\ \rho ^{a_{xy}}_{N_{S}2}&{} \phi ^{{xy}}_{N_{S}2}\\ \vdots &{}\vdots \\ \rho ^{a_{xy}}_{1N_{T}}&{} \phi ^{{xy}}_{1N_{T}}\\ \rho ^{a_{xy}}_{2N_{T}}&{} \phi ^{{xy}}_{2N_{T}}\\ \vdots &{}\vdots \\ \rho ^{a_{xy}}_{N_{S}N_{T}}&{} \phi ^{{xy}}_{N_{S}N_{T}}\\ \end{bmatrix} \end{aligned}$$
(23)

for E-polarization and

$$\begin{aligned} x^{out}=\begin{bmatrix} \rho ^{a_{yx}}_{11}&{} \phi ^{{yx}}_{11}\\ \rho ^{a_{yx}}_{21}&{} \phi ^{{yx}}_{21}\\ \vdots &{}\vdots \\ \rho ^{a_{yx}}_{N_{S}1}&{} \phi ^{{yx}}_{N_{S}1}\\ \rho ^{a_{yx}}_{12}&{} \phi ^{{yx}}_{12}\\ \rho ^{a_{yx}}_{22}&{} \phi ^{{yx}}_{22}\\ \vdots &{}\vdots \\ \rho ^{a_{yx}}_{N_{S}2}&{} \phi ^{{yx}}_{N_{S}2}\\ \vdots &{}\vdots \\ \rho ^{a_{yx}}_{1N_{T}}&{} \phi ^{{yx}}_{1N_{T}}\\ \rho ^{a_{yx}}_{2N_{T}}&{} \phi ^{{yx}}_{2N_{T}}\\ \vdots &{}\vdots \\ \rho ^{a_{yx}}_{N_{S}N_{T}}&{} \phi ^{{yx}}_{N_{S}N_{T}}\\ \end{bmatrix} \end{aligned}$$
(24)

for H-polarization, respectively.

For the testing phase, a different set of periods \(T_i^{test} \in [A,B]\) for \(i=1,2,\ldots , N_{T}^{test}\), where \(N_T^{test}\) represents the total number of testing periods, and \(S_{j}^{test}=\left( y_j^{test},z_j^{test}\right) \in \left[ Y_l, Y_r\right] \times \left[ Z_{a}, Z_{b}\right]\) for \(j=1,2,..., N_{S}^{test}\), where \(N_{S}^{test}\) is the total number of testing stations, are utilized as input. The construction of the input vector \(x_{test}^{in}\) mirrors the training stage, although the size of matrices may vary. Ultimately, the estimated MT responses obtained through the VQTAM or VQTAM-LLE approaches are stored in \(x_{test}^{out}\) with dimensions \(\left( N_T^{test} \cdot N_{S}^{test}\right) \times 2\) for each polarization. Note that for all experiments in this study, we keep \(S^{test}_{j}= S_{j}\) and \(N^{test}_{S} = N_{S}\). This consistency is crucial for making valid comparisons among the MT sites within the specified domain.

Numerical experiments

In this section, we present a comprehensive evaluation of the VQTAM and VQTAM-LLE approaches, focusing on their performance, accuracy, and reliability. For these experiments, we employ three distinct models. The initial benchmark model is derived from the COMMEMI project (Zhdanov et al. 1997). The second model represents a modification of another original COMMEMI project, while the final model incorporates topographic zone. To assess the accuracy of the developed VQTAM and VQTAM-LLE methods, the mean absolute percentage error (MAPE) that measures the prediction accuracy of a predicting method defined by

$$\begin{aligned} \textrm{MAPE}=\frac{1}{N^{test}_{T}N^{test}_{S}} \sum _{j=1}^{N^{test}_{T}}\sum _{i=1}^{N^{test}_{S}}\left| \frac{d_{ij}-\hat{d}_{ij}}{d_{ij}}\right| \times , 100, \end{aligned}$$
(25)

where \(d_{ij}\) is the referenced response obtained from the FEM scheme (Sarakorn 2017) and \(\hat{d}_{ij}\) is the predicted data obtained by the VQTAM or VQTAM-LLE methods is utilized. This evaluation involves the apparent resistivity and phase data for both \(E-\) and \(H-\)polarizations. Furthermore, we gauge the efficiency of our approaches by analyzing overall CPU times and the number of iterations during the training stage. All experiments are made on MacBook Pro(M2) with memory 16GB and CPU 10 cores.

COMMEMI2D-2 model

The resistivity structures of the COMMEMI2D-2 model (Zhdanov et al. 1997) are illustrated in Fig. 2. The Earth’s surface is flat in this representation, with \(z=0\) denoting the air–earth interface and sea level. We have positioned \(N_{S}=21\) stations at intervals of 4 km, covering a range of \([-50,50]\) km on the Earth’s surface. Each station is represented as \(S_{i}=(y_{j},z_{j})=(-50+4(j-1)~\text {km},0~\text {km})\), where j ranges from 1 to \(N_{S}\). These station locations are marked by red triangles in Fig. 2. For our testing purposes, we have selected 61 periods of electromagnetic (EM) waves within the range of [1, 1000] seconds, spaced at intervals of 0.05 on a logarithmic scale.

Several critical parameters influence the performance of our neural networks including the number of training datasets, the number of neurons, and the stopping criteria used to determine the stability of the objective functions. The number of training datasets plays a pivotal role in the accuracy and efficiency of the neural networks. While a more extensive training dataset typically results in higher algorithm accuracy, an extensive dataset can reduce overall performance. Note that the parameter k in Algorithm 2 is fixed as \(k=4\) in all experiments.

In this model, we utilize \(N_{T}=31\) periods selected from a total of 61 periods within the specified interval, defined as \(T_{i}=10^{0.1(i-1)}\) s, where i ranges from 1 to \(N_{T}\). These periods are employed for simulating the magnetotelluric (MT) data and as input training data for our algorithms. It is important to note that these period selections remain consistent across all cases for this model. The impact of varying training data will be explored in the subsequent model.

The second parameter we consider is the number of neurons allocated for each input dataset. Utilizing the SOM architecture, the neural network structure forms a 2D lattice, denoted as \(N_{n}\times N_{n}\) neurons. In our experimentation, we configure the design with \(N_{n}\) set to 10, 15, 20, 25, 30, 35,  and 40.

Lastly, the third factor involves setting the stopping criteria to determine the stability of objective functions a(m) and b(m) in equation (14), which dictate the termination of the training step for the VQTAM and VQTAM-LLE approaches before the commencement of the testing phase. Specifically, the stability of the objective function is achieved when the values of these functions do not fluctuate beyond \(3\%\) to \(0.05\%\). By adjusting these key parameters, we assess the accuracy of both methods in terms of MAPE. The resulting MAPE values for the VQTAM and VQTAM-LLE approaches, employing a stopping criterion for objective functions set at \(1\%\), are summarized in Table 1.

In our numerical experiments, it is evident that increasing the number of neurons leads to a decrease in MAPE across all MT responses, while a reduction in the number of neurons significantly raises MAPE levels. Specifically, employing \(40\times 40\) square grid configuration ensures that the MAPE for all MT responses remains below \(5\%\). The VQTAM-LLE method displays a smaller MAPE than the conventional VQTAM approach in this scenario. This trend persists when the stopping criterion is reduced to \(0.5\%\), as illustrated in Table 2.

Similar trends are observed in cases where the stopping criteria are set at \(0.1\%\) and \(0.05\%\) (refer to Tables 3 and 4, respectively). In these instances, MAPE levels below \(5\%\) are consistently achieved with neuron configurations ranging from \(30\times 30\) to \(40\times 40\). Notably, the choice of stopping criteria does not significantly impact the MAPE levels within this range.

To consider efficiency, the relationship between the total number of iterations during the training stage and varying numbers of neurons is depicted in Fig. 3. Decreasing the stopping criteria increases the number of iterations, while an increase in the number of neurons results in higher iteration counts. To regard CPU time, Table 5 demonstrates that a higher number of neurons leads to increased CPU usage for both VQTAM and VQTAM-LLE approaches across all stopping criteria. The CPU time increases when the stopping criteria decrease across all neural network scenarios. Additionally, when employing the same number of neurons, VQTAM scheme exhibits shorter CPU times than VQTAM-LLE scheme.

COMMEMI2D-1-like model

For this model, we configured it according to the COMMEMI2D-1 model (Zhdanov et al. 1997), as illustrated in Fig. 4. Originally, the subsurface anomaly size was 0.5 km \(\times\) 2 km. However, it was enlarged to 10 km \(\times\) 2 km for a larger scale. Twenty-one stations were strategically placed for measuring responses, spanning a range of [− 10, 10] km on the Earth’s surface with 1 km spacing. A red triangle denotes each station. The EM wave is comprised of 81 periods ranging from [0.01, 100] s, spaced logarithmically at 0.05 intervals.

During the training process, the stopping criteria for changing the levels of a(m) and b(m) in equations (14) were set at \(0.1\%\). The total number of neurons was specified as 1600 neurons. The purpose of employing this model was to assess the efficiency of VQTAM and VQTAM-LLE concerning the variation in the ratio between training data sampling periods (\(N_{T}\)) and testing data sampling periods (\(N^{test}_{T}\)), in which the total number of stations for two phases, \(N_{S}\) and \(N^{test}_{S}\), were set to be equal. The ratios considered in this experiment were 14 : 67, 21 : 60, 41 : 40, and 58 : 23, respectively.

The MAPE obtained by VQTAM and VQTAM-LLE for each response type concerning these ratios is depicted in Fig. 5. An increase in training data led to a reduction in MAPE for both approaches. The highest MAPE occurred at the smallest ratio of 14 : 67, whereas the minimum MAPE was at the most significant ratio of 58 : 23. VQTAM consistently exhibited lower MAPE compared to VQTAM-LLE for all response types. Notably, the MAPE values for apparent resistivity were consistently higher than those for phase across all ratios. However, this trade-off was observed regarding the number of iterations and CPU time, with increased training data leading to longer computation times, as illustrated in Fig. 6.

Topographic models

For the topographic models, we selected two profiles denoted as P1 and P2, passing through distinct terrains and locations in Thailand. Sample points along these profiles were collected and utilized as input for the FE method to generate training and testing data. The locations of these profiles are depicted in Fig. 7 (top).

The left-bottom part of Fig. 7 illustrates the P1 profile and its corresponding elevation, covering a length of 335 km and situated on the Khorat Plateau in northeastern Thailand. This region, as depicted in the figure, forms a lengthy basin. Geological evidence below the Khorat plateau indicates surface rock types such as sandstone, shale, and rock salt Bunkanpai et al. (2023). For this study, we create the simplified resistivity model with the top layer set at 100 \(\Omega\)m, reaching depths of approximately 4 km. The bottom layer is defined at 5000 \(\Omega\)m, as shown in Fig. 8 (left). There were \(N_{S}=21\) sites in total, and the data were generated over 41 periods ranging from 1 to 1000 s using the FE method for the P1 profile, as illustrated in Fig. 9 (left), showcasing significant distortions in the \(\rho ^{a}_{yx}\) responses.

The right-bottom section of Fig. 7 displays the P2 profile, located in the Kanchanaburi province of Thailand. This 123 km long profile is positioned very close to the P4 profile studied by Boonchaisuk et al., 2013 Boonchaisuk et al. (2013), for investigating Miocene dual subduction zones beneath the Shan-Thai terrane in western Thailand. The resistivity model for this profile was simplified based on the inversion results from the study mentioned above, as depicted in Fig. 8 (right). For the P2 profile, \(N_{S}=41\) sites were considered, and data were generated over 31 periods ranging [0.1, 100] s using the FE method, as shown in Fig. 9 (right).

In both cases, we set the acceptable average MAPE level at \(5\%\) for each response type, with 1600 neurons and a stopping criterion of \(0.1\%\) for the change in the objective function in both VQTAM and VQTAM-LLE approaches. The ratio between training and testing data was 21:20 for the first P1 model and 16:15 for the second P2 profile. Figure 10 depicts the calculated MAPE values. For the P1 profile, both approaches managed to reduce the average MAPE to the desired level within 151 iterations, demonstrating similar accuracy. Similarly, for the P2 profile, these methods required 141 iterations to achieve the desired MAPE level. The accuracy of both VQTAM and VQTAM-LLE appeared comparable, although the phase MAPE values were significantly lower than those of the apparent resistivity.

Discussion and conclusions

This study aims to develop and implement the VQTAM and VQTAM-LLE approaches for simulating 2-D magnetotelluric responses. Training and testing data, essential for evaluating the effectiveness of these methods, are generated through FE techniques. Both neural networks require input data, precisely the coordinates of magnetotelluric (MT) sites and their corresponding responses, including apparent resistivity and phase at selected electromagnetic (EM) periods. Various parameters, such as the number of neurons, stopping criteria for objective functions, the ratio between training and testing data, and the type of errors considered, are investigated to determine their impact on the accuracy and efficiency of the neural networks.

Benchmark models and real-world scenarios are employed to assess the potential of both VQTAM and VQTAM-LLE approaches. The results indicate that sufficient neurons are crucial for achieving high accuracy. Setting the stopping criteria for the objective function at an appropriate level is essential. Moreover, the quantity of training data should be substantial compared to testing data, and these data points need to be distributed within the specified range. When these parameters are appropriately configured, the VQTAM and VQTAM-LLE schemes demonstrate excellent performance for benchmark and confirmed cases, achieving the desired error levels.

Due to their impressive performance, these techniques may be helpful as alternative forward operators when incorporated alongside some traditional methods or as hybrid neural networks-traditional methods. Furthermore, the neural networks’ role in the pre-inversion process may generate additional MT data for better input data sets or for the restricted stations or periods where the MT survey cannot be done with some limitations. However, enhancing the speed of these approaches for practical applications is crucial. Achieving this goal may require potential enhancements, redesigns, or re-architecture of these methods.

Fig. 1
figure 1

The computational domain \(\Omega = \Omega _1 \cup \Omega _2 \cup \Gamma \cup \Gamma ^{int}\) for 2-D MT case

Fig. 2
figure 2

The resistivity structures of COMMEMI2D-2 model. The red triangles are the locations of \(N_{S}=21\) stations

Fig. 3
figure 3

The graph between num. of iterations during the training stage of both VQTAM and VQTAM-LLE methods and various num. of neurons

Fig. 4
figure 4

The resistivity distributions of COMMEMI2D-1-like model. The red triangles are the locations of 21 stations

Fig. 5
figure 5

The graph displays MAPE values with respect to the ratios between training and testing data for VQTAM (left) and VQTAM-LLE (right) approaches

Fig. 6
figure 6

The graph on the left illustrates the number of iterations, while the graph on the right depicts CPU time, both with respect to the ratios between training and testing data used by VQTAM and VQTAM-LLE approaches

Fig. 7
figure 7

The profiles that pass through some part of Northeast (P1: left bottom) and Kanchanaburi province (P2: right bottom) of Thailand

Fig. 8
figure 8

The resistivity models for the P1 (left) and the P2 (right) profiles. The red triangles are assumed as the MT sites for modeling

Fig. 9
figure 9

The generated MT responses using FE method of the P1 (left) and the P2 (right) profiles for training and testing data

Fig. 10
figure 10

The MAPE obtained by the VQTAM and VQTAM-LEE schemes for P1 (left) and P2 (right) profiles

Table 1 The mean absolute percentage error (MAPE) of MT was obtained through the VQTAM and VQTAM-LLE approaches, with a stopping criterion set at \(1\%\) for changing the objective function
Table 2 The mean absolute percentage error (MAPE) of MT was obtained through the VQTAM and VQTAM-LLE approaches, with a stopping criterion set at \(0.5\%\) for changing the objective function
Table 3 The mean absolute percentage error (MAPE) of MT was obtained through the VQTAM and VQTAM-LLE approaches, with a stopping criterion set at \(0.1\%\) for changing the objective function
Table 4 The mean absolute percentage error (MAPE) of MT was obtained through the VQTAM and VQTAM-LLE approaches, with a stopping criterion set at \(0.05\%\) for changing the objective function
Table 5 The CPU time (seconds) against different numbers of neurons utilized by the VQTAM and VQTAM-LLE approaches

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Download references

Acknowledgements

The authors express their gratitude to Dr. Chatchai Vachiratienchai from Curl-E Geophysics Co. Ltd. for their technical guidance and valuable discussions throughout this research. Additionally, the authors appreciate the reviewers for their helpful feedback, which greatly enhanced the quality of this manuscript.

Funding

This research was supported by Department of Mathematics, Faculty of Science, Khon Kaen University, Fiscal Year 2022.

Author information

Authors and Affiliations

Authors

Contributions

Phongphan Mukwachi is credited with the study conception, design, data collection, analysis, and initial drafting of the manuscript. Banchar Arnonkijpanich is acknowledged for contributing to study design and substantial revisions of the manuscript. Weerachai Sarakorn supervised the entire research project, provided guidance on methodology, reviewed and interpreted the data, and played a significant role in manuscript writing and revisions.

Corresponding author

Correspondence to Weerachai Sarakorn.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mukwachi, P., Arnonkijpanich, B. & Sarakorn, W. Simulating 2-D magnetotelluric responses using vector-quantized temporal associative memory artificial neural network-based approaches. Geosci. Lett. 11, 10 (2024). https://doi.org/10.1186/s40562-024-00328-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40562-024-00328-8

Keywords