Double-difference earthquake relocation using waveform cross-correlation in Central and East Java, Indonesia

The Central and East Java region, which is part of the Sunda Arc, has relatively high seismic rates due to the convergence of two major tectonic plates in the Indonesian region; i.e., the Indo-Australian Plate subducting under the Eurasian Plate. Many devastating earthquakes have occurred in this area as a result of the interaction between these two plates. Two examples are the 1994 Banyuwangi earthquake (Mw 7.6) and the 2006 Yogyakarta earthquake (Mw 6.3). This study aims to determine precise earthquake locations and analyze the pattern of seismic distribution in Central and East Java, Indonesia. We manually re-picked P and S-wave arrival times that were recorded by the Agency for Meteorology, Climatology and Geophysics (BMKG) of the Indonesian earthquake network during the time period January 2009–September 2017. We then determined the earthquake locations using a non-linear method. To improve the accuracy of the earthquake locations, we relocated 1,127 out of 1,529 events, using a double-difference algorithm with waveform cross-correlation data. Overall, the seismicity in the Central and East Java region is predominantly distributed in the south of Java Island; e.g., the Kebumen, Yogyakarta, Pacitan, Malang, and Banyuwangi clusters. These clusters are probably related to the subduction activity in these regions. Meanwhile, there are clusters of earthquakes having shallow depths on the mainland that indicate the activity of inland faults in the region; e.g., the Opak Fault, the Kendeng Thrust, and the Rembang–Madura–Kangean–Sakala (RMKS) Fault Zone. Several other active inland faults have not shown any significant seismicity over the time period mentioned, i.e., the Pasuruan Fault, the Lasem Fault, the Muria Fault, the Semarang Thrust, and the Probolinggo Fault.


Introduction
Central and East Java are part of the Sunda Arc, which has relatively high seismicity and a complex geological system as a result of the Indo-Australian Plate subducting under the Eurasian Plate. The convergence rate varies from ~ 5.6 cm/year in the western part of Java to ~ 6.5 cm/year in the eastern part (Koulali et al. 2017). This has produced several active faults, i.e., the Semarang Thrust Fault, the Kendeng Thrust Fault, the Opak Fault, the Lasem Fault, the Probolinggo Fault and the Pasuruan Fault, as well as the volcanoes that most likely control the seismicity in the study area (Marliyani 2016;Pusat Studi Gempa Nasional (PuSGeN), 2017) (Fig. 1). In contrast with the oblique convergence that occurs in Sumatra, the convergence is normal in the western part of the Sunda Arc up to the plate boundaries at Java Island (Malod et al. 1995). Consequently, the seismic rate in Central and East Java is relatively lower than in Sumatra and West Java (the transitional zone from oblique to normal subduction) (Newcomb and Mccann 1987). However, the study area still has a potential for destructive earthquakes since the seismic gap that is found in this area threatens the region with potential future megathrust events (Widiyantoro et al. 2020). Based on historical earthquake data, many large earthquakes have occurred in Central and East Java, such as the 1994 large subduction thrust earthquake (Mw 7.6) that produced a tsunami in Banyuwangi. This earthquake was caused by slip over a subducting seamount, which is a locked patch within a decoupled subduction zone (Abercrombie et al. 2001). Another event, the 2006 Yogyakarta earthquake (Mw 6.3), occurred on the inland Opak Fault; the geometry of which has been subsequently determined by SAR interferometry (Tsuji et al. 2009). There have also been other historical earthquakes (M > 6) along the Sunda Arc dating from the 1900s that have been documented by Newcomb and McCann (1987).
Previous studies have evaluated the seismicity in the study area using the Agency for Meteorology, Climatology, and Geophysics of Indonesia (BMKG) regional network. These include: hypocenter determination using a non-linear method in West Java (Rosalia et al. 2017) and  Muttaqy et al. Geoscience Letters (2023) 10:5 in Central and East Java (Muttaqy et al. 2019); hypocenter relocation using a double-difference method in West Java (Supendi et al. 2018), and in East Java (Cahyaningrum et al. 2015); and teleseismic double-difference along the Sunda Arc . Many local seismic networks have been deployed and have also contributed to seismicity and tomography studies in Central and East Java. These include: the DOMERAPI network that was used to comprehensively study the crustal structure beneath the Merapi volcano (Ramdhan et al. 2015(Ramdhan et al. , 2016(Ramdhan et al. , 2017a(Ramdhan et al. , b, 2019; the MERAMEX network, consisting of onshore and offshore seismographic stations in Central Java, that successfully determined the crustal and upper mantle structure beneath Central and East Java; as well as studies related to volcanic activities in the study area (Koulakov et al. , 2009Wagner et al. 2007;Rohadi et al. 2013;Bohm et al. 2013;Zulfakriza et al. 2014;Haberland et al. 2014;Wölbern and Rümpker 2016); and ambient noise tomography, using both the BMKG network and portable seismographs in East Java (Martha et al. 2017). Central and East Java are considered to be the most densely populated region in Indonesia; over 73 million people live in this highly seismic area (Central Bureau of Statistics of Indonesia (BPS) 2012). Due to the potential of high seismic hazard, the investigation of earthquake clusters in this region is essential in order to improve and support the Indonesian seismic hazard map. Therefore, this study aims to determine precise hypocenter locations and analyze the pattern of seismic distribution in Central and East Java.

Data and method
We manually re-picked P-and S-arrivals on waveforms recorded at 34 BMKG stations in Central and East Java ( Fig. 1) in the period January 2009 to September 2017, using Seisgram2K (Lomax and Michelini 2009). The following criteria were used for selecting events for determining the hypocenters: (i) recorded by at least four stations and having clear onset P-and S-arrival times, and (ii) having magnitude (Mw) > 3 (Fig. 2a). To assure quality control during the picking process, we plotted a Wadati diagram to independently check the linear relationship between phase data (Fig. 2b); a Vp/Vs ratio of 1.75 was obtained. The hypocenter locations were determined using a non-linear method in the NLLoc program (Lomax et al. 2000) with the global 1-D seismic velocity model AK135 (Kennett et al. 1995). This method uses the oct-tree importance sampling to produce an estimation of the posterior density function (PDF) for the hypocenter location in 3D. A similar method was implemented to determine hypocenters in West Java (Rosalia et al. 2017), as well as in doing an aftershock analysis of the May 27, 2006, M 6.4 Yogyakarta earthquake (Husni et al. 2018;Wulandari et al. 2018); it was also used in the Pannonian Basin in Hungary (Wéber and Süle 2014), the Central-Eastern Alps of North Italy (Viganò et al. 2015), and the eastern border faults of the Main Ethiopian Rift (Lapins et al. 2020), among many others.
In order to have a more reliable seismic velocity model of the area beneath the study area, we updated the 1-D seismic velocity model from VELEST code which simultaneously inverts the hypocenter, velocity and station corrections. The code performs an iterative damped least-squares inversion where each iteration solves ray tracing and inverse problems. We applied the damping to control which parameters of earthquake locations, layer velocities, and station corrections needed to be adjusted. The higher the damping value, the fewer parameters are allowed to vary in the inversion process (Kissling et al. 1994). In this study, we selected events that have a maximum azimuthal gap of 180° to assure that the events are well localized by the seismograph network and are representative of the subsurface information in the study area. The 1-D priori seismic velocity model considered for this study was taken from Koulakov et al. (2007) as it successfully defined crustal and upper mantle P-average velocity (Vp) beneath Central Java and combined well with the global AK135 model (Kennett et al. 1995) for the deeper part of the structure (> 210 km). We used the Vp/Vs ratio of 1.75 derived by using a Wadati Diagram to scale the initial vs model. We then randomly generated 10 initial velocity models that were uniformly distributed within ± 20% relative to the priori model.
We then ran the HypoDD program (Waldhauser 2001), which implements the double-difference algorithm (Waldhauser and Ellsworth 2000), to relocate earthquakes that had previously been determined using the non-linear method. The double-difference algorithm is based on the assumption that if the distance between two earthquakes is smaller than their distances to the station and the length scale of the structure, then the ray paths of these earthquakes are similar. HypoDD can minimize the residuals between observed and calculated travel-time differences for pairs of earthquakes recorded at the same station. Thus, the errors due to an inaccurate velocity model can be minimized without using station corrections.
In addition to double-difference relocation, we also used waveform cross-correlation (WCC) data to obtain more reliable relative travel time data. Using waveform cross-correlation data minimizes the error commonly associated with the arrival-time picking process (Hauksson and Shearer 2005;Schaff and Waldhauser 2005). This process relies on the similarity between waveforms  Fig. 3) recorded by the nearest stations (GMJI, JAGI, KRK, BYJI, PWJI, RTBI, IGBI, and ABJI as shown in Fig. 1). Red and blue lines indicate the arrival times of P and S-waves, respectively. b Wadati Diagram showing a linear relationship between picked phases. The Vp/Vs ratio in this study is 1.75. Red dashed line indicates deviations from a constant Vp/Vs ratio and/or data reading errors recorded at the same station. The WCC technique is initially performed by selecting the seismogram with the highest signal-to-noise ratio (SNR) to be the master event of each earthquake cluster as determined by a double-difference algorithm. We assumed that the onsets of P-and S-waves on a highly SNR seismogram were clear enough to be identified. The other seismograms at the same cluster and the same station were cross-correlated and the picked arrival times were refined to the shifted time. Cross-correlation has been widely used, in addition to the double-difference algorithm, to relocate hypocenters; e.g., Sumatra (Pesicek et al. 2010;Waldhauser et al. 2012;Muksin et al. 2014), Central Java (Sipayung et al. 2018), the Nicoya Peninsula in Costa Rica (Hansen et al. 2006), the 2019 Ridgecrest earthquake sequence in eastern California (Lin 2020), the Alboran slab of the westernmost part of the Mediterranean Sea (Sun and Bezada 2020), among others.

Results and discussion
The hypocenter determination results consisted of the location of 1,529 events, using 11,192 phases for each P and S-wave (Fig. 3). To quantify the capability of the BMKG network in detecting earthquakes, we plotted both the cumulative number of earthquakes and the frequency-magnitude relationship in the time period 2009 to 2017 using the maximum likelihood method, which was applied in the Zmap package (Wiemer 2001). The regional BMKG network has a 3.4 magnitude of completeness (Mc), with many more earthquakes that could still be recorded; as compared to global networks such as USGS, which has a Mc of 4.2 with fewer earthquakes that could be recorded (Fig. 4).
We conducted the updated 1-D seismic velocity model by employing the selected 154 located events that have a maximum azimuthal gap of 180° and which were expected to represent the average velocity of Central and East Java. This is a trial-and-error process done by defining various initial models and parameters, iteratively. For each initial model, we used various velocity dampings from 0.1 to 1.0, while the hypocenter and station correction dampings were set to 0.01 and 0.1, respectively. This resulted in 100 1-D seismic velocity model solutions for each Vp and Vs. We selected 1 out of 100 updated models that was considered to be the best solution having minimal residual (Fig. 5).
Several earthquakes that may be generated by the same source mechanism will produce high waveform similarity at a common station. Therefore, the waveform cross-correlation (WCC) process ensures the consistency of P-and S-wave phase identification. We computed the cross-correlation functions for P-and S-waves using a time window of 0.2 s before and 2 s after the onset of P-arrival time and 1.4 s before and 5 s after S-arrival time onset. We used the Butterworth filter between 1-6 Hz and coefficient correlation criteria that are greater than 0.7. Figure 6 shows an example of the cross-correlation results at RTBI and PWJI stations. The output of the WCC process that was saved as input for HypoDD was the lag time and coefficient correlation. We also used this technique to estimate the uncertainty of observed data, resulting in the average of picking errors for P-and S-arrivals as 0.1886 s and 0.297 s, respectively. Fig. 4 a Earthquake cumulative numbers and b earthquake magnitude-frequency in relation to the regional BMKG network, compared to c earthquake cumulative numbers and d earthquake magnitude-frequency in relation to the global USGS network We applied both catalog and cross-correlation differential time data in HypoDD to improve the quality of event clustering. The weighting of the distance between paired events for catalog data (WDCT) was set to 45 km in the first four iterations; it was then set to 15 km and 35 km for correlation data (WDCC) in the second four iterations. These parameters are distance cut-off parameters used in HypoDD to remove data for event pairs with separation distances larger than the given values (Waldhauser 2001). The selection of the optimum damping factor depends on the system conditions to be resolved, which is represented as the condition number (CND) (Hauksson and Shearer 2005). We used the damping factors of 85 and 70, resulting in a condition number between 40 and 80.
As a result, we were successful in relocating 1,127 out of 1,529 events in the Central and East Java region (Additional file 1: Table S1) that form more of a cluster in several areas than the initial locations indicated (Fig. 7). The average shifting of earthquake locations in X (east-west), Y (north-south), and Z (depth) directions are 3.37, 4.76, and 10.4 km, respectively; with the maximum shifted locations being 29.2, 44.36, and 49.98 km, respectively (Additional file 2: Figure S1). This somewhat significant improvement was also statistically proven by the histogram of residual times (Fig. 8) which had standard deviations of 0.912, 0.476, and 0.402 s 2 before relocation, after relocation without, and with WCC, respectively. The distribution of location errors in X, Y and Z directions is also provided in Additional file 2: Figure S2.
Based on the relocation results, the seismicity in Central and East Java are predominantly distributed in the south of Java Island. The vertical cross-section of blocks B-F (Fig. 9) shows subduction-related events that are compatible with the slab 1.0 model (Hayes et al. 2012). The dipping angle of the slab steepens from west to east. Each block represents several interesting clusters in the study area, such as the Kebumen, Yogyakarta, Pacitan, Malang, and Banyuwangi clusters.
Block B contains the Kebumen cluster where the Mw 6.2 Kebumen earthquake occurred on January 25, 2014 (Fig. 9). The focal mechanism of the Global Centroid Moment Tensor (GCMT) (Dziewonski et al. 1981;Ekström et al. 2012) (https:// www. globa lcmt. org/), shows a normal faulting mechanism, while the surrounding events in the cluster are dominated by a thrusting mechanism (Fig. 12). Based on the location and focus depth, the seismicity in this cluster consists of intraslab events associated with an intense deformation zone due to plate collision (Serhalawan et al. 2017).
The vertical cross-sections of blocks C, D, and E depict the Yogyakarta, Pacitan, and Malang clusters, respectively (Fig. 9). These seismic clusters are located at the forearc of the Java subduction system and are dominated by subduction-related events with thrusting mechanisms and normal-faulting mechanisms in certain areas, based on the GCMT focal mechanism (Fig. 12). The steeper angle of the slab causes an increase in the number of earthquakes towards the east with depths of up to 200 km. On April 10, 2021, a Mw 6.1 earthquake with a  Koulakov et al. (2007) and the AK135 (Kennett et al. 1995) Muttaqy et al. Geoscience Letters (2023 10:5 Fig. 6 Example of the waveform cross-correlation (WCC) process for events recorded at common stations. a P-waves recorded at RTBI station. b S-waves recorded at PWJI station Fig. 7 Comparison of seismic distribution in the Central and East Java region: a before relocation, b after the relocation. Blocks A-F are the area used to plot the vertical cross-sections shown in Fig. 9. The solid-colored circles represent earthquake focus depth, while the grey circles are earthquakes which were eliminated in the relocation process thrusting fault mechanism occurred near the Malang cluster. The event produced strong shaking with MMI V in East Java (http:// shake map. bmkg. go. id/), causing fatalities and damage to buildings. Block F represents an interesting cluster in the south of Banyuwangi, close to the location of the Mw 7.8 Banyuwangi earthquake that occurred in 1994 (Fig. 9). The seismicity in this area is most likely related to the subducting plate behind seamount which triggered the normal faulting earthquake at the outer rise of the Indo-Australian Plate (Abercrombie et al. 2001) (Fig. 12). Additionally, the shallow clustered earthquakes are probably controlled by active inland faults, such as in block A, northern block D and block F, and are associated with the Opak Fault, the Kendeng Thrust Fault, and the Rembang-Madura-Kangean-Sakala (RMKS) Fault Zones, respectively (Fig. 9). The Opak Fault is considered be the cause of the 2006 Yogyakarta earthquake (Mw 6.3); the aftershocks of which were still observed in the data during the period of our study. The geometry of the Opak Fault is still debatable, whether the fault plane is east-or west-dipping. Based on the vertical cross-section of block A, the relocated events are clustered in the east of the Opak Fault lineament with depths between 5 and 20 km, indicating that the fault plane is more likely east-dipping. Based on SAR interferometry observations, it was concluded that the geometry of the Opak Fault is considered to be an east-dipping left-lateral fault which ensures that the hypocenter distribution is in the eastern part of the fault (Tsuji et al. 2009). Several previous studies also support this result and show that the aftershock distribution of the 2006 Yogyakarta earthquake is parallel to the Opak Fault lineament and located 5-10 km to the east (Husni et al. 2018;Wulandari et al. 2018). Furthermore, a recent crustal deformation study suggests that the distribution of these aftershocks is most likely related to the activity of unmapped local faults, instead of the Opak Fault, which are currently accumulating stress in Yogyakarta as the results of an ongoing postseismic deformation of the 2006 Yogyakarta earthquake (Widjajanti et al. 2020).
Furthermore, shallow clustered events at depths of less than 30 km were observed in the northern part of block D, suggesting activity in the Kendeng Thrust Zone (Figs. 9 and 10). This is a major fault zone in the study area; it extends for 200 km from Central to East Java and is an accumulation of thrusts and folds (Pusat Studi Gempa Nasional (PuSGeN) 2017). Evidence of movement in this fault can be observed by the presence of uplifted alluvial terrace along with this fault's activity (Marliyani 2016). Based on their geodetic study, Koulali et al. (2017) estimate the average slip rate of Kendeng Thrust Fault to be about 2.3-4.1 mm/year. However, whether the seismicity is controlled by the local fault or by volcanic activity of Mt. Pandan and Mt. Wilis is still debatable. In 2015, an earthquake in Madiun (Mw 4.2) caused damage to several houses due to its shallow depth and the amplification effect in the north of Mt. Pandan . Previous studies suggest that this event may be related to the local strike-slip fault Sipayung et al. 2018). In contrast, a gravity survey that was conducted around Mt. Pandan indicated that a low-density anomaly, possibly related to hot material or a magma  (Santoso et al. 2018). The survey suggests that the subduction process resulted in fault movement which triggered a magma flow to the surface at the same time. Thus, we conclude that the seismicity in this cluster might be associated with both Kendeng Thrust activity and a magmatic process.
There is a shallow seismic cluster around Rembang and Madura in the northern part of East Java (Fig. 11) which most likely corresponds to the Rembang-Madura-Kangean-Sakala (RMKS) Fault Zone. We suggest that this fault extends to the north of Surabaya where shallow events are observed. Recent destructive earthquakes have occurred in the RMKS Fault Zone, i.e., the Madura earthquake (Mw 4.3) and the Situbondo earthquake (Mw 6.3), both in 2018 but with different mechanisms. The Madura earthquake (Mw 4.3) was more likely related to the strike-slip RMKS Fault, while the Situbondo earthquake (Mw 6.3) has a thrusting mechanism based on the GCMT focal mechanism solution (Fig. 12). This suggests that the Situbondo earthquake had a strong connection with the Back Arc Thrust that may extend from the east.
Several other active inland faults may control the seismicity in the Central and East Java region, for example, the Pasuruan Fault, the Lasem Fault, the Muria Fault, the Semarang Thrust Fault, and the Probolinggo Fault. These have not shown a significant number of earthquakes during the time period of 2009-2017. Hence, "unpaired" events that are not clustered beyond distance weighting were eliminated by the double-difference algorithm. Moreover, earthquakes associated with volcanic activities were also not well-determined due to the limited seismograph network used in this study.

Conclusions
We have successfully determined 1,529 earthquakes in the Central and East Java region in the time period of January 2009-September 2017, using a manual re-picking process. We then relocated 1,127 events by applying waveform cross-correlation data in the double-difference algorithm. Overall, our results show that the seismic pattern in Central and East Java is predominantly distributed in the south of Java Island, such as the Kebumen, Shallow clustered earthquakes in the mainland of the Central and East Java region were also observed; these correspond to active inland faults that include the Opak Fault, the Kendeng Thrust Fault, and the Rembang-Madura-Kangean-Sakala (RMKS) Fault Zone. Based on the relocation results, the seismicity around the Opak Fault indicates east-dipping geometry, since the relocated events were distributed to the east of the Opak Fault lineament at depths between 5-20 km. Meanwhile, the shallow seismic cluster (< 30 km depths) around the Kendeng Thrust Fault in the north of Madiun coincide with volcanoes present there, suggesting that these are triggered by both active local faults and magmatic processes beneath Mt. Pandan and Mt. Wilis. We suggest that the RMKS Fault in the northern part of East Java extends to the north of Surabaya where shallow events are observed. Several other active inland faults have not shown significant seismicity, and earthquakes caused by volcanic activities were not well-determined by the seismic network used in this study.
Additional file 1: Table S1. Relocated earthquake catalog around Central and East Java region determined by this study.
Additional file 2: Figure S1. Histograms of shifted earthquake locations in X (east-west), Y (north-south), and Z (depth) direction after relocation process by using double-difference algorithm with waveform crosscorrelation. Figure S2. Histograms of earthquake locations error in X (eastwest), Y (north-south), and Z (depth) direction after relocation process by using double-difference algorithm with waveform cross-correlation.