Evaluation of the seasonal to decadal variability in dynamic sea level simulations from CMIP5 to CMIP6

Previous studies have revealed little progress in the ensemble mean of Coupled Model Intercomparison Project Phase 6 (CMIP6) models compared to Phase 5 (CMIP5) models in simulating global dynamic sea level (DSL). This study investigates the performance of the CMIP5 and CMIP6 ensembles in simulating the spatial pattern and magnitude of DSL climatology, seasonal variability, interannual variability, and decadal variability by using the pattern correlation coefficient (PCC) and root-mean-square error (RMSE) as metrics. We show that the top models of the CMIP6 ensemble perform better than those of the CMIP5 ensemble in the simulation of DSL climatology and seasonal and interannual variability, but not DSL decadal variability. An intermodel linear relationship between the RMSE and PCC is found for both the CMIP5 and CMIP6 ensembles; however, this intermodel relationship is more linearly correlated in the CMIP6 ensemble and not significant for DSL decadal variability. The results show that the finer-horizontal resolution models tend to yield a smaller RMSE and a larger PCC in the DSL climatology, seasonal variability, interannual variability but not decadal variability simulations, and the relationship is more evident for the CMIP6 ensemble than for the CMIP5 ensemble.


Introduction
As one of the most severe impacts of anthropogenic climate change, sea level rise has been a key challenge in responding to global warming and future adaptation (Nicholls et al. 2010;Oppenheimer et al. 2019).Tide gauge records show that the global mean sea level rose at a mean rate of approximately 1.7 mm yr −1 during the twentieth century (Church and White 2006;Bindoff et al. 2007).However, the rate has increased to approximately 3. 25 [2.88-3.61]mm yr -1 and is estimated to increase with an acceleration of approximately 0.094 [0.082-0.115]mm yr -2 during 2013-2018 according to the latest IPCC AR6 (Fox-Kemper et al. 2021).Rises in sea level pose increased risks of storm surges, flooding, coastal erosion, and salt tide intrusion to coastal environments and communities (Nicholls et al. 2010).Hence, the projection of future sea level change is critical for coastal cities and communities to formulate adaptation policies (Horton et al. 2020;Pörtner et al. 2022).
Currently, projections of long-term dynamic sea level (DSL) changes from anthropogenic forcings mostly rely on the simulations of global coupled climate models from the Coupled Model Intercomparison Project (CMIP) (Yin et al. 2010;Slangen et al. 2014;Lyu et al. 2020).DSL is the dynamic component of regional sea level change caused by changes in seawater density and ocean *Correspondence: Hailong Liu lhl@lasg.iap.ac.cn 1 State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China 2 College of Earth and Planetary Sciences, University of Chinese Academy of Sciences, Beijing 100049, China currents, defined as the local height of the sea surface above the geoid with zero global mean (Gregory et al. 2019;Lyu et al. 2020).The reliability of future DSL projections depends on the CMIP climate models' fidelity in simulating the present-day DSL state (Willis and Church 2012;Lyu et al. 2020).Hence, it is essential to evaluate the representation of DSL fields in global climate model simulations.. Modulated by various modes of internal climate variability at regional scales, changes in ocean DSL occur at multiple timescales, e.g., seasonal, interannual, decadal (Zhang and Church 2012;Stammer et al. 2013;Griffies et al. 2014;Han et al. 2017).In addition to DSL climatology, previous studies have also evaluated the global climate model-simulated magnitude of the tropical Pacific DSL seasonal cycle and global DSL interannual variability, and the pattern of the Pacific DSL decadal variability (Landerer et al. 2014;Lyu et al. 2016).The internal variability that causes the variation in DSL components at different timescales varies (Nerem et al. 1999;Sturges and Hong 2001;Feng et al. 2004;Landerer et al. 2008;Zhang and Church 2012), and the ability of CMIP models to simulate these internal variabilities varies (Bellenger et al. 2014;Lyu et al. 2016;Fasullo et al. 2020), so it is necessary to separate the components of DSL at different timescales for evaluation.
As the internal variability of the state-of-the-art coupled models is not constrained to be in phase with observations, most studies mainly focus on comparing the spatial patterns and magnitudes of DSL components at various timescales to assess model performance for those DSL components.Landerer et al. (2014) noted that most models in CMIP phase 5 (CMIP5) overestimated the magnitude of the DSL seasonal cycle.However, the CMIP5 ensemble mean underestimated the magnitude of the DSL seasonal cycle.Compared with CMIP phase 3 (CMIP3), CMIP5 did not significantly improve the simulation in the magnitude of the DSL seasonal cycle.Since DSL interannual variability is relatively large in the equatorial Pacific between 20°S and 20°N, Landerer et al. (2008) focused on the simulation of DSL interannual variability over this region and found a large spread of CMIP5 model performance with a pattern correlation coefficient (PCC) ranging from 0.05 to 0.8.Lyu et al. (2016) noted that most models and multimodel means (MMMs) in the CMIP5 ensemble tend to underestimate the magnitude of Pacific decadal sea-level variability patterns.However, CMIP5 offers a better simulation of decadal variability patterns in the Pacific than CMIP3, which is consistent with the slightly improved representation of the climatological mean states from CMIP3 to CMIP5 (Flato et al. 2014;Landerer et al. 2014).It is still unknown whether the latest generation CMIP phase 6 (CMIP6) ensemble improves the simulation of DSL components at different timescales compared to the CMIP5 ensemble.
The ocean model resolution in CMIP6 has been significantly improved, with the average horizontal resolution increasing from 87 km in CMIP5 to 58 km in CMIP6.There are different views on the effect of the horizontal resolution of coupled models on DSL simulations.One is that mean sea level bias originates from atmospheric or air-sea coupled processes; hence, improving the ocean model resolution does not affect the DSL mean state bias (Morim et al. 2020;Lyu et al. 2020).Another is that a finer model resolution will improve the simulation of DSL mean state bias in regions of the Antarctic Circumpolar Current (ACC) and western boundary currents (WBCs), such as the Kuroshio and the Gulf Stream, where active mesoscale eddies exist (Penduff et al. 2010;Higginson et al. 2015;Liu et al. 2016;van Westen et al. 2020).Therefore, it is also necessary to thoroughly investigate the effect of the horizontal resolution of the CMIP5 and CMIP6 models on the DSL components at different timescales.
This paper mainly evaluates the global DSL seasonal cycle, interannual variability, and decadal variability regarding the simulation ability of spatial patterns and magnitudes in the CMIP5 and CMIP6 models.Also, this paper will discuss the influence of CMIP5 and CMIP6 coupled model resolutions on the simulation of DSL seasonal to decadal variability on the global scale.

Observational datasets
An absolute dynamic topography with a horizontal resolution of 0.25° from Archiving, Validation, and Interpretation of Satellite Oceanographic (AVISO, www.aviso.ocean obs.com) from 1993 to 2014 is employed as the observation reference.Global area-weighted mean sea level rise from 1993 to 2014 is removed from the original values before evaluation.Considering the limited period of AVISO, the sea level data over the longer term  from the latest European Centre for Medium-range Weather Forecasts (ECMWF) Ocean Reanalysis System 5 (ORAS5, www.ecmwf.int/ en/ forec asts/ datas et/ oceanreana lysis-system-5) are chosen as the observation reference for the magnitude of DSL decadal variability.

CMIP5 and CMIP6 simulations
The DSL from the output variable zos in historical simulations of 42 CMIP5 and 36 CMIP6 models are used in this study (available from https:// esgf-node.llnl.gov/ proje cts/ esgf-llnl/).Detailed information on the two phases of CMIP models used in this study, with model names, resolution, and modeling centers, are shown in Additional file 1: Tables S1 and S2.In the multimodel analysis, r1i1f1 data are chosen for CMIP5 and r1i1p1f1 for CMIP6 with equal weight.The DSL values from the CAMS-CSM1.0,GISS-E2.1-G, and MIROC5 models have been converted into the effective sea level by removing the inverse barometer effect from sea ice (Griffies et al. 2014).Some marginal or enclosed seas in the selected CMIP models that exhibit unrealistic DSL biases are masked before the DSL is derived from the model sea surface height above the geoid by removing its time-dependent global area-weighted mean (Landerer et al. 2014, Griffies et al. 2016).For instance, DSL values in MIROC-ESM models are 15 m over Hudson Bay and -15 m over the Mediterranean.The simulation data are monthly means from 1993 to 2014 for CMIP6 and from 1984 to 2005 for CMIP5.

Method
All data are first regridded to a common resolution of 1° by bilinear interpolation for further analysis.Based on multiyear climatology, the magnitude of the seasonal cycle is computed as half the difference between the annual maximum and minimum values (Landerer et al. 2014).Before obtaining the DSL interannual signal, the trend and the mean seasonal cycle are removed for each dataset at each point.The magnitude of interannual variability is derived as the standard variation at each grid point of the filtered fields with cutoff frequencies of 0.1 and 1 year −1 , similar to the processes in Gleckler et al. ( 2014) and Landerer et al. (2014).The procedure used to obtain the DSL decadal signal is the same as that for the DSL interannual signal, but the cutoff frequency of the corresponding Lanczos low-pass filter is 0.1 year −1 .

Evaluation of DSL components at different timescales
The CMIP6 MMM shows no improvement in simulating the magnitude and spatial pattern of the DSL climatology relative to the CMIP5 MMM.The RMSEs for CMIP6 and CMIP5 are very similar, 0.103 m and 0.105 m respectively, and the PCC between the simulated DSL climatology and the observation is as high as 0.99 for both the CMIP6 and CMIP5 MMMs.Our results are consistent with Lyu et al. (2020).Both the CMIP5 and CMIP6 MMMs yield similarly good simulations of the DSL climatology, exhibiting positive biases in the Indian Ocean, Northeast Pacific, and tropical Northeast Pacific, negative biases in the Atlantic, and a downsloping meridional gradient bias in the Southern Ocean (Additional file 1: Fig. S1a-c).However, compared with those from the CMIP5 MMM, the positive biases in the North Pacific simulated by the CMIP6 MMM are significantly larger, and the meridional gradient of the Southern Ocean DSL biases is also further increased (Additional file 1: Fig. S1d).
There is also no significant improvement from the CMIP5 to CMIP6 MMM in simulating the magnitudes of DSL seasonal, interannual, and decadal variability.Compared with those of the CMIP5 MMM on the global scale, all the magnitudes of DSL seasonal, interannual, and decadal variability simulated by the CMIP6 MMM present slightly smaller RMSEs and slightly larger PCCs compared with the corresponding observations (Table 1).However, no large significant difference between the two generation models was found (Fig. 1j, k, and l).
The observed DSL seasonal cycle signals are mainly distributed in the tropics and western boundary regions at the mid-latitudes (Fig. 1a).The spatial patterns of the DSL seasonal cycle signals simulated by the CMIP5 and CMIP6 MMMs are basically consistent (Fig. 1d, g), and their global PCCs with observations are 0.62 and 0.64, respectively.Compared with that from the CMIP5 MMM, the magnitude of DSL seasonal cycle simulated by the CMIP6 MMM is only significantly improved in regions of the tropical eastern Pacific and the Kuroshio extension (Fig. 1j).The RMSEs of the CMIP5 and CMIP6 MMMs relative to the observations are 0.024 m and 0.022 m, respectively.
The DSL interannual variability signals in the observations are mainly distributed in the tropics and the areas with strong current, such as the Kuroshio, Gulf Stream, and ACC (Fig. 1b).The spatial features of DSL interannual variability in the tropics are broadly replicated by both the CMIP5 and CMIP6 MMMs (Fig. 1e, h).The global PCCs of the CMIP5 and CMIP6 MMMs relative to the observations are only 0.51 and 0.58, respectively.Compared with that of the CMIP5 MMM, the magnitude of DSL interannual variability simulated by the CMIP6 MMM significantly improves in regions such as the Gulf Stream and the ACC (Fig. 1k).This may largely be related to the finer resolution of the CMIP6 models than the CMIP5 models.The RMSEs of the CMIP5 and CMIP6 MMM-simulated magnitudes of DSL interannual variability relative to the observations are also very close, at 0.024 m and 0.023 m, respectively.It is worth noting that the magnitudes of the DSL seasonal, interannual, and decadal variabilities simulated by both the CMIP5 and CMIP6 MMMs are all weaker than those of the corresponding observations (Fig. 1a-i).
In the large signal areas of the DSL seasonal cycle, such as the eastern Pacific, Kuroshio, and Gulf Stream, the magnitudes simulated by both the CMIP5 and CMIP6 MMMs are approximately 40%-80% weaker than those from the observations.The simulated magnitudes are  and j, but for the magnitude of DSL decadal variability.Stippling in j, k and l indicates where the difference between CMIP6 and CMIP5 is statistically significant at the 99% confidence level based on the two-sample t-test 40%-100% weaker for the interannual variability and 60-100% weaker for the decadal variability.These results indicate that the average resolution for CMIP6 may still not be high enough compared to the resolution of the observations.

The representation of top models in the CMIP5 and CMIP6 ensemble
Further, taking RMSE and PCC as metrics to quantify how well the models can simulate the magnitude and spatial pattern of DSL components at various timescales, the intermodel probability density function (PDF) distributions of these two metrics for the mean, seasonal cycle, interannual and decadal DSL components are shown in Fig. 2. Box edges represent the 25th and 75th percentiles in each CMIP ensemble, and white circles are the medians.Although the simulated DSL climatology and the spatial patterns of the seasonal to interannual variability show only slight improvement from CMIP5 to CMIP6 in terms of the ensemble mean or median (smaller RMSE and larger PCC), the top models in the CMIP6 ensemble are better than those in the CMIP5 ensemble.From the perspective of the intermodel PDF distribution of RMSE (PCC), the peak value of the CMIP6 ensemble is smaller (larger) than that of the CMIP5 ensemble, and the overall distribution is skewed to the low-(high-) value regions (Fig. 2a-c and e-g).
The top models that simulate the magnitude of the DSL decadal variability in the CMIP6 ensemble do not perform better than those in the CMIP5 ensemble (Fig. 2d, h); however, there is an increase in the number of top models that best simulate the magnitude and spatial pattern of the DSL decadal variability in the CMIP6 ensemble.The RMSEs of the top models in both the CMIP5 and CMIP6 ensembles are basically the same, but from the perspective of the intermodel PDF distribution of RMSE, the proportion of models that simulate smaller RMSE values in the CMIP6 ensemble is higher than that in the CMIP5 ensemble (Fig. 2d).The situation is essentially the same for the PCC distribution (Fig. 2h).
To further validate the aforementioned conclusion, we chose the top 15% of models in the CMIP5 and CMIP6 ensembles with the lowest RMSE or highest PCC as the high-skill ensemble (Additional file 1: Tables S3 and S4).The composite biases for the magnitude of the DSL climatology, seasonal, interannual, and decadal variability in both the CMIP5 and the CMIP6 high-skill ensembles with the metric RMSE are shown in Fig. 3. Since the results are similar, the results for PCC are shown in Additional file 1: Fig. S3.The improvements in the CMIP6 high-skill ensemble are mainly reflected in the simulation of the DSL seasonal cycle and interannual variability (Fig. 3c-f ).For the DSL climatology biases in the CMIP6 high-skill ensemble, even though there are some improvements in the Indian Ocean and Atlantic Ocean, more obvious biases appear in the Northern Pacific Ocean and Southern Ocean, which are closely associated with the biases in wind stress in these areas (Lyu et al. 2020).
The CMIP5 and CMIP6 high-skill ensembles show similar biases in the simulation of the DSL decadal variability (Fig. 3g, h), which show no difference from Fig. 2d.Compared with the CMIP5 high-skill ensemble, the CMIP6 high-skill ensemble shows better skill in the Southern Ocean and in most WBCs (Fig. 3c-f ).The abovementioned oceans are the main activity areas of mesoscale eddies; improving the horizontal resolution in CMIP6 models may facilitate the simulation of mesoscale eddies therein (Penduff et al. 2010;Higginson et al. 2015;Liu et al. 2016;van Westen et al. 2020).However, we also find that not all the models in the CMIP5 and CMIP6 high-skill ensembles have higher resolutions (Additional file 1: Tables S3 and S4).Meanwhile, it is also found that the CMIP6 high-skill ensemble has more high-resolution models than the CMIP5 high-skill ensemble.Model resolution plays an important, but not indispensable, role in DSL simulation.

The effect of models' horizontal resolution
To explore the effect of model horizontal resolution, we presented the scatterplot between RMSE and PCC for the CMIP5 and CMIP6 ensembles, with the circle sizes Fig. 3 The biases of the high-skill ensemble (take RMSE as reference, see Table S3 for a list of selected models) mean DSL climatology for a CMIP5 and b CMIP6.c, d is the same as a, b, but for the magnitude of the DSL seasonal cycle.e, f is the same as a, b, but for the magnitude of the DSL interannual variability.g, h is the same as a, b, but for the magnitude of the DSL decadal variability and colors representing the horizontal resolution (Fig. 4).Here, the model's horizontal resolution is defined as the product of the total longitude degree (360°) and latitude degree (180°) divided by the total grid number.The horizontal resolution defined above is physically equivalent to the area represented by a single grid in the model.S1 and S2.The circle's sizes and colors denote each model's horizontal resolution, the unit of the colorbar is degree*degree.The higher the horizontal resolution, the smaller the circle.The red lines are linear regression between the RMSE and the PCC, and pink shadings denote the 95% confidence range.The correlation coefficients between the two metrics are also listed at the top right of the panel.The correlations with an asterisk are significant at the 95% confidence level.Note that model inmcm4 is not included in a That is, the smaller the value is, the higher the horizontal resolution of the model.As in the previous state, the model with smaller RMSE and larger PCC values has better skill to reproduce the magnitude and spatial pattern of DSL components at various timescales.In general, the two metrics are approximately negatively correlated.That is, models having smaller RMSE also tend to have larger PCC.This relationship generally holds for climatology, seasonal cycle, interannual variability but not for decadal variability (Fig. 4).
It is worth noting that the intermodel relationship between RMSE and PCC is more linearly correlated in the CMIP6 ensemble.The intermodel correlation coefficients between RMSE and PCC in simulating DSL climatology and seasonal and interannual variability for CMIP6 could reach − 0.71, − 0.65, and − 0.87, respectively, but they are − 0.53, − 0.64, and − 0.61 for CMIP5 (Fig. 4a-f ).This means that CMIP6 models tend to have good or bad skill more consistently in terms of both DSL spatial pattern and magnitude than CMIP5 models.However, the linear intermodel relationship between RMSE and PCC does not exist or is not significant in the DSL decadal variability simulation, particularly for the CMIP5 ensemble (Fig. 4g, h).The intermodel correlation coefficients between RMSE and PCC for the CMIP5 and CMIP6 ensembles are 0.12 and − 0.16, respectively (Fig. 4g, h).This may be attributed to the large RMSE in some models, such as MIROC4h in CMIP5 with an RMSE of 0.0127 m and AWI-CM-1-1-MR in CMIP6 with an RMSE of 0.0128 m.If the abovementioned models were eliminated, the intermodel correlation coefficients between RMSE and PCC for the CMIP5 and CMIP6 ensembles would become − 0.15 and − 0.37, respectively.
The models with higher horizontal resolution tend to have larger PCC and smaller RMSE values in simulating DSL climatology, seasonal, and interannual variability (Fig. 4a-f ).Higher (lower) horizontal resolution models tend to have better (worse) simulating skills.But it is not always the case in both CMIP5 and CMIP6 models.For the CMIP5 ensemble, not only do the finest horizontal models not show good simulation skills, but some of the lowest horizontal models also show good skills, especially in simulating DSL climatology and seasonal cycle (Fig. 4a,  c).Further calculation shows that the smallest-RMSE and highest-PCC models have finer horizontal resolutions in both the CMIP5 and CMIP6 ensembles.This relationship is more pronounced in the CMIP6 ensemble (Fig. 5) than in the CMIP5 ensemble (Additional file 1: Fig. S2).
Further, the relationship between the model skill and the horizontal resolution also does not hold in the simulation of the DSL decadal variability, particularly for the CMIP5 models (Additional file 1: Fig. S2d, h).However, the higher-resolution models have a higher skill to simulate the DSL decadal variability in CMIP6 (Fig. 5d, h).S1.The black lines are linear regression between two metrics, and the red lines denote the 95% confidence range.The correlation coefficient with an asterisk is significant at the 95% confidence level These two models with the largest RMSEs, MIROC4h in CMIP5 and AWI-CM-1-1-MR in CMIP6, have a relatively high horizontal resolution, indicating that the increased resolution may not guarantee an improvement in the simulation of DSL decadal variability.Other factors, such as physical processes, may dominate the simulation abilities.
The increase in horizontal resolution in both the CMIP5 and CMIP6 ensembles can rarely help improve the simulation of the magnitude of the DSL decadal variability (Fig. 5d, Additional file 1: Fig. S2d).This result shows that the representation of the DSL decadal signal involves more factors than horizontal resolution.As a result of Lyu et al. (2016), large sea level variations are associated with climate modes, such as IPO; both the simulated magnitude and pattern of such climate modes are essential to simulating sea level variation.Thus, the limited skills in simulating the magnitude and pattern of various climate modes hinder the improvement of DSL decadal variability.Due to the obvious resolution improvement from the CMIP5 to CMIP6 ensemble, the individual CMIP6 model with a higher horizontal resolution can simulate a better spatial pattern of DSL decadal variability (Fig. 5h).This is different from the previous CMIP3 and CMIP5 results obtained by Lyu et al. (2016) that there is no clear relationship between an individual model's resolution and performance in simulating the DSL decadal variability patterns.Although the excellent simulation of the DSL decadal signal requires an understanding of various complicated physical processes, the continuous improvement in model resolution will help achieve this step by step.
The higher the horizontal resolution of the CMIP6 model is, the better ability in simulating the spatial pattern of the DSL decadal variability the CMIP6 model will have.Nevertheless, the improvement in simulating the magnitude of the DSL decadal variability for the CMIP6 models is not significant (Fig. 5d).The DSL decadal variability accounts for only a small proportion of the total DSL signal, so it is difficult to simulate.Additionally, no improvement in the magnitude simulating capability of CMIP6 models may be attributed to large intermodel uncertainty in the main distribution areas of the DSL decadal signal.However, in the CMIP5 ensemble, the improvement in model horizontal resolution did not significantly improve the ability to simulate the spatial pattern and magnitude of DSL decadal variability (Additional file 1: Fig. S2d, h).On a global scale, although the CMIP6 high-resolution models improve the spatial simulation capability for DSL decadal variability, the CMIP6 magnitude simulation ability needs to be improved.

Conclusions and discussion
Regarding spatial pattern and magnitude, we evaluate DSL climatology and seasonal, interannual, and decadal variability in the CMIP5 and CMIP6 ensembles.The effect of model horizontal resolution on DSL simulations at different timescales is also discussed.The main conclusions are as follows: 1. From the perspective of the ensemble mean, the magnitudes of DSL climatology, seasonal variability, interannual variability, and decadal variability simulated by CMIP6 show no significant improvement compared with those simulated by CMIP5.However, the top models in the CMIP6 ensemble are better than those in the CMIP5 ensemble, except at the decadal timescale.2. The magnitudes of DSL seasonal, interannual, and decadal variability simulated by both the CMIP5 and CMIP6 models in the tropics, Kuroshio, Gulf Stream, and ACC are significantly weaker than the observations by 40-80%, 40-100%, and 60-100%, respectively.3. Regarding the simulation in the spatial pattern and magnitude of DSL climatology, seasonal variability, and interannual variability, the RMSE is smaller and the PCC is higher for both CMIP5 and CMIP6.However, the relationship is not significant for the decadal timescale.4. The finer-horizontal resolution models tend to yield a smaller RMSE and a larger PCC.This is true for the simulation of DSL climatology, seasonal variability, and interannual variability.The relationship was more evident in the CMIP6 ensemble than in the CMIP5 ensemble.
Our work reveals that CMIP6 high-resolution models simulate better in the spatial pattern and magnitude of DSL climatology, seasonal cycle, and interannual variability.However, it does not mean that one higher-resolution model must be more capable of simulating the DSL components at different timescales.This is because the internal signals at different timescales imply various complicated physical processes.Hence, better simulation of these DSL internal variabilities requires not only an improvement in resolution but also an accurate representation of various complex physical processes, such as the IPO and air-sea coupling processes mentioned by Lyu et al. (2016Lyu et al. ( , 2020)).The reason for reducing the specific simulation biases needs to be further analyzed and should be connected to the background physical processes.Our conclusions on the effect of the horizontal resolution of the CMIP6 coupled models indicate that the ability of the CMIP6 high-resolution model to simulate DSL decadal variability needs to be improved on the global scale.The results above will guide our further projections of future DSL changes so that we can choose suitable models to project such changes, regardless of the timescale or region.
The observed DSL decadal variability signals are mainly located over the tropical northwestern Pacific, the tropical northeastern Pacific, and the areas with strong current, such as the Kuroshio, Gulf Stream, and ACC (Fig.1c).The CMIP5 and CMIP6 MMMs barely describe the spatial pattern of the DSL decadal variability signals in the tropics (Fig.1f, i), particularly for the large signal in the tropical northeastern Pacific; thus, the PCCs of the CMIP5 and CMIP6 MMM-simulated DSL decadal variability signals relative to the observations are only 0.42 and 0.45, respectively, which are lower than the seasonal and interannual variabilities.The CMIP6 MMM-simulated DSL decadal variability signal is only significantly improved in a tiny part of the Gulf Stream and the ACC relative to that from the CMIP5 MMM (Fig.1l), with the RMSE decreasing from 0.0081 m for the CMIP5 MMM to 0.0079 m for the CMIP6 MMM.

Fig. 1
Fig.1The magnitude of DSL (unit: m) annual cycle for a AVISO, d the CMIP5 MMM, and g the CMIP6 MMM, and j the difference between CMIP6 MMM and CMIP5 MMM.b, e, h, k are the same as a, d, g, j, but for the magnitude of DSL interannual variability.c, f, i, and l are the same as a, d, g, and j, but for the magnitude of DSL decadal variability.Stippling in j, k and l indicates where the difference between CMIP6 and CMIP5 is statistically significant at the 99% confidence level based on the two-sample t-test

Fig. 2
Fig. 2 The intermodel PDF distribution of DSL RMSE for CMIP5 and CMIP6 models of the magnitude of a the climatology, b the seasonal, c the interannual, and d the decadal variability.e, f, g, and h are the same with a, b, c, and d, but for the PCC.Box edges indicate the 25th and 75th percentiles in each CMIP ensemble, with the median as a white circle

Fig. 4
Fig.4The intermodel scatterplot between the RMSE and the PCC of a the climatology, c the seasonal, e the interannual, and g the decadal variability magnitude from DSL in the CMIP5 ensemble.b, d, f, h are the same as a, c, e, g but for the CMIP6 ensemble.The number for each model can find in TableS1 and S2.The circle's sizes and colors denote each model's horizontal resolution, the unit of the colorbar is degree*degree.The higher the horizontal resolution, the smaller the circle.The red lines are linear regression between the RMSE and the PCC, and pink shadings denote the 95% confidence range.The correlation coefficients between the two metrics are also listed at the top right of the panel.The correlations with an asterisk are significant at the 95% confidence level.Note that model inmcm4 is not included in a

Fig. 5
Fig. 5 Intermodel scatterplot of DSL RMSE and the model horizontal resolution for a climatology, b annual cycle, c interannual variability, and d decadal variability for CMIP6 models.e, f, g, h are the same as a, b, c, d, but for the intermodel relationship between the PCC and the model horizontal resolution.The blue number denotes the number of the CMIP6 model in TableS1.The black lines are linear regression between two metrics, and the red lines denote the 95% confidence range.The correlation coefficient with an asterisk is significant at the 95% confidence level

Table 1
Comparison of DSL RMSE (unit: m)and PCC for the magnitude of the climatology, the seasonal, the interannual, and the decadal variability between CMIP5 and CMIP6 MMM (in square brackets)