1 Introduction
Chaotic dynamics are ubiquitous in natural and engineered systems, ranging from weather and climate processes to electronic circuits and structural mechanics. Unlike regular periodic motions, chaotic time series are characterized by sensitive dependence on initial conditions, nonlinearity, and high dimensionality, which make their long-term prediction extremely challenging. Traditional approaches based on analytical modelling or numerical simulation often become intractable when the governing equations are highly nonlinear or the system complexity increases (Zhang
et al.,
2019). In this context, data-driven methods, particularly neural networks (Sestanovic and Kalinic Milicevic,
2025; Calvo-Rolle
et al.,
2014), have emerged as effective alternatives for capturing the underlying dynamics and predicting the evolution of chaotic systems.
The success of neural networks in chaotic time series prediction (Fernandes
et al.,
2020), however, is not determined solely by network architecture; the choice of loss function plays an equally crucial role. A well-designed loss function not only guides the optimization process but also influences the model’s ability to capture long-term dynamics rather than merely short-term patterns. Over time, the development of loss functions has become a central factor in improving both predictive accuracy and generalization performance across different applications. Early loss functions mainly include Mean Square Error (MSE), Mean Absolute Error (MAE), and cross-entropy, the latter rooted Information Theory of Shannon (
1948). To address robustness issues, Huber proposed a hybrid of MSE and MAE (Huber,
1964). Chopra
et al. (
2005) proposed the contrastive loss to minimize distances between similar samples and maximize those between dissimilar ones, on which the triplet loss (Hadsell
et al.,
2006) introduced additional constraints through triplet comparison. Customized objectives have also been developed: BLEU-based loss for machine translation (Papineni
et al.,
2002; Ranzato
et al.,
2015), BPR loss for recommender systems (Rendle
et al.,
2009), and focal loss for class-imbalance in object detection (Lin
et al.,
2017). A principled deep learning approach was proposed for multi-task learning, which weighs multiple loss functions by considering homoskedastic uncertainty (Kendall
et al.,
2018). In physics-related domains, auxiliary loss functions have been integrated into neural networks to incorporate physical dynamics. Examples include PIDynNet (Liu and Meidani,
2023), a physics-informed neural network for structural systems, and a two-coefficient loss function based on linear multistep methods (Zhang
et al.,
2023). In addition, Ghazvini
et al. (
2024) combine MSE and LogCosh to mitigate vanishing gradients, while a weight-constrained RNN optimization (Wu
et al.,
2020) incorporates parameter constraints and regularization to ensure the input-output relationship.
The combination of numerical algorithms and neural networks has advanced deep learning for differential equations. The Runge-Kutta Convolutional Neural Network (RKCNN) builds network models using higher-order Runge-Kutta methods (Zhu
et al.,
2022). PINNs have been integrated with Runge-Kutta schemes for parameter estimation and dynamic modelling (Zhai
et al.,
2023; He
et al.,
2023). Connections between the implicit Euler method and neural networks led to the Adaptive Implicit Network (AIM-NET), allowing flexible convergence for improved parameter estimation (Yuan
et al.,
2020). Implicit Adams Predictor-Corrector Blocks (IABs) combined with Non-local Sparse Attention (NSA) and Attention Feature Fusion (AFF) form the Implicit Adams Predictor-Corrector Module (IAM) (Yin
et al.,
2023), while discrete ZNN (DZNN) models of Adams-Bashforth type enhance accuracy for time-varying problems (Yang
et al.,
2020). A linear multi-step framework based on the implicit Adams-Moulton scheme approximates full-order models in low-dimensional space (Xie
et al.,
2019). Finally, MultiPINN, a multi-head neural network enriched with PINNs, incorporates RBF interpolation and embeds physics-based priors from differential equations and boundary conditions (Li,
2024). These methods improve efficiency and performance by integrating numerical schemes with physics-informed loss functions.
Building on the integration of numerical algorithms and neural networks, interpolation techniques have become important in deep learning for data augmentation, feature extraction, and regularization, improving accuracy and robustness. Conditional Encoder-Decoder GANs (CEDGANs) treat interpolation as a conditional generation task, capturing spatial relationships (Zhu
et al.,
2019). Interpolated Adversarial Training enhances adversarial robustness while preserving generalization (Lamb
et al.,
2019). D’Ambrosio
et al. (
2021) combine PINNs with a functional interpolation technique, the Theory of Functional Connections (TFC), to learn optimal controls. Interpolation Consistency Training (ICT) ensures predictions for interpolated points align with interpolated predictions, reducing overfitting (Verma
et al.,
2022). RAKI performs nonlinear interpolation of missing k-space lines, improving noise resilience in MRI reconstruction (Akçakaya
et al.,
2019), while a CNN-based interpolation method addresses missing projection data in sparse-view CT (Lee
et al.,
2017).
Previous studies on loss functions remain inadequate for modelling and predicting complex, high-dimensional, nonlinear dynamical systems. Traditional loss functions, such as MSE, are widely adopted for their simplicity, but they often overlook critical dynamic features in long-term predictions. Although several custom loss functions have been proposed, they generally suffer from limited accuracy and poor stability, which restrict their applicability to a broader class of dynamic systems. To address these challenges, this paper introduces an error term in the prediction phase, thereby proposing an improved custom loss function. The improved version further integrates interpolation and recursive strategies to better capture temporal dependencies and enhance long-term prediction accuracy. This paper selects the dynamic modelling of a ring truss antenna as a numerical case study, representing a typical chaotic system within aerospace structural dynamics. Numerical simulations demonstrate that the improved loss function consistently outperforms both the traditional MSE and the original custom loss function in terms of accuracy and stability. This study presents a systematic approach to designing advanced loss functions and demonstrates its effectiveness in predicting complex, high-dimensional nonlinear dynamical systems.
6 Conclusions
In this paper, based on the original linear multistep loss function, a new loss function strategy is proposed under the condition that the time step is guaranteed to be consistent. In order to test the effect of the traditional loss function (MSE), custom loss function and the improved custom loss function on the network prediction, different neural networks are used to predict the time series of the beam ring system. Based on the analysis, the key conclusions are summarized as follows:
-
1. For neural networks of the same size, the use of traditional MSE loss functions is very ineffective and cannot accurately capture the complex dynamic behaviour of the system. However, a custom loss function based on three types of errors can more effectively capture the complex dynamics of the system, thereby significantly enhancing prediction accuracy.
-
2. Numerical simulations show that the improved loss function has higher prediction accuracy with the same neural network size. In addition, the time increase of using the improved loss function is negligible, and the strategy of improving the loss function is generalisable.
-
3. To further optimise the loss function, we have introduced RecursiveLoss. By calculating the loss function in a recursive form, the loss function can simulate the error that occurs in the prediction phase, which effectively improves the prediction accuracy. Interpolation methods are then introduced on this basis to reduce the step size of the prediction phase under the same input conditions. The addition of interpolation to the prediction error term of the loss function aims to overcome the dependence of the adapted loss function on the step size and broaden its scope of application. Finally, the applicability of the interpolation method is discussed.
It is worth mentioning that our improvement of the loss function does not involve the addition of specific formulas, but rather the incorporation of iterative prediction methods into the loss function in order to control the errors generated during the iterative process. This strategy is applicable to any iterative method and can therefore provide some assistance in interdisciplinary research in applied mechanics and machine learning.