ITS-Mina: A Harris Hawks Optimization-Based All-MLP Framework with Iterative Refinement and External Attention for Multivariate Time Series Forecasting

ITS-Mina: A Harris Hawks Optimization-Based All-MLP Framework with Iterative Refinement and External Attention for Multivariate Time Series Forecasting

Pourya Zamanvaziri
Amirhossein Sadr
Aida Pakniyat
Dara Rahmati
Published on 4/30/2026
Cross-asset
Machine learning
Deep learning

ITS-Mina is a novel all-MLP framework for multivariate time series forecasting that addresses key limitations of existing MLP-based models. The framework integrates three innovations: (1) iterative refinement via a shared-parameter residual mixer stack, which deepens effective computation without increasing parameter count by reapplying the same mixer block multiple times; (2) an external attention module that replaces self-attention with learnable memory units, capturing cross-sample global dependencies with linear complexity and acting as an implicit regularizer; and (3) Harris Hawks Optimization (HHO) for automatic dropout rate tuning, enabling adaptive regularization tailored to each dataset.

The architecture processes input through instance normalization, iterative refinement (N rounds of a depth-M mixer stack), external attention, and temporal projection with denormalization. Extensive experiments on six benchmark datasets (Traffic, Electricity, ETTh1, ETTh2, ETTm1, ETTm2) show that ITS-Mina achieves state-of-the-art or highly competitive performance compared to eleven baselines, including Transformer-based and MLP-based models. The results demonstrate the effectiveness of iterative refinement, external attention, and HHO-based optimization in improving forecasting accuracy while maintaining computational efficiency.

Highlights

  • 1Proposes ITS-Mina, an all-MLP framework for multivariate time series forecasting with three key innovations: iterative refinement via shared-parameter mixer loops, external attention for efficient global context, and HHO-based dropout optimization.
  • 2Achieves state-of-the-art or highly competitive performance on six benchmark datasets (Traffic, Electricity, ETTh1, ETTh2, ETTm1, ETTm2) against eleven baselines across multiple forecasting horizons.
  • 3Demonstrates that iterative refinement with weight tying deepens effective computation without increasing parameter count, improving representation quality.
  • 4Shows that external attention captures cross-sample global dependencies with linear complexity and acts as an implicit regularizer.
  • 5Introduces HHO for automatic dropout rate tuning, providing adaptive regularization tailored to each dataset.

Methods

  • M
    Iterative refinement via shared-parameter residual mixer stack: applies the same depth-M mixer stack N times with tied weights, deepening computation without multiplying parameters.
  • M
    External attention module: uses two learnable memory matrices (slots) to capture global inter-sample correlations with O(LCS) complexity, replacing self-attention's O(L^2).
  • M
    Harris Hawks Optimization (HHO) for dropout rate tuning: formulates dropout optimization as a continuous problem and uses HHO's exploration-exploitation balance to find optimal rates.
  • M
    Instance-wise normalization and temporal projection: normalizes input before mixing and applies linear readout with inverse normalization for forecasting.

Results

  • R
    ITS-Mina achieves state-of-the-art MSE/MAE on the majority of dataset-horizon combinations across six benchmarks.
  • R
    Outperforms Transformer-based models (Informer, Autoformer, FEDformer) and MLP-based models (DLinear, TSMixer) on most settings.
  • R
    Iterative refinement with N=3 rounds and M=2 mixer blocks yields best performance on average.
  • R
    External attention with S=64 slots provides effective global context while maintaining linear complexity.
  • R
    HHO-based dropout tuning finds optimal rates around 0.1-0.3, improving generalization over fixed rates.
0/5

Analyze Paper

Generate insights from "ITS-Mina: A Harris Hawks Optimization-Based All-MLP Framewor...".

Suggested Actions