OpenTS-Leaderboards
Leaderboard for multivariate time series forecasting
Rules:
For multivariate forecasting algorithms, we consider 25 datasets and 2 error metrics, i.e., MAE and MSE. For each dataset, we consider 4 forecasting horizons. This gives 200 (25 * 4 * 2) unique evaluation settings (click here for details), click here to see the detailed results, click here to download detailed evaluation results for each of the 25 multivariate time series.
For each forecasting algorithm, we count the number of times that the algorithm receives the gold, silver, and bronze medals, i.e., having the lowest, 2nd lowest, and 3rd lowest errors, shown as 🥇, 🥈, and 🥉, respectively.
We provide three different types of scores for ranking the forecasting algorithms. First, the scores equal to the numbers of gold medals. Second, the scores are the sum of the numbers of gold, silver, and bronze medals. Third, the scores are the weighted sum of the gold, silver, and bronze medals, where the weights can be customized. The larger the score, the higher the ranking.
Profile1 refers to a subset of ten datasets commonly used in recent literature, including Electricity, ETTm1, ETTm2, ETTh1, ETTh2, Traffic, Solar, Weather, ILI, and Exchange.
We consider four forecasting horizon F: {24, 36, 48, and 60} for FredMd, NASDAQ, NYSE, NN5, ILI, Covid-19, and Wike2000, and we use another four forecasting horizon, {96, 192, 336, and 720,} for all other datasets which have longer lengths. The look-back window H underwent testing with lengths {36 and 104} for FredMd, NASDAQ, NYSE, NN5, ILI, Covid-19, and Wike2000, and {96, 336, and 512} for all other datasets. For each method, we adhere to the hyper-parameter as specified in their original papers. Additionally, we perform hyper-parameter searches across multiple sets, with a limit of 16 sets. The optimal result is then selected from these evaluations, contributing to a comprehensive and unbiased assessment of each method’s performance. Please note that we have retested the results of the algorithms, which may differ from those in the TFB paper.