Datasets
Table of Contents
Overview
Univariate time series
The univariate dataset includes 8,068 time series which are carefully curated from 16 open-source datasets from multiple domains. Table 1 categorizes the univariate dataset by sampling frequency; and for each frequency category, it reports the number of time series with different charactistics, including seasonality, trend, shifting, transition, stationarity, and the average lengths.
Frequency | #Series | Seasonality | Trend | Shifting | Transition | Stationarity | Average Lens |
---|---|---|---|---|---|---|---|
Yearly | 1,500 | 611 | 1,086 | 978 | 633 | 354 | 32 |
Quarterly | 1,514 | 486 | 933 | 889 | 894 | 471 | 97 |
Monthly | 1,674 | 883 | 884 | 778 | 1,212 | 667 | 259 |
Weekly | 805 | 253 | 330 | 445 | 407 | 372 | 536 |
Daily | 1,484 | 374 | 502 | 487 | 1,176 | 714 | 4,951 |
Hourly | 706 | 435 | 276 | 284 | 680 | 472 | 5,109 |
Other | 385 | 75 | 248 | 236 | 195 | 124 | 1,678 |
Total | 8,068 | 3,117 | 4,259 | 4,097 | 5,197 | 3,174 | 1,569 |
Multivariate time series
The multivariate datasets include 25 multivariate time series from 10 domains. The sampling frequencies vary from every 5 minutes to 1 month, the range of feature dimensions varies from 5 to 2,000, and the time series length varies from 728 to 57,600. This substantial diversity of the multivariate datasets enables comprehensive benchmarking of different forecasting methods. To ensure fair comparisons, we choose a fixed data split ratio for each dataset chronologically, e.g., 7:1:2 or 6:2:2, for training, validation and testing.
Dataset | Domain | Frequency | Lengths | Dim | Split | Description |
---|---|---|---|---|---|---|
METR-LA | Traffic | 5 mins | 34,272 | 207 | 7:1:2 | Traffic speed dataset collected from loopdetectors in the LA County road network |
PEMS-BAY | Traffic | 5 mins | 52,116 | 325 | 7:1:2 | Traffic speed dataset collected from the CalTrans PeMS |
PEMS04 | Traffic | 5 mins | 16,992 | 307 | 6:2:2 | Traffic flow time series collected from the CalTrans PeMS |
PEMS08 | Traffic | 5 mins | 17,856 | 170 | 6:2:2 | Traffic flow time series collected from the CalTrans PeMS |
Traffic | Traffic | 1 hour | 17,544 | 862 | 7:1:2 | Road occupancy rates measured by 862 sensors on San Francisco Bay area freeways |
ETTh1 | Electricity | 1 hour | 14,400 | 7 | 6:2:2 | Power transformer 1, comprising seven indicators such as oil temperature and useful load |
ETTh2 | Electricity | 1 hour | 14,400 | 7 | 6:2:2 | Power transformer 2, comprising seven indicators such as oil temperature and useful load |
ETTm1 | Electricity | 15 mins | 57,600 | 7 | 6:2:2 | Power transformer 1, comprising seven indicators such as oil temperature and useful load |
ETTm2 | Electricity | 15 mins | 57,600 | 7 | 6:2:2 | Power transformer 2, comprising seven indicators such as oil temperature and useful load |
Electricity | Electricity | 1 hour | 26,304 | 321 | 7:1:2 | Electricity records the electricity consumption in kWh every 1 hour from 2012 to 2014 |
Solar | Energy | 10 mins | 52,560 | 137 | 6:2:2 | Solar production records collected from 137 PV plants in Alabama |
Wind | Energy | 15 mins | 48,673 | 7 | 7:1:2 | Wind power records from 2020-2021 at 15-minute intervals |
Weather | Environment | 10 mins | 52,696 | 21 | 7:1:2 | Recorded every for the whole year 2020, which contains 21 meteorological indicators |
AQShunyi | Environment | 1 hour | 35,064 | 11 | 6:2:2 | Air quality datasets from a measurement station, over a period of 4 years |
AQWan | Environment | 1 hour | 35,064 | 11 | 6:2:2 | Air quality datasets from a measurement station, over a period of 4 years |
ZafNoo | Nature | 30 mins | 19,225 | 11 | 7:1:2 | From the Sapflux data project includes sap flow measurements and nvironmental variables |
CzeLan | Nature | 30 mins | 19,934 | 11 | 7:1:2 | From the Sapflux data project includes sap flow measurements and nvironmental variables |
FRED-MD | Economic | 1 month | 728 | 107 | 7:1:2 | Time series showing a set of macroeconomic indicators from the Federal Reserve Bank |
Exchange | Economic | 1 day | 7,588 | 8 | 7:1:2 | ExchangeRate collects the daily exchange rates of eight countries |
NASDAQ | Stock | 1 day | 1,244 | 5 | 7:1:2 | Records opening price, closing price, trading volume, lowest price, and highest price |
NYSE | Stock | 1 day | 1,243 | 5 | 7:1:2 | Records opening price, closing price, trading volume, lowest price, and highest price |
NN5 | Banking | 1 day | 791 | 111 | 7:1:2 | NN5 is from banking, records the daily cash withdrawals from ATMs in UK |
ILI | Health | 1 week | 966 | 7 | 7:1:2 | Recorded indicators of patients data from Centers for Disease Control and Prevention |
Covid-19 | Health | 1 day | 1,392 | 948 | 7:1:2 | Provide opportunities for researchers to investigate the dynamics of COVID-19 |
Wike2000 | Web | 1 day | 792 | 2,000 | 7:1:2 | Wike2000 is daily page views of 2000 Wikipedia pages |