Application of Artificial Neural Networks and Singular-Spectral Analysis in Forecasting the Daily Traffic in the Moscow Metro

In this paper, we investigate the possibility of applying various approaches to solving the problem of medium-term forecasting of daily passenger traffic volumes in the Moscow metro (MM): 1) on the basis of artificial neural networks (ANN); 2) using the singular-spectral analysis implemented in the package “Caterpillar”-SSA; 3) sharing the ANN and the “Caterpillar”-SSA approach. We demonstrate that the developed methods and algorithms allow us to conduct medium-term forecasting of passenger traffic in the MM with reasonable accuracy.


Introduction
The Moscow metro is the main transport system in Moscow, carrying up to 10 million passengers a day.Given the dynamics of the MM development and the construction of modern transport infrastructure in Moscow, the role of the MM in passenger transportation will grow.To ensure effective functioning of the MM, constant monitoring of passenger traffic and its near-and medium-term forecasting are necessary.

Preparation of initial data for forecasting
The initial time series, reflecting the dynamics of the daily volume of passenger transportation by the Moscow metro over the past 5 years, contains 1,793 observations.Our analysis has shown [1] that the intensity of passenger traffic varies greatly between working days and weekend-holidays.Moreover, the corresponding distributions of passenger traffic practically do not overlap [1].This observation made it possible to break the initial series into two series: 1) working days, 2) weekend-holidays.Our research is connected with analysis of the traffic in the Moscow metro during working days.
It is obvious that the forecasting of a time series is possible only when there is a link between subsequent values of the series and the previous ones.Therefore, at the preliminary stage, a highfrequency component was excluded from the initial series by means of wavelet filtration (see Fig. 1).As an estimate of the correlation interval (forecasting horizon), the value of the first intersection of the autocorrelation function of the time axis in the confidence interval corresponding to the white EPJ Web of Conferences 173, 05009 (2018) https://doi.org/10.1051/epjconf/201817305009Mathematical Modeling and Computational Physics 2017 noise was used [2].We have shown that realizations of the noisy component are independent of each other, symmetric with respect to zero, and that their distribution agrees well with the Gaussian law.In this way we isolated the noisy component from the source data, and the filtered series was used for forecasting.

Forecasting passenger's traffic using ANN
As an ANN model we used a multilayered perceptron (MLP), at the input of which 6 variables were applied as factors affecting the passenger's traffic in MM: Var1 -year of observation, Var2 -month, Var3 -day of week, Var4 -type of day, Var5 -daily consumption of electricity in the Moscow region.
Var6 assumes different values: 1) daily traffic in the MM at the ANN training stage; 2) forecast for the previous day at the forecast stage based on the ANN; 3) forecast for the current day based on the SSA (sharing ANN and SSA approaches).These variables are due to the seasonality of traffic in the metro, they can be used to obtain predicted values using alternative methods of obtaining information, and these variables do not exhibit the phenomenon of multicollinearity [3].The MLP contained 6 neurons at the input, two hidden layers of 16 and 8 neurons, respectively, and one output neuron, which gives the predicted value.We conducted a series of experiments using the TMVA 4.2.0 package in the ROOT environment [4] and the Fletcher-Reeves method of learning was adopted as optimal.The training procedure included 1500 epochs, upon its completion the state of the network was fixed and it was used for forecasting.At the forecasting stage, the value of the passenger's traffic predicted by the network at the current step was applied to the network input (as variable Var6).The results of ANN predictions for 30 and 50 days are shown in Figs. 4 and 5.We have found that the relative error (R error = Real−Forecast

Real
) is symmetric with respect to zero and does not exceed 3-5%.This result means that using ANN, we can quite accurately predict the traffic of passengers in the metro.

Forecasting passenger's traffic using the 'Caterpillar'-SSA approach
For forecasting, we used the latest version of the CaterpillarSSA program (version 3.40, Professional M Edition, for details see [5]).As input information for the CaterpillarSSA program, the filtered time series was used.Before converting to multivariate form, the analyzed time series was standardized by means of CaterpillarSSA [5].The length of the caterpillar (the number of components for decomposition) was set equal to 492.This is approximately half of the analyzed series, allowing to distinguish all its characteristic features.In the reconstruction of a one-dimensional series, 13 main components were used.Their total contribution was 99.91%.At the prediction stage, the confidence interval was set equal to 0.25.
Figure 2 shows the graphs of the first 13 components of the decomposition of the analyzed series when predicted for 30 days of observation.It can be seen from the graphs that the components under Figure 3 presents: 1) above -the original series with its approximation, reconstructed on the basis of the first 13 components, 2) at the bottom -a series rebuilt on the basis of the discarded components.We found that the discarded data are in good agreement with the normal distribution, and it can be assumed that the responding process is close to Gaussian noise in its behavior.The results of forecasting by the SSA method for 30 and 50 days are shown in Figs. 4 and 5.

Discussion of results and conclusion
The best version of the forecast was achieved with the help of an ANN of the feed-forward type when a number of factors influencing the traffic of passengers in the metro were submitted to the ANN entrance.The prediction by the SSA method turned out to be shifted relative to the forecasted data in the region of large values of passenger traffic.The same was the reason for the displacement of the forecast in the region of large volumes of passenger traffic, when ANN and SSA were used together.
The technique for forecasting traffic in the MM, developed in this paper, will enhance the efficiency and speed of making management decisions in transport.

Figure 1 .
Figure 1.From top to bottom: 1) the original time series (transformed to the interval [−1, 1]) containing data on 1024 working days; 2) this series after application of wavelet-filtering; 3) excluded noisy component

Figure 2 .
Figure 2. Graphs of first 13 components when predicting 30 days of observation consideration are responsible for the trend and periodic (seasonal) components of the analyzed series.Figure3presents: 1) above -the original series with its approximation, reconstructed on the basis of the first 13 components, 2) at the bottom -a series rebuilt on the basis of the discarded components.

Figure 3 .
Figure 3. Above -the original series with its approximation, at the bottom -the series corresponding to the discarded components

Figure 4 .
Figure 4. Comparative predictive results for 30 days of observation using approaches developed in this paper