[IEEE International Joint Conference on Neural Networks, 2003. - Portland, Oregon USA (July 20 - 24,...

Neural Smoothing Transition Coefficients for Nonlinear Processes in Mean and Variance Maria Luiza F. Velloso', Marley M.B.R. Vellasco*, Marco A. P. Cavalcante'; Cristiano C. Femandes*

DETEL,Uej -Rio de Janeiro State Universily', DEE, PUC-Rio'.','

Rio de JaneirF- BRAZIL ~nlll(aiczntroii~.com.br', iiiarlev(i~elz.~iic-rio.br, ~iiarco(iL~ii,sle.~oc-rio.br', cris(iL?cle.~iic-rio.br'

Abstract-Additive models has been the preferential choice in nonlinear modeling: parametric or nonparametric, of conditional mean or variance. A new class of nonlinear additive varying coefficient models is presented in this paper. The coefficients are modeled by neural networks (mulilayer perceptrons) and, both the conditional mean and conditional variance; are explicitly modeled. The learning algorithm of the neural network is based on a concept of likelihood maximization. Case s1udies;with a nonlinear in variance synthetic series and a non-linear in mean real series are presented. . .

. . :

Index Terms- Nonlinear time series, conditional variance, varying coefficient models.

1. INTRODUCTION T HE linear Gaussian models have dominated the development of time series model building for the past

decades. These models have been reasonably successful as a practical tool for analysis, forecasting, and control. They represent the objective world to a good first approximation. Although linear models were conceptually convenient and analytically tractable they did not always provide an adequate framework for modeling real problems. It is well known that real systems are usually nonlinear, and linear models cannot correctly capture certain features, such as limit-cycles, asymmetry, amplitude-dependent frequency responses, jump phenomena, and chaos.

Over recent years, several nonlinear time series models have been proposed. Where interpretation is one'of the main concerns, non-linearity h i s been treated more as one-step simple extensions. of the linear formulation and several nonlinear models were proposed as generalizations of well- known linear models. There are many non-linear models that have,been suggested by economic theory. One class that does seem to hepotentially relevant involves switching regimes.

Additive models replace the sum ,of linear function of regressors by a sum of non-hear functions. Admitting the coefficients to vary with the values of other covariates, one obtains a class or varying coefficient models.

TJarying coefficients, or switching models model the changes in the structure that generates the observations or the transitions between regimes.

If the conditional: mean of a process is found to involve

certain variables, there is every reason to expect these, and possibly other variables, to affect higher moments. Usually, the variance is the next most important moment, after the mean. If the variance is not constant the process is said to, be heteroskedastic. I

This paper proposes a formulation that combines the ideas from the additive varying coefficients models and from artificial neural networks. A new class of nonlinear additive varying coefficient models is presented in this paper, the Neural Smoothing Transition Coefficients Model (NSTCM). The coefficients are modeled by neural networks (mulilayer perceptrons) and, both the conditional mean and conditional variance, are explicitly modeled. The learning algorithm of the neural network is based on a concept of likelihood maximization. A feedforward neural network with one hidden layer estimates the coefficients of mean and variance models.

Section I1 gives a brief description of the proposed model. Case studies with a non-linear in variance synthetic series and a non-linear in mean real series are presented in Section Ill. Concluding remarks are made in Section IV.

11. THE NSTCM MODEL

NSTCM models can be described as follows.

m y , =a, + ~ A , Y , - ~ + z P ; , ~ ; , + $ ~ ~ , e , - ~ + e , , (1)

i=l i=l i=l

with e, = a,&, ,where E~ are identically independently distributed, xi, represent exogenous variables, A,, p,, e S,, are neural varying coefficients. These neural coefficients are modeled by neural networks with one hidden layer and squashing transfer function (universal approximations).

Therefore, the varying coefficients are modeled as follows.

2, is the input data set of the neural networks and F is the

Let o,? the variance of e,. If the process is heteroskedastic it hidden layer transfer.

0-7803-7898-9/03/$17.00 02003 IEEE 2493

couldbe modeled as GARCH model [4] process with varying coefficients.

(3)

Assuming Gaussian distribution, the learning algorithm of the neural network is based on a concept of likelihood maximization, and estimates simultaneously the mean and variance coefficients, by minimizing

logp(e,a;') = -0.51og2n-0.5e:a;'. (4)

111. EXPERIMENTAL RESULTS The first case is concemed about a synthetic time series that

combines a ARCH process [7] (GARCH models are generalizations of ARCH models) with a SETAR process [ I 11.

Y, = =,E,

0.1+0.7y:., y,.,SO

0.3+0.5y:., y,.,>O U; =

where E, = white noise with unitary variance; E, e yt-l are independent.

The model structure to be estimated is

The used gradient for training of neural network was

In order to compare our model with the traditional, and assess the effectiveness of this technique, we adjust a traditional 'ARCH model [4] (GARCH models are generalizations of ARCH models).

Figure I shows a subset of the real variance and the estimated NSTCN variance.

Figure 2 shows a subset of the real variance and the estimated ARCH variance.

We used the Jarque-Bera test of normality parametric hypothesis test of composite normality. The Bera-Jarque test is a 2-sided test of composite normality with sample mean and sample variance used as estimates of the population mean and variance, respectively.

The test statistic is based on estimates of the sample skewness and kurtosis of the normalized data.

The standardized Null Hypothesis says that E, is normal with unspecified mean and variance.

The test did not reject the null hypothesis at 95% significance level for the NSTCM estimates but rejected for the ARCH model.

Figure 3 shows a normal-plot of NSTCM estimate.

Sample

Figure I - NSTCM estimated variance (continuous line) and the real variance (dashed line)

Sample

Figure 2- ARCH estimated variance (continuous line) and the real variance (dashed line)

The other data set was the classic Canadian lynx data set, which consists of the annual record of the numbers of the Canadian lynx trapped in Northwest Canada for the period 1821-1934. The first time series model built for this data set was probably that by Morgan, and he used a log transformation on data set. In this work we used the same transformed data set. Morgan's model was a second order Autoregressive -AR(2), model:

2494

X, =1..05+1.41X,., -0.77X,-, + E , .

Tong [ 1 I ] developed a SETAR(2;7,2) model, as follows, with the standard errors between brackets.

where variances of E;’’ e $’ are 0.0259 e 0.0505, respectively. A total of 12 parameters were used.

Cox (1977) suggested a polynomial AR as follows, with zero mean data.

X, =0.3452X,+,+l.0994X,.,+O.l204X~., t

0.1 162X,.,X,-, - 0.3838X:., + &,

where e , is i.i.d. with zero mean and 0.0469 for constant variance.

Normal Probability Plot I 1

I Data

As entradas testadas para a rede que formava os coeficientes foram X,.,, X,.2, X,.,, Xt4, ..,., X,.8 e seus quadrados, juntas ou separadas. 0 melhor resultado (mais parcimonioso tamhem) foi a utiliza@o de X,.2 como entrada

In order. to’ interpret adjustment accuracies we used two descriptive measures: the root mean square erros (rms) and the Mean Absolute Percentage Error (MAPE). Table 1 resumes the results obtained.

Table I - Results for adjust of the Canadian lynx time series.: ~~ ~

Model r m S MAPE Parameters

~

NSTCM 3.736 0.0542 12

Moran 5.7192 0.0692 3

cox 57.3185 0.2344 6

Tong 3.9709 0.0566 12

CONCLUSION In this paper, a new class of nonlinear additive varying

coefficient models is proposed in this paper. Such technique improved the performance of estimation of heteroskedastic processes, by exploiting neural networks characteristics.

Further research should be conducted to test the potential improvements associated with such approach. A strategy appropriate for input variable model choice could he used, and others probability distributions could be experimented. Criterions for specify the appropriate model structure could be tested.

In spite of the simplicity adopted, experimental results confirm the effectiveness of the presented technique.

REFERENCES [ I ] BAILLIE, R. T.: BOLLERSLEV, T. The Messagc inDaily Enchangc

Rates: A Conditional Variancc Talc. Journal of Busincss and Economic Statistics, 7,297-305, 1989. BISHOP, C.M. Neural Nchwrks for Patlcm Recognition. Oxford University Prcss, Oxford, 1995. BOLLERSLEV, T. A Conditional Heteroskcdastic Timc Scricr Modcl

[2]

131 ~~

for Spcculativc Prices and Ratcs of Rctum. Rcvicw of Ecomics and Statistics, 69, 542-547, 1987. BOLLERSLEV, T. Gcneralizcd Autorcarcssive Conditional

Figure 3- Normal plot ofNSTCM estimated innovations. 141

The NSTCM model adjusted is described by:

X , = ~ ~ + ~ ~ X ~ ~ , + r * , X , ~ * + ~ , X , ~ ~ + r ~ , , ~ , . , + & ~

Hetcroskedasticity. Joumal of Economciics, 31,307.327, 1986. BOLLERSLEV, T.; CHOU, RAY Y.; KRONER, KENNETH F. ARCH Modelling in Finance: A Revicw of thc Theory and Empirical Evidcnce. Joumal of Economctrics, Vol. 52, ~ p . 5 - 5 9 . Novcmbro 1990.

[5]

161 BOLLERSLEV, T.; ENGLE, R. F: Common Persistcncc in Conditional Variances: Dctinition and Rcprcscntation., 1. L. Kellog Graduate School of Managemcnt, Northwestcm University, 1990.

where I,, are neural network outputs and 4 is i.i.d., Gaussian and constant variance.

2495

[7]

[XI

ENGLE, R. F. Autoregressive conditional hetcroskedasticity with cstimtcs of the variance of UK inflation, Economclrica, 5 0 987-1008, HAGGAN, V.; OZAKI, T. Madelling non-lincar random vibrations using na amplitude-dependent autorcgrcssive timc scrics model Biomctrika,68:189 -196, 1981. NARENDRA, K.S.; PARTHASARATHY.K. Gradicnt mcthob for the optimization of dynamical systems containing neural networks. IEEE Transactions on Neural Nehvorks, Vol.2, pp. 252-262, 1991

[ IO] SEBER, G. A. F.; WILD, C. J . Nonlinear Regression. Wiley & Sons, New York, 1989.

[ I I] TONG, H. Non-Lincar Timc Scries: A dynamical System Approach. Oxford Univcnity Prcss, 1990.

[ 121 VELLOSO. M. L. F. Timc Series Madel with Ncural Coefficients for Non-lincar Proccsscs in Mean and Variance. Doctor's thcsis (in Portuguese), Catholic Univcnity, Rio de Janciro, 1999.

[ I 31 M. T. N. Mcllem, "Hybrid autorcgrcssive-neural models for time S C ~ ~ C S . "

Master's thcsis (in Pomugucse). Catholic University, Rio dc Jancim, Brazil. 1997.

[9]

2496

[IEEE International Joint Conference on Neural Networks, 2003. - Portland, Oregon USA (July 20 - 24,...

Documents

Transcript of [IEEE International Joint Conference on Neural Networks, 2003. - Portland, Oregon USA (July 20 - 24,...