Journal of Asset Management A hybrid genetic algorithmsupport vector machine approach in the

Post on: 16 Март, 2015 No Comment

Journal of Asset Management (2013) 14, 52–71. doi:10.1057/jam.2013.2; published online 14 February 2013

A hybrid genetic algorithm–support vector machine approach in the task of forecasting and trading

Christian L Dunis 1. Spiros D Likothanassis 2. Andreas S Karathanasopoulos 3. Georgios S Sermpinis 4 and Konstantinos A Theofilatos 5

Correspondence: Konstantinos A. Theofilatos, Pattern Recognition Laboratory, Department of Computer Engineering & Informatics, University of Patras, Greece. E-mail: theofilk@ceid.upatras.gr

1 is Emeritus Professor of Banking and Finance at Liverpool John Moores University where he also directed the Centre for International Banking, Economics and Finance (CIBEF) from February 1999 through August 2011. He is a Visiting Professor of Quantitative Finance at the Universities of Venice (Italy) and Aix-en-Provence (France), and at the ECE School of Electronic Engineering in Paris. He is also a consultant to asset management firms, specializing in the application of nonlinear methods to financial management problems.

2 is currently Professor and Director of the Pattern Recognition Laboratory, Department of Computer Engineering and Informatics, University of Patras. His research interests include intelligent signal processing and adaptive control, neural networks, genetic algorithms and applications, intelligent agents and applications, bioinformatics, web-based applications, virtual e-learning environments, artificial intelligence / expert systems, data and knowledge mining and intelligent tutoring systems.

3 is a senior lecturer in London Metropolitan Business School. In 2008, he received the Master of Science in International Banking and Finance from the Department of Banking and Finance of Liverpool John Moores University. His research interests include financial forecasting, trading strategies, time series prediction, artificial and computational intelligence and neural networks.

4 joined the Glasgow Business School in September 2011. He holds degrees from the National Kapodistrian University of Athens and the Liverpool John Moores University. He previously worked at the University of Bedfordshire and Liverpool John Moores University. His research interests include risk management, financial forecasting, trading strategies and artificial intelligence models.

5 is a PhD candidate in the Department of Computer Engineering and Informatics of the University of Patras, Greece. In 2009, he received a Master’s degree from the Department of Computer Engineering and Informatics of the University of Patras. His research interests include computational and artificial intelligence intelligence, evolutionary computation, time series modeling and forecasting, bioinformatics, data mining and web technologies.

Received 21 December 2012; Revised 21 December 2012

Advance online publication 14 February 2013

Abstract

The motivation of this article is to introduce a novel hybrid Genetic algorithm–Support Vector Machines method when applied to the task of forecasting and trading the daily and weekly returns of the FTSE 100 and ASE 20 indices. This is done by benchmarking its results with a Higher-Order Neural Network, a Naïve Bayesian Classifier, an autoregressive moving average model, a moving average convergence / divergence model, plus a naïve and a buy and hold strategy. More specifically, the trading performance of all models is investigated in forecast and trading simulations on the FTSE 100 and ASE 20 time series over the period January 2001–May 2010, using the last 18 months for out-of-sample testing. As it turns out, the proposed hybrid model does remarkably well and outperforms its benchmarks in terms of correct directional change and trading performance.

Keywords:

ASE 20; FTSE 100; trading simulation; genetic algorithms; support vector machines

INTRODUCTION

Forecasting financial time series is a difficult task because of their complexity and their nonlinear, dynamic and noisy behavior. Traditional methods such as autoregressive moving average (ARMA) and moving average convergence / divergence (MACD) models fail to capture the complexity and the nonlinearities that exist in financial time series. On the other hand, nonlinear approaches such as Artificial Neural Networks (ANNs) have given promising empirical evidence but their numerous limitations are often creating skepticism about their use among practitioners ( Dunis et al. 2009 ). Support Vector Machines (SVMs) ( Vapnik, 2000 ) handle some of ANNs’ limitations as they can be trained more effectively and theoretically provide classification models with enhanced generalization abilities. However, their performance is highly associated with their parameters and input selection that should be selected in a computational manner.

The purpose of this article is to introduce a hybrid Genetic Algorithms (GA) and SVM model, which can overcome some of the limitations of ANNs and simple SVMs and excel in financial forecasting. More specifically, in our hybrid methodology, a GA is used to optimize the SVM parameters and to find the optimal feature subset. Furthermore, the proposed hybrid methodology uses a problem-specific fitness function, which is believed to produce more profitable prediction models.

In our application, we developed a hybrid GA-SVM model and applied it to the task of forecasting and trading the daily and weekly returns of the FTSE 100 and ASE 20 indices. This is done by benchmarking its results with a Higher-Order Neural Network (HONN), a Naïve Bayesian Classifier, an ARMA model, an MACD model, plus a naïve and a buy and hold strategy. More specifically, the performance of all models is investigated in a forecast and trading simulation on the FTSE 100 ASE 20 time series over the period January 2001–May 2010, using the last 18 months for out-of-sample testing.

As it turns out, the proposed hybrid model does remarkably well and outperforms its benchmarks in terms of trading performance. This superiority is also confirmed after transaction costs and leverage to exploit the high information ratios are applied.

The rest of the article is organized as follows. In the next section, we present some relevant recent applications, whereas the subsequent section provides a detailed description of the FTSE 100 and the ASE 20 time series. A detailed overview of the proposed methodology and its benchmarks is given in the latter section, whereas in the penultimate section we present our results. The final section provides some concluding remarks.

LITERATURE REVIEW

The main objective of this article is to introduce a novel hybrid GA and SVM model that can overcome some of the limitations of ANNs and simple SVMs and excel in financial forecasting applications.

Panda and Narasimhan (2007) use a single hidden layer feedforward Neural Network (NN) to produce statistical accurate forecasts of the INR / USD exchange rate having several linear autoregressive models as benchmarks, whereas Andreou et al (2008) use NNs to forecast and trade European options with disappointing results. On the other hand, Kiani and Kastens (2008) forecast the GBP / USD, the CAD / USD and the JPY / USD exchange rates with feedforward and recurrent NNs having as benchmarks several ARMA models. In their application, NNs outperform in statistical terms their ARMA benchmarks in forecasting the GBP / USD and USD / JPY but not in forecasting the USD / CAD exchange rate. Adeodato et al (2011) won the NN3 Forecasting Competition problem with an innovative approach based on the use of median for combining MLP forecasts, and Matias and Reboredo (2012) forecast successfully with NNs and other nonlinear models intraday stock market returns. In a forecasting competition, Dunis et al (2009 and 2011 ) and Sermpinis et al (2013) compare several Higher-Order NNs and autoregressive models in forecasting and trading the EUR exchange rates. Their results demonstrate the forecasting superiority of a class of NNs, the Psi Sigma, which are able to capture higher-order correlation within their data set.

Until now, many approaches have been based on SVMs for the modeling of financial time series. Cao and Tay (2003) apply SVMs to the problem of forecasting several future contracts from the Chicago Mercantile Market and demonstrate the superiority of SVMs over Back Propagation and regularized Radial Basis Function (RBF) NNs. In the same year, Kim (2003) used SVMs to predict the successful direction of change of the daily Korean composite stock index, whereas Huang et al (2005) used SVM to predict correctly the directional movement of the NIKKEI 225 index. More recently, Ince and Trafalis (2008) apply successfully Support Vector Regression to the task of forecasting 10 NASDAQ financial indices.

The proposed model combines GA with SVMs. Lately, some other research groups tried to forecast financial and other time series using algorithmic combinations of GAs and SVMs. In Nguyen et al (2009). the authors propose a hybrid methodology that uses a GA to locate the optimal feature subset, which should be used by an SVM classifier. This methodology was applied to financial indices with great success even if the GA was not used to optimize the SVM’s parameters and the classification models were not combined with advanced trading strategies. Wu et al (2009) developed a novel methodology that used a GA to find the optimal Kernel function and parameters of a Support Vector Regression model. This algorithm was applied to forecast the maximum electrical daily load and outperformed previous models. However, no feature selection procedure was applied in this methodology and it could be improved if the GA search for the optimal features subset on parallel to the Kernel’s and parameters’ optimization. Min et al (2006) introduced a GA to optimize both features subset and the SVM’s parameters. Our article extends this methodology by using a novel problem-specific fitness function, by estimating decimal regression values by computing the distance from the classification margin of each sample and by combining the final prediction models with advance trading strategies such as confirmation filters and leverage analysis.

THE FTSE 100 AND ASE20 INDEX

The FTSE 100 index is a share index of the 100 companies listed in the London Stock Exchange with the highest market capitalization. The ASE 20 index consists of the 20 largest Athens Stock Exchange stocks and represents over 50 per cent of the exchange’s total capitalization. Both indices are traded by future contracts that are cash settled upon maturity of the contract with the value of the index fluctuating on a daily basis. The cash settlement of this index is simply determined by calculating the difference between the traded price and the closing price of the index on the expiration day of the contract. Both series were provided by Bloomberg’s financial information services.

The FTSE 100 and ASE 20 daily and weekly time series are non-normal (Jarque-Bera statistics confirms this at the 99 per cent confidence interval), containing slight skewness and high kurtosis. They are also nonstationary and we decided to transform the series into stationary series of daily and weekly rates of return. 1

Given the price level P ₁. P ₂. …, P _t . the rate of return at time t is formed by:

The summary statistics of the FTSE 100 and ASE 20 daily and weekly returns series reveal positive skewness and high kurtosis. The Jarque-Bera statistic confirms again that the return series are non-normal at the 99 per cent confidence level. These return series will be forecasted from our models.

For each time series under study, as inputs to our algorithms and our networks, we selected the first 14 autoregressive lags of the series. In order to train our artificial intelligence models, we further divided our data set as shown in Table 1 .

FORECASTING MODELS

Naïve strategy

The naïve strategy simply takes the most recent period change as the best prediction of the future change. The model is defined by:

where Y _t is the actual rate of return at period t ; Ŷ _{t + 1} is the forecast rate of return for the next period.

The performance of the strategy is evaluated in terms of trading performance via a simulated trading strategy.

Moving average

The moving average model is defined as:

where M _t is the moving average at time t ; n is the number of terms in the moving average; Y _t is the actual rate of return at period t .

The MACD strategy used is quite simple. Two moving average series are created with different moving average lengths. The decision rule for taking positions in the market is straightforward. Positions are taken if the moving averages intersect. If the short-term moving average intersects the long-term moving average from below, a ‘long’ position is taken. Conversely, if the long-term moving average is intersected from above, a ‘short’ position is taken.

The forecaster must use judgment when determining the number of periods, n. on which to base the moving averages. The combination that performed best over the in-sample sub-period was retained for out-of-sample evaluation. The models selected were a combination of (1,3) for FTSE 100 daily, (2,9) for the FTSE 100 weekly, (1,7) for the ASE 20 daily and (1,3) for the ASE 20 weekly series.

ARMA model

ARMA assume that the value of a time series depends on its previous values (the autoregressive component) and on previous residual values (the moving average component). 2

The ARMA model takes the form:

where Y _t is the dependent variable at time t ; Y _{t −j} . j = 1…p are the lagged dependent variable; φ _j . j = 1…p are regression coefficients; ε _t is the residual term; ε _{t −m} . m = 1…q are previous values of the residual; w _m . m = 1…q are weights.

Using the correlogram and the information criteria in the training and the test sub-periods as a guide, we choose our ARMA models for the four series under study (for more information see Tables A1 – A4 ). All of their coefficients are significant at the 99 per cent confidence interval. The selected ARMA models for the daily FTSE 100, weekly FTSE 100, daily ASE 20 and weekly ASE 20 series are presented in equations (5). (6). (7) and (8). respectively:

The models selected are retained for out-of-sample estimation. The performance of the strategy is evaluated in terms of trading performance.

The HONN architecture

NNs exist in several forms in the literature. The most popular architecture is the Multi-Layer Perceptron (MLP). A standard NN has at least three layers. The first layer is called the input layer (the number of its nodes corresponds to the number of explanatory variables). The last layer is called the output layer (the number of its nodes corresponds to the number of response variables). An intermediary layer of nodes, the hidden layer, separates the input from the output layer. Its number of nodes defines the amount of complexity the model is capable of fitting. In addition, the input and hidden layers contain an extra node called the bias node. This node has a fixed value of one and has the same function as the intercept in traditional regression models. Normally, each node of one layer has connections to all the other nodes of the next layer.

The training of the network (which is the adjustment of its weights in the way that the network maps the input value of the training data to the corresponding output value) starts with randomly chosen weights and proceeds by applying a learning algorithm called backpropagation of errors 3 ( Shapiro, 2000 ). The learning algorithm simply tries to find those weights, which minimize an error function (normally the sum of all squared differences between target and actual values). As networks with sufficient hidden nodes are able to learn the training data (as well as their outliers and their noise) by heart, it is crucial to stop the training procedure at the right time to prevent overfitting (this is called ‘early stopping’). This can be achieved by dividing the data set into three subsets, respectively, called the training and test sets used for simulating the data currently available to fit and tune the model and the validation set used for simulating future values. The network parameters are then estimated by fitting the training data using the above-mentioned iterative procedure (backpropagation of errors). The iteration length is optimized by maximizing the forecasting accuracy for the test data set. Then the predictive value of the model is evaluated by applying it to the validation data set (out-of-sample data set).

HONNs were first introduced by Giles and Maxwell (1987) and were called ‘Tensor Networks’. Although the extent of their use in finance has so far been limited, Knowles et al (2011) show that, with shorter computational times and limited input variables, ‘the best HONN models show a profit increase over the MLP of around 8 per cent’ on the EUR / USD time series (p. 7). For Zhang et al (2002). a significant advantage of HONNs is that ‘HONN models are able to provide some rationale for the simulations they produce and thus can be regarded as “open box” rather than “black box”. HONNs are able to simulate higher frequency, higher-order nonlinear data, and consequently provide superior simulations compared with those produced by ANNs’ (p. 188). Furthermore, HONNs clearly outperform in terms of annualized return and this enables Dunis et al (2008) to conclude with confidence over their forecasting superiority and their stability and robustness through time.

Although they have already experienced some success in the field of pattern recognition and associative recall, 4 HONNs have only started recently to be used in finance. The architecture of a three-input second-order HONN is shown in Figure 1 .

Left, MLP with three inputs and two hidden nodes; right, second-order HONN with three inputs.

Full figure and legend (44K )