Parameters Optimization Using Genetic Algorithms in Support Vector Regression for Sales Volume

Post on: 16 Март, 2015 No Comment

Department of Information Management, Yuan Ze University, Chung-Li, Chinese Taipei

Email: imyuan@saturn.yzu.edu.tw

Received July 10, 2012; revised August 10, 2012; accepted August 17, 2012

Keywords: Budgeting Planning; Sales Volume Forecasting; Artificial Intelligent; Support Vector Regression; Genetic Algorithms; Artificial Neural Network

ABSTRACT

Budgeting planning plays an important role in coordinating activities in organizations. An accurate sales volume forecasting is the key to the entire budgeting process. All of the other parts of the master budget are dependent on the sales volume forecasting in some way. If the sales volume forecasting is sloppily done, then the rest of the budgeting process is largely a waste of time. Therefore, the sales volume forecasting process is a critical one for most businesses, and also a difficult area of management. Most of researches and companies use the statistical methods, regression analysis, or sophisticated computer simulations to analyze the sales volume forecasting. Recently, various prediction Artificial Intelligent (AI) techniques have been proposed in forecasting. Support Vector Regression (SVR) has been applied successfully to solve problems in numerous fields and proved to be a better prediction model. However, the select of appropriate SVR parameters is difficult. Therefore, to improve the accuracy of SVR, a hybrid intelligent support system based on evolutionary computation to solve the difficulties involved with the parameters selection is presented in this research. Genetic Algorithms (GAs) are used to optimize free parameters of SVR. The experimental results indicate that GA-SVR can achieve better forecasting accuracy and performance than traditional SVR and artificial neural network (ANN) prediction models in sales volume forecasting.

1. Introduction

Sales forecasting is a self-assessment tool for a company. The managers have to keep taking the pulse of their company to know how healthy it is. A sales forecast reports, graphs and analyzes the pulse of the business. It can make the difference between just surviving and being highly successful in business. It is a vital cornerstone of a company’s budget. The future direction of the company may rest on the accuracy of sales forecasting [1].

For sales forecasting to be valuable to the business, it must not be treated as an isolated exercise. Rather, it must be integrated into all facets of the organization. Thus, all enterprises are working on the exploitation of prediction methods, which decide the success and failure of an enterprise [2,3].

Business forecasting has consistently been a critical organizational capability for both strategic and tactical business planning [4]. Thus, how to improve the quality of forecasts is still an outstanding question [5]. For data containing trend or/and seasonal patterns, failure to account for these patterns may result in poor forecasts. Over the last few decades when dealing with the problems of sales forecasting, traditional time series forecasting methods, such as exponential smoothing, moving average, Box Jenkins ARIMA, and multivariate regressions etc. have been proposed and widely used in practice to account for these patterns, but it always doesn’t work when the market fluctuates frequently and at random[6,7]. Therefore, Research on novel business forecasting techniques have evoked researchers from various disciplines such as computational artificial intelligence.

An artificial neural network (ANN) is a new contender in forecasting sophisticated trend and seasonal data. Artificial intelligent models have more flexibility and can be used to estimate the non-linear relationship, without the limits of traditional time series models [8]. Therefore, more and more researchers tend to use AI forecasting models to deal with forecasting problems. Artificial neural network (ANN) has strong parallel processing and fault tolerant ability. However, the practicability of ANN is affected due to several weaknesses, such as over-fitting, slow convergence velocity and relapsing into local extremum easily [9].

Support Vector Machines (SVM), a more recent learning algorithm that has been developed from statistical learning theory [10,11], has a very strong mathematical foundation and has been shown to exhibit excellent performance in time series forecasting [7,12-14] and in classification [15,16]. SVM is a new machine learning method based on the statistical learning theory, which solves the problem of over-fitting, local optimal solution and low convergence rate existed in ANN and has excellent generalization ability in the situation of small sample. When SVM is used in regression, it is called support vector regression (SVR). However, the select of appropriate SVR parameters is difficult. A highly effective model can be built after the parameters of SVR are carefully determined [17].

Whereas GA has strong global search capability[18], support vector regression optimized by genetic algorithm (GA-SVR) is proposed to forecast the sales volume, among which GA is used to determine training parameters of support vector regression [19,20]. The GA proposed by Holland [21] is derivative-free stochastic optimization method based on the concepts of natural selection and evolutionary processes. The GA also encodes each point in a parameter space into a binary bit string called a chromosome. Major components of this algorithm include encoding schemes, fitness evaluation, parent selection, crossover, and mutation operators. GASVR has been used in many fields and proved be a very effective method [11,19,22], but not used in sales volume forecasting. Therefore, in this research, the hybrid improved intelligent models, GA-SVR, will be discussed for forecasting monthly sales volume of car industry and compared with ANN and other traditional models.

The rest of the paper is organized as follows. Section 2 describes the theory of support vector regression. Section 3 presents the experiment design. The data of sales volume of a car manufacturer in Taiwan is used as a case study to test the reliability and accuracy of the proposed model. Section 4 contains experimental results and analysis. Finally, Section 5 concludes the paper.

2. Theory of Support Vector Regression

The basic concept of SVR is that nonlinearly the original dataset x_i is mapped into a high-dimensional feature space. Given data set where x_i is the input vector, y_i is the associated output value of x_i The SVR regression function is:

(1)

where denotes the non-linear mapping function, w is the weight vector and b is the bias term. The goal of SVR is to find a function that has at most ε deviation from the targets y_i for all the training data and, at the same time, is as flat as possible. In SVR, ε-insensitive loss function is introduced to ensure the sparsity of support vector, which is defined as:

(2)

where the loss equals zero if the error of forecasting values is less than ε, otherwise the loss equals value larger than ε.

As with the classification problem, non-negative slack variables, and, can be introduced to represent the distance from actual values to the corresponding boundary values of the ε-tube. Then, the constrained form can be formulated as follows:

(3)

Subject to

where C denoted a cost function measuring the empirical risk.

Finally, the constrained optimization problem is solved using the following Lagrange form:

Max

(4)

Subject to

where and are Lagrange multipliers. is a so-called kernel function. By using a kernel function, it is possible to compute the SVR without explicitly mapping in the feature space. The condition for choosing kernel functions should conform to Mercer’s condition, which allows the kernel substitutions to represent do products in some Hilbert space. SVM constructed by Gaussian radial basis function (RBF) has excellent nonlinear forecasting performance. Thus, in this work, RBF is used in the SVR.

Equation (1) can now be rewritten as follows:

, and therefore

(5)

The training parameters C, σ and ε greatly affect the forecasting performance of SVR. However, the appropriate select of these SVR parameters is very difficult. In this research, GA is used to optimize the training parameters. GA has strong global search capability, which can get optimal solution in short time [20]. So GA is used to search for better combinations of the parameters in SVR. After a series of iterative computations, GA can obtain the optimal solution. The methods and process of optimizing the SVR parameters with genetic algorithm is described as follows:

2.1. Initial Value of SVR Parameters

In our proposed novel GA-SVR model, the training parameters C, σ and ε of SVR are dynamically optimized by implementing the evolutionary process with a randomly generated initial population of chromosomes, and the SVR model then performs the prediction task using these optimal values. Our approach simultaneously determines the appropriate type of kernel function and optimal kernel parameter values for optimizing the SVR model. The process of optimizing the SVR parameters with genetic algorithm is shown in Figure 1 .

Figure 1. The process of SVR parameters optimized by genetic algorithm.

2.2. Genetic Operations

Generally, genetic algorithm uses selection, crossover and mutation operation to generate the offspring of the existing population as described as follows:

“Selection” operator: Selection is performed to select excellent chromosomes to reproduce. Based on fitness function, chromosomes with higher fitness values are more likely to yield offspring in the next generation by means of the roulette wheel or tournament method to decide whether or not a chromosome can survive into the next generation. The chromosomes that survive into the next generation are then placed in a mating pool for the crossover and mutation operations. Once a pair of chromosomes has been selected for crossover, one or more randomly selected positions are assigned into the to-be-crossed chromosomes. The newly-crossed chromosomes then combine with the rest of the chromosomes to generate a new population. Suppose there are m individuals, we select [m/2] individuals but erase the others, the ones we selected are “more fitness” that means their profits are greater.

“Crossover” operator: Crossover is performed randomly to exchange genes between two chromosomes. Suppose, , are two chromosomes, select a random integer number 0 r n, S₃. S₄ are offspring of crossover (S₁. S₂ ),

“Mutation” operator: The mutation operation follows the crossover to determine whether or not a chromosome should mutate to the next generation. Suppose a chromosome S₁ = ₁₁. s₁₂. , s_1n >, select a random integer number 0 r n, S₃ is a mutation of S₁. S₃ = _i | if i r, then s_i = s_1i. else s_i = random (s_1i )>.

Offspring replaces the old population and forms a new population in the next generation by the three operations, the evolutionary process proceeds until stop conditions are satisfied.

2.3. Calculation of the Fitness Value

A fitness function assessing the performance for each chromosome must be designed before searching for the optimal values of the SVR parameters. Several measurement indicators have been proposed and employed to evaluate the prediction accuracy of models such as MAPE and RMSE in time-series prediction problems. To compare the results achieved by the present model, this research adopts Mean Absolute Percentage Error (MAPE) to evaluate the performance.

(6)

where A_t is the actual values for period t, F_t the expected value for period t and n is the number of training samples. The smaller the values of MAPE, the better the forecasting models will be. The smaller values mean that the calculating results are closer to the historic actual data.

3. Experiment Design

In this research, the monthly sales volume of trucks and small cars of a car manufacturer in Taiwan, and other input variables, such as stock index, jobless rate, GDP per person, CPI, CCI, US dollars, Yen, Euro, and average gasoline price, were collected during the period from 2003

2009.

For comparison purpose, several commonly used forecasting models, such as Least-Mean Square Algorithm (LMS), Artificial Neural Networks (ANNs), and Support Vector Regression (SVR), are also applied.

In the experiments, the models are trained using training data, and are applied to testing data. Thus, the models are trained with input data from the year 2003 and output data (forecasted monthly sales volume) from 2004. Then the data from the year 2004 were entered as testing data in order to forecast the monthly sales volume from 2005. For later years the data from all the previous years were used in the training phase. In a subsequent cross-section analysis the mean absolute percentage error (MAPE) is used to evaluate the forecasting accuracy.

4. Experimental Results and Analysis

After using the above-mentioned data and implementing the computational procedure, the monthly prediction results of forecasting models for trucks based on previous monthly sales volume and other factors during 2003-2009 are summarized in Table 1 and Figures 2-6, respectively. Accordingly, Table 2 target=_self> Table 2 and Figure 7 report the MAPE results. All experimental results indicate that GA-SVR has more excellent performance than other models in forecasting monthly sales volume.

Considering the adaptability of GA-SVR, we also use four models above to predict the small car data set. As shown in Table 3 and Figure 8. the experimental results indicate that GA-SVR model can achieve better forecasting accuracy and performance than other models.

Table 1. Comparison of the prediction results for trucks from each model for 12 months from 2005-2009.

Figure 8. Comparison of the MAPE results for small cars.