Short-Term Wind Power Forecasting Using Support Vector Regression and Harmony Search Algorithm

.


Introduction
Wind power generation is a clean and renewable energy that generates low pollution electricity and provides diversity in electricity supply.There is a great relation between wind power output and wind speed.Due to the irregularity of the wind speed, wind power generation has an intermittent nature that brings a great challenge to the energy dispatchers.
To allow efficient planning for distributed generation (DG), the forecast of wind power generation is necessary for a grid operator.Accurate short-term wind power forecast improves the wind power penetration level, increases system reliability, reduces operating costs and allows for efficient load management strategies.However, because it is affected by amount of wind speed, wind power generation is highly uncertain and difficult to predict accurately.To make effective use of wind power generation, accurate forecasting of wind power output is essential for a DG system.
Since the wind power output is greatly affected by amount of wind speed, many studies focus on predicting wind speed, according to the historical meteorology information.The forecasted wind speeds are then converted to wind power outputs using a power curve model.These techniques include adaptive wavelet neural network (AWNN) method [1], wind vector prediction method [2], smoothing technique [3] and adaptive local learning technique [4].Other studies use historical data and associated meteorology information to directly forecast wind power output.These studies use radial base function neural network (RBFNN) and fuzzy methods [5], ridgelet neural network method [6] and wavelet transformation and SVR techniques [7].
Most existing studies, as mentioned above, improve the forecasting accuracy for wind power output.However, the ANN based methods [1][5-6] require more complicated structure to capture the non-linear characteristic between input and output datasets, though it has good approximation for a wide range of nonlinear functions.The fuzzy method [5] uses rule-based knowledge to design an efficient inference engine, but it is difficult to draw conclusions from a large amount of linguistic terms.Through a multi-resolution technique, the wavelet-based method [7] decomposes the original time series data into a stationary series in different frequency bands of subsets.The structure of this method is complicated and requires long execution time.The SVR method [7] has been proven to be an effective method for regression prediction.However, the parameters of SVR are often tuned by the operators' experiences.
To achieve more accurate forecasting, this paper proposes a hybrid method which combines SVR and HSA to predict the wind power output for a 3-hour ahead in steps of 15 minutes.First, a k-means clustering algorithm is used to classify the historical wind power generation data into various classes.Five HSA-based SVR models are then established to train the collected data.Since the HSA provides an efficient scheme to optimize the SVR parameters, the selected forecasting model obtained from the wind speed forecasts every three hours by the TCWB is then used to produce accurate forecasts for the wind power generation.
The remainder of this paper is organized as follows.In Section II, the characteristics of wind power generation are briefly reviewed.Section III describes the proposed approach to produce future 3 hours wind power forecast in steps of 15 minutes.Section IV presents the forecasting results for ANN and proposed methods.Finally, conclusions are given in Section V.

Characteristics of wind Power Generation
The output of wind power generation is affected by meteorological conditions, such as the wind speed and air density, as follows [8]: where P wind is the power output of a wind turbine, A is the rotor area of a wind turbine (m 2 ),  is the density of air (kg/m 3 ), v is the wind speed (m/s) and C p is the utilization coefficience of wind energy.The conversion efficiency of a wind turbine is about 20~40%.
As shown in (1), power output of a wind turbine is proportional to the cube of the wind speed.There are three types of method to control the turbine generation: pitch control, tip control and stall control.Fig. 1 shows the wind power curve for a 2 kW wind turbine generator.There are some control devices to limit the wind power generation according the different wind speeds.The speed to start the turbine generation is called the cut-in speed, while a cut-out speed is set to prevent the turbine suffering from damage.The speed between 12 m/s and cut-out speed is called the rated speed, which indicates the output capability of a wind turbine.Table 1 shows the wind speed as well as wind grade classification provided by TCWB.Consideration of the rated output of a wind turbine, the wind grades from 2 to 6 are used to develop the training models.
Since the power output of a wind power system is intrinsically unstable and has an intermittent nature, providing an accurate forecast of wind power output is important for a DG system.This paper presents a hybrid method to make more accurate forecast for PV power output.Details of the proposed scheme are given in the next section.no definition more than 37.0

Data classification
The collected historical wind power generation patterns are classified using a k-means clustering algorithm.It is a method of vector quantization and is used for cluster analysis in data mining.Given a set of observations (X 1 , X 2 , …, X n ), the k-means clustering partitions n observations into k clusters, as follows: where X j is the jth observation, R i is the ith clustering center, w ji is a synaptic weight and  is the Euclidean distance.R i and w ji are respectively expressed as follows: As shown in (3), the k-means clustering partitions n observations into k clusters (k ≤ n), in order to minimize the within-cluster sum of the squares.This algorithm assigns using the least sum of the squares of the Euclidean distance to achieve the near global optimum.

The creation of training models
As described in the data classification section, a k-means clustering algorithm provides information on similar days and their associated wind speed types.A similar day is defined as having the same wind speed type.In this stage, the SVR evolved using a HSA is used to train the collected input-output datasets.
Since the TCWB provides the verbal wind speed forecasts for the next 3 hours, in terms such as the slight wind, breeze wind, moderate wind, cool breeze and strong wind, the training stage consists of five different models.
Figure 2 shows the structure of the training stage.The inputs for each training model contain the actual wind power output for future 3 hours in steps of 15 minutes on the latest similar day, a prediction of the wind speed for future 3 hours, a prediction of the wind direction for future 3 hours and time zone for future 3 hours.The outputs are the wind power forecasts for future 3 hours in steps of 15 minutes.The five training models have the same training structure.In this paper, a HSA-based SVR is used to train the input/output datasets.The SVR and the HSA-based SVR models are hereby described.
(1) SVR SVR was first introduced by Vapnik in 1995 [10].It is a technique for data classification and regression analysis.SVR can be roughly divided into linearly separable SVR, linearly inseparable SVR and nonlinear SVR.In this paper, a nonlinear SVR is used to find the best hyperplane from n-dimensional spaces to capture the nonlinear mapping between input and output data.Fig. 3 shows the hyperplane of a SVR.The SVR used to divide the data into a high dimension space is described as follows [11].
where u is an unit normal vector to the hyperplane, h is the distance from the origin to the hyperplane, n is the number where  k and  k are the Lagrange multipliers.
x Fig. 3 The hyperplane of a SVR.
Minimization of ( 6) by partially differentiating u, h and Substituting ( 7)-( 9) into (6) forms the dual optimization problem as follows: 10) is defined as the kernel function, K(x k ,x l ), and must satisfy the condition as follows.
( 1 1 ) where g(x) is an integrable function.The kernel function used in this paper is the radial basis function as follows: where  is the dilation parameter.
The parameters,  in (10) and  in ( 12), determine the mapping properties of SVR.In this paper, the HSA algorithm is used to automatically tune these two parameters.
(2) HSA for tuning the parameters of SVR A HSA [12] operates similarly to the process of musical improvisation, where musicians tune the pitches of their instruments to attain better harmony.It is a populationbased random search algorithm and has been applied to various fields.A HSA used to tunes the parameters of SVR is described as follows.

1) Initialization
In this step, an initial harmony matrix (HM) is randomly generated as follows: where z 1 j is the jth weight of penalty function, and j=1, 2, …, S, S is the size of the harmony matrix, z 2 j is the dilation parameter of the jth kernel function.
As shown in (13), each column vector ([z 1 j , z 2 j ]) represents a feasible solution which is randomly generated as follows: where h=1, 2 and j=1, 2, …, S, j z min and j z max are the lower and upper bounds of j z and rand is a uniform distribution value between 0 and 1.
2) Determination of the best hyperplane For each vector solution obtained from ( 14), the hyperplane is evaluated using (10).The solution with the maximum value in HM is regarded as the optimal solution for the SVR.

3) Creation of a new HM
, is generated using the memory consideration, pitch adjustment and random selection processes.In the memory consideration, the decision variables in ' Z are updated by the difference (z 1 -z S ).Then the harmony memory considering rate (HMCR) is used to improve the diversity of the solution vector, as follows: the HSA has 85% probability to choose decision values from historically stored values and 15% probability to choose decision values from the entire range.
For a new harmony matrix obtained using (15), the pitch adjustment is checked to improve the vector solution, as follows: is the pitch adjusting rate and B W is the bandwidth of the variations.

4) Parameter update
In the parameter adjustment process, R par is tuned to improve the performance of the HSA, as follows: where min par R and max par R are the lower and upper bounds of R par , t is the number of iteration and I P is the number of improvisations so far in the optimization process.The improvisation means that the best hyperplane of the offspring harmony matrix is better than that of the parent harmony matrix.

5) Selection of the best vector solution
After adjusting the parameters, the size of HM is kept at 2S. S sets of the better row vector in the harmony matrix are then selected.
Processes from 3) to 5) are repeated, until the maximum number of iterations is reached.The best vector solution in HM is regarded as the optimal hyperplane for the SVR.

Data description and evaluation criteria
The proposed method was used to forecast the power output for a 2 kW wind turbine system.The forecasting period is 3-hour ahead in steps of 15 minutes.To verify the superiority of the proposed approach, the traditional ANN method was also tested using the same database.The practical wind power generation data, wind speed and wind direction data are sampled every 5 seconds, which are collected from January to December in year 2013.The 15-minutely wind power output data are then obtained by averaging the data collected over 15 minutes.
To verify the proposed approach, the data for the last day of each month was used for testing and that for the other days are for training.Note that the traditional ANN and proposed methods have the same number of training models and same input/output variables of each model, while the parameters remained to be determined independently.
To evaluate the forecasting performance of each method, the criterion of root mean square error (RMSE) which measures the average magnitude of the errors was used as follows: where P f is the forecasting value, P t is the actual value and N is the number of forecasting points.
Besides RMSE, the mean relative error (MRE) was also used as an evaluating index for trend estimation, as follows: where P total is the capacity of the wind power system.

Forecasting results
Table 2 shows the parameters of ANN and proposed methods.Figure 4 shows the forecasting results on Mar. 31 in year 2013.Both the ANN and proposed methods give a large error from 3 rd to 5 th forecasting periods.For RMSE value, the ANN method obtained is 127.0492W, while the proposed method gives a value of 113.4993W. For MRE index, the ANN and proposed methods obtained are 4.8304% and 4.48674%, respectively.
Figure 5 shows the forecasting results on Jun. 30 in year 2013.The ANN and proposed methods also give a large error from 3 rd to 5 th forecasting periods.But the obtained RMSE and MRE are 10.0654W and 0.3534% for the ANN method, while the proposed method gives values of 7.9569 W and 0.2970%, respectively.The forecasting errors are very low because the wind power levels are very low on that day.
Figure 6 shows the forecasting results on Sep. 30 in year 2013.Since the wind speed is changed frequently, both the ANN and proposed methods produce larger prediction error.Fig. 7 shows the forecasting results on Dec. 31 in year 2013.The ANN method has better forecasts in the 5 th forecasting period but worse in the 6 th forecasting period than the proposed method.
Table 3 summarizes the forecasting results of RMSE and MRE for each method.For the average RMSE, the ANN gives a value of 68.0519 W while the proposed method gives a value of 58.0490 W. The average MRE is 2.0542% for the proposed method and the ANN gives a value of 2.3587%.The proposed method produces a better forecasting result than the ANN method in terms of RMSE and MRE criteria.

Conclusions
A hybrid approach that combines SVR and HSA for the short-term wind power forecasting is proposed in this paper.The proposed approach comprises data classification, training and forecasting stages.The data classification stage uses a k-means clustering algorithm to classify the wind power output patterns into different wind-speed types.In the training stage, five training models, representing different wind-speed types, are created using a HSA-based SVR.The forecasting stage uses verbal wind speed forecasts of the TCWB for the next 3 hours to choose an adequate model for accurate forecasting.For one year testing data, the proposed approach achieves better forecasting accuracy than the ANN method, in terms of MRE and RMSE criteria.Furthermore, the proposed method is also feasible to deal with more forecasting models for more accurate forecasts.

Fig. 1
Fig. 1 Wind power curve for a 2kW wind turbine generator cut-in speed rated speed cut-out speed K-means clustering was first proposed by Lloyd in 1982 [9].
Fig. 2 Structure of training stage.best hyperplane

Financial
supports from the Ministry of Science and Technology, Taiwan, R.O.C. under the Grant No. 104

Table 1
Wind speed classification of training data,  k is the kth slack variable,  is a weight for penalty function, x k is the kth input data set and H(x k ) is a nonlinear mapping function.The term  k is a penalty function.

Table 2
Parameters of ANN and proposed methods