Stock Price Forecasting Using Support Vector Regression

Stock Price Forecasting Using Support Vector Regression


Stock market research based on network behavior data has become one of the focuses in behavioral finance . In this paper, we firstly construct a proxy variable of investors’ attention based on comments data collected from the Snowball Finance, which is a popular online financial community in China. Then we analyze the lead-lag relationship between investors’ attention and the stock price of all A-shares listed in both the Shanghai Stock Exchange and Shenzhen Stock Exchange, using Thermal Optimal Path (TOP) method. And we find that investors’ attention and stock price have a dynamic relationship, which differs from stock to stock. In terms of quantity, only a small number of stocks have a relationship where investors’ attention changes ahead of the stock price. Those two facts may account for conflicting conclusions drew by different studies. Further on, this paper establishes two support vector regression models, comparing the predictive capability of investors’ attention to the selected two kinds of stocks. The results show that adding investors’ attention to models can only enhance the prediction precision of the "leading stocks”, while it has little effect to the “lagging stocks”.

Existing system:

Existing studies often regard the relationship between investors’ attention and stock price as a linear relationship, and apply static statistical or econometric models to do research, such as correlation analysis (Zhang et al.,2011), Grainger causality test (Gilbert and Karahalios, 2010), multiple regression model (Chen et al., 2012) and time series model (Karabulut, 2012). These methods are estimation with parameters based on the linear relationship and imply the idea of long-term equilibrium. They cannot catch the dynamic characteristics of the lead-lag relationship between investors' attention and stock price(Guo et al., 2017). Even if the we improve the original method with sliding time window, it is still challenging to correctly find out the lead-lag order in the entire time interval, let alone the huge amount of data requirement as the research foundation.


The results of the forecast show that not all stocks’ price can be predicted by investors' attention. On the contrary, the number of stocks with predictability, which we call “leading stocks”, is rather small. Different stocks have different predictability. It may be explained by limited attention of investors and the characteristics of stocks themselves. The theory of limited attention suggests that investors’ concern has an impact on stocks’ price.