Abstract
The efficient use of text data is very important in investor sentiment research and other fields. Through the sentiment classification of text data containing investor sentiment, we can effectively and accurately identify the sentiment contained in the text. This paper takes the futures market forecast text published by 21 futures companies as the data source and constructs a sentiment classification model of the market forecast text based on BERT (Bidirectional Encoder Representations from Transformers) according to the characteristics of the market forecast text. The sentiment classification of the market forecast text is carried out by using the sentiment classification model of the market forecast text based on BERT and a classification model based on the classical classification algorithm. The classification effects of different models are compared. The results show that the optimized BERT model has the best classification effect. This enriches the research methods of investor sentiment measurement in the financial field and improves the accuracy of this kind of sentiment measurement result.
Similar content being viewed by others
References
Zhang X, Fuehres H, Gloor PA (2011) Predicting stock market indicators through twitter “I hope it is not as bad as I fear. Proc-Soc Behav Sci 26:55–62.
Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8
Xu Jianfeng Xu, Yuanchen YX, Yuanjian Z, Qing L (2015) Hybrid algorithm framework for sentiment classification of Chinese based on semantic comprehension and machine learning. Computer Sci 42(06):61–66
Yi Hongbo Ou, Yun (2016) Internet forum information mining and investor sentiment measurement-based on multivariate GARCH-BEKK Model Analysis. Finance Research 36(05):20–22
Yin Haiyuan, Wu Xingyin (2019) Daily investor sentiment, excess return and market liquidity—research on time-varying correlation based on DCC-GARCH Model. J Beijing Institute Technol (Social Sciences Edition) 21(05):76–87+114.
Sun Mingxuan, LI Lil (2020) Research on the influence of investor sentiment on stock market volatility based on data mining. J Yanshan Univ (Philosophy and Social Science) 21(01):68–77.
Chen H, De P, Hu YJ et al (2014) Wisdom of crowds: The value of stock opinions transmitted through social media. Rev Financial Stud 27(5):1367–1403
Wu ZG, Qiu H (2012) Local bias of investor attention: evidence from China’s Internet Stock Message Boards. http://ssrn.com., Feb 2012.
Mengjun Z, Jiang Hongxun Xu, Wei (2016) Micro-blog moods and propagation factors based stock prices prediction. J Shandong Univ (Natural Science) 51(11):13–25
Bu Hui, Xie Zheng Li Jiahong, Wu Junjie (2018) Investor sentiment extracted from internet stock message boards and its effect on Chinese stock market. J Manage Sci China v21(4):86–101
Li D, Ma YT, Guo JL (2009) Words semantic orientation classification based on HowNet. J China Univer Posts Telecommun 16(1):106–110
Wanyun C, Jie L (2013) Investors' Bullish sentiment of social media and stock market indices. J Manage Sci 26(5):111–119
Das SR, Chen MY (2007) Yahoo! for Amazon: sentiment extraction from small talk on the web. Manage Sci 53(9):1375–1388
Pandey V, Iyer C (2009) Sentiment analysis of microblogs. CS 229: Machine learning final projects, 2009.
Wawre SV, Deshmukh SN (2016) Sentiment classification using machine learning techniques. Int J Sci Res (IJSR)) 5(4):819–821
Zhoujun Li, Fan Yu, Xianjie Wu (2020) Survey of natural language processing pre-training techniques. Computer Sci 47(03):162–173
Sun Y, Wang L, Chen C, Xia T, Zhao X (2019) Improved distant supervised model in tibetan relation extraction using ELMo and attention. IEEE Access 7:173054–173062
Ling L, Zhihao Y, Yawen S, Nan Li, Hongfei L (2020) Chinese clinical named entity recognition based on Stroke ELMo and multi-task learning. Chin J Comput 43(10):1943–1957
Zijian Li, Chengying C, Xuegang Z (2020) Thai sentence segmentation model based on Bi-LSTM and CRF. Comput Eng 6(10):294–300
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need[C]//Advances in neural information processing systems, pp 5998–6008.
Yang Piao, Dong Wenyong (2020) Chinese named entity recognition method based on BERT embedding. Computer Eng 46(4):40–45+52.
Bhati BS, Rai CS, Balamurugan B, Al-Turjman F (2020) An intrusion detection scheme based on the ensemble of discriminant classifiers. Compters Electrical Eng 86:106742.
Bhati BS, Rai CS (2020) Analysis of support vector machine-based intrusion detection techniques. Arab J Sci Eng 45:2371–2383
Tiwari D, Bhati BS (2021) A deep analysis and prediction of COVID-19 in India: using ensemble regression approach. Artificial Intelligence and Machine Learning for COVID-19, pp 97–109.
Acknowledgements
This work was supported by the National Natural Science Fund (71701099) and National Key Research and Development Program of China (2018YFC0830400).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xiaofeng, W., Jinghua, Z., Chenxi, J. et al. Research on sentiment classification of futures predictive texts based on BERT. Computing (2021). https://doi.org/10.1007/s00607-021-00989-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00607-021-00989-9