1 Introduction

Starting from January 2020, the outbreak of novel coronavirus brought a serious threat to the global economy and people’s life safety, and the government immediately took a series of prevention and control measures to curb the spread of epidemic. Meanwhile, release from Shuanghuanglian to Chinese traditional medicine prescription, effective prevention methods for the virus has attracted great attention. At Oxford University, about Pu’er tea contains theaflavins (TF3) can prevent coronary virus replication to suppress the virus press release, Pu'er tea to get such attention again though from 2006 years of hype over time some remote but is also common, because it was auction 160 thousands RMB at Guangzhou tea abundant meeting clap, to create a new record of Chinese tea auction, and it has been labeled as “sky-high price” and “crazy Pu’er.” It can be seen that its drinking, investment and medical value all make it a leader in the tea industry. Therefore, predicting Pu'er tea price and mastering market fluctuation have great significance for government adoption of scientific and effective control measures and industry development.

Pu’er tea industry standard come on stage in 2013 and its attention gradually to increase, the price forecasting methods in developing and perfecting, but due to the sample data and limited method, price forecasting methods are mainly qualitative method, it means employees according to the experience to forecast price, such as drought climate affect Pu'er tea production and then price increase; economic conditions affect consumers' spending ability, thus causing price fluctuations and so on. Although the qualitative analysis method has certain rationality, it is easily affected by human factors, and the predicted results are highly uncertain and unstable. However, it also brings certain guidance and reference for the industry development. The crash and collapse of Pu'er tea market in 2007 and 2014 made Pu'er tea price prediction particularly important. However, the qualitative method used before was at a stable market environment, so the application of quantitative analysis method in this field is imperative. Although there are many quantitative analysis methods which play a great role in price prediction in many fields, it is difficult to collect sample data, and the price gap between different tea regions and products is wide, which brings certain difficulty to the price forecast research work. At present, most price prediction studies on Pu'er tea are conducted from a small range and single perspective. For example, Xu (2012) investigated Pu'er tea price by field investigation [1], but due to manpower, technical limitations, she only selected three sites to invest, which cannot fully reflect the tea industry situation. In Dou L’s (2014) paper, Feng KF based on the analysis of raw materials and labor costs concluded that spring tea in 2013 would have a certain increase, but it was within a very reasonable range of 10% ~ 15% [2]. Its price fluctuates greatly and its accuracy is insufficient. China Agricultural Information (2012) based on market research drew the conclusion that drought caused Yunnan spring tea decline and led to the tea price rise [3]. It can be seen that the existing methods are not ideal for the price prediction of Pu'er tea, so it is necessary to find a method with higher prediction accuracy.

Current prices forecast method has formed more than 150 kinds, including the difference of autoregressive integrated moving average model (ARIMA) due to its simplicity, feasibility and flexibility characteristics, and it has become a widely used time series models [4, 5], scholars used ARIMA model at pork prices [6], China road logistics freight index [7], second-hand housing prices [8] and forecast analysis on the whole society fixed asset investment data of Guangxi [9], more accurate prediction results were obtained. And in addition to the ARIMA model, intelligent algorithm is widely used in the price prediction field, mainly including artificial neural network algorithm, grey prediction algorithm, genetic algorithm, wavelet analysis, etc. Artificial neural network algorithm is a kind of information processing mathematical model, which is a simulated biological nervous system; it is more intelligent and has higher prediction rate; in particular, the traditional artificial neural network algorithm is improved to avoid the traditional algorithm disadvantages and further improve the prediction accuracy [10]. Scholars have applied back-propagation neural network model (BP) to stock [11, 12], commercial housing [13], second-hand housing [14], carbon market [15], shale gas production [16] and zinc futures price prediction [17]. However, the comparison between ARIMA and BP models has always been controversial. For example, Zhang et al. (2003) conducted a comparison experiment between BP and ARIMA and concluded that BP prediction accuracy is better than ARIMA model in terms of nonlinear data processing [18]. Ma (2014) constructed a prediction model by taking the annual sales of China enterprise appliance industry as an example, which showed that BP had a higher prediction accuracy than ARMA model [19]. Chen (2017) predicted the closing prices of Baidu and Alibaba based on ARIMA and BP and compared the two models’ prediction accuracy. The results showed that BP prediction accuracy was lower than ARIMA model [20].

More than 150 kinds of price forecasting methods have their advantages, disadvantages and applicable field, and the short-term price forecasting accuracy is higher and can guide the employees of short-term business, consumer behavior and investors' investment, but the short-term price forecasting cannot meet the government and tea enterprise long-term demand, so long-term forecast has important guiding significance to the business development. At present, there are few quantitative forecast methods for Pu'er tea price in academic circle and the forecasting accuracy is not high. In view of the above research status and problems, this paper firstly takes 412 weekly prices of the current year of Dayi 7542 raw tea, the current year of 7572 ripe tea, 2011 Dayi 7542 raw tea and 2011 Dayi 7572 ripe tea from June 12, 2012, to May 3, 2020, as sample data to establish ARIMA model for short-term price prediction. Secondly, from the macro- and microperspectives, the annual price data and four products impact factors from 2012 to 2019 are taken as samples, the TOPSIS method is used to sort the weight of 16 factors affecting price, and BP model is established to verify the price prediction results of different combinations of impact factors, respectively. Then, the two models were compared and the model which has more accuracy was found, according to the empirical results which offer objective suggestions to the industry.

This paper first uses the prices of the current year and 2011 Dayi 7542 raw tea and 7572 ripe tea produced from 2012 to 2020 as the research object, selected products and data are representative. Secondly, in terms of research methods, short-term and long-term price prediction models are selected, which not only meet the need of high prediction accuracy, but also provide guidance for the future long-term price of the industry. By comparing the two models, better prediction method is obtained for analysis.

2 Short-term price forecast of Pu'er tea

Yunnan Dayi Tea Group Co., Ltd. was founded in 1940, which is a pioneer in China's tea industry, "Time-honored brand of China," "Well-known trademark of China," national high-tech company and an award-winning company of Yunnan Provincial People's Government Quality Award. Now Pu'er tea as the company core business, its business covering tea, water, vessel and tea ceremony four sector, this company through scientific research, cultivation, production, marketing and culture of the whole industry chain modern large company groups, the scale of production, sales, profits and brand comprehensive influence ranks first in the same industry, brand store number are the most around the world. As founder of Pu'er tea industry, tea value standard setters, classic formula tea owners, the inventor of global microbial tea process, the definition of OTCA Pu'er tea value concept. Dayi company active corporate social responsibility, spent about 5 billion RMB money for tea raw materials acquisition during more than 10 years, help hundreds of thousands tea farmers out of poverty to get rich.

The two products of Dayi 7542 and 7572 are the benchmark products in the industry and are known as the price weathervane and barometer. 7542 is the standard taste of Pu'er for its long duration, classic materials and various storage styles. 7572 is Menghai Tea factory success product in 1975, whether materials, blending technology or fermentation technology is relatively stable, although after decades of agricultural improvement and raw material market changes, but is still active in the market.

This part takes the weekly prices of the current year of Dayi 7542 raw tea, the current year of Dayi 7572 ripe tea, 2011 Dayi 7542 raw tea and 2011 Dayi 7572 ripe tea from June 12, 2012, to May 3, 2020, as sample data and establishes ARIMA model for Pu'er tea short-term price prediction and analysis.

2.1 Construction of ARIMA model

Autoregressive integrated moving average model (ARIMA) was a famous time series forecasting model put forward by Box and Jenkins in 1970. First, the nonstationary time series is constructed by \(d\) difference, and then the new series is fitted, autoregressive and moving average to determine the model order sum \(p\) and \(q\); the ARMA (\(p,q\)) model is established for the new series, and prediction after model diagnosis. The ARIMA (\(p,d,q\)) model equation is:

$$X_{{\text{t}}} = \phi_{1} X_{t - 1} + \phi_{2} X_{t - 2} ... + \phi_{p} X_{t - p} + \epsilon - \theta_{1} \varepsilon_{{\text{t - 1}}} - \theta_{2} \varepsilon_{{\text{t - 2}}} ... - \theta_{{\text{q}}} \varepsilon_{{{\text{t}} - {\text{q}}}}$$
(1)

Among them, \(\phi{1},\phi{2}....\phi{p}\) is the autoregressive coefficient; \(\theta{1},\theta{2}....\theta{q}\) is the moving average coefficient; \(p\) is the autoregressive order; \(q\) is the average moving order; \(d\) is the difference order; and \(\left\{\epsilon\right\}\) is the white noise sequence.

The characteristics of this model do not directly consider the change of other related random variable; it will predict objects to form a sequence of data over time as a random sequence, and the random sequence can be generated through the autoregressive moving average process. The time series can be explained by its own past values, hysteresis and random interference terms [21]. If the time series is a stationary series, its behavior will not change significantly with time and the past value and present value of time series can be used to predict the future value, which is the advantage of random time series analysis model [22].

This paper selected the weekly prices of four Pu'er tea products from June 12, 2012, to May 3, 2020, as sample data. There are 412 data for each product and the data source from China Pu'er tea website.

2.1.1 Stationarity test

Analysis of four products’ price trend is shown in Figs. 1, 2, 3 and 4.

Fig. 1
figure 1

Price of the current year of Dayi 7542 raw tea

Fig. 2
figure 2

Price of the current year of Dayi 7572 ripe tea

Fig. 3
figure 3

Price of 2011 Dayi 7542 raw tea

Fig. 4
figure 4

Price of 2011 Dayi 7572 ripe tea

It can be seen from the figure that time series of the four products can be regarded as nonstationary time series. The autocorrelation coefficient of time series does not decay rapidly to zero, but fluctuates up and down at one side of zero axis, which further verifies the conclusion.

2.1.2 Sequence stabilization

After the first-order difference is carried out on the original time series, the test results are shown in Figs. 5, 6, 7 and 8. It can be seen from the figure that time series of the four products has no obvious trend characteristics and can be considered as a stable time series.

Fig. 5
figure 5

First-order difference of the current year of Dayi 7542 raw tea

Fig. 6
figure 6

First-order difference of the current year of Dayi 7572 ripe tea

Fig. 7
figure 7

First-order difference of 2011 Dayi 7542 raw tea

Fig. 8
figure 8

First-order difference of 2011 Dayi 7572 ripe tea

The horizontal value of ARIMA model is 0.01, 0.05 and 0.1. Under normal circumstances, a horizontal value less than 0.05 indicates a good fitting effect. Therefore, the horizontal value of 0.05 was selected in this paper. After unit root detection, the \(p\) values of four products were significantly lower than the horizontal value of 0.05, rejecting the null hypothesis and indicating a stationary time series.

2.1.3 Parameter estimation

In the previous part, a first-order difference is made for the original time series, that is \(d\) = 1. This part conducts a fixed-order analysis for \(p\) and \(q\). Set \(p\) and \(q\) as 0, 1, 2, they have nine combination models which are obtained in total. ARIMA model trade-off is based on minimum information criteria, such as AIC (Akaike information criterion), SC (Schwarz criterion) and HQ (Hannan–Quinn criterion), which are generally used to determine the order. In this paper, AIC, SC and HQ are used as the criteria for determining ARIMA model, in order to obtain the best trade-off effect, so as to establish better ARIMA model. Through comparative analysis of the value of AIC, SC and HQ in the operating results, the combination of minimum value is the optimal model, and the different combination results of four products are shown in Tables 1, 2, 3 and 4.

Table 1 Current year of Dayi 7542 raw tea model comparison results
Table 2 The current year of Dayi 7572 ripe tea model comparison results
Table 3 2011 Dayi 7542 raw tea model comparison results
Table 4 2011 Dayi 7572 ripe tea model comparison results

Through the numerical comparison analysis of AIC,SC and HQ of four product models, it can be known that the optimal model of the current year of Dayi 7542 raw tea and Dayi 7572 ripe tea is (1,1,0). The optimal model of 2011 Dayi 7542 raw tea is (0,1,1), and the optimal model of 2011 Dayi 7572 ripe tea is (2,1,2).

2.1.4 Model adaptability test

After determining the optimal model, the adaptability of fitting model was tested, that is testing the model residual sequence. If the residual sequence is not white noise, it indicates that some important information has not been extracted, so the model should be reset. The sequence autocorrelation and partial autocorrelation of four products are shown in Figs. 9, 10, 11 and 12.

Fig. 9
figure 9

Autocorrelation and partial autocorrelation of the current year of Dayi 7542 raw tea

Fig. 10
figure 10

Autocorrelation and partial autocorrelation of the current year of Dayi 7572 ripe tea

Fig. 11
figure 11

Autocorrelation and partial autocorrelation of 2011 Dayi 7542 raw tea

Fig. 12
figure 12

Autocorrelation and partial autocorrelation of 2011 Dayi 7572 ripe tea

According to the results of autocorrelation and partial autocorrelation of four products, there is no autocorrelation in the residual sequence, which means white noise, so it is an appropriate model, and the model fitting figure is shown as Figs. 13, 14, 15 and 16.

Fig. 13
figure 13

Model fitting diagram of the current year of Dayi 7542 raw tea

Fig. 14
figure 14

Model fitting diagram of the current year of Dayi 7572 ripe tea

Fig. 15
figure 15

Model fitting diagram of 2011 Dayi 7542 raw tea

Fig. 16
figure 16

Model fitting diagram of 2011 Dayi 7572 ripe tea

2.2 Verification and analysis of ARIMA model

The experimental environment is Intel(R) Pentium(R) CPU N3540@2.16 GHz processor, 4.00 GB installed memory, 64-bit operating system, Windows 7 flagship version, EViews 10 software.

2.2.1 Model validation

Dividing data into two parts, the first part is from June 12, 2012, to February 16, 2020, 401 weekly data as the training set, the second part is from February 17, 2020 to May 3, 2020, 11 weekly data as validation set. Using ARIMA model to forecast and compare the predicted values and real value, the results are obtained in Tables 5, 6.

Table 5 ARIMA model verification results of the current year of Dayi 7542 raw tea and the current year of Dayi 7572 ripe tea
Table 6 ARIMA model verification results of 2011Dayi 7542 raw tea and 2011 Dayi 7572 ripe tea

In order to more comprehensively and objectively describe the prediction results of ARIMA model on the four kinds Pu'er tea price, this paper refers to relevant literatures and uses two indexes—mean square error (MSE) and mean absolute error (MAE), to measure the model prediction quality [23]. Based on this, the average relative error (MRE) is added, and its formula definition is as follows:

$$MSE=\frac{1}{N}\sum_{\left(i=1\right)}^{N}\left(y{i}-y{j}\right)^{2}$$
(2)
$$MAE =\frac{1}{N}\sum_{\left(i=1\right)}^{N}\backslash y{i}-y{j}\backslash$$
(3)
$$MRE = \frac{1}{N}\sum_{\left(i=1\right)}^{N}\frac{\backslash y_{i}-y_{j}\backslash}{y{i}}$$
(4)

where \(y{i}\) denotes the actual value, \(y{j}\) denotes the predicted value and \(N\) denotes the number of predicted samples. For different models, smaller values of MSE, MAE and MRE indicate higher prediction accuracy. The prediction accuracy results of the model are shown in Table 7.

Table 7 Prediction accuracy results of ARIMA model

2.2.2 Result analysis

From Tables 5, 6 and 7, the prediction relative error of four products in the first 2 weeks is smaller, and among them the price prediction error of the current year of Dayi 7572 ripe tea at the first week is 0.0008998; only a few weeks later, the prediction error of four products increases, but the prediction error of the current year of Dayi 7542 raw tea at 10 weeks and that of the current year of Dayi 7572 ripe tea at 5 to 9 weeks are reduced to a certain extent, but the error is still bigger than two weeks before. Therefore, the ARIMA model has the best price prediction effect in the first two weeks, and the error accordingly in the later period increases, but it is still within the acceptable range, indicating that the ARIMA model can better predict the product price and the short-term prediction effect is better. It can be seen from Table 7, compared with the two products of current year, the 2011 Dayi products have a small error in the three measurement indexes of MSE, MAE and MRE, while the prediction accuracy of the current year of Dayi 7572 ripe tea was better than the raw tea, which indicates that the ARIMA model has a better prediction effect on the 2011 products.

3 Long-term price forecast of Pu'er tea

This part takes the annual prices of four Pu'er tea products and the annual data of 16 price impact factors from 2012 to 2019 as sample data. Based on the weight analysis of Pu'er tea impact factors by TOPSIS method, BP model is constructed to predict and analyze the annual price of Pu'er tea.

3.1 BP model construction

Back-propagation network model (BP) was put forward in 1986 by Rumelhart and McCelland, which is a kind of multilayer feed forward network which according to the error back-propagation algorithm training is one of the most widely used neural network models, which is used for function approximation, pattern recognition classification, data compression and time-series prediction. It consists of input layer, hidden layer and output layer, and the hidden layer may have one or more layers. Each neuron in the input layer is responsible for receiving input information from the outside world and transmitting it to each neuron of middle layer. The middle layer is the internal information processing layer, which is responsible for information transformation. According to the demand of information change ability, the middle layer can be designed as a single or multiple hidden layer structure. The last hidden layer transmits the information to each neuron of output layer. After further processing, a learning forward propagation process is completed, and the output layer outputs the information processing results to the outside world. When the actual output is inconsistent with the expected output, the error back-propagation stage is entered. The error passes through the output layer, revises the weight of each layer in the way of error gradient descent and passes back to the hidden layer and input layer by layer. The repeated process of information forward transmission and error back transmission is a process in which the weights of each layer are adjusted continuously, and it is also a process of neural network learning and training, which will continue until the network output error is reduced to an acceptable level or the learning number is preset. BP model diagram is shown in Fig. 17.

Fig. 17
figure 17

BP model diagram

BP model has high nonlinear and strong generalization ability, but it also has some shortcomings such as slow convergence speed, large number of iterative steps, easiness to fall into local performance and poor global searching ability.

3.1.1 Training samples determination

Based on the analysis of Pu'er tea industry chain, reference and draw lessons from domestic and foreign scholars about Pu'er tea price impact mechanism research and according to the China Statistical Yearbook, China Agriculture Website, Bulletin of World Tea Industry Development Report, the relevant data such as Yunnan Statistical Yearbook, selecting the economic development factor, national tea supply and demand factors, tea supply factors of Yunnan Province, supply and demand factors of Pu'er tea and Internet development level factor in Yunnan Province during 2012–2019, 5 evaluation and 16 indicators to building Pu'er tea price impact mechanism evaluation system, such as Table 8. Most of the 16 impact factors of the Pu'er tea price are low-frequency data (annual), so the prices of four products should be in the same frequency with the impact factors; that is, it is more reasonable to choose the annual price. The annual price data of four Pu'er tea products are shown in Table 9.

Table 8 Impact factors of Pu'er tea price
Table 9 Annual price of four Pu'er tea products (unit: RMB)

3.1.2 Weight model of Pu'er tea price impact mechanism

Technique for order preference by similarity to ideal solution method (TOPSIS) was proposed by Hwang and Yoon in 1981, and as a comprehensive evaluation method, it has no strict limits on data distribution and sample content index and is applicable to small sample data as well as multi-evaluation units. It can be used for both horizontal multiunit comparison and longitudinal analysis of different years. The same trend and normalization processing of the original data not only eliminate the influence of different index dimensions, but also can make full use of the original data information to quantitatively evaluate the pros and cons of different units, with objective and accurate results. As a common multi-objective decision analysis method for finite schemes, it can effectively avoid the information overlapping problem due to certain correlation among various indexes. This method is used to determine the weight ranking of each impact factor. In the field of Pu'er tea, scholars have used the grey correlation analysis method to study the impact factors of the total output value of Yunnan tea. However, according to the research objects and data characteristics, TOPSIS method is more suitable for this paper.

Build the original data matrix of Pu'er tea price impact mechanism according to the evaluation indexes and the number of evaluation objects:

$$X=\left\{x{ij}\right\}n\times m,\left(m=16,n=8\right)$$
(5)

Transform the original data matrix of Pu'er tea price impact mechanism into a standardized matrix:

$$Y{ij}=\left\{y{ij}\right\}n\times m$$
(6)

where, the positive index is \(y{ij}=\frac{{x{ij}-a{j}}}{A{j}-a{j}}\), the negative index is \(y{ij}=\frac{A{j}-x{ij}}{A{j}-a{j}}\), the neutral index is \(y{ij}=\frac{x{ij}-a{j}}{x{0}-a{j}},\left(x{ij}\prec x{0}\right);y{ij}=\frac{A{j}-x{ij}}{A{j}-x{0}},\left(x{ij}\geq x{0}\right)\) \(A{j}=\left\{x{ij}\right\}_i^{\max}\), \(a{j}=\left\{x{ij}\right\}_i^{\min}\).

Normalize \(y{ij}\):

$$Z{ij}=\frac{y{ij}}{\sum_{\left(i=1\right)}^{n}y{ij}}$$
(7)

Calculate the entropy value of \(j\) index:

$$e{j}=-k\cdot\sum_{\left(j=1\right)}^{n},\left[Z{ij}\cdot l{n}\left(Z{ij}\right)\right]\left(j=1,2,3....,n;k=\frac{1}{l{n}\left(n\right)},constant\right)$$
(8)

Differentiation coefficient:

$$g{j}=1-e{j}$$
(9)
$${\text{Entropy weight}}:c{j}=\frac{g{j}}{\sum_{\left(j=1\right)}^{m}g{j}}$$
(10)
$${\text{Indexweight}}:w{j}=\frac{c{j}}{\sum_{\left(j=1\right)}^{m}c{j}}$$
(11)

According to TOPSIS to get the impact factors’ weight, the impact factors of the three sort weight sum is 0.5638, become the main impact factors to price, so choose during 2012–2019, available from the former three impact factors to 16 full impact factors, total of 14 combinations kinds as input samples, and at the same time the annual tea prices for the output samples, different factors affecting the combination of price forecasting accuracy. Table 10 analyzes the weight of 16 impact factor indexes of Pu'er tea and ranks them according to the weight, so as to determine the factors that have the greatest influence on the price to those that have the least influence, thus laying a foundation for the subsequent BP model price prediction research.

Table 10 Weight ranking of Pu'er tea price impact factors

3.1.3 Data preprocessing

In order to avoid excessive impact on the training speed and sensitivity of the network due to the original sample data, this paper converts the original data into dimensionless standard values through normalized mathematical processing and uses premnmx function to conduct normalization processing on the original data samples. After the BP training, the normalized data obtained from the predicted output need to be reverse-normalized to restore the Pu'er tea price value.

3.1.4 BP model structure design

The number of input layer neurons is 3–16, and the number of output layer neurons is 1. In BP neural network model, it is very important to select the appropriate number of hidden layer nodes, which has a great impact on the network performance. If the number of nodes in the hidden layer is set too little, the training process will not converge easily. If the number of nodes is set too much, it will be easy to train too much and the learning time will be too long. Set the number of hidden layer nodes is a complex problem, there is no authoritative calculation method, now more than several times with training in the optimal method, to choose a smaller estimate at the beginning, other conditions remain unchanged, and gradually increase the number of nodes, repeatedly for training and testing, the smallest error of hidden layer nodes is the most points. The results calculated by the existing calculation formulas are only empirical estimators, not necessarily the optimal number of nodes. The commonly used calculation formulas mainly include the following three types:

$$m=\sqrt{nl}$$
(12)
$$m=\log_{2}{n}$$
(13)
$$m=\sqrt{n+l}+\alpha$$
(14)

where \(m\) represents the number of nodes in the hidden layer, \(n\) is the number of nodes in the input layer, \(l\) is the number of nodes in the output layer, and \(\alpha\) is a constant between 1 and 10. In this paper, the more commonly used formula [14] is selected, and the optimal number of hidden layer nodes is determined according to the training error results through repeated experiments with different node numbers. The number of neurons in the hidden layer is different, and the optimal number of neurons can be determined through subsequent training. The hidden layer excitation function was tansig function, the output layer excitation function was purelin function, and the training function was trainlm. The number of network iterations epoch was set to 5000 times, and the training error was 0.0000001.

3.2 Verificationand analysis of BP model

The experimental environment of this paper is Intel(R) Pentium(R) CPU N3540@2.16GHz processor, 4.00GB installed memory, 64-bit operating system, Windows 7 flagship version,MATLAB2018 software.

3.2.1 Model validation

According to the characteristics of BP model and study relevant literature, 80% of the data is generally taken as training set, and the remaining 20% is taken as the verification set.Considering the sample only involves 8 annual years data, the sample data from 2012 to 2018 is used as training set, the sample data of2019 is used as verification set, the 3–16 impactfactors from 2012 to 2018 are used as input layer, and four products prices from 2012 to 2018 are used as output layer. In this paper, the input layer of sample data are available from the former three impactfactors to 16 full impact factors, total of 14combinationskinds, different number of input layer neurons causedhidden layer neurons number is different, because the space is limited, only the best hidden layer neurons mean square errortrainingresult of different number of input layer neurons willbe listed and detailed in Table 11

Table 11 Relative error results of optimal hidden layer neuron training with different input layer neuron numbers of four Pu'er tea products

According to the results in Table 11, the neurons number of input layer is 4, which means to use the top four price impact factors. When the neurons number of hidden layer is 8, the prediction accuracy is relatively high. After 25, 19, 18 and 20 iterations, respectively, the four products’ network training error reached below the target error, and the network training results are shown in Figs. 18, 19, 20 and 21.

Fig. 18
figure 18

Training error curve of the current year of Dayi 7542 raw tea

Fig. 19
figure 19

Training error curve of the current year of Dayi 7572 ripe tea

Fig. 20
figure 20

Training error curve of 2011 Dayi 7542 raw tea

Fig. 21
figure 21

Training error curve of 2011 Dayi 7572 ripe tea

After the model training is completed, the obtained results are reverse-normalized and restored to the original order of magnitude to obtain the four products’ price prediction results of 2019. In order to more comprehensively and objectively analyze the prediction results of the four kinds of Pu'er tea price, this paper adopts relative error to measure the model prediction quality, and the prediction accuracy results are shown in Table 12.

Table 12 Prediction accuracy results of BP model

3.2.2 Result analysis

Through the establishment of BP model, testing the Pu'er tea price impact factor, the results showed the rankings of top four factors—Yunnan tea E-commerce sales, inflation rate, number of tea production enterprises in Yunnan Province and interest rate which are good at predicting Pu'er tea price and further verify that they are the main price impact factors. Internet + environment and network consumption form were further enriched, including online–offline and online stores more multi-channel interaction [24], made the electric business platform to complete the sales account for the proportion of Pu'er tea sales increased year by year, the development of Yunnan tea electricity is the main factor to made Pu'er tea sales promotion and price steady growth. The inflation rate is another factor that leads to Pu'er tea price fluctuation, and adding this factor can make the analysis more objective. Since most enterprises have the technology and ability to produce Pu'er tea, the number of tea production enterprises in Yunnan Province also reflects the capacity of Pu'er tea industry or the number of products flowing into the market, so the analysis can directly reflect the market supply situation of Pu'er tea. Qu Q [25] proposed that market interest rate is an influencing variable reflecting market borrowing costs, which directly affects commodity investment behavior. Pu'er tea has become a new favorite for investment due to its value-added characteristics; especially for small-scale collectors, when the market interest rate is low, they are likely to put their spare funds into Pu'er tea and other products.

The predicted results of the current year of Dayi 7542 raw tea are not good and the error is big; the main reason is the raw tea is sought by consumers and investors; its price impact apart from macro- to microfactors also has the government’s promotion, media publicity and hype qualitative factors such as investors, so its price fluctuation is big and using BP model is hard to accurately predict. The prediction error of the current year of Dayi 7572 ripe tea and 2011 Dayi 7542 raw tea is relatively small and close to the real price. The main reasons are the two products’ price fluctuation is small and the price increase is not large in recent years. 2011 Dayi 7572 products became popular due to their certain advantages in cost performance, and the price was easily affected by other factors, which brought some difficulty to the prediction work. Therefore, the prediction error was larger than the last two products, but it still had certain research significance.

4 Price forecast and analysis of Pu'er tea

This paper selects ARIMA and BP models to forecast the four Pu'er tea product prices, the ARIMA short-term forecasting effect is good, but BP model for the current year of Dayi 7542 raw tea prediction error is big, together with the annual data of other three products’ impact factors which are not released and the data do not have a simple linear and exponential function relationship; it is difficult to accurately predict the four products’ impact factors and then further forecast four product prices in 2020. However, if the prediction period of ARIMA model is too long, it may lead to low prediction accuracy and little practical guidance. Therefore, this paper mainly uses ARIMA model to make four products’ short-term price prediction in the next 10 weeks, and the results are shown in Table 13.

Table 13 ARIMA model prediction results of the next 10 weeks Pu'er tea price

From the perspective of the overall price level of four products, the current year of Dayi 7542 raw tea has a big price profit and rising trend in the next few weeks, this also objectively reflected the product price in recent years, and the current year of Dayi 7572 ripe tea price over the next few weeks is only 100 RMB; the price of 2011 Dayi 7542 raw tea price increase is not big, up to 4.4 RMB; the price of 2011 Dayi 7572 ripe tea in the next 10 weeks was nearly 160 RMB, close with the current price. It can be seen that the raw tea price and future appreciation space are better than ripe tea. The price forecast of the four products reflects both the industry’s current situation and the future trend of Pu'er tea market.

4.1 Among the four products, the current year of Dayi 7542 raw tea has the highest price and it continues to increase substantially. The main reason was it is not suitable for immediate drink when the raw tea just product, and through long-term storage, color and taste were better than before, consumer see it as a long-term storage products and purchase because its subsequent drinking; Pu'er tea has invest value, financialization and futures products characteristics, and the investment sees it as a product which can be long-term stored and invested, so the raw tea price is higher than other three products, but it is still in the first stage of commodity financialization advanced stage, compared with red sandalwood wood, jade and real estate which in the second stage has relatively low financialization degree [26],Footnote 1 so its future tea price did not have drastic fluctuations.

4.2 The predicted price of the current year of Dayi 7572 ripe tea in the next 10 weeks is the lowest among the four products, with the price only about 100 RMB. The reasons are mainly as follows: First, due to the taste and fermentation technology, consumers have ignored the ripe tea value for a long time. After 2013, tea companies improved the ripe tea quality by improving the quality of raw materials and process technology, thus increasing the demand. Second, in the consumer market, there is a consumption habit of “drinking ripe tea, collecting raw tea and tasting old tea,” so in the value of drinking and collecting, the main value of ripe tea is drinking. Third, in a comprehensive view of Pu'er tea market, there is a serious excess capacity, supply far more than demand.

4.3 2011 Dayi 7542 raw tea is representative in Pu'er old tea market, and according to industry standards, 10–15 years suits the middle tea standard, the quality is better and has more drinks value and this product is close to the tea standard, so it should be a popular product in consumption market and the price should be higher, but the actual situation for the next 10 weeks was the highest price increase of only 4.4 RMB; the main reason is excessive storage, and Pu'er tea storage has been rising from 337,800 tons in 2012 to 718,900 tons in 2019. Although Pu'er tea market recognition and brand awareness increased year by year, its sales are rising from 47,000 tons in 2012 to 91,500 tons in 2019, there are too much storage and a large number of fresh tea products in the market, the supply is far more than demand, though 2011 tea product stays at the best drinks period, but its future price rise is still unable to reflect its actual value.

4.4 Ripe tea artificial pile fermentation process to make the subsequent transformation space is small, its main value is drinking, it can be immediately drunk after fermentation, and it does not need long storage time. 2011 Dayi 7572 ripe tea price is higher than the current year of Dayi 7572 ripe tea, so consumers tend to choose the current year ripe tea. Though the tea has reached the mid tea standard, the quality and conversion are not big and the price is same to the year Dayi 7542 raw tea, but it is inferior in the quality and taste, so its price and quality do not have advantage and then the prediction price nearly 160 RMB has certain inevitability.

5 Conclusion and suggestions

5.1 Conclusion

This paper firstly takes the weekly prices of four products from June 12, 2012, to May 3, 2020, as sample data, and uses ARIMA model to make short-term price forecast. Secondly, the annual prices of four products from 2012 to 2019 and the annual data of the impact factors are taken as sample data. From the macro and micro perspectives, the BP model is established based on the weight ranking of 16 factors affecting the price by TOPSIS method, and the long-term price prediction results of impact factors different combinations are verified respectively.

The results show that ARIMA is more suitable for short-term prediction and its prediction error will increase when the prediction period prolonged. The long-term price prediction results of BP model show that its prediction accuracy of the current year of Dayi 7542 raw tea is not high and the error is big, while the other three products’ price prediction has certain error, but it still has a certain reference value.

Therefore, this paper uses ARIMA model to make short-term price forecast in the next 10 weeks, and the results show that the price of the current year of Dayi 7542 raw tea will increase greatly and present continuously increasing trend in the next few weeks, but the price of the current year of Dayi 7572 ripe tea was only about 100 RMB. The price of 2011 Dayi 7542 raw tea did not increase much, and the price of 2011 Dayi 7572 ripe tea will be nearly 160 RMB in the next 10 weeks, which is close to the current actual price. And the main reason is the Pu'er tea price has been gradually increasing because it has always been sought and hyped by consumers and investors, whether for drinking or investment. The Pu'er tea storage increased year by year, leading to serious excess production capacity and unbalance between supply and demand, so though the product has reached the optimal drinking period, the price has not increased much and does not reflect its real value. And with the increase of artificial cost in recent years, Pu'er tea prices rise even to maintain the status quo, make the tea farmers and tea enterprise lose, and even the situation that no people to harvest, low price led to reduced research, inadequate innovation and promotion ability, thus not conducive to the industry healthy development. According to the current situation and problems of the industry, this paper puts forward the following suggestions.

5.2 Suggestions

5.2.1 Pu'er tea has both drinking and investment values. This unique advantage brings considerable benefits to Pu'er tea, but also brings some hidden dangers. Too much attention to the investment value will lead to a sharp increase in the stock market. So far, there is 718,900 tons of Pu'er tea storage, and the annual production of Pu'er tea is 148,800 tons, while the annual sale of Pu'er tea is only 91,500 tons, resulting in excess production capacity and serious unbalance between supply and demand. When the market stock flows to the market in large quantities, it will cause precipitous drop in tea price and seriously harm the healthy development of the industry. Therefore, in view of the current situation of industry, Pu'er tea should reflect its tasting value and reduce the market stock, so that Pu'er tea can truly enter people’s daily life and become a tea drink, accelerate market consumption and increase effective demand. The government shall give scientific guidance to the industry practitioners according to the market situation and future development trend. Tea farmers should not blindly plant and should adhere to the sustainable development of tea resources, scientific planting and picking. Tea enterprises should adjust the quantity of inventory according to the actual situation of enterprises.

5.2.2 The consumer group of Pu'er tea has shown a young trend, and the post-1990s and post-1900s have gradually become the main consumer group. They have a wide range of channels to accept new things, pay more attention to the information timeliness and have new requirements on the raw materials, prices and packaging methods of tea products. The Pu'er tea industry should also adapt to this change, speed up its publicity and promotion through Internet, expand the consumer and understand consumers' needs and buying intentions through data analysis, carry out targeted product promotion, meet their personalized needs and cultivate loyal consumer groups. Therefore, the government should strengthen the infrastructure guarantee, improve the Internet penetration rate, conduct skill training for tea-related personnel and improve the comprehensive quality of employees. Tea farmers and tea companies should use Internet to understand customers’ needs according to the actual situation and adjust their business strategies according to their feedback.

5.2.3 Pu'er tea industry has more than 8 million tea farmers, more than 11 million tea workers and 1751 tea production enterprises. The Pu'er tea from planting, picking and processing to reprocess have in total more than 30 major processes, and every step needs scientific evaluation. Make a good quality of Pu'er tea can not do without science and careful data analysis, strict operation, but the most industry professionals is based on past production experience, the tea quality is not same, difficult to standardization and standardized production. The scale tea companies in the industry are less, and few enterprises have high brand awareness. The product quality is not the same and easily appears in the consumer adverse selection. Therefore, tea practitioners should strictly control quality, improve production process, accurately describe product information and enhance the brand awareness in order to restore consumer confidence and make the price truly reflect the product value.