1 Introduction

With the ever-increasing significance of model pigs in many fields such as life sciences, medicine, and health, the identification and tracking of model pigs raise a hot issue in machine vision research [1, 2]. Aiming at the video of model pig motion based on the video monitoring system of pig farm, the comparative research for detection and tracking of the pig object was conducted by the machine vision [3, 4]. In combination with the living habits of pigs, an abnormal evaluation system with respect to the pig movement track is established [57]. However, due to the complexity of the environment and the interference of similar backgrounds, the effect falls short in recognizing the target object. Complex environment: there are railings, fodder and so on within the breeding environment [8, 9]. Similar background: (1) The colors of the model pigs are indistinguishable from its surroundings [10, 11]. (2) The skin colors between the model pigs are similar [12, 13]. In light of the strong real-time demands for usual tracking system, therefore, more stringent requirements are imposed on the computational complexity for the target recognition and tracking algorithm, which signifies that the mobile targets need to be accurately identified and tracked only with a handful of calculation [1416]. In view of this challenge, we develop probabilistic graphical model (PGM) mechanism for suppressing the probability values of the background and the target in the projected image simultaneously, and we put forward a novel Camshift tracking approach based on correlation probability graph called CamTracorPG. Our CamTracorPG can effectively achieve a good tradeoff among the recognition accuracy, scalability, and computational expense. Generally, the major contributions of this paper are three-fold. First, the remainder of this paper is organized as follows. In Section 2, a novel Camshift tracking method is put forward. In Section 3, we compare the laboratory results in three different cases and proved the superiority of the method in this paper. Related work is briefly surveyed in Section 4. In Section 5, we summarize the paper and point out our future research directions.

2 The design of the tracking method

2.1 Correlated calculation of probabilistic projection graphs

The Camshift tracking method employs the chrominance (H) information in the HSV color space to establish a histogram model of the target [17, 18]. Subsequently, on the basis of the established histogram, an inverse probability projection graph of the target in the tracking window is set up [19, 20]. Since the histogram model is relatively simple to establish in some ways, the computational cost of the tracking method is little [21, 22].

Assume that the chromaticity value in the HSV color space is divided into m levels [23], and there are a total of S pixels from the target area, where the coordinate position of the ith pixel is (xi,yi),i=1,2,…,S, the corresponding chromaticity value of the point is b(xi,yi) [24, 25]. Then, we could achieve the target chromaticity histogram model for the target area by the following formulae.

$$ \begin{aligned} &{\mathbf{q}}= \left\{{q_{u}}\right\},u=1,2,\ldots,m \\ &q_{u}=\sum\limits_{i = 1}^{S} \delta[b(x_{i},y_{i})-u] \end{aligned} $$
(1)

Set the chromaticity value of the pixel at position (x,y) in the tracking window as u. According to the target histogram model by Eq. (1), the value of inverse projection probability at this point can be obtained:

$$ p(x,y)=\frac{q_{u}}{\text{max}\left\lbrace q_{i}|i = {\rm{ }}1,{\rm{ }}2,{\rm{ }}3,{\rm{ }} \ldots,m\right\rbrace } $$
(2)

The pixel gray value represents the probability value, and the projection gray value corresponding to the above inverse probability value is pg(x,y):

$$ {p_{g}(x,y)}=\left\lfloor{{\frac{q_{u}}{\text{max}\left\lbrace q_{i}|i = {\rm{ }}1,{\rm{ }}2,{\rm{ }}3,{\rm{ }} \ldots,m\right\rbrace }}\times255}\right\rfloor $$
(3)

The symbol “ ⌊⌋” signifies rounding. The gray value of inverse probability projection is obtained through Eq. (3) for all pixels within the tracking window, then the inverse probability projection graph could be obtained. The gray value of the pixel in the inverse probability projection graph varies from 0 to 255. Particularly, the pixel with a gray value of 255 appears white, which means that the pixel belongs to the target area in significant measure, and the pixel with a gray value of 0 is black, which indicates the probability that the pixel belongs to the target area is comparatively tiny.

In the above-described target inverse probability projection graph, the correlation information between adjacent pixels is not taken into account only considering the individual information of each pixel. Therefore, when it comes to the chromaticity, if the background is analogous with the target chromaticity, each pixel from background graph will also obtain a higher probability value in the inverse projection graph, which would lead to huge interference to the object identification.

Considering this challenge, there are two ways to deal with, that is, to increase the probability gray value of the target area within the inverse projection graph or to suppress the probability gray value of the background area, respectively. Hence, we propose a chromaticity correlation calculation for the inverse probability projection graph obtained by Eq. (2), that is, each pixel point is associated with the probability value of the surroundings, so as to determine the inverse projection correlation probability value. The distribution of a pixel a0,0 and its surrounding pixels in the inverse projection graph is depicted in Fig. 1.

Fig. 1
figure 1

Pixel distribution

In which ai,j represents the jth pixel around the ith circle of a0,0, and the probability value calculated by Eq. (2) for this pixel is \( p^{0}_{i,j} \). A correlation calculation is performed on the probability value \( p^{0}_{0,0}\) of the pixel a0,0 to obtain \(p^{1}_{0,0}\):

$$ {p^{1}_{0,0}} = \frac{1}{N}\left[\sum\limits_{i= 1}^{N}{\frac{1}{8i}{\sum\limits_{j = 1}^{8i} {{p^{0}_{0,0}}.{p^{0}_{i,j}}}}}\right] $$
(4)

According to Eq. (4), the results of multiple correlation calculations can be deduced. For example, the k-time correlation results for pixel a0,0 are:

$$ { p^{k}_{0,0}} = \frac{1}{N}\left[\sum\limits_{i= 1}^{N}{\frac{1}{8i}{\sum\limits_{j = 1}^{8i} {{p^{k-1}_{0,0}}.{p^{k-1}_{i,j}}}}}\right] $$
(5)

Although the probability of the target areas will be suppressed through the formula in (5), but the probability value for the background area will be suppressed more significantly. In this way, we could achieve a good tradeoff among the highlighting for target area and the control for the background area, so that ultimately enable the goal to be more prominent. Next, the obtained probability value by (5) is conducted in the normalization process, and then the inverse projection probability gray value is calculated as follows:

$$ {p_{i,j}}=\left\lfloor \frac{p^{k}_{i,j}}{p^{k}_{\text{max}}}{\times255}\right\rfloor $$
(6)

Among them, \(p^{k}_{\text {max}}\) denotes the maximum probability value in the tracking window after k-time correlation calculations, that is \(p^{k}_{\text {max}}= \text {max}\left \lbrace p^{k}_{i,j}\right \rbrace \). The accuracy of the target recognition will be considerably fruitful owing to utilizing the inverse projection probability graph after correlation calculation to identify and locate the target.

2.2 Algorithm description

Specifically, the algorithm for target recognition and tracking is described as follows in detail:

Step 1. Select the tracked target and employ the foregoing Eq. (1) to establish the chromaticity histogram model for the tracked target.

q={qu},u=1,2,…,m

Step 2. Calculating the inverse projection probability value by Eq. (2).

Step 3. Searching the k-time correlation probability value through Eq. (5).

Step 4. Building the inverse projection probability graph by using Eq. (6).

Step 5. Computing the zero and first-order moment for the search window based on the gray value of the inverse probability-based projection graph:

$$ {M_{00}} =\sum\limits_{x}^{} {\sum\limits_{y}^{} {p_{x,y}}} $$
(7)
$$ {M_{10}} =\sum\limits_{x}^{} {\sum\limits_{y}^{}x{p_{x,y}}} $$
(8)
$$ {M_{01}} =\sum\limits_{x}^{} {\sum\limits_{y}^{}y{p_{x,y}}} $$
(9)

Step 6. Figuring up the centroid position (xc,yc) of the search window by using the zero and first-order moments obtained in step 5:

$$ {x_{c}} = \frac{M_{10}}{M_{00}} $$
(10)
$$ {y_{c}} = \frac{M_{01}}{M_{00}} $$
(11)

Step 7. Adaptively adjust the side length of the search window:

$$ s =2 \sqrt {\frac{M_{00}}{256}} $$
(12)

The center of the search window is drifted to the centroid of the search window. Then, comparing the drift distance with the set threshold, and if the drift distance is greater than the set threshold, we need to repeat steps 5–7 and continue the next step 8 until the drift distance is less than the set threshold.

Step 8. Calculating the second-order moment of the search window based on the correlation probability value:

$$ {M_{11}} =\sum\limits_{x}^{} {\sum\limits_{y}^{}xy {p_{x,y}}} $$
(13)
$$ {M_{20}} =\sum\limits_{x}^{} {\sum\limits_{y}^{}x^{2} {p_{x,y}}} $$
(14)
$$ {M_{02}} =\sum\limits_{x}^{} {\sum\limits_{y}^{}y^{2} {p_{x,y}}} $$
(15)

According to the second moment obtained above, the following three parameters are calculated:

$$ a=\frac{M_{20}}{M_{00}}-x^{2}_{c} $$
(16)
$$ b=2\left(\frac{M_{11}}{M_{00}}-{x_{c}}{y_{c}}\right) $$
(17)
$$ c=\frac{M_{02}}{M_{00}}-y^{2}_{c} $$
(18)

Accordingly, on the basis of the obtained parameters, the size and direction of the target area are adaptively updated. Concretely, the length, width, and direction of the target area is updated as in (19), (20), and (21), respectively:

$$ L=\sqrt{\frac{(a+c)+\sqrt{b^{2}+(a-c)^{2}}}{2}} $$
(19)

The width of the target area is updated as:

$$ W=\sqrt{\frac{(a+c)-\sqrt{b^{2}+(a-c)^{2} }}{2}} $$
(20)

The direction of the target area is updated to:

$$ \theta=\frac{1}{2}\operatorname{\arctan\left({\frac{b}{a-c}}\right)} $$
(21)

Until here, the recognition and tracking of this frame are smoothly completed.

Step 9. Return to step 1, focus on the next frame, re-identify, locate, and track the target of the next frame by employing steps 1–8.

Through the above series of steps of our proposed CamTracorPG, a model pig could be more effective in tracking and recognizing with better precision in a scalable manner. In the process of described tracking above, the target is recognized and tracked well by means of multi-correlation probability grayscales. Formally, our proposal is specified by the following pseudocode. The performance of our proposal is specified by the following Section 3.

3 Results and discussion analysis

To verify the effectiveness of this method, our proposed approach in this paper are compared with the basic Camshift method, the Camshift method of multi-feature fusion under the similar background, and the actual complex environment. The parameters for computing chromaticity correlation are selected as follows: N = 1, k = 1.

The data of this experiment is from the Zhuozhou experimental demonstration base of the China Agricultural University. The total data is 425 G, lasting for 160 days; it records the whole process of the model pig from entry to exit. The tested data are randomly selected from all data, in comparing detection and tracking effect, and some frames with large interference are selected for comparative analysis.

3.1 Motion target tracking in actual complex environment

In many practical applications, the tracking of the target is usually very complicated, and there are a large number of interference areas [2628]. The three tracking methods mentioned are used to track model pigs in actual farms. As shown in Figs. 2, 3, 4, 5, 6, and 7, the body color of the model pig is single black, the background in the scene is also grayish black, and the illumination is not uniform. If one model pig is selected for tracking and the other one becomes a disturbance, it is difficult for the tracking algorithm to distinguish between the two model pigs and the tracking task cannot be realized.

Fig. 2
figure 2

Tracking results of the basic Camshift method

Fig. 3
figure 3

Projection diagram of basic Camshift probability

Fig. 4
figure 4

Tracking results of the multi-feature fusion Camshift method

Fig. 5
figure 5

Probability projection graph of multi-feature fusion Camshaft

Fig. 6
figure 6

The tracking method in this paper

Fig. 7
figure 7

The associated probability projection of the method in this paper

In this paper, the small target in the right eye of the left model pig is selected as the tracking object, and the pig is subjected to a motion tracking experiment. The tracking results obtained by the above methods and the corresponding probability projection graphs are shown in Figs. 2, 3, 4, 5, 6, and 7.

As can be seen from Figs. 2 and 3, in the basic Camshift tracking method, when the pig’s posture changes, the influence of disturbances such as changes in illumination makes the difference in chromaticity between the target and the background smaller. As a result, the target is easy to get positioning error, which leads to inaccurate tracking.

As can be seen in Figs. 4 and 5, with the multi-feature fusion Camshift tracking method, the fusion of various features can overcome some of the influence, which comes from the weakening of the target feature due to the changes in illumination and posture.

However, due to the small size of the tracking template, there is little obvious difference between the various features which are merged and the background features, resulting in a low probability of the inverse projection graph, and it still cannot make the algorithm obtain strong anti-interference ability; eventually, the target is incorrectly positioned and the target tracking task under this complex background could not be completed.

As can be seen in Figs. 6 and 7, the projection probability graph is established by using the target’s correlation probability results in this method. Although the target’s inverse projection probability value is close to the background, but after the correlation calculation of a pixel position, the probability value of the background area around the target can be suppressed, that is, in the local area, a relatively significant target areas can be obtained; therefore, the interference caused by the change of illumination and the background interference to the target positioning can be overcome, and the effectiveness of positioning in the target area can be guaranteed; finally, the target tracking task in a complex background can be completed.

In this profile, the tracking accuracy values of the three approaches are tested and compared, and the average overlap rates of the tracking targets for the three methods are presented in Fig. 8.

Fig. 8
figure 8

The average overlap rate of the tracking targets for the three methods

As shown in Fig. 8, in the tracking result of the basic Camshift tracking method, the average overlap rate of the tracking targets is 71% (136th frame), 66% (288th frame), 49% (304th frame), and 28% (410th frame); in the tracking result of the multi-feature fusion Camshift method, the average overlap rate of the tracking targets is 60% (136th frame), 39% (288th frame), 34% (304th frame), and 20% (410th frame); and in the tracking result for this paper, the average overlap rate of the tracking targets is 91% (136th frame), 85% (288th frame), 88% (304th frame), and 89% (410th frame). The method in this paper is superior to the other two commonly used methods in experimental results.

The average overlap rate is one of the core criteria in the evaluation of target tracking. We further analyze the proportion of the trajectory to which the target is largely tracked, that is, the MT index. As shown in Fig. 9, in the tracking result of the basic Camshift tracking method, the MT rate of the tracking targets is 85% (1 h), 78% (2 h), 71% (3 h), and 60% (4 h); in the tracking result of the multi-feature fusion Camshift method, the MT rate of the tracking targets is 80% (1 h), 70% (2 h), 50% (3 h), and 10% (4 h); and in the tracking result of this paper, the MT rate of the tracking targets is 90% (1 h), 88% (2 h), 85% (3 h), and 83% (4 h). The method in this paper is superior to the other two commonly used methods in experimental results.

Fig. 9
figure 9

The MT rate of the tracking targets for the three methods

4 Related work

Owing to its stability characteristic in practical applications, Meanshift algorithm has become one of the most effective techniques in various target tracking field [29]. In this section, we briefly survey the related work about tracking algorithm approaches from two perspectives: Camshift algorithm and the method of combining multiple features.

4.1 Camshift algorithm

Camshift algorithm, as an improved algorithm for the Meanshift algorithm [30], possesses the function of adjusting the target size adaptively. At the same time, the cost of calculation is little so that can meet the requirements for real-time tracking [31]. However, the Camshift algorithm is mainly suitable for the tracking of salient targets. That is to say, when the difference between the target and the surrounding tones is obvious, the Camshift algorithm can obtain ideal performance; when the target is similar to the surrounding background, the goal will be submerged in the background and cannot be identified or tracked. This is one of the key technical challenges in target identification and tracking [32]; in other words, the complexity of the environment and the similarity of the background have a significant influence on the performance of the target identification algorithm [33]. Even if the environment or background looks like straightforward, the objective will be submerged in the background so that it is pretty difficult to be effectively identified and tracked while the chromaticity of the background is similar to the tracked target [34].

4.2 The method of combining multiple features

The method of combining multiple features is recruited to compensate for the consequences that the target cannot be accurately identified due to a single feature [35, 36]. For example, in order to enhance the accuracy of targeting, a promising approach is to raise the features like texture and edge based on the chromaticity features by utilizing the complementary relationship between multiple features. However, in practical situation, they may also cause interference with each other due to the interaction of features such as chromaticity, texture, and edge [37]. For instance, if an area in the background domain is similar to the texture or edge of the target, the tracking effect may be considerably compromised [38].

5 Conclusion

To deal with the problem of model pig tracking under similar background and complex background, a Camshift tracking approach for correlation probability calculation is proposed. Specially, each pixel point in the inverse projection graph is correlated with its surrounding projection probability values, which could commendably control the probability value within the background area. Simultaneously, it also highlights the relative probability value of the target area in disguised form, so that the target is not submerged in the background, and then fulfills the requirement for improving the target tracking performance. From the extensive experiments displayed, on the one hand, while the model pigs have similar chromaticity characteristics to the background area, this approach can significantly separate the target from the background area; on the other hand, while the target is in a complex background, the method can effectively suppress the probability gray value of the interference area; at the same time, it could realize a good tradeoff between the accuracy and effectiveness of identification for model pig.