Abstract

Optical coherence tomography (OCT) is a noninvasive cross-sectional imaging technology used to examine the retinal structure and pathology of the eye. Evaluating the thickness of the choroid using OCT images is of great interests for clinicians and researchers to monitor the choroidal thickness in many ocular diseases for diagnosis and management. However, manual segmentation and thickness profiling of choroid are time-consuming which lead to low efficiency in analyzing a large quantity of OCT images for swift treatment of patients. In this paper, an automatic segmentation approach based on convolutional neural network (CNN) classifier and - () fitter is presented to identify boundaries of the choroid and to generate thickness profile of the choroid from retinal OCT images. The method of detecting inner choroidal surface is motivated by its biological characteristics after light reflection, while the outer chorioscleral interface segmentation is transferred into a classification and fitting problem. The proposed method is tested in a data set of clinically obtained retinal OCT images with ground-truth marked by clinicians. Our numerical results demonstrate the effectiveness of the proposed approach to achieve stable and clinically accurate autosegmentation of the choroid.

1. Introduction

Choroid is the vascular layer located between retina and sclera. Its inner surface is connected with the retinal pigment epithelium (RPE) through Bruch’s membrane (BM), and the outer surface is connected with the sclera. Recent researches indicated that the changes of the choroidal thickness could be related to some ocular conditions such as macular degeneration and myopia [14]. Therefore, segmentation and the ability to accurately measure the thickness of the choroid are of clinical importance.

Optical coherence tomography (OCT) is a technique for obtaining subsurface images of translucent or opaque materials with high resolution [5, 6]. It uses low-coherence interferometry and imaging reflections from interior tissues to generate cross-sectional images. Comparing with traditional imaging methods, OCT has some obvious advantages of being nondestructive, high resolution, and minimally invasive, and it has been widely applied in ocular detections for many years [3, 7, 8]. However, the imaging quality of the choroid is not good enough in retinal OCT images due to the shortage of penetration depth [9]. The major challenges of the choroidal segmentation are from low contrast of the lower boundary and unknown noise in the images, which will make the detection result inaccurate and unreliable.

To segment the choroid efficiently, many researchers studied model-based methods with prior assumptions for the structure of the input images: A-scan [10, 11], active contour [1215], sparse high order potentials [16], and 2D/3D graph [1722] methods. The main limitation of these traditional approaches is their high dependence on the feature extraction phase for the accurate segmentation. However, extraction of appropriate image features is difficult for a definite medical image recognition problem, and the traditional methods may provide disappointing segmentation results. In recent years, machine learning-based methods have achieved excellent performance in computer vision and medical image analysis. Convolutional neural network (CNN) is one of the extensive application approaches for image processing and also effective for multilayer segmentation in OCT images. Sui et al. used a graph-searching-based segmentation technique with learning an optimal graph weight by using CNN architecture [23]. Masood et al. proposed a two-stage segmentation method to segment out the BM and choroid and calculate the thickness map [24]. A series of morphological operations were used to segment BM, while the choroid was segmented using CNN. Fang et al. presented a novel framework combining CNN and graph search methods (CNN-GS) to segment nine layer boundaries on retinal OCT images [25]. Alonso-Caneiro et al. used a fully-convolutional network (FCN) technique based on graph search theory to segment the choroidal boundary and obtain the choroidal thickness profile from OCT images [26]. There are also some other outstanding networks for OCT image segmentation such as U-shape convolutional network (U-Net), which is considered as the most widely applicable architecture for medical image segmentation [27, 28].

It is worth noting that a major disadvantage of many machine learning-based methods, such as CNN-GS, FCN-GS, and U-net, is their reliance on the availability of a large supervised/marked data set. However, marking medical images manually requires highly professional technique, which leads to the lack of accurately labeled data in great quantity. In addition, the number of negative samples is far more than the number of positive samples, i.e., the pixels within a single OCT image are largely labeled as “0” (not contained in the Choroidal-Scleral interface boundary) as opposed to being positively labeled as “1” (contained in the Choroidal-Scleral interface boundary), which may affect the training results. In extreme cases, the loss of negative samples will dominate in the training process and may lead to a high accuracy even when the model predicts all the samples to be negative. Taking these into considerations, we propose an improved CNN model-based method which performs well on a small data set and reduces the adverse impact of unbalanced samples. With the choroid boundaries obtained by neural network, we further adopt a - () regression model to fit the choroidal layer curve. This model ensures not only the fitting accuracy but also the simplicity of fitting function, leading to a better generalization segmentation result.

This paper is organized as follows. In Section 2, we describe the details of the proposed method including segmentations of the BM and Choroidal-Scleral interface. Experimental evaluation and comparison with other methods are discussed in Section 3. Concluding remarks are given in Section 4.

2. Materials and Methods

2.1. OCT Data

OCT images were obtained in Chinese schoolchildren aged 8-13 years using spectral domain OCT (SD-OCT, Spectralis HRA+OCT, Heidelberg Engineering, Germany) in Optometry Research Clinic of The Hong Kong Polytechnic University. Consents were obtained from both children and their parents/guardians. The study protocol has been approved by the Human Subjects Ethics Subcommittee of The Hong Kong Polytechnic University and met the tenets of the Declaration of Helsinki. Cross-sectional OCT images with axial resolution of 3.9 μm and transverse resolution of 14 μm were obtained using a light source (peak wavelength of 870 nm) together with a scanning speed of 40000 A-scan/sec. In order to better capture the boundary of choroid, an enhanced depth image scanning mode was adopted. Choroid was then manually segmented using the built-in software in SD-OCT by trained clinicians. There are 146 marked retinal OCT images from 108 patients in total, and we divide them into a training set of 70 images, while the rest are categorized as in the testing set. In our experiments, all the OCT images are preprocessed as grey images with the size of () pixels.

2.2. Overview

The proposed approach is divided into two parts: BM segmentation and Choroidal-Scleral interface segmentation. Physiological tissues mentioned in this work are visualized by diverse colors in Figure 1, and red curves mark the known ground-truth provided by clinicians. For BM segmentation, we start by recognizing the approximate position of RPE, which is the brightest layer in retinal OCT images. We then identify the BM by its physiological feature in the vicinity of RPE. After the extraction of the BM, we consider to segment the Choroidal-Scleral interface by a two-stage process: (1)Partition the OCT image into small patches and input them into the CNN based classifier. The likelihood of the Choroidal-Scleral interface passing through the patch increases with the predicted value’s proximity to 1.(2)Generate a heat map according to the predicted values of patches. Choose appropriate points in the heat map and fit these points to obtain the curve of the Choroidal-Scleral interface.

The details will be discussed in the later sections.

Our numerical experiments in this paper are implemented in Tensorflow 1.13.1, Python 3.6.6, and Cuda 9.0, running on a server with 2 Tesla P100-PCIE GPU with 16 GB memory at 1.3285 GHz and an operating system of 64 bits in the University Research Facility in Big Data Analytics (UBDA) of the Hong Kong Polytechnic University. (UBDA website: https://www.polyu.edu.hk/ubda/.)

2.3. Retinal Pigment Epithelium Recognition

Since RPE is the brightest part in all retinal OCT images after the reflection of light, we can locate the approximate position of RPE. Regarding the OCT image as a matrix, the point with the largest pixel value in each column can be found and denoted by , . Fitting the points and then we can obtain a curve lying in the region of RPE. Here, we adopt the cubic regression method in consideration of the simplicity of the RPE curve and the high accuracy of extracted points.

2.4. Bruch’s Membrane Segmentation

From Figure 1, we can observe that the BM is at the boundary between RPE and the choroid, and below the curve extracted in the previous step. Therefore, we intend to find appropriate points , in the lower neighborhood of , where is with the largest difference in magnitude in the th column. In our experiments, the set is obtained in the region between and ; represents the curve whose point in each column is 5 pixels lower than the corresponding point in . The width of gap between two curves is decided by the observation that the thickness of RPE is about 5 pixels in OCT images. Finally, the curve acquired by fitting points is regarded as the BM. Some results of BM segmentation are shown in Figure 2; we can find that our result (green curve) and ground-truth of BM (red curve) coincide with each other. The error for each image is calculated by where and represent the corresponding numbers of row for th column in our result and ground-truth of BM, respectively. The average error and variance for 76 test images are 1.5189 and 1.1325.

2.5. Choroidal-Scleral Interface Segmentation

In this section, we present the details of the method based on CNN classifier and - () fitter to segment the Choroidal-Scleral interface.

2.5.1. Data Preprocessing and Clipping

In order to improve the recognition accuracy of the lower choroidal boundary, we first remove the irrelevant information in the OCT images above the BM. A large size of patch may lead to a small number of samples, while a small size of patch will result in the lack of features in samples. In our proposed approach, we cut images from the training set into a group of square patches with pixels from left to right, top to bottom with a step-size of 8.

For each minipatch, if the marked ground-truth passes through its center area, this minipatch will be labeled as “1” (positive sample) and as “0” (negative sample) otherwise. As shown in Figure 3, the minipatch in red is labeled as “1,” and the minipatch in yellow is labeled as “0.” The blue frame in the center is the recognition area with a size of pixels. In [24], the criterion to define positive samples is the marked ground-truth passes through the patch, otherwise is a negative sample. Thus, the number of positive samples made in [24] is more than ours, and the ratio of positive and negative samples is relatively balanced. However, the positive samples whose edges are passed through by the marked ground-truth may impact the extraction of features in training process. Therefore, we set the recognition area in our samples to avoid this problem. After finishing the labeling, the layer segmentation problem is converted into a binary classification problem.

2.5.2. CNN Training

The structure of the neural network used in our work is the Lenet-5 model [29]. It includes 3 convolutional layers and 3 fully connected layers, which is shown in Table 1. Each convolutional layer consists of a layer of convolution and function of local response normalization. The purpose of 3 continuous convolutional layers is to extract features and map the original data into a feature space. The kernel size in each convolution layer is . After the process of feature extraction, 3 fully connected layers are arranged to provide a classification.

Before applying this CNN model to segment the choroid, it is necessary to pretrain it by using training data. After data preprocessing and clipping, we can get 260000 patches with label as a revised training set. Among these patches, about 24000 samples are on-line samples with label 1, while the other 236000 patches are off-line samples with label 0. Note that the ratio of positive and negative samples is about 1 : 11 due to the way of making samples and specialty of images. As mentioned earlier, unbalanced samples will have a negative effect on the training result, and it is necessary to provide a reasonable solution. For clear display, we use to represent the CNN model; then, the relationship between the input patches and the output values can be denoted as where is the th patch of our input and is the set of parameters in the CNN model. The problem of unbalanced samples, the ratio of on-line samples, and off-line samples is about 1 : 11 and is treated by our adoption of a loss function called Focal Loss [30]: where is the label of the corresponding patch, ensures a higher weight for the rare online samples, and provides a higher weight for samples hard to be classified.

The Focal Loss, a dynamically scaled cross-entropy loss function, is proposed for dealing with samples’ imbalance in [30]. From formula (3), we can find that the value of loss function is reducing with a higher , which means the class with lower accuracy will dominate in network training.

Next, we need to minimize . Let , and the problem can be formulated as where denotes the total number of input patches. To prevent overfitting in CNN, Dropout and regularization have been used [3133]. In our numerical experiments, we add regularization in the objective function to prevent overfitting; hence, the regularized problem is written as

Adaptive Moment Estimation (Adam) optimizer [34] is applied to solve problem (5).

After finishing the training of the CNN model, we then apply it to segment choroid. Let be a matrix for the storage of output values corresponding to the image matrix. At the first stage, we need to implement the following operations: (1)Cut the input OCT image into a group of patches with size of from left to right, top to bottom with a step-size 3. Every patch matches a submatrix of with a size of ,(2)Input these patches into the trained CNN model one by one and obtain the corresponding predicted value, denoted by for the th patch, . All the values of elements in the corresponding submatrix are ,(3)The element value of corresponding to the th patch is replaced by . The element value of in the overlaps of two or four patches is replaced by a weighted average value. For example in Figure 4, if it lies on the overlaps of four patches: the th patch for , its element value is ,(4)Find the second largest element (i.e., the center element of three continuous largest elements) in each column of matrix as the desired point.

The results of the above process are shown in Figure 5, which is a special example with a lot of noises and mistaken points. Hence, we need to further recorrect the data.

2.5.3. Data Recorrection

From the above results, we notice that the major area of the choroid can be accurately detected. However, there may be some misjudgments because of the interference information in the image, and these errors will make a negative influence on the segmentation result if we directly fit all the sample points without filter. For this reason, the random sample consensus (RANSAC) algorithm [35] is applied to recorrect the detected points. RANSAC algorithm estimates parameters of a mathematical model from a set of observed data that contains “outliers” by iterations. Basic assumptions of RANSAC are in the following: (i)Data consist of “inliers”, i.e., the distribution of data can be explained by some set of model parameters.(ii)“Outliers” do not fit the model.(iii)Other data are regarded as noise data.

For RANSAC, it is also assumed that there exists a process of estimating model parameters for a set of given “inliers,” and this model can optimally explain or fit the data.

In addition, a shape constraint rule is enforced to help identifying the outlines, namely, the location of the choroid is always below the BM.

2.5.4. Fitter

This step sketches out the outer boundary of the choroid with recorrected data. Through the observation of the ground truth, a high-order polynomial is applied for curve fitting to improve accuracy. Moreover, to avoid the fitting function being too complicated for stability, we preferably set a relatively sparse group of coefficients . In recent years, regularization attaches attention and has advantages over smooth, convex regularization for variable selection. Motivated by this, a regression model consists of a data fitting term and a regularization term is used to ensure the sparsity of polynomial coefficients in this paper. The regression model is as follows: where

() denotes the set of recorrected coordinate points, is the vector consists of the fitting polynomial coefficients, , and is a parameter.

Some existing methods such as iteratively reweighted (IRL1) and (IRL2) minimization algorithms have been widely studied for solving problem (6), see [3639] and references therein. We adopt the hybrid orthogonal matching pursuit-smoothing gradient (OMP-SG) based on lower bound theory in [36] to solve problem (6). First, we use the OMP method to generate an initial point and its support set; then, the SG method is employed to further reduce the objective value of (6), and finally, the numerical solution is purified by deleting its entries with small values. The framework of the hybrid OMP-SG algorithm is given in Algorithm 1, where denotes the number of elements in the set .

Input:.
Step 1. Use the OMP method to obtain ,
Step 2. Use the SG method with as an initial point to solve
where . Let be the output of the SG method.
Step 3. Output a numerical solution satisfying

We refer the interested readers to [36] for more details on the OMP-SG algorithm. Moreover, the smoothing function used in the SG method is where

is a smoothing parameter. It is easy to verify that is continuously differentiable for any fixed .

Since , the objective function is bounded below and if . Moreover, the set of local minimizers of problem (6) is nonempty and bounded. According to Theorem 2.2 in [36], for any local minimizer of problem (6) satisfying , a lower bound theory of nonzero entries and an upper bound of are presented as follows.

Theorem 1. Let be a local minimizer of problem (6) satisfying for an arbitrarily given point . Let ; then, we have Moreover, the number of nonzero entries in is bounded by

From the upper bound in Theorem 1, the number of nonzero elements of is less than . The sparsity of is dependent on the choice of . A sufficient condition on for minimizers of problem (6) to have desirable sparsity can be found in [40].

With the output result by Algorithm 1, we can obtain a high-order polynomial whose coefficient vector is to fit the curve as shown in Figure 6(a). The red part is the scatter of recorrection points, i.e., , and the fitting curve is shown in green. In Figure 6(b), we compare the fitting result and ground-truth in green and red, respectively. Figure 7 shows some of the choroid segmentation results with images in the testing set. These examples display that the regression fitting is visually reasonable and conforms to the ground-truth.

2.6. Summary

In our experiments, we set , , , , and . For achieving a more stable fitting result, we choose the value of in (6) according to the recorrection data of the given image as following: (i), if and ,(ii), if or ,(iii), if the above two conditions are not satisfied.

In Figure 8, we give a flowchart to show the process of our choroid segmentation method. The RPE and BM are recognized first by their physiological characteristics. The labeled patches cut from OCT images are used to train a CNN classifier. Then, the input image can be transferred into a group set of points with a high predicted value by the trained CNN model. After data recorrection by the RANSAC method, the residual points are fitted by a regression model, and the fitting curve is the desired outer interface of the choroid. Thus, we obtain the segmentation results of the choroid.

3. Results and Discussion

In this section, we discuss the experimental analysis based on thickness evaluation. We give the measurement criteria of thickness, and then compare the average error, maximal error, and Dice coefficient of our experimental results with the results by the method in [24].

3.1. F Hypothesis Testing and T Hypothesis Testing

We use the Anderson-Darling test to verify both the average thickness marked by clinicians and predicted by our model with approximate normal distribution. Therefore, we can adopt F Hypothesis testing and T Hypothesis testing to analyze the thickness results obtained by our method.

By F Hypothesis testing, we can compute the confidence level of the thickness results. Denote by the thickness sample set in which the element represents an average thickness of choroid given by medical staff in one OCT image. is the thickness sample set given by our proposed method. First, we use the -test to verify whether the variances between the given thickness samples and our thickness results are similar. (i): there is no significant difference in the variance between the given sample and our result sample(ii): there is a significant difference in the variance between the given sample and our result sample

Value of -testing is where and are the variances of samples in and , respectively. The value of -test is , i.e., can not be rejected. In other words, it can be ensured that the results by our proposed method are not significantly different from the ground-truth.

After verifying the variance, the next step is to verify whether there is a significant difference between the mean values of the two sample sets. (i): there is no significant difference in the average value between the given sample and our result sample(ii): there is a significant difference in the average value between the given sample and our result sample

Value of -testing is where and are mean values of samples in and , respectively. The value of -test is , i.e., can not be rejected. Therefore, the mean value of our results is not significantly different from the data given by optometrists.

3.2. Minimum Distance Method

In this part, we compare the results of our proposed method and other methods in detail. Since finding the minimum distance point directly in the image will produce a zigzag error, we calculate the distance based on the regression functions of the BM and choroid. Define the set of horizontal ordinate in the result image as ; and represent the regression function value at , respectively. The minimum distance method starts from a point on the choroidal curve, then finding the corresponding point on the BM which is closest to point , and calculating the distance between these two points. As shown in Figure 9, the length of the yellow line is desired.

The thickness function at point is defined as follows:

Using this minimum distance method, we can obtain the thickness results by our proposed method. Next, we would like to present some measurements to compare our results with those by other methods.

3.3. Average Error and Maximal Error

In order to evaluate the thickness results, the average error and maximal error in an image are calculated to compare. The average error is calculated by where for represents the horizontal coordinate in each image, is the thickness function defined in (12) by our proposed method, is the thickness function provided by the ophthalmologists, and is the image width ( in this paper). The maximal error is

3.4. Dice Coefficient

Dice coefficient, a metric function in set comparing, is considered to measure the similarity between the segmentation result of the proposed method and the ground truth. It is defined as where sets and consist of the pixels in the segmented choroidal region by our proposed method and the manually labeled choroidal region by experts, respectively. denotes the number of elements in the set .

4. Discussion

The details of the comparison in , , and Dice coefficient over the whole testing set (76 images) are given in Table 2. The two-stage segmentation is the method proposed in [24] which is verified that it performs better than some classical approaches such as Graph cut [41], -means [41], and Graph Search Theory [26]. From Table 2, we can find that our proposed method possesses a smaller average error and larger Dice coefficient than the two-stage segmentation, which implies the effectiveness of the - regression model and other improvements in our method. The comparison in Dice coefficient is visually shown in Figure 10. Yellow and red points represent the average Dice coefficient of our proposed method and the two-stage segmentation, respectively. It is clearly displayed that the Dice coefficient data obtained by our proposed method is generally superior to the data generated by the two-stage segmentation.

5. Conclusion

In this paper, we propose and implement an automated segmentation method based on CNN classifier and - fitter to detect the region of the choroid. The BM, next to the inner surface of the choroid, is segmented by its physiological characteristic with the recognition of RPE. The extraction of the Choroidal-Scleral interface curve, outer surface of the choroid, is divided into two steps. First, we cut the images into small patches with label “on-line” or “off-line” to train the CNN classifier. The Focal Loss function and ADAM optimizer are used in the process of training the CNN model. Then, a binary classification problem is solved by using this CNN model with input test images. After obtaining the classified data with predicted values, we filter the mistake and noise points by the RANSAC method. In the second step, we adopt a - () regression model to fit the discrete and filtered points to generate the desired curve. The hybrid OMP-SG algorithm is employed to solve the - minimization. Segmentation results in some test images are given in Figure 7. Finally, we discuss and evaluate the experimental results in the aspects of average error, maximal error, and Dice coefficient. Comparison details with other methods are shown in Table 2 illustrating the effectiveness of the proposed method. With the help of the proposed method in autosegmentation of choroid from OCT images, the changes of choroidal thickness in response to experimental treatment or diseases could be effectively and accurately evaluated.

Data Availability

The data used to support the findings of this study were supplied by Optometry Research Clinic, School of Optometry, The Hong Kong Polytechnic University, under license and so cannot be made freely available. Requests for access to these data should be made to the Optometry Clinic in the Hong Kong Polytechnic University (Rachel Ka Man Chun, [email protected]).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This paper is partially supported by the PolyU research project “Analysis and applications of low information density signal processing” Hong Kong Research Grant Council, PolyU 153016/16P, PolyU 153000/17P, PolyU 1-ZVS5, and PolyU 1-ZVN2, and Henry G. Leong Endowed Professorship in Elderly Vision Health (8-8475). The authors would like to thank the PolyU Optometry Research Clinic for providing OCT images and PolyU University Research Facility in Big Data Analytics (UBDA) for providing servers for numerical experiments in this paper.