Independent Thresholds on Multi-scale Gradient Images

In this paper we propose a multi-scale edge detection algorithm based on proportional scale summing. Our analysis shows that proportional scale summing successfully improves edge detection rate by applying independent thresholds on multi-scale gradient images. The proposed method improves edge detection and localization by summing gradient images with a proportional parameter c n (c < 1); which ensures that the detected edges are as close as possible to the fine scale. We employ non-maxima suppression and thinning step similar to Canny edge detection framework on the summed gradient images. The proposed method can detect edges successfully and experimental results show that it leads to better edge detection performance than Canny edge detector and scale multiplication edge detector.


Introduction
Edge detection is a fundamental image processing method which has been studied for decades (1,2,3,4) .In general, the aim of edge detection is to significantly reduce the amount of data in an image, while retaining the structural properties to be used for further image processing.Usually images contain noise information which could be caused by electrical noise, illumination, shades and reflection etc. (5) .Therefore, edge detection has been a challenging topic for the last two decade.The common approach is to apply the first (or second) derivative to a smoothed image and then find the local maxima (or zerocrossing) (6) .Canny (1) first presented the well-known three criteria of edge detectors: good detection, good localization, and only one response to a single edge.
The state of art gradient based edge detectors suffers from an important issue of scale selection of detection filter.
One detector running at one scale does not yield all edges in an image (5) .Small scaled filters are sensitive to edge signals but also prone to noise, whereas large scaled filters are robust to noise but can filter out fine details (5) .A small step in this direction has been accomplished by using multiple detectors and multiple scales as described in (5,7) .The characterization of signals from multi-scale edges has been studied extensively in the literature (3) .Hasegawa et al. (7) implemented the IMPRESS system that is able to choose the appropriate detector to find a given edge.It involves applying all the detectors and retaining the one which generates the edge most similar compared to a referenced edge.Lindeberg (8) has implemented a system which is able to automatically select edge detectors and their scales to extract a given edge.
However, many of these multi-scale schemes need to explicitly select the best scale of detectors.Rosenfeld (9) exploited the idea of scale multiplication which can improve the edge detection results without explicit scale selection.Bao et al. (10) further evaluated scale multiplication method based on Canny's three criteria.They showed that through scale multiplication the localization accuracy can be significantly improved with only a small loss in the detection criterion and the product of the two criteria for the scale multiplication is greater than that for a single scale.However, their method only demonstrated multiplication of two scales and does not guarantee that all edges will be detected.They did not present any result of implementing multiplication of more than two scales.Furthermore, implementing scale multiplication of several scales instead of two scales is not promising since multiplying several scales may reject not only noise but also correct signals (i.e.; desired edges).As the localization criterion can be improved by scale multiplication at a loss in the detection criterion (10) .
In this paper, we propose a novel proportional scale summing method instead of scale multiplication.With the benefit of scale summing, each scale can have its threshold set independently, and fine scale gradient image can be multiplied with a parameter c n (c < 1) before being added together with large scale gradient images, thus resulting in a localization as close as possible to the fine scale.Scale summing brings obvious benefit in form of independent thresholds which enables us to find 'good' thresholds.In single scale detection method, it is challenging to find a trade-off between noise reduction and good detection rate.Small thresholds can detect more edges but give rise to more noise.While large thresholds may filter out more noise but also possibly reject correct edges.In contrary, by employing the proposed proportional scale summing method, one can set a large threshold to filter out most noise of a small kernel and let the miss-filtered edge to be detected on the next larger kernel.In general, blurred edges usually produce low responses on fine scale and a low filter response is sometimes misjudged as noise.In our method the blurred edges are not ignored because the large scale filters work independently and independent thresholds are applied.Finally, thinning is applied on the summed gradient image.Thinning is done through finding local maxima and applying hysteresis (1) .This paper is organized as following: section 2 provides an overview of related edge detection work; section 3 describes our edge detection approach; finally in section 4 we present analysis based on the experiment results and concluding remarks.

Background
Edges are normally representations of changes in intensity functions of an image; i.e., image intensity variations such as steps, lines and junctions (5) .The widespread edge detection methods detect edges by finding local maxima of first-order derivative function or zerocrossing of second-order derivative function of the intensity profile of given image.In practice, image gradients can be estimated by convoluting images with first-order derivative operators (also known as kernels), such as Robert's cross operator, Prewitt operator and Sobel operator (6) .The kernel convolution finds the abrupt changes in intensity of the image.In case of Sobel kernel both horizontal and vertical changes are approximated.Formally if I(x, y) is intensity level at a point in a given source image, and Gx, Gy are horizontal and vertical derivative approximations, and then the gradient magnitude is given as: Once the gradient magnitude is computed, the next step is to apply a threshold, to decide whether edges are present or not at an image point.Appropriate threshold will filter out most noise and keep edge points.After this step, the resulting edge is still thick.Canny (1) introduced the notion of non-maximum suppression (NMS) to find edges with one-pixel thickness by modifying Sobel method; which gives the gradient directions as: It also introduces a hysteresis method to further reduce noise.Hysteresis uses the upper threshold to find the start of an edge.Then edge is traced from the start point, marking an edge whenever gradient value is above the lower threshold.Hysteresis step deletes the weak edge points that are not connected to a strong edge.However, choosing appropriate threshold values vary over the images and it is challenging to find a trade-off between detection rate and noise cancellation rate.Bao et al. (10) developed scale multiplication edge detector (SMED) which increase edge detection rate and reduce noise.Scale multiplication is defined by the product of gradient magnitude on two different scales as: However, their method only demonstrated multiplication of two scales and does not promise that all edges will be detected.In order to evaluate the performance of edge detectors, Canny (1) has generalized three criteria: 1. Good detection: The probability of detecting real edge points should be maximized while the probability of falsely detecting non-edge points should be minimized.This corresponds to maximizing the signal-to-noise ratio.
2. Good localization: The detected edges should be as close as possible to the real edges.
3. Low spurious response: One real edge should not result in more than one detected edge.
Our edge detector is based on the Canny edge detector framework and the major difference is that we apply independent thresholds on each scale and proportionally sum the gradient images.Independent thresholds are the key for good detection.Since applying thresholds bring non-linearity to the formula, therefore it is better to evaluate the first criterion in an analytical way rather than mathematical inference (section 3.2).Proportional scale summing is the key of good localization.This idea can be easily proved in an analytical way (section 3.3).

Implementation
In our algorithm, an original image is convolved with extended Sobel kernel (11) on each scale to obtain x, y directional gradient of a given image.The gradient image is then summed proportionally with a parameter c n (c < 1), where c is a constant and n is the power.Formally, the x-derivative image (if   (, )>T_low s ) (5)   and the y-derivative image The proportional scale summing edge detection algorithm (PSSED) is described in Fig. 1 using flow chart and pseudo code.
The gradient image is handled with two scaleindependent thresholdsone low threshold and one high threshold to define weak edge and strong edge.In our experiment, each filter size is set approximately twice of the smaller one.In this work, we use four scales with 3x3, 5x5, 9x9, and 17x17 Sobel operator resulting in eight independent thresholds; i.e., 4 lower thresholds (21, 15 ,13, 8) and 4 upper thresholds (42, 30, 26, 16).In our work, the gradient calculated from Sobel filter is standardized to a r a n g e b e t we e n 0 a n d 2 5 6 .Scale (filter width) and parameters can be adjusted according to task requirement.The empirical principle of setting a lower threshold is as, the lower gradient filter threshold is half of the upper Fig. 1.Proposed PSSED algorithm.Scale (filter width) and parameters can be adjusted according to task requirement.threshold; whereas the threshold in fine scale must be higher than in the coarse scale to fulfill the assumption that the fine scale contains more noise.
Here the last step of finding local-maxima and hysteresis is similar to Canny edge detection framework.The difference is that we use proportional summed directional gradient to calculation edge direction.Hysteresis step uses edge possibility image which define weak edges and strong edges.When the scale increase, the number of strong edges increases because we do not lower the possibility of a pixel from strong edge (defined is small scale) to weak edge.The hysteresis starts to trace the edge from strong edge point and stop where the weak point vanishes.
One advantage of PPSED is that it can distinguish between noise and blurred edges.By setting higher thresholds more noise can be reduced.By applying multiscale summing, detection rate on blurred image is increased (see Fig. 5 -7).

Good Detection Criterion Evaluation
Good detection means an edge detection algorithm should have high detection rate on true edge and have low  false alarm on a false edge.Fig. 2 demonstrates how independent threshold works for our PSSED method.The noise on fine edge is filtered out by independent thresholds on each scale, and the edge can be detected correctly in the next larger scale.On the contrary, Canny edge detector uses only one scale which either brings noise if a small threshold is used; or fail to detect blurred edge if a large threshold is employed (11) .Here noise is the gradient peaks on fine scale (pointed by circles on Fig. 2(b)) which cannot accurately reflect gradient peak position of blurred edges (pointed by circle on Fig. 2(c)).This is due to the edge scale being larger than the gradient operator.By successfully detecting blurred edges, detection rate is increased.The difference between our proposed method and scale multiplication edge detector (SMED) (10) is that SMED improve SNR by decreasing noise such as salt pepper noise whereas our proposed method improves SNR by increasing detection rate significantly.At the same time, Good detection performance is also affecting Good localization performance by suppressing the delocalization effect caused by inaccurate local maxima on fine scale.It can be noticed that the gradient values from Sobel 3x3 kernel have two peaks because of noise which can be filtered out by independent thresholds.

Good Localization Criterion Evaluation
Fig. 3 shows the gradient images of a synthetic image and its edge images.It can be clearly seen that the corners become less sharp when the scale increases.It can be  clearly seen that the corners become less sharp when the scale increases.After applying non-maximum suppression the corners around T junction even loses continuity.This scale effect has been explained in the literatures (1,4) .Fig. 4 shows .This also brings much more possibilities to real application since some applications may need more smoothed edge image (see Fig. 5).

Low Spurious Response Criterion Evaluation
In order to fulfill the criterion of low spurious response, it is necessary to apply non-maximum suppression.Even some literatures have tried to develop good methods to get thin edge, it is still impossible to get a one-pixel width edge without non-maximum suppression.If non-maximum suppression is applied, then this criterion is fulfilled.Using this criterion may cause worse performance result because it may reduce detection rate and cause localization error.Fig. 6 shows that within the third criterion restriction our proposed method can successfully detect blurred edge.

Results and Concluding Remarks
The most common way to evaluate edge detectors is to compare detected edge with datasets which are benchmarked on natural images by human observers.Public available dataset which are used as evaluation datasets by some authors are suitable for contour detection instead of edge detection (12,13) .Since the main difference between edge detection methods and contour detection techniques is that in edge detection edges are detected based on intensity variations from pixel-to-pixel, while contours are salient region boundaries in an image.These contour datasets are usually benchmarked by human observers only considering boundaries of interesting regions whereas most fine edges are ignored which are important for feature based image analysis (see lituratures (3,12,13) for more discussion).Therefore, the contour datasets are not suitable enough for evaluating edge detectors.Some previous work has employed synthetic images to generate benchmarks.However, those synthetic images are usually not for analyzing multi-scaled edges since synthetic images cannot mimic natural blurring and noise in real image.
We have demonstrated the advantages of PSSED by subjective comparison with Canny edge detector and SMED because they fulfill the same low spurious response criterion.There are many edge or boundary detection methods which do not fulfill the low spurious response criterion.For example, the algorithms described in literatures (13,14) did not show the edge detection using nonmaximum suppression even they are developed in recent years.As shown in Fig. 6-7, PSSED is able to detect blurred edges where Canny edge detector failed (marked with ellipse in Fig. 6).Compared with Canny edge detector, SMED shows improved result on blurred edge (Fig. 6(d)) but it has low performance on localization.In order to detect all edges, SMED have to set a relatively low threshold.By applying large scales on SMED (Fig. 6(e) and  (f)), edges on blurred area are more smooth and this will cause unwanted blurring on the whole edge image.Overall, PSSED shows the best performance of generating smooth edge with/without blurring detail (see Fig. 7).To demonstrate this fact, we have selected some images with strong blurring effect (3) .Blurring effect can be caused by focal blurring, penumbral, shading, and etc.In Fig. 7(a) the bridge image is blurred due to out of focus lens and the edges are not detected by Canny edge detector (Fig. 7(d)).
Similarly, the shadow of the doll (Fig. 7(b)) has finer edges around foot and coarser edges around head.Our PSSED algorithm detects most edges on the shadow whereas Canny edge detector only finds edge close to foot.One limitation of PPSED algorithm is its noise reduction rate which is not as high as SMED, especially for the image with strong saltand-pepper noise (10) .However, in natural images such extreme cases may exit (e.g. the hair of animal, texture on tree) where it is not considered as noise technically.Our algorithm works well in such extreme cases if the higher thresholds is set i.e. a little loss of localization performance because high possibility of response from coarse scale rather than fine scale.In this work, we have presented a novel technique of proportional scale summing for edge detection.Proportional scale summing improves edge detection rate by applying independent threshold on multi-scale gradient images.Summing gradient images with a proportional parameter c n (c < 1) ensures the edge is as close as possible to the fine scale.It can be seen from the results, that our method is comparable and outperforms the previous edge detection methods.It is also shown that PSSED algorithm can distinguish the difference between noise and blurred edge with good detection rate and localization performance.
a 1-D extraction from the 2-D gradient image.We extract one row from the gradient image on each scale.The dash line on Fig. 4(a) shows the location of the extracted row.Fig. 4(b) shows the gradient value of extracted rows on each scale.It can clearly be seen that the local maxima location is significantly shifting with scale change (left peaks on Fig. 4(b)).While the local maxima location maintains stable on the edge (right peaks on Fig. 4(b)).According to our methodology, by setting a suitable c, the local maxima of proportional summed gradient image should be as close as possible to the edge on fine scale.Fig. 4(c) and (d) demonstrate this.By setting c < 1 or c > 1 one can determine if the local maxima is close to the fine scale edge or the coarse scale edge (marked with circles)