INTRODUCTION

Reference micromarking is used to define the boundaries of a certain surface area [1, 2], identify products, and indicate the location of protective nanomarkings [3]. Reference marking can be performed using a cantilever probe microscope or a nanoindenter. The marking may consist of individual imprints or a series of adjacent imprints, continuous lines, or geometric shapes. Searching for markings on a surface is performed using high-resolution microscopy. In this study, the application of scanning probe microscopy (SPM) for these purposes was considered; nevertheless, the results obtained can be used for image processing by different microscopes.

The process of searching for a marked surface area using a probe microscope is shown in Fig. 1. The microscope field of view moves along a serpentine curve with overlapping of neighboring regions. Each SPM image is subjected to automatic processing and binary classification (recognition) to separate those containing marking elements and those containing only background. If no marking is found, the field of view shifts. When marking elements are recognized, an operator is informed. The scanning time for one SPM image may be as long as several minutes, while the entire searching duration may be tens of minutes; thus, the urgency of automation of the main searching procedures (especially classification) becomes evident.

Fig. 1.
figure 1

Schematic of searching for the marking with a shift of the field of view of the scanning probe microscope.

MARKING SHAPE

The application of discrete reference marking was described in [1, 3]; in this technique, individual imprints indicated directions of shift of the microscope field of view toward a target surface area and corner points of this area. In some cases (Fig. 2), the use (recognition) of discrete markings may be limited for the following reasons:

Fig. 2.
figure 2

Examples of hard-to-recognize discrete micromarkings.

(i) presence of background elements with close sizes (Figs. 2a, 2b);

(ii) fragmentation of marking imprints (Fig. 2c);

(iii) absence of a sufficient number of imprints in the microscope field of view (Fig. 2c) (when one to three imprints fall in the field of view, the probability of successful marking recognition is low).

The first two reasons can be eliminated via additional image processing (smoothing, threshold filtering, background clipping, selection of peculiar points, and analysis of ordering of peculiar points). An obvious way to eliminate the third reason is to use extended clustered (Figs. 3a, 3c, 3e, 3j, 3k, 3l) or continuous (Figs. 3b, 3d, 3f, 3g, 3h, 3i) reference marks forming regular geometric shapes: lines, crosses, or rectangles. These marks have more clear and structured boundaries, which makes them easier to detect and recognize. However, as shown in Fig. 3, extended reference marks, while having fairly diverse shapes, may be characterized by individual defects and deviations from a regular shape. To date, there are no universal algorithms for recognizing such objects. Therefore, the development of such algorithms, focused, in particular, on the use of conventional image processing methods, is an urgent task. Here, the methods implemented in the open program library OpenCV are of greatest practical interest.

Fig. 3.
figure 3

Examples of reference markings.

ANALYSIS OF THE APPLICABILITY OF EXISTING RECOGNITION METHODS

The existing methods that are potentially suitable for recognition of marks images can be divided into low-level, feature, contour, and structural ones. Low-level methods imply functional image transformations; among them, the Fourier–Mellin transform is distinguished, which is invariant to image shift and rotation transformations. However, the Fourier–Mellin transform is invariant to relatively simple changes and may be inefficient at significant differences in the image brightness and contrast [4], changes in scale [5], noise effect, and distortion of the mark shape. For example, Fig. 4 shows that the results of comparing two images (Figs. 3b, 3d) using the Fourier–Mellin transform may have a high error even at significant overlapping.

Fig. 4.
figure 4

Result of comparison of two images (Figs. 3b, 3d) using the Fourier–Mellin transform (obtained based on [6]).

Feature methods use a certain functional representation of the vicinity of image pixels. These methods include detector-descriptors of peculiar points Fast, Orb, Brief, and Sift and algorithms of their comparison. During recognition, peculiar points of the obtained image are compared with those of the reference marking image. There are two factors hindering the comparison. The first one is the sensitivity of detector-descriptors to noise [7] and low-level distortions of the object shapes that are inherent in SPM images. The second factor is false correspondence to peculiar points due to the presence of many similar objects (grains, pores, etc.). For example, Fig. 5 demonstrates the ambiguity in the interpretation of the results of comparing images (Figs. 3b, 3d), obtained using the Sift detector and the Ransac method [8, 9].

Fig. 5.
figure 5

Comparison of peculiar points using the detector–descriptor Sift [9].

Contour methods are based on the comparison of contours, e.g., by calculating their moments. The use of these methods is limited by partial reflection of the marking, the influence of image distortions, and the presence of false contours at marking fragments (Fig. 6).

Fig. 6.
figure 6

Selection of contours of the marking presented in Figs. 3a–3c.

Structural methods are most adapted to the recognition of reference markings; they are based on the selection of objects in an image and their subsequent analysis. Structural methods that are applicable for marking recognition include template matching and Hough transform. The template matching method has limitations when the marking only partially fall in the microscope field of view. The Hough transform method makes it possible to detect circles, rectangles, and straight line segments. Taking into account the typical shapes of marking presented above (Fig. 3), it is expedient to use the Hough transform for searching for parallel and perpendicular segments approximating the marking fragments. The Hough transform can be implemented using the function cv::HoughLinesP from the OpenCV library, which suggests the preliminary selection of boundaries using the Canny method cv::Canny. The parameters of these functions are configured after making the marking and obtaining its preliminary images. However, the subsequent searching for the marking may call for the preprocessing of SPM images and additional configuration of the Hough transform parameters, which is related to changes in the background noise, effect of interference, changes in the probe parameters, surface contamination, etc.

IMAGE PREPROCESSING

The methods described below are focused on processing of single-channel images. The reason is that they are actually a map of surface heights measured by a probe. Preliminary analysis of the Hough transform showed that it is sensitive to a number of specific features of SPM images, which impede recognition of the reference marking. These features are as follows.

(i) SPM images contain low-frequency components caused by a tilt of the sample or nonideality of a scanner (a second-order surface is described generally), which reduces the contrast of the marking elements against the background of the low-frequency image components. If the marking is in a dark part of the image, fewer segments can be detected therein, and the probability of misclassification increases.

(ii) The background relief of the surface near the marking may contain some elements (folds, terraces, fibers, grains, interference, particles, etc.), which leads to preferred registration of straight-line segments along their boundaries.

(iii) Interference, contamination, or non-optimum settings of the probe microscope reduce the marking contrast.

(iv) If the marking clusterization is insufficient, excessive fragmentation of the segments approximating marking fragments is observed.

Let us consider the techniques of invalidating these specific features.

Elimination of the Tilt

To remove low-frequency image components due to a tilt of the sample or nonideality of a scanning device, a first- or second-order approximating surface (P(1) or P(2), respectively) is chosen, which is then subtracted from the initial image function Z [10]:

$$Z = Z - {{P}^{{(1)}}}.$$

The tilt elimination function is a basic one for the probe microscope software. If necessary, one can implement a simple algorithm of tilt elimination, which consists in the following.

(i) The tilt coefficients mx and my of a single-channel image are calculated, which implies determination of the sum of values of the image intensity function Z in the outer rows (columns), calculation of the difference of these sums, and division by the number of rows (columns) N:

$${{m}_{x}} = \frac{{\sum\limits_{i = 0}^N {{{Z}_{{N,i}}}} - \sum\limits_{i = 0}^N {{{Z}_{{1,i}}}} }}{N};\quad {{m}_{y}} = \frac{{\sum\limits_{i = 0}^N {{{Z}_{{i,N}}}} - \sum\limits_{i = 0}^N {{{Z}_{{i,1}}}} }}{N}.$$

(ii) After calculating the tilt coefficients mx and my, the initial image is corrected using the following expression:

$${{Z}_{{i,j}}} = {{Z}_{{i,j}}} + \frac{{N{\text{/2}} - i}}{N}{{m}_{x}} + \frac{{N{\text{/2}} - i}}{N}{{m}_{y}}.$$

Threshold Filtering

If there are background surface elements (grains, defects, etc.) in the image near the marking, its recognition will be difficult. To remove background elements, one can apply threshold filtering, for example, the function cv::threshold from the OpenCV library with a threshold determined using the IsoData method [11].

Examples of images for which the use of threshold filtering is efficient are presented in Figs. 3f and 3g. For other images, the threshold filtering may remove a part of the marking. In this context, this type of filtering was used in further experiments only for low-contrast images (standard deviation meanStdDev is below 7) in the TRESH_TRUNC mode, which is described by the expression:

$${{Z}_{{i,j}}} = \left\{ \begin{gathered} {\text{threshold}}\quad {\text{if}}\quad {{Z}_{{i,j}}} \geqslant {\text{threshold}} \hfill \\ {{Z}_{{i,j}}}\quad {\text{otherwise}}{\text{.}} \hfill \\ \end{gathered} \right.$$

Increase of the Contrast

To increase the contrast, the illumination equalization CLAHE (Contrast Limited Adaptive Histogram Equalization) algorithm was used; it is implemented as the function cv::createCLAHE from the OpenCV library. Illumination equalization increases the contrast of the marking and the probability of its recognition. However, the probability of false classification of background images also increases. Therefore, as in the case of the threshold filtering, the CLAHE method was applied only for low-contrast images.

Morphological Erosion

Erosion (operator Θ, function cv::erode in OpenCV) of image Z and structural element B at the point xy of the image pattern is described by the expression

$$A\Theta B = \{ {{Z}_{{xy}}}\,|\,{{(B)}_{{xy}}} \in Z\} ,$$
(1)

where (B)xy is a structural element localized at the point xy. Expression (1) means that an image pixel remains invariable if the structural element centered at this pixel and the vicinity of the image coincide. Otherwise, the pixel is replaced by a minimum in the vicinity. Thus, erosion leads to the blurring of objects protruding above the surface and the closure of buried objects. As a result, individual marking imprints close up and the segments recorded by the Hough method are elongated. Background images after erosion become more blurred, which reduces the number of recorded segments and increases the probability of correct classification.

ALGORITHM OF RECOGNITION USING THE HOUGH TRANSFORM

The recognition algorithm consists of the following stages: recording of segments with additional tuning, selection of mutually parallel and perpendicular segments, and decision making. The goal of the additional tuning is to reduce the number of segments to a certain value N0 (N0 = 35). To this end, the parameter threshold (cumulative threshold) of the function cv::HoughLinesP() was used; it is responsible for the relative intensity of the recorded segment.

The additional configuration algorithm can be written as follows.

Step 1. Registration of segments using the Hough method.

Step 2. Determination of the number of segments N.

Step 3. If N < N0, then end; otherwise, increase the threshold and go to Step 1.

To decide whether there is markings, the fraction R of segments with relative rotation angles of 0° ± 10° and 90° ± 10° was calculated. When the R value exceeded a certain threshold, a decision was made to classify the image as containing marking elements. The relative rotation angles were determined from the expressions:

$$\varphi = \arctan \frac{a}{b};$$
$$a = \frac{{({{x}_{1}} - {{x}_{2}})(y_{1}^{'} - y_{2}^{'}) - ({{y}_{1}} - {{y}_{2}})(x_{1}^{'} - x_{2}^{'})}}{{{{{({{x}_{1}} - {{x}_{2}})}}^{2}} + {{{({{y}_{1}} - {{y}_{2}})}}^{2}}}};$$
$$b = \frac{{({{x}_{1}} - {{x}_{2}})(x_{1}^{'} - x_{2}^{'}) - ({{y}_{1}} - {{y}_{2}})(y_{1}^{'} - y_{2}^{'})}}{{{{{({{x}_{1}} - {{x}_{2}})}}^{2}} + {{{({{y}_{1}} - {{y}_{2}})}}^{2}}}},$$

where \(({{x}_{1}}{{y}_{1}})({{x}_{2}}{{y}_{2}})\) and \((x_{1}^{'}y_{1}^{'})(x_{2}^{'}y_{2}^{'})\) are the coordinates of the segment ends.

After processing by the method cv::Canny(), the boundaries may be much larger or smaller than the marking segments; therefore, the analyzed segments must be filtered with respect to length. The function cv::HoughLinesP allows for filtering only relative to the minimum length. Therefore, additional limitation with respect to the maximum length was implemented at the stage of calculating the angles of mutual orientation of the segments. Note that, for better adaptation to the shape and size of the marking, several versions of the maximum length threshold from an empirically determined list were used. In particular, the maximum element of the list is determined by the maximum length of the marking segments, the rest are arranged in the descending order with a certain step.

The image classification algorithm (after additional configuration of the Hough transform) can be written as follows.

Step 1. Formation of set Λ of segments with a certain length:

$$\Lambda = \bigcup {\{ {{l}_{i}},{{l}_{{\min }}} \leqslant {{l}_{i}} \leqslant {{l}_{{\max }}}\} ,} $$

where lmin is the minimum length of the segments under consideration (15% of the image size in pixels), lmax is the maximum length of the segments under consideration (100% of the image size in pixels), and li is the length of the segment connecting a pair of points:

$${{l}_{i}} = \sqrt {{{{({{x}_{2}} - {{x}_{1}})}}^{2}} + {{{({{y}_{2}} - {{y}_{1}})}}^{2}}} .$$

Step 2. Formation of a set of angles of mutual orientation of segments from the set Λ:

$$\Phi = \bigcup\limits_{{{L}_{i}} \in \Lambda } {\bigcup\limits_{{{L}_{j}} \in \Lambda } {\{ \varphi = \angle {{L}_{i}}{{L}_{j}}\} .} } $$

Step 3. Formation of a truncated set of angles of mutual orientation of segments with the given values:

$$\begin{gathered} \Phi = \bigcup\limits_{{{L}_{i}} \in \Lambda } {\bigcup\limits_{{{L}_{j}} \in \Lambda } {\{ \varphi = \angle {{L}_{i}}{{L}_{j}}:(\varphi = 0^\circ \pm 10^\circ )} } \\ {\text{or}}\;(\varphi = 90^\circ \pm 10^\circ )\} . \\ \end{gathered} $$

Step 4. Calculation of the ratio \(R = {{\left| \Omega \right|} \mathord{\left/ {\vphantom {{\left| \Omega \right|} {\left| \Phi \right|}}} \right. \kern-0em} {\left| \Phi \right|}}\).

Step 5. If R >= 0.5, then end (the image contains the marking).

Step 6. Selection of the next threshold value p = lmax.

Step 7. If p = null, then end (the image does not contain the marking); otherwise, go to Step 1.

RESULTS

Table 1 presents the values of the ROC-AUC metric (quality of the image classification) for different preprocessing techniques: (technique 1) threshold filtering of images; (technique 2) erosion with the structural element size of 5 × 5 and the number of iterations of 3; (technique 3) erosion with threshold filtering of images; (technique 4) erosion with illumination equalization using the CLAHE method; and (technique 5) without processing. The results (Table 1) were obtained using a database of 111 images [12], 59 of which contained marking elements, while the rest 52 images did not. It can be seen in Table 1 that the erosion transformation is one of the key processing stages that increase the accuracy of marking recognition. Nevertheless, an analysis of the precision–recall curve (Fig. 7) and the average precision (AP) of the marking recognition shows that the average recognition precision is relatively low for all techniques.

Table 1. ROC-AUC quality of image classification
Fig. 7.
figure 7

Precision–recall curves: (solid line) technique 1 with AP = 0.6787, (dashed line) technique 2 with AP = 0.7016, and (dotted line) technique 5 with AP = 0.6707.

To increase the recognition precision, it is proposed to use a binary threshold instead of the classification based on the R value. An analysis of the histograms of distribution of the angles of mutual orientation of segments showed that, in the presence of the marking in an image, the histograms have local maxima near the values of 0° and 90°. Therefore, the presence of such extrema was used as a binary threshold. Estimates of the recognition precision with the binary threshold are listed in Table 2, where TP (true positive) is the number of correctly recognized images with the marking; FP (false positive) is the number of images incorrectly recognized as marking images; FN (false negative) is the number of unrecognized images with the marking; TN (true negative) is the number of correctly recognized background images (without marking); P is precision; and R is recall:

$$R = \frac{{TP}}{{TP + FN}};\quad P = \frac{{TP}}{{TP + FP}}.$$
Table 2. Metrics of evaluating the recognition quality for the binary threshold

An analysis of the recognition errors (FP, FN) showed that they occur in the following cases.

(i) There are several destabilizing factors in images with the marking: low contrast, interference, and relief elements (Figs. 3f, 3g).

(ii) Background images are characterized by a developed substrate relief (grains, terraces, and cracks) or contain step noise at the edges of the images.

Table 2 confirms that erosion is a key operation in the image processing. If parameter TP is more preferable in searching for markings, then additional tools for improving the recognition quality are threshold filtering and illumination equalization using the CLAHE method. Threshold filtering can be recommended as an inevitable procedure if it has been established in advance that the surface in the marking area contains protruding elements (which can be eliminated using this filtering). The CLAHE method is recommended to be used when the background relief near the marking is homogeneous in order to avoid excessive selection of segments in background elements. The positive effect of erosion and threshold filtration is demonstrated in Table 3, which contains values of the binary threshold for the images shown in Fig. 3.

Table 3. Binary thresholds for different techniques of image preprocessing

CONCLUSIONS

It was shown that the problem of binary classification of micromarking images can be solved using conventional processing methods implemented in an open source software (OpenCV). It was established that marking segments can be adequately approximated by straight-line segments using the Hough transform. To improve the approximation and recognition quality, morphological erosion and threshold filtering can be used. It was proposed to use the fraction of registered segments with specified angles of mutual orientation or the local dominance of such segments in the distribution histogram as recognition criteria. The adequacy of the proposed methods was confirmed for different types of images of the marking and background relief.