Shape and orientation classification of objects based on their electromagnetic signatures using convolutional neural networks

Yasmina Zaky; Nicolas Fortino; Benoit Miramond; Jean-Yves Dauvignac

doi:10.1088/1361-6420/ad2ec9

1. Introduction

Automatic target classification (ATC) using radar has mobilized considerable research efforts in recent decades [1]. Developments in this area have led to the emergence of several new techniques capable of meeting this need. Therefore, artificial intelligence (AI) and the study of various techniques have become popular among researchers in various fields. Several algorithms vary from simple models, such as support vector machines (SVMs), decision trees (DTs), K-nearest neighbor (KNN), and naive Bayes (NB), to more complex models, such as artificial neural networks (ANNs) [2].

The type of radar data to be used for target classification is a crucial choice that directly impacts classifier performance. It can be divided into two main categories: pre-processed and raw data. In the first category, pre-processing has many uses, one of which is to perform identification and classification in a space different from that of the measurement. Several feature extraction techniques can enhance classification by allowing an object to be characterized by its scattered field (SF) [3–5]. One of the most mature and studied is certainly synthetic aperture radar (SAR) or inverse SAR (ISAR) [6, 7], which can be used to generate images for classification purposes. Classification can also be based on templates [8] or geometric features extracted from images, but the former requires huge libraries and the latter a large amount of data for training. In addition, both depend directly on image quality or feature extraction quality, i.e. noise. Radar experts therefore consider that ATC applications are not yet commensurate with a such research investment and have identified three objectives: the first one is to design a low-power real time radar, the second one is to reduce misjudgement in complex environment, and the third is to require limited training data to classify new targets [9].

To get less complex systems, target classification from raw data offers a number of advantages. In this category, we find radar cross section (RCS) responses or micro-Doppler measurements that can be directly used to classify different objects, moving targets or human activities [10–12]. Recent works on object classification based on raw SAR measurements have even shown results that are only slightly inferior to pre-processed data, but at much lower computational costs [13, 14]. For these reasons, we will use the results of classifications based on raw data as a reference in this work.

However, all the methods presented above depend on the angle from which the target is illuminated, making the classifiers sensitive to changes in the object orientation or in antenna position. Moreover, they require numerous observations, including spatial diversity in order to classify and make decisions. Therefore, from a hardware perspective, it is also beneficial to implement a frugal, easy-to-train method that permits the recognition and classification of objects from a few parameters and from a single measurement. The work presented in this article addresses these issues by using ultra wide band SFs which offers frequency diversity.

Natural resonances are parameters that allow modeling of ultra-wide-band (UWB) SFs of a target using the singularity expansion method (SEM) [15]. They provide an extremely compact model of the impulse response of an object illuminated by a UWB radar signal. These resonances are intrinsic to the object and independent of the incident and observation angles, making SEM a very interesting method in an operational context when the target position is not completely controlled. SEM has been largely explored for the characterization of objects, but rarely for classification owing to its sensitivity to noise [16–19]. The few studies in the literature that use SEM data to classify simple objects highlight not only the qualities of SEM data (excellent classification from a few data points regardless of the target's aspect angle) but also its limitations to noise that affect higher order complex natural resonances (CNRs) [20, 21]. Therefore, their approach was limited to the first or second CNRs to build the dataset. Moreover, they either trained their classifiers at different signal-to-noise ratio (SNR) levels [20] or applied principal component analysis (PCA) before CNRs extraction to address the intrinsic weakness of SEM in terms of noise [21]. These solutions either complicate the constitution of the dataset or increase computation time before classification.

In this study, we propose another solution to overcome the drawbacks of SEM, which is sensitive to noise and retains only its advantages. In fact, compressing the diffracted field into normalized data independent of the angle of observation enables us to propose a frugal, easy-to-train method of object classification. The solutions provided are based on both the constitution of the SEM dataset and choice of the classification algorithm. Our first objective is to address the multi-class classification of objects with different geometries using convolutional neural network (CNNs). The choice of CNN was made considering previous studies that highlighted its efficiency compared with other classifiers applied to real data (DT, multiLayer perceptron (MLP), SVM) [22, 23]. Classifying an object's shape, regardless of its size, is an interesting task that has not yet been addressed using SEM data. Additionally, the CNRs of an object as well as their associated residues, which have angle dependencies, are considered. Thus, they have never been considered for object characterization, and to the best of our knowledge, have never been exploited. We also propose a novel use of these residues that will allow to determine the orientation and position of the receiving antenna relative to an object. This is similar to determining the orientation of an object according to the antennas of a radar or measurement system.

The remainder of this paper is organized as follows: In section 2, we construct datasets based on raw SF and SEM data, and then present the results of object classification using CNNs to compare the classification performances with and without SEM pre-processing. Additionally, we demonstrate how the algorithms generalize to other object sizes and noisy data that are not included in the training sets. In section 3, the residues are explored for the classification of the observation angle and to determine the orientation of an object according to the location of the receiving antenna . Finally, we conclude the study in section 4.

2. Classification of objects

2.1. CNN as a good trade-off between accuracy and complexity

The field of machine learning continues its rapid and massive evolution, proposing new variants of existing network models. New families of networks are appearing less frequently, but it is clear that the trend is for these networks to increase in complexity, whatever the field of application and the type of data considered.

As studies [24, 25] have shown, this significant increase in complexity does not bring the same proportional gain in performance. The most classical networks, and the underlying machine learning mechanism based on the optimization power of the frameworks associated with them, provide in most cases very satisfactory results for a reasonable computational cost and memory footprint.

It is worth noting that the larger the networks, the greater the amount of data required to train them, and such data are not always available for scientific problems. In other words, the size of the network must be adapted to the size of the data generated, which here is limited to 12 objects multiplied by the number of observation angles (36 by 73). Increasing the number of observations means increasing simulation times, which are already substantial, and is therefore of no real interest. The reduction of the spatial step to 2 degrees has already been dictated by the need to generate sufficient data per object for our simple CNN architecture in the angular classification part of this work. Thus more complex networks would have need to significantly increase the size of the databases.

More recent network architectures such as ReSNet [26], U-Net [27], or transformers [28] have often proved indispensable for solving complex ML problems such as object detection, language processing (NLP) or multimodal fusion. But on simpler classification problems, such as the one considered in this paper, their use is often unjustified, as they often only allow to gain the few last performance points at the cost of a significant multiplication in complexity. This is illustrated by the example of image classification, which serves as a benchmark for all the models in the literature.

As an example, ConvMLP-S is a hierarchical Convolutional MLP, which is a light-weight, stage-wise, co-design of convolution layers, and MLPs. ConvMLP achieves 76.8% top-1 accuracy on ImageNet-1k with 9 M parameters and 2.4G MACs. In comparison, most recent models as ViT-e reaches 90.9% but at the cost of 3900 M parameters and 5777 GFlops. Even ReSNet-152 with 152 layers and 11.3 billion FLOPs only reaches 78,5%. On a less complex dataset as CIFAR-10, ConvMLP obtained already 98.6% where all ViT models obtained between 99 and 99.5%.

The cost/performance trade-off offered by CNN networks is thus perfectly suited here to the nature of the data and the objectives of our study. Indeed, as we will show in sections 2.6 and 3.1 a convolutional model is already capable of achieving over 90% accuracy on our inverse problem, so not justifying to increase model complexity by several orders of magnitude. Especially since one of our long-term objectives is to aim a full integration of the SEM data capture and processing system on an on-board embedded target, in order to create an autonomous and portable detection system.

For all these reasons, we adopted in this work the LeNet-5 architecture to train both SF datasets [29]. The filter size was changed to be applied to the 1-D input signals. For the SEM data, we applied one convolution layer with 6 filters, followed by one hidden layer with 32 neurons, and a ReLU activation function applied in all layers.

2.2. Input data

The aim of this study is to classify objects based on a single observation of the SF over an ultra wide frequency range (i.e. ultra wide band (UWB) illumination). The first step is to generate a SF database of all the categories of objects to be classified. The UWB SF from 12 objects with different geometries or materials was computed in the far-field region from 10 MHz to 5 GHz. They were illuminated using a plane wave, as shown in figure 1, where the incident wave vector $\vec{k}_{\textrm{inc}}$ is defined as follows:

$\begin{equation} \vec{k}_{inc} = \begin{bmatrix} cos\left(\phi_{\textrm{inc}}\right) sin\left(\theta_{\textrm{inc}}\right)\\ sin\left(\phi_{\textrm{inc}}\right) sin\left(\theta_{\textrm{inc}}\right)\\ cos\left(\theta_{\textrm{inc}}\right) \end{bmatrix}. \end{equation} \tag{ 1 }$

**Figure 1.** Thin wire along z axis illuminated using a plane wave with incident wave vector $\vec{k}_{inc}$ .
Download figure:
Standard image High-resolution image

First, we have five spheres with different permittivity ε_r and conductivity σ where the SF is computed analytically using Mie series algorithm implemented in Matlab [30]. The spheres were illuminated using an x-polarized incident plane wave propagating along z axis. The SF is recovered in a bistatic configuration, where θ varies from 0^∘ to 180^∘ with 5^∘ steps and φ from 0^∘ to 90^∘ with 10^∘ steps. Second, we have seven different perfect electric conductor (PEC) objects, in which the simulations are carried out in a monostatic configuration by using the time domain solver of CST Microwave Studio, which is a 3D electromagnetic simulation software [31] . Hence, they were illuminated using a plane wave with multiple incident directions, chosen according to the symmetry planes of each object.

The 12 objects are enumerated as class 0 to 11 as follows:

Class 0: PEC sphere
Class 1: lossy sphere ε_r = 4 and σ = 0.5
Class 2: lossless sphere ε_r = 2
Class 3: lossless sphere ε_r = 4
Class 4: lossless sphere ε_r = 9
Class 5: PEC ring
Class 6: thin wire
Class 7: thick cylinder
Class 8: ovoid
Class 9: cube
Class 10: rectangular solid
Class 11: equilateral pyramid

In this manner, we dispose of raw UWB SF data for these 12 objects. We then apply the SEM to pre-process the raw data and extract the characteristic parameters of each object from its UWB SF.

2.3. SEM

Pre-processing of SF allows the estimation of CNRs and their associated residues of an object from its complex frequency response. Vector fitting (VF) is a widely used method for fitting a measured or simulated frequency-domain response H(s) using the following rational function [32]:

$\begin{equation} H\left(s\right) = \sum_{m = 1}^{M} \frac{R_{m}}{s - {a_{m}}} + d +s*e \end{equation} \tag{ 2 }$

where M is the model order, $s = j\omega$ , a_m are the poles, and R_m are the residues. d and e are optional real constants [32]. The poles have the following form: $a_{m} = -\sigma_{m} + j\omega_{m}$ , where σ_m is the damping factor, and ω_m is the natural pulsation of the m_th singularity. VF method was selected to determine the CNRs of an object as it showed better robustness to noise and improved accuracy than other SEM techniques (TLS Cauchy method as an example among others) [33].

The first step of the VF algorithm is to estimate CNRs by assuming a set of starting complex poles uniformly distributed over the frequency range. Subsequently, through successive iterations, the algorithm relocates them to the actual poles [34, 35]. If the data are noise-free, two iterations are sufficient to obtain accurate CNRs when the number of starting poles is greater than M. Further iterations are required for convergence in presence of noise. For example, at 10 dB SNR, we applied ten iterations in the VF algorithm. Once the CNRs have been determined, the residues can be computed by solving the corresponding least-squares (LS) problem.

We also computed the quality factor (Q-factor) associated with each pole [16]. This is computed as follows:

$\begin{equation} Q_{m} = -\frac{2*\pi*f_{m}}{{2*\sigma_{m}}}. \end{equation} \tag{ 3 }$

The Q-factor allows the description of an object from its CNRs independently of its size. Strong resonating objects, such as thin wires, exhibit resonances with a high Q-factor, whereas weak resonating objects, such as PEC spheres, are characterized by a low Q-factor.

2.4. CNN

CNNs are among the most commonly used ANN algorithms. They are composed of multiple layers of different types. The most common layers are convolution, pooling, and fully connected layers. The convolution layer is a filter of specific length and width that moves along with the input data. It is composed of kernels or filters applied to extract specific features from an input vector [36].

The pooling layer is a sub-sampling layer that is used to reduce the size of the features computed using the convolution layer. Here, we used the maximum pooling layer that outputs the maximum input value. Fully connected layers are used to process the output features of the last convolution or pooling layer. They are composed of neurons connected to all the neurons in the previous layer. They applied a non-linear activation function to the sum of the products of the inputs related to their weights [36]. Neuron j in layer m performs the calculations indicated in equation (4):

$\begin{equation} y_{j}^{m} = f\left(\sum^{N_{m-1}-1}_{i = 0}w_{ij}*y_{i}^{m-1}\right) \end{equation} \tag{ 4 }$

where $y_{j}^{m}$ is the output of neuron j in layer m, N_m is the number of neurons in layer m, w_ij is the weight between the neurons of layer m + 1 and layer m, and $f(.)$ is a non-linear activation function. In this study, we use the rectified linear unit (ReLU) function because of its computational efficiency, enabling the system to converge quickly and prevent over-fitting [37].

2.5. Dataset construction

To classify an object, we can directly use raw SF data or pre-processed SEM data, as shown in figure 2. We evaluated these approaches to determine the most efficient data type for object classification.

2.5.1. SF dataset.

The first dataset is constructed using the amplitude of the SF response in the frequency domain. We intentionally conducted a preliminary study on real data to allow a direct comparison between different classifiers, some of which do not support complex numbers. Moreover, some datasets (i.e. time-domain datasets) are real; thus, we decided to include only real numbers in the input vector to conduct our comparative study. In this preliminary study, we noticed that using the phase alone had the same performance as using the amplitude, whereas adding the phase along with the amplitude was not relevant, as it doubled the vector's size while maintaining the same performance. Hence, the input vector is a 1-D signal with 500 frequency points, and is composed of two channels. The first and second channels represent the amplitudes of E_θ and E_φ components of the SF, respectively.

The second dataset represents the SF in the time domain. The transient impulse response was obtained by computing the inverse fast Fourier transform (IFFT) of the complex frequency SF response. Here, we include the first 10 ns of each signal, which constitutes 100 time points, as it starts to decay after. The input vector is also a 1-D signal of 100 points and is composed of two channels, one for each polarization.

In these two cases, we simulated multiple object sizes to recognize an object, regardless of its size. Therefore, we simulated dimensions of: 3 cm, 6 cm, 9 cm, 12 cm and 15 cm. The size of an object is defined by its largest dimension (diameter for a sphere, length for a cylinder, etc).

2.5.2. SEM dataset.

For the third dataset, we used pre-processed data provided by VF. We retain the first five CNRs of each object because the higher order poles are the most affected by noise [33]. We then included the following parameters in the dataset.

Resonant frequencies were normalized with respect to the fundamental frequency. By normalizing, we always obtain the same values, regardless the size of the object.
Q-factor was used instead of the commonly used damping factor. This parameter is an important representation of an object, as it is independent of its size, which will help in generalizing the classification to objects of any size.
Residues, and more specifically their distribution according to frequency, contain additional information about the objects and make training easier, with fewer neurons. Hence, the classification accuracy was more stable than that of the dataset constructed without residues.

Thus, the input vector is composed of four channels: normalized natural frequencies (NRFs), Q-factors, and the residues (amplitude) of θ and φ components; each channel has a length 5. Thus, in this study, the SEM dataset was easily constructed from a single object with a size of 15 cm.

In addition, to anticipate the case of objects having fewer than five resonances in the frequency range, we complemented the missing data with zeros to preserve a channel length of 5. This ensures that predictions can be made despite the missing values in the row. Hence, we have sparse SEM data in this dataset.

The following abbreviations were used to refer to the datasets: FD, TD, and SEM which represent frequency domain, time domain, and SEM data, respectively.

2.6. Results & discussion

Each dataset was divided into 80% for training and 20% for validation (tables 1 and 2). The remaining 20% was composed of random observation angles that were not present in the training set, where each class had an equal number of samples. The results were averaged over 10 runs. We then test noisy data, which are data affected by an additive white Gaussian noise (AWGN) with different levels of SNR. Next, we present the results of the generalization ability of the CNN for noisy data from unseen object (i.e. according to its size). The object sizes were either larger than those included in the training datasets ( ${\gt}$ 15 cm) or smaller (fewer than 5 resonances for the SEM data) as reported in table 3. Indeed, decreasing the object size means that the first five natural frequencies are shifted upward within the frequency band.

Table 1. Training phase conditions.

Training	SNR	Object size simulation	Antenna position wrt object
TD	Without noise	5 dimensions (3 to 15 cm)/class	80 % of θ [0^∘:5^∘:180^∘], φ [0^∘:5^∘:360]
FD		5 dimensions (3 to 15 cm)/class
SEM		1 dimension (15 cm) / class

Table 2. Validation phase conditions.

Validation	SNR	Object size simulation	Antenna position wrt object
TD	Without noise	5 dimensions (3 to 15 cm)/class	20 % of θ [0^∘:5^∘:180^∘], φ [0^∘:5^∘:360]
FD		5 dimensions (3 to 15 cm)/class
SEM		1 dimension (15 cm) / class

Table 3. Test phase conditions.

Test	SNR	Object size simulation	Antenna position wrt object
TD	10 to 65 dB	8 new dimensions (2 to 30 cm)/class	Average on 20 random antenna positions
FD
SEM

2.6.1. Training phase.

We adopted the LeNet-5 architecture to train both SF datasets [29]. The filter size was changed to be applied to the 1-D input signals. For the SEM data, we applied one convolution layer with 6 filters, followed by one hidden layer with 32 neurons, and a ReLU function applied in all layers.

This was performed over 200 epochs for the SEM data and 800 epochs for the SF data, with a batch size of 100. The output layer was composed of 12 neurons, using the softmax activation function to compute the output probabilities. The learning rate was updated using the Adam optimizer [38].

Figure 3 shows the training and validation accuracies as a function of the number of epochs for the three datasets. We can see that the training curve converges quickly to an accuracy of 100% for SEM data from 125 epochs, while presenting much fewer fluctuations than both SF data curves, because the SEM data are easier to learn.

Table 4 lists the execution times recorded during the CNN training phase. We observed that the training runtime using the SEM data was much faster than that using raw data. The SEM pre-treatment using VF accounts for 1 s when applied to all observation angles of an object. This means that even when considering the VF time in the overall procedure, the computation time remains favorable to the SEM data.

Table 4. Time consumption of the CNN trained using the different datasets.

Database	FD	TD	SEM
Time (s)	950	658	84

2.6.2. Test phase.

First, we present the mean accuracy results (% of data points classified correctly) when testing the 20% of the remaining samples of each dataset. In this case, we note that the CNN trained using FD and TD data did not classify all classes accurately, and the mean accuracy did not exceed 94%. In contrast, when trained with SEM data, the accuracy reached 99.8% where it miss-classified only some samples from class 1 (lossy sphere) with class 11 (pyramid), as seen from the confusion matrix in figure 4.

**Figure 4.** Normalized confusion matrix using CNN classifier when testing 20% remaining samples SEM data.
Download figure:
Standard image High-resolution image

In the following subsections, we kept the datasets unchanged.

2.6.3. Noisy data.

Now, we test the classifiers' ability to handle noisy data. Several AWGN levels are added to the SF responses, where the SNR values vary from 65 dB, which is one of the highest values that can be obtained, to 10 dB, corresponding to an unfavorable noise condition. First, we test noisy data on object sizes that have already been seen during training (15 cm) where the five first resonant frequencies are present. VF was applied to extract the CNRs and residues directly from the noisy signals.

The results in figure 5 indicate that the performance of the CNN trained using normalized SEM data was better than that of the SF data, with a gain of approximately 20% for SEM at high noise levels (10 dB SNR). CNNs can classify noisy responses processed with SEM because they can take advantage of the extra information provided by the original structure of the SEM dataset proposed in this study (sparse data, Q-factors, and residues). They can easily identify the first NRF, its associated Q-factor, and residues that are barely affected by noise.

These results show that the first objective of this study was to achieve high robustness to noise in classification based on SEM data.

2.6.4. Generalization to noisy data from unseen objects.

The larger dimension is chosen to be 30 cm while the smaller one is chosen for each object so that it is smaller than those included in the constructed dataset (but with at least one CNR in the frequency band). The results in figures 6 and 7 confirm that the SEM data can distinguish the object classes correctly regardless of the size from their noisy responses, as opposed to SF data. Moreover, the SEM dataset only requires knowledge of the SF of one size for each object, in contrast to the FD or TD datasets, which are made up of five sizes for each object. This is due to the use of Q-factors and NRFs, which makes it possible to accurately distinguish objects of different sizes not included in the training set. In addition, sparse SEM data allow the maintenance of high classification rates that can attain 80% when there is only a single CNR and noisy data.

**Figure 6.** Accuracy (%) of noisy responses of 30 cm objects at various SNRs when using CNN classifier with the different datasets.
Download figure:
Standard image High-resolution image

**Figure 7.** Accuracy (%) of noisy responses of smaller objects at various SNRs when using CNN classifier with the different datasets.
Download figure:
Standard image High-resolution image

These results prove that a CNN trained using SEM data can efficiently distinguish the classes of different objects of any size from the knowledge of a single object size even at low SNR levels.

3. Identification of object orientation

After identifying the class of an object in section 2, it is interesting to determine which part of the object is illuminated by the incident wave. The residues are related to the SF response and vary with the observation and incident directions. Hence, they can be considered to detect the location of the observer (i.e. the receiving antenna) and the orientation of the object owing to the orientation of the EM field (i.e. the polarization of the field).

To determine the direction and orientation of the antenna in the coordinate system of the object, two steps were followed.

First, we identified the angular sector containing the direction of the wave vector of the SF (the surface in green in figure 8). For a spherical object, the position of the receiving antenna is determined in bi-static configuration (it makes no sense in mono-static scenario). In the case of a non spherical object, measurements are performed in a mono-static scenario where the SF varies with the incident direction of the illuminating wave (i.e. the position of the Rx/Tx antenna).
Second, once the angular sector is identified, we define a plane perpendicular to the incident wave vector to represent the rotation of the incident field polarization in this plane. This is similar to the rotation of an object with respect to the direction of the incident wave. Angle α was used to define the orientation of the electric field of the incident wave.

Therefore, by identifying the angular sector and the α angle, for example, the wire shown in figure 8, we can determine the orientation of the antenna in the coordinate system of the object or vice versa.

3.1. Detection of the angular sector

To identify the angular sector, the space surrounding each object is split into multiple angular sectors to locate the receiving antenna. This is performed according to the symmetries in the geometry of each object. For example, a thin wire oriented along z-axis has a symmetry plane in the azimuth plane; therefore, we only consider the upper half space, as shown in figure 9. Additionally, it has rotational symmetry along z-axis; therefore, the mono-static SF does not vary in the azimuth plane.

**Figure 9.** Various possible positions of the antenna relative to the thin wire regarding its symmetries.
Download figure:
Standard image High-resolution image

Using a CNN, we determined the angular sector containing the scattered wave vector, which is equivalent to determine the location of the receiving antenna. We present the results obtained for two specific classes of objects: a thin wire, which is a highly resonating smooth object, and a rectangular solid, which is a weakly resonating object with edges.

3.1.1. Dataset construction.

For each object, datasets were created separately because they had their own geometric symmetries. The SFs simulated in 2.2 are used to construct the datasets. To consider the polarization of the incident wave, we include data derived from various rotational angles α that vary from 0^∘ to 90^∘ with a 10^∘ step. This allows the diversification of the polarization of the incident wave and, consequently, the SF. Thus, we can detect the angular sector for various rotations of the object or the incident wave.

First, for the FD and TD datasets, the lengths of the input vectors were the same as those in section 2.5.1. Thus, for FD data, we include the amplitude of both E_θ and E_φ field components that have 500 frequency points, whereas for TD data, we include the first 10 ns of the signal for both field components.

Second, for the SEMs dataset, the input vector is of length 10 and comprises two channels representing the amplitude of residues related to both field components, E_θ and E_φ respectively. The natural frequencies and Q-factor are eliminated because they do not vary with the locations of the antennas and the polarization of the field. Additionally, the dataset was completed with sparse data related to objects of small size, for the same reasons as those mentioned in section 2.5.2. In fact, using residues over raw SF data presents some merits.

First, there are significantly fewer data points in the input vector. The residue dataset is 100 and 20 times smaller than the SF datasets in the frequency and time domains, respectively.
Second, the amplitude of the residues is independent of the size of the object, which aids in the generalization process.

Indeed, the natural frequencies at which the residues are derived are inversely proportional to the size of the object; that is, for each object, the residues are independent of their size because they are computed at the same electrical length. Consequently, they are unique to each object but informative about the orientation of that object.

The thin wire is divided into nine sectors making nine classes as follows:

sector 0: 2 $\unicode{x2A7D} \theta \unicode{x2A7D}$ 10;
sector 1: 12 $\unicode{x2A7D} \theta\unicode{x2A7D}$ 20;
sector 2: 22 $\unicode{x2A7D} \theta\unicode{x2A7D}$ 30;
sector 3: 32 $\unicode{x2A7D} \theta\unicode{x2A7D}$ 40;
sector 4: 42 $\unicode{x2A7D} \theta\unicode{x2A7D}$ 50;
sector 5: 52 $\unicode{x2A7D}\theta\unicode{x2A7D}$ 60;
sector 6: 62 $\unicode{x2A7D}\theta\unicode{x2A7D}$ 70;
sector 7: 72 $\unicode{x2A7D}\theta\unicode{x2A7D}$ 80;
sector 8: 82 $\unicode{x2A7D}\theta\unicode{x2A7D}$ 90.

The rectangular solid is a polyhedral object with edges. We tested two angular sector configurations as shown in figure 10. A preliminary study showed that the classification performance was lower when the configuration shown in figure 10(a) was used. This becomes some of the miss-classified samples belonging to the data related to the edge area of this object (sectors 2 to 5). However, when using the configuration shown in figure 10(b), where sectors 4 and 5 covered the edge area, the classification accuracy within this area increased by 8%. Therefore, we show the study realized when this object is divided into ten sectors, as shown in figure 10(b),

sector 0: 0 $\unicode{x2A7D}\phi\unicode{x2A7D}$ 21; 0 $\unicode{x2A7D}\theta\unicode{x2A7D}$ 21
sector 2: 0 $\unicode{x2A7D}\phi\unicode{x2A7D}$ 21; 24 $\unicode{x2A7D}\theta\unicode{x2A7D}$ 33
sector 4: 0 $\unicode{x2A7D}\phi\unicode{x2A7D}$ 21; 36 $\unicode{x2A7D}\theta\unicode{x2A7D}$ 51
sector 6: 0 $\unicode{x2A7D}\phi\unicode{x2A7D}$ 21; 54 $\unicode{x2A7D}\theta\unicode{x2A7D}$ 69
sector 8: 0 $\unicode{x2A7D}\phi\unicode{x2A7D}$ 21; 72 $\unicode{x2A7D}\theta\unicode{x2A7D}$ 90
sector 1: 24 $\unicode{x2A7D} \phi \unicode{x2A7D}$ 45; 0 $\unicode{x2A7D} \theta \unicode{x2A7D}$ 21
sector 3: 24 $\unicode{x2A7D} \phi \unicode{x2A7D}$ 45; 24 $\unicode{x2A7D} \theta \unicode{x2A7D}$ 33
sector 5: 24 $\unicode{x2A7D} \phi \unicode{x2A7D}$ 45; 36 $\unicode{x2A7D} \theta \unicode{x2A7D}$ 51
sector 7: 24 $\unicode{x2A7D} \phi \unicode{x2A7D}$ 45; 54 $\unicode{x2A7D} \theta \unicode{x2A7D}$ 69
sector 9: 24 $\unicode{x2A7D} \phi \unicode{x2A7D}$ 45; 72 $\unicode{x2A7D} \theta \unicode{x2A7D}$ 90.

Note that the steps in θ and φ are a trade-off between the field distribution (which is strongly linked to the shape of the object), frequency, and desired angular accuracy. This value can be set according to these parameters.

3.1.2. Results & discussion.

We divided each dataset into 80% for training and 20% for testing, and the results of the angular sector classification were averaged over 10 runs. We then tested the generalization ability for noisy data and object sizes that were not included in the training sets.

3.1.2.1. Training phase.

The LeNet-5 architecture was adopted for the SF datasets, whereas for the residue dataset, we used one convolution layer with 12 filters, followed by two hidden layers with 32 neurons each. The number of epochs was 256 for the residues and 512 for the SF data with a batch size of 100. The output layer had 9 and 10 neurons for the thin wire and rectangular solid, respectively.

3.1.2.2. Test phase.

For the thin wire, the accuracy obtained using residues as input data was 97.8% whereas for both SF data, we obtained 96%. These results show that it is possible to detect the illuminated angular sector in one of the nine sectors of a thin wire, using either residues or raw data.

For the rectangular solid, the accuracy obtained using residues was 90% whereas for both SF data, we obtained an accuracy of 95% . We obtained a slightly better performance when using raw data in this case, which might be due to the discontinuities in the shape of this object, making residue computation difficult. Additionally, an important part of the miss-classified samples belonged to data with only one or two resonances in the frequency band.

The classification results of the observation angle using the SEM data and raw data were good and almost identical during the test phase. At this stage, the main benefit of the SEM method is the simplicity of building the dataset, which requires only one object dimension in the learning phase and very few parameters which reduces computational cost.

3.1.2.3. Noisy data.

After the test phase, we proceed to the test of noisy data of the 15 cm object size for which the noiseless response is already included in the training datasets. This was tested using multiple SNR levels to corrupt the data.

Thin wire: figure 11 shows the accuracy results when using the three datasets. We see that the accuracy when using residues or raw data is almost similar, with only 2% difference at 10 dB SNR. This is because the thin wire is a strong resonating object, so the extraction of CNRs in a noisy environment is not very difficult; therefore, the calculation of residues is quite precise even at low SNRs.
Rectangular solid: figure 12 shows the accuracy results when using the three datasets and highlights a weakness of the proposed method: residues show higher sensitivity to noise than raw data because we have significantly less data in the input vector. However, most of the miss-classified samples, when using residues, belong to data having only one or two resonances in the frequency band (small object sizes) and to data that exist in the borders of each sector (small error).

**Figure 11.** Accuracy (%) of noisy data of the thin wire at different SNRs when using CNN.
Download figure:
Standard image High-resolution image

**Figure 12.** Accuracy (%) of noisy data of the rectangular solid at different SNRs when using CNN.
Download figure:
Standard image High-resolution image

3.1.2.4. Generalization to noisy data from unseen object sizes.

The large dimension chosen was 30 cm for both objects, and the small dimension, where there was only a single resonance, was of 5 cm for the thin wire and 3 cm for the rectangular solid.

Thin wire: figure 13 shows the confusion matrices for larger and smaller thin wires at 10 dB SNR when using residues. The classification accuracy is high for large objects, where it reaches 80% at 10 dB SNR. For a small thin wire, the accuracy was low (62%), which is normal because there is only a single CNR in the frequency band. The results when the raw data were used were low and did not exceed 30%. These results are summarized in table 5.
Rectangular solid: the accuracy obtained at 10 dB SNR when using residues of the 30 cm length object is 62%. However, for the 3 cm length object, the accuracy decreased to 52% as it was more difficult to classify from only residues associated with the first CNR. In addition, with raw data (FD or TD data), we obtained very low performance that did not exceed 30%. These results are summarized in table 6. The normalized confusion matrices showing the classification of each class using residues at 10 dB SNR are presented in figure 14. It can be noticed that the confusion mostly arises in the sectors that are adjacent to each other.

**Figure 13.** Normalized confusion matrices of noisy data when identifying the angular sectors of the thin wire using residues for two different sizes (a) 30 cm and (b) 5 cm.
Download figure:
Standard image High-resolution image

**Figure 14.** Normalized confusion matrices of noisy data when identifying the angular sectors of the thin wire using residues for two different sizes (a) 30 cm and (b) 3 cm.
Download figure:
Standard image High-resolution image

Table 5. Comparison of classification accuracy for unseen thin wire sizes @10 dB SNR.

Size	Included in database	30 cm (not included)	5 cm (not included)
Raw data	$85\%$	${\lt}30\%$	${\lt}30\%$
SEM data	$82\%$	80%	62%

Table 6. Comparison of classification accuracy for unseen rectangular solid sizes @10 dB SNR.

Size	Included in database	30 cm (not included)	3 cm (not included)
Raw data	$76\%$	${\lt}30\%$	${\lt}30\%$
SEM data	$65\%$	62%	52%

By comparing the classification results obtained using both objects, we can observe that the thin wire has better performace. This is because of to the strong resonating aspect of this object, making the extraction of resonances and the residues computation more accurate even in a noisy environment than for a weak resonating object. Both objects evaluated in this section were chosen because they are representative of all the other simulated objects, which can be divided into two categories: objects that do not present sharp edges (thick cylinder, ovoid, etc) having similar performance as the thin wire, and the remaining objects that perform more like a rectangular solid where the edges strongly scatter the field, which is detrimental in a monostatic configuration, making the estimation of residues using SEM more delicate.

The advantages of using residues are the compactness of the dataset and ease of construction because the residues are independent of the object size (i.e. we can use the SF from a single object size). Moreover, the performance obtained in terms of generalization to other object sizes is excellent and largely surpasses that obtained from the SF, even when training with many object sizes. Finally, we note good robustness to noise owing to the sparse dataset, which maintains a very high performance considering the small number of parameters of the input vector (20 to 100 times more compact).

3.2. Calculation of object rotation

Once the receiving antenna is localized (i.e. the angular sector is identified), the next information to be retrieved is the polarization of the incident wave, which indicates the rotation of the antenna with respect to the object. This is equivalent to determining object orientation according to the antenna coordinate system. A rotated incident wave or a rotated object means that the scattered wave will be expanded in both θ and φ field components. The angle α used to define the orientation of the linear polarization of the incident wave can be determined according to figure 8 as follows:

$\begin{equation} \textrm{tan}\left(\alpha\right) = \frac{|Res_{\theta_{\textrm{meas}}}|}{|Res_{\phi_{\textrm{meas}}}|} \end{equation} \tag{ 5 }$

$\begin{equation} \alpha = \textrm{tan}^{-1} \left(\frac{|Res_{\theta_{\textrm{meas}}}|}{|Res_{\phi_{\textrm{meas}}}|}\right) \end{equation} \tag{ 6 }$

where $Res_{\theta_{\textrm{meas}}}$ and $Res_{\phi_{\textrm{meas}}}$ are the residues of the field components in the plane perpendicular to the direction of $\vec{k}_i$ .

To estimate the α angle using equation (6), we considered only the residues associated with the first resonant frequency in our calculations. This is due to the fact that, in case of noisy signals, those residues are the least affected by noise, unlike residues related to poles of higher order.

As an example, we present a more complex object formed by merging two PEC objects: a thick cylinder and a hemisphere. The cylinder has a length of 15 cm and the hemispherical has a radius of 7.5 cm (figure 15).

**Figure 15.** The simulated PEC cylinder with a hemispherical end oriented along z axis.
Download figure:
Standard image High-resolution image

To verify whether we can compute α accurately to determine the orientation of this object, we rotate it by 45^∘ around y axis and recover the SF at multiple incident angles, where θ varies from 0^∘ to 180^∘ in 5^∘ steps. Additionally, we tested this approach with noiseless and noisy data at 30 dB and 10 dB SNR. VF was then applied to the raw data to extract the residues from the E_θ and E_φ field components. Equation (6) was used to compute the orientation angle of the object. For the noiseless case, we obtained the exact value with α = 45^∘ for all the observation angles. When testing with noisy signals, we noticed that noise affects both field polarizations, and there exists a small error in the computation of α which increases when the SNR value is low. Figure 16 shows the estimated angle α for various observation angles and noise levels.

**Figure 16.** Estimated angle α computed for each observation angle and at two SNR values.
Download figure:
Standard image High-resolution image

Thus, we can easily determine the incident wave orientation with respect to the object, or inversely. Hence, we developed a simple and complete scheme that allows the determination of the orientation of the object.

4. Conclusion

This study describes an efficient workflow for classifying the shape of an object and determining its orientation according to the antenna system. Three datasets were constructed based on the raw SF in the time and frequency domains and the proposed pre-processed SEM data. They were constructed using noiseless responses only. An original SEM dataset including sparse data was proposed and validated. This format improves the classification of small objects (i.e. those with few resonances in the frequency band).

The comparison between object classification based on SEM and raw data confirms that the proposed method allows classification from a single observation angle, while being efficient, less time consuming, and aspect-independent. The residues were integrated into the dataset to make the object classification more accurate. Moreover, the use of the Q-factor and normalized resonant frequencies allows the accurate distinction of objects of different sizes not included in the training dataset, even at low SNRs. This makes the construction of the SEM dataset easier and faster than that of raw data, as it is not necessary to include multiple object sizes. Thus, combining the CNN algorithm with pre-processed SEM data from the VF method can compensate for the noise sensitivity of the SEM method by surpassing the results obtained from FD or TD data, even at low SNRs and without prior training with noisy data.

Finally, we used the residues to identify the position of the receiving antenna and its orientation relative to the object. This was achieved by dividing the space surrounding each object into multiple angular sectors. The classification results indicate that even though the residues are affected by noise, the proposed approach offers excellent classification rates, regardless of the size of the object. This generalization capability for different object sizes when using residues facilitates dataset constitution and offers interesting perspectives for radar applications. Second, to determine the orientation of each object with respect to the antenna system (α angle), we used the residues related to both field components to compute α. Thus, we present a procedure that efficiently describes object orientation in free space from a few parameters, regardless of their size.

This study provides promising results for future object classification with UWB radar signals using simpler but more reliable and faster techniques than those using raw data. It should be noted that classification results using raw data can be enhanced by applying deeper neural network topologies. However, this would have considerably increased the amount of time necessary to train the network, which was already much higher than that used for training with the SEM data.

Data availability statement

For the scattered field of spherical objetcts we used Mie series. It is a well know method. For non spherical objects, we used CST Microwave Studio. The organisation of data are a bit complex and requires additional work and it is not a simple task. But, from the geometries of objects in the paper, it is easy with CST to compute again the data of scattered fields. SEM processing of raw data have been made using vector fitting. Toolboxes are available for Matlab and Python. The data cannot be made publicly available upon publication because they are not available in a format that is sufficiently accessible or reusable by other researchers. The data that support the findings of this study are available upon reasonable request from the authors.

Shape and orientation classification of objects based on their electromagnetic signatures using convolutional neural networks

Article metrics

Submit

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. Classification of objects