Abstract

The Earth observation system heavily relies on sophisticated remotely sensed satellites, an important means to obtain global high-precision geospatial products and an important strategic area for the world’s major scientific and technological powers to develop. Although China’s satellites currently have real-time or quasi-real-time observations with a high temporal resolution, there are still a lot of gaps between their positioning accuracy and the world’s advanced level. This essay aims to study an efficient ground image processing technology and apply it to high-resolution satellite remote sensing images. The convolutional neural network is an efficient deep learning method for image recognition and feature extraction. In this essay, people use a convolutional neural network (CNN) to identify ground images, use a support vector machine (SVM) to classify and summarize images, and then use a Kalman filter for noise reduction, so as to obtain sophisticated remotely sensed images. In the experiment, 100 satellite remote sensing images in the GeoImageDB database were selected for the simulation test, the images were divided into 5 types, and their recognition accuracy, classification accuracy, image signal-to-noise ratio, and resolution were analyzed. The results show that the accuracy of CNN’s recognition of different types of images is up to about 94%, and the lowest is about 85%. The accuracy of the SVM for image classification is above 80%, and the highest is about 95%. The SNR of the image after noise reduction is basically above 6.5, and some even reach above 8.0. The resolution of the image is basically above 800ppi, and the highest even reaches an ultra-high resolution of 1400ppi. Overall, the processed images are of high quality. This shows that this essay uses CNN for image recognition and then uses an SVM for classification, and finally, the method of denoising the image has certain feasibility and has achieved good results through experiments.

1. Introduction

As space launch technology, satellite platforms, sensors, and other technologies advance, the “four highs” characteristics (that is, high space, high spectrum, high temporal resolution, and high positioning accuracy) of earth observation satellites become prominent. High-resolution remote sensing satellites have been widely used in global surveying and mapping, national defense surveillance, intelligence collection, accurate mapping, and other fields and are important strategic, forward-looking, and infrastructure facilities for countries. The construction of a high-resolution satellite remote sensing system has many characteristics, for example, high technology content, large capital investment, long construction cycle, obvious industry drive, etc. The system and corresponding data resources have become a crucial pillar of a state’s financial, military, and social progress. However, the current images obtained by satellite remote sensing systems are not clear enough compared to those of other countries, and the image quality still needs to be improved. Therefore, it is of great value and significance to study the ground image processing technology in this essay to improve the image quality of the high-resolution satellite remote sensing system.

Ground image processing technology is the key to improving the quality of satellite remote sensing images. At present, many scholars have carried out related research on it. Among them, in order to distinguish vegetation from the backdrop, Wang A created several color indices and classification strategies and provided a summary of the development of weed identification utilizing ground-based machine vision and image processing techniques, such as color index-based, threshold-based, and learning-based classification methods [1]. Xuan designed an ISAL receive channel layout that combines an orthogonal short baseline in the inner field with an orthogonal long baseline in the outer field. The significance is to improve the focus of the two-dimensional image and obtain high-precision three-dimensional imaging results [2]. Wang F. proposed an improved region-growing image processing method with modifications to the region-growing seeds and growing criteria obtained by the background subtraction method. This method can obtain the integral area of the cloud, which can be used to extract geometric parameters [3]. Demi developed and described a method for calculating ground-specific tire pressure using image processing theory [4]. However, most of these methods emphasize the theoretical basis but little for practical application and effect analysis, so further practice and exploration are needed.

In order to further study satellite remote sensing images, some scholars have conducted more in-depth exploration. Gong C. suggested a sizable dataset named NWPU-RESISC45 after studying several datasets and methodologies for scene categorization from remote sensing photos. The dataset contains 31,500 images covering 45 scene classes with 700 images per class [5]. Wang proposed a new kernel clustering algorithm to segment sophisticated remotely sensed images. The effectiveness and reliability of the proposed algorithm were verified by comparing the experimental results with the mean-shift algorithm and the watershed algorithm [6]. Wenjie proposed a technique for segmenting sophisticated remotely sensed images that combines the RHMRF-FCM algorithm with static minimal spanning tree (MST) subdivision, taking shape information into account [7]. Jiang suggested a sealed approach where a length signal in the form is induced using both multispectral data [8]. Although these methods have achieved some research results, the effect of improving the quality of remote sensing images is not obvious, so these methods need to be further improved and innovated.

In the era of continuous development of science and technology, the previous satellite remote sensing images require higher quality because high-quality images can be more conducive to serving society. The innovation of this essay is that it proposes a method of using CNN for ground image recognition and combining it with SVM for ground image classification, so that the image obtained from remote sensing images is more stable. Then the Kalman filter is used to denoise the image so that the image quality is higher and a higher-resolution satellite remote sensing image can be obtained.

2. Ground Image Processing Technology for Satellite Remote Sensing

2.1. Satellite Remote Sensing Technology

At present, China’s satellite development has achieved good results, but the gap between satellite remote sensing technology and developed countries is not small [9], as shown in Figure 1, from China’s first artificial near-Earth satellite, Dong Fang Hong-1, to the moon-orbiting satellite Chang’e-1, and then to the space station and Mars probe, which is undoubtedly the best proof of technological improvement. Although some achievements have been made in the remote sensing technology of Chinese satellites, and the high-resolution satellites manufactured have many characteristics (such as stable operation, efficient detection, global coverage, and other characteristics), currently, it is challenging to fulfill large-scale mapping and high-precision reconstruction of ground objects demands. Taking sophisticated remotely sensed images for precision-guided weapon strikes in national defense applications as an example, it is not only necessary to solve the problem of “seeing clearly,” but also the problem of “accurate positioning” [10, 11].

At present, high-resolution mapping satellites generally use the linear array push-broom imaging mode, as shown in Figure 2. The key to realizing its high-precision geometric positioning is to restore the photographic light and orientation parameters at the imaging time and establish a strict imaging model according to the three-point collinear principle of the image point, the projection center, and the ground point at the imaging time. The core of this is to accurately obtain satellite orbit and attitude data at the time of imaging. The high-precision geometric positioning of high-resolution satellite remote sensing images has always been concerned and favored by aerospace photogrammetry scholars [12, 13]. At present, satellites for photogrammetry with good resolution at home and abroad all use line array cameras as imaging sensors. During photography, along the flight direction of the satellite, push-broom imaging is performed row by row, and the detected image is the projection image at the center of the row. That is, each line of scanned image has a strict geometric imaging relationship with the ground, and each line has an independent external orientation element. However, when the satellite is actually in orbit, due to platform flutter, equipment aging, gravitational perturbation, temperature change, and other reasons, the position and attitude of the satellite image often have systematic errors. It is difficult to describe the azimuth elements outside each scan line with a standard and unified model, and it is even more difficult to use conventional framed aerial photogrammetry methods to process them. Therefore, in view of the geometric characteristics of sophisticated remotely sensed image imaging systems such as linear array push-broom and line center projection, the study of high-precision positioning models and methods has always been a hot and difficult issue in the field of aerospace photogrammetry [14].

The beam method commonly used in satellite remote sensing technology, which uses internal and external azimuth elements, model point coordinates, and self-checking parameters as parameters for overall adjustment, is the most rigorous method and theoretical basis for the high-precision positioning of remote sensing images. This method is an effective way to achieve a fast, accurate, and reliable solution. High-resolution remote sensing satellite linear array push-broom imaging has the characteristics of high orbital operation, high dynamic photography, narrow beam imaging, small field of view detection, short time exposure, etc. There is a strong correlation between the external azimuth and the orientation parameters. When the external orientation is characterized based on the classical Euler angles and quaternions, the problems of ill-conditioned formulas, poor iterative efficiency, and low solution accuracy in the adjustment algorithm often appear. In addition, self-calibration parameter polynomials and a large number of connection points are introduced in the linear satellite image beam adjustment. There are many unknowns to be solved and huge normal formulas, so there is an urgent need for fast, reliable, and stable solutions [15].

In addition, for satellite remote sensing images, there are generally hyperspectral images and high-resolution images, as shown in Figure 3. The so-called hyperspectral image is the image formed according to the spectral information, which can distinguish the ground objects in more detail according to the spectral information. High-resolution images refer to high-definition images with a resolution of 720p or higher. In high-resolution images, the human eye can have a more intuitive perception of the image and can acquire target information more quickly. The advantage of the high-scoring Earth observation system is that it can obtain global remote sensing data, but there are still underdeveloped areas in the world that belong to the unmapped area. Especially in difficult or unmanned areas such as oceans, deserts, and overseas, it is often difficult to implement topographic map surveying due to the lack of ground features and the inability of personnel to carry out surveys. Therefore, the most effective way to solve the problems of surveying and mapping in these difficult areas or areas without maps is to implement photogrammetry without ground control points. When using remote sensing satellite images to measure topographic maps of different scales, there are specific requirements for elevation accuracy and plane accuracy. The actual satellite engineering shows that after the high-precision orbit determination of the satellite platform is realized based on the single-frequency or dual-frequency receiver system, the plane positioning accuracy can be better guaranteed. Therefore, to realize topographic map surveying without ground control points, it is necessary to improve the elevation positioning accuracy [16, 17].

From single-star independent observation on a large platform to multi-star/constellation network detection on a small platform, it has become a new trend in the development of high-resolution Earth observation systems, so as to meet the new requirements for the application and development of Earth observation systems with shorter revisit times, wider detection range, and faster response. The use of large satellite platforms to move towards universal small satellite platforms is another development frontier of high-resolution Earth observation systems. At the International Forum on Small Satellites held in the Silicon Valley of the United States a long time ago, well-known experts and scholars in the aerospace field from various countries pointed out that, due to the further maturity of technology, the small satellite platform will play an important role from a single professional application to several formations networking, and then to large-scale cluster applications. In the future, the small satellite platform will achieve clustered shared launch [18].

2.2. Ground Image Processing

During the joint observation of satellites in orbit, nearly hundreds of images and auxiliary data are obtained every day. This kind of massive data requires satellite and ground coprocessing, and new intelligent processing mechanisms will become an urgent problem for high-resolution Earth observation systems. According to the different task requirements of users, such as real-time imaging, high-precision positioning, and change monitoring, optimize the allocation of satellite-to-ground computing resources, realize intelligent processing of the satellite-to-ground data, and provide quasi-real-time, high-precision, and highly reliable information services. It is a crucial strategy for raising the usability of the process of Earth observation systems, so it is necessary to study the efficient processing of high-resolution satellite images under the condition of limited resources. In order to meet the requirements of the future joint processing technology for earth observation, it is necessary to expand and supplement the ground image processing technology [19, 20].

For the processing of ground images, the process is shown in Figure 4. Firstly, the data preprocessing is carried out, the data is transferred to the remote sensing image management, and the features are extracted from the images. Then, the classification model is used to classify and aggregate them into cloud images, and then the cloud images are filtered and denoised to generate high-resolution images. At present, China is constrained by the level of hardware such as star sensors, and its attitude determination technology is relatively backward. After the camera parameters are calibrated in orbit, the positioning accuracy is insufficient without ground control points, and there is a gap with foreign satellite positioning accuracy, which makes it difficult to meet the current needs of high-precision positioning and mapping of satellite images. The lack of accurate satellite remote sensing images cannot provide accurate three-dimensional positioning and cannot use the ground control point beam method adjustment theory [21].

For image processing methods, CNN can be used to extract features from ground images, and then various classification models can be used to classify and summarize images. The similarity of grayscale information and spatial information between regional elements in the image can also be used to measure the correlation of ground objects. In satellite remote sensing images, the grayscale changes between pixels are gentle, the similarity between pixels with small spacing is greater, and the image as a whole also exhibits a certain spatial correlation, while the types of ground objects are quite different, and the correlation is small. Therefore, the correlation can also be used to distinguish clouds from ground objects. The inverse moment is a feature quantity that describes the same or similar local materials in the image. The basic principle of CNN image feature extraction used in this essay can be shown in Figure 5. The original image is extracted by the convolution process to obtain the feature image, and the feature image is then pooled and then preferably fully connected to obtain the final image to be extracted.

In recent years, the domestic three-dimensional surveying and mapping satellite “Tianhui-1“ has been successfully launched and is running normally in orbit. By adopting certain technical methods, improving current work has focused heavily on the spatial placement precision of local satellite and aerial figures, and very rich research results have been achieved. Because there is no ground information for reference in uncontrolled positioning of high-resolution satellite remote sensing images, only a certain system error compensation model and an appropriate adjustment method can be used to correct the observed values based on the strict imaging model of satellite remote sensing images. And the ground information is complex and diverse, including building information, terrain relief, and traffic information. As shown in Figure 6, to display and process this information in real-time and with high resolution, more precise location and image processing technology is required.

Attitude accuracy is the core index that affects the positioning accuracy of linear array satellite images without ground control, and it is necessary to introduce altimetry auxiliary data to make the elevation accuracy error less than meters in uncontrolled positioning. At present, higher-resolution satellite remote sensing images and higher ephemeris attitude measurement accuracy provide important prerequisites for obtaining higher-precision target position information. Aerospace photogrammetry technology is based on a strict mathematical theory, through the modeling of satellite attitude and orbit data. According to the imaging collinear condition formula and the appropriate error modeling theory, the spatial position information of the camera shooting area can be calculated, and the target position can be determined. Due to the practical application of high-precision sensors such as high-resolution three-line array cameras and arcsecond-level star sensors, the positioning method of high-resolution earth observation satellites is gradually developing towards a direction that does not depend on control points, that is, control-free positioning. In addition, the uncertain vibration of the space-based platform makes the attitude measurement of the spaceborne sensor inaccurate. In the actual engineering process, it is necessary to perform interpolation, modeling, and other processing on the external orientation and attitude information collected by the star sensor according to the characteristics of the linear array image [22]. The processing method adopted will introduce different degrees of systematic errors, which will affect the accuracy of subsequent spatial positioning. The processing method adopted will introduce different degrees of systematic errors, which will affect the accuracy of the subsequent spatial positioning [23].

In addition to image processing techniques, determining satellite attitude is also critical. A high-precision satellite attitude is the primary condition for high-resolution satellite remote sensing images to build a rigorous model and achieve high-precision positioning. In high-resolution satellite push-broom imaging, the image line transfer period is much lower than the sampling time of the attitude determination equipment, so it needs to be based on the operating characteristics of the satellite. To build a suitable attitude interpolation and fitting model to obtain the trajectory and attitude data of each linear image imaging time, high-precision attitude determination is an important prerequisite for high-precision positioning. The satellite attitude was first characterized and processed using Euler angles. According to the “rotation theorem” proposed by the mathematician Euler, any rotation can be characterized by the three independent rotation angles around a certain axis, thereby determining the current attitude of an object.

2.3. SVM Classification

The classification model used in this essay is the commonly used SVM, and its classification principle is shown in Figure 7. The SVM classifier has the qualities of higher accuracy, dependability, and durability when used to categorize ground figures. The model’s formal program process involves the following steps.

It is assumed that when the input image information set can be linearly classified, and the classification decision surface is represented by , it satisfies the following:

If the nonlinearity is separable, then the extended field λ is introduced to satisfy the following:

Then, the optimal decision surface should meet the conditions is given as follows:

That is, use Lagrangian to solve the following:

In any data, there is a vector formula satisfied by the corresponding factor α.

The dataset is represented by a function f (x) in a high-dimensional space, and is given as follows:

From this, the optimal decision surface formula can be obtained in the following:

Among them,

The SVM function is given as follows:

Among them, sgn represents a symbolic function.

2.4. Image Filtering and Noise Reduction

After the CNN image recognition and classification, the image is filtered and denoised to make the image clearer. The denoising process is shown in Figure 8. The Kalman filter noise reduction selected in this essay is as follows.

In image processing, the relationship between the real state of the system at time k and noise can be expressed as follows:

Among them, represents system noise.

Then, the prior value at time k−1 is given as follows:

The sensor state is represented by z, and its relationship to noise is given as follows:

Among them, represents sensor noise.

From this, the posterior value can be obtained by weighting is given as follows:

Then, according to the difference between the actual value and the posterior value, the mean square error is obtained in following formula:

The covariance is given as follows:

The Kalman increment K can be obtained in the following:

Finally, substitute K into Formula (15) to get the covariance is given as follows:

3. Simulation Test of Ground Image Processing Effect

3.1. Experimental Design

This essay selects 100 satellite remote sensing images from the GeoImageDB database, which include different types of images such as agriculture, military, transportation, and urban buildings, as shown in Table 1.

In addition, the image size of each type of image is shown in Table 2.

The test environment of the simulation experiment is shown in Table 3. OpenCV is a cross-platform computer vision and machine learning software library that can run on Linux, Windows, and other operating systems. It is lightweight and efficient, and provides interfaces in languages such as Python, Ruby, and MATLAB, and implements many common algorithms in image processing and computer vision.

In the testing process, the selected 100 images are first scrambled and randomly input into the image processing software. The feature extraction of the image is performed by CNN, and then the SVM is used to classify and summarize the feature-extracted image, and finally a high-resolution image is obtained by filtering and denoising.

3.2. Experimental Results

First of all, this essay uses CNN to identify and extract features from the selected 100 satellite remote sensing images and records the accuracy of identification. It then uses SVM to classify the extracted images and summarizes the accuracy of SVM classification. The results are shown in Figure 9.

As the figure demonstrates, the recognition accuracy of CNN for these 100 images is generally above 85%, and the highest is about 97%. It shows that the effect of CNN recognition is relatively good, and from the accuracy of these five types of images, CNN has the highest recognition accuracy for the C3 type, that is, agricultural type image recognition, reaching about 94%. For C1 and C5, the recognition degree is lower, but both exceed 85%. From the results of SVM classification of images, it can be observed that the classification accuracy is above 80%, and the highest is about 95%. From the comparison of these five types of images, it can be observed that the classification accuracy of SVM for buildings is the highest, at about 95%, and the classification degree for agriculture is the lowest, at only 83.6%. It may be that there are many types of agriculture and the similarity between many crops is high, resulting in a high rate of classification errors, while buildings are generally more prominent and may be easier to identify. Overall, the recognition effect of CNN and the classification effect of SVM are quite good, and both have high accuracy.

After the images are more accurately classified, the classified images are filtered and denoised. The image SNR and image resolution obtained by the Kalman filter denoising in this essay are shown in Figure 10.

As can be observed from the figure, the SNR of the image after noise reduction is basically above 6.5, and some even reach above 8.0. With the high quality, the resolution of the image is basically above 800ppi, which is a high-resolution, and the highest even reaches an ultra-high-resolution of 1400ppi. For different types of images, it can be observed that the C2 type, that is, the building type, has a lower SNR and resolution, only about 800ppi. The SNR and resolution of C4 and C5 are higher, at around 7.6 and 1400ppi, which may be due to the fact that there are more light reflections in buildings, which are not as good as glass reflections, wall reflections, etc., which cause image distortion. So the SNR and resolution will be lower. But on the whole, the quality of the processed images is very high, and the resolution of the images is above 720ppi, which are high-resolution or ultra-high-resolution images. The experimental test shows that this essay uses the Kalman filter for noise reduction to achieve good results.

Although this essay uses CNN feature extraction, SVM classification, and filtering noise reduction methods, some results have been achieved in the experiment. However, there is not much research on the imaging technology of remote sensing satellites, and it is necessary to realize the mutual conversion between the strict imaging model and the rational function model. The strict imaging model is used in the uncontrolled positioning of sophisticated remotely sensed satellite images, and it is used in professional processing. The strict geometric relationship of satellite imaging time is constructed through orbit modeling, system error elimination, and other methods to achieve high-precision positioning processing. However, for professional users, obtaining strict imaging geometry data is of great practical significance for studying satellite imaging characteristics, multi-sensor fusion processing, and long-strip regional network adjustment under uncontrolled conditions.

4. Conclusions

It is of great significance to study the formation of high-resolution satellite remote sensing images and the processing of ground images. Among them, linear array satellite single image positioning, two-strip image positioning, multiple-strip image positioning, and regional network adjustment methods under uncontrolled conditions are the basis of high-resolution satellite remote sensing images. The research on these topic has important reference value for realizing sophisticated remotely sensed satellite image positioning processing. This essay firstly studies satellite remote sensing images and finds that the current satellite remote sensing images cannot meet the needs of large-scale mapping and high-precision reconstruction of ground objects. Through the research of satellite remote sensing technology, it is found that most remote sensing satellites currently use linear array push-broom imaging, and the quality of remote sensing images obtained by this method needs to be improved. Then this essay analyzes the current processing technology of ground images and finds that it is a good method to use CNN for image feature extraction and SVM for image recognition. Therefore, this essay selects multiple satellite remote sensing images in the database and divides them into different types for image processing and testing. The results show that the recognition accuracy of CNN for these 100 images is generally above 85%. Among them, the image recognition accuracy of agricultural types is the highest, the accuracy of SVM classification is above 80%, and the classification accuracy of buildings is the highest. The quality of the images after noise reduction is very high, and the resolution of the images is above 720ppi. This shows that the image processing method in this essay has achieved good results. However, there are still some deficiencies in this essay. Among them, the specific principles of satellite remote sensing image imaging are not described clearly enough, the research in the experimental part is not complete enough, and the overall needs to be improved.

Data Availability

The data for this paper can be obtained through e-mail to the authors.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this work.