Abstract
A classification system for hazardous materials in air traffic control was investigated using the Human Factors Analysis and Classification System (HFACS) framework and natural language processing to prevent hazardous situations in air traffic control. Based on the development of the HFACS standard, an air traffic control hazard classification system will be created. The dangerous data of the aviation safety management system is selected by dead bodies, classified and marked in 5 levels. TFIDF TextRank text classification method based on key content extraction and text classification model based on CNN and BERT model were used in the experiment to solve the problem of small samples, many labels and random samples in hazardous environment of air pollution control. The results show that the total cost of model training time and classification accuracy is the highest when the keywords are around 8. As the number of points increases, the time spent in dimensioning decreases and affects accuracy. When the number of points reaches about 93, the time spent in determining the size increases, but the accuracy of the allocation remains close to 0.7, but the increase in the value of time leads to a decrease in the total cost. It has been proven that extracting key content can solve text classification problems for small companies and contribute to further research in the development of security systems.
- Xavier, B. A., & Chen, P. H. (2022). Natural language processing for imaging protocol assignment: machine learning for multiclass classification of abdominal ct protocols using indication text data. Journal of Digital Imaging,58(7),69-74.Google Scholar
- Cosimo, Ieracitano, A., Paviglianiti, M., Campolo, A., Hussain, E., & Pasero, F., et al. (2021). A novel automatic classification system based on hybrid unsupervised and supervised machine learning for electrospun nanofibers. IEEE/CAA Journal of Automatica Sinica, v.8(01), 68-80.Google Scholar
- Guhan, B., Sowmiya, S., Snekhalatha, U., & Rajalakshmi, T. (2021). Automated segmentation of heel fissures based on thermal image processing and classification based on machine learning algorithms. Biomedical Engineering: Applications, Basis and Communications,36(7),96-102.Google Scholar
- Hamid, Z., & Khafaji, H. K. (2021). A general algorithm of association rule-based machine learning dedicated for text classification. Journal of Physics Conference Series, 1773(1), 012011.Google ScholarCross Ref
- Pilar López-beda a, Manuel Carlos Díaz-Galiano a, Teodoro Martín-Noguerol b, B, A. L., L. Alfonso Urea-López a, & M. Teresa Martín-Valdivia a. (2021). Automatic medical protocol classification using machine learning approaches. Computer Methods and Programs in Biomedicine, 200(9),15-16.Google Scholar
- Faris, H., Habib, M., Faris, M., Alomari, A., Castillo, P. A., & Alomari, M. (2022). Classification of arabic healthcare questions based on word embeddings learned from massive consultations: a deep learning approach. Journal of ambient intelligence and humanized computing,85(4), 13.Google Scholar
- Occhipinti, A., Rogers, L., & Angione, C. (2022). A pipeline and comparative study of 12 machine learning models for text classification. arXiv e-prints,123(7),56-59.Google Scholar
- Odden, T. O. B., Marin, A., & Rudolph, J. L. (2021). How has science education changed over the last 100 years? an analysis using natural language processing. Science Education,854(6),65-68.Google Scholar
- Rajkumar, N., Subashini, T. S., Rajan, K., & Ramalingam, V. (2021). An efficient feature extraction with bidirectional long short term memory based deep learning model for tamil document classification. Journal of computational and theoretical nanoscience,874(3), 18.Google Scholar
- Song, G. (2021). Sentiment analysis of japanese text and vocabulary learning based on natural language processing and svm. Journal of Ambient Intelligence and Humanized Computing,45(5),75-78.Google Scholar
- Faris, H., Habib, M., Faris, M., Alomari, A., & Alomari, M. (2021). Classification of arabic healthcare questions based on word embeddings learned from massive consultations: a deep learning approach. Journal of Ambient Intelligence and Humanized Computing,65(2),35-39.Google Scholar
- Gasmi, K. (2022). Medical text classification based on an optimized machine learning and external semantic resource. Journal of circuits, systems and computers,847(52),125-129.Google Scholar
- Guberney Muetón-Santa, Escobar-Grisales, D., Felipe Orlando López-Pabón, Paula Andrea Pérez-Toro, & Orozco-Arroyave, J. R. (2022). Classification of poverty condition using natural language processing. Social Indicators Research, 162(3), 1413-1435.Google ScholarCross Ref
- El Mir, I., El Kafhali, S., & Haqiq, A. (2022). A hybrid learning approach fortext classification using natural language processing,85(7),55-58.Google Scholar
- Cherif, W., Madani, A., & Kissi, M. (2021). Text categorization based on a new classification by thresholds. Progress in Artificial Intelligence, 452(7),1-15.Google Scholar
- Penfold, R. B., Carrell, D. S., Cronkite, D. J., Pabiniak, C., Dodd, T., & Glass, A. M., et al. (2022). Development of a machine learning model to predict mild cognitive impairment using natural language processing in the absence of screening. BMC Medical Informatics and Decision Making, 22(1), 1-13.Google ScholarCross Ref
- Alexis, A., Kyubum, L., Qingyu, C., Ling, L., & Zhiyong, L. Litsuggest: a web-based system for literature recommendation and curation using machine learning. Nucleic Acids Research(W1),96(74),88-92.Google Scholar
- Hagberg, E., Hagerman, D., Johansson, R., Hosseini, N., Liu, J., & Bjrnsson, E., et al. (2022). Semi-supervised learning with natural language processing for right ventricle classification in echocardiography—a scalable approach. Computers in Biology and Medicine, 143(4), 105282.Google ScholarDigital Library
- Mariyam, A., Basha, S. A. H., & Raju, S. V. (2021). A literature survey on recurrent attention learning for text classification. IOP Conference Series: Materials Science and Engineering, 1042(1), 012030 (4pp).Google ScholarCross Ref
- Iqbal, S., Hassan, S. U., Aljohani, N. R., Alelyani, S., Nawaz, R., & Bornmann, L. (2021). A decade of in-text citation analysis based on natural language processing and machine learning techniques: an overview of empirical studies. Scientometrics, 126(3),666-668.Google Scholar
Index Terms
- A Natural Language Processing System for Text Classification Corpus Based on Machine Learning
Recommendations
TextCNN-based ensemble learning model for Japanese Text Multi-classification
AbstractIn this paper, we aim at improving Japanese text classification using TextCNN-based ensemble learning model. Specifically, we first construct three different sub-classifiers, combining ALBERT, RoBERTa, DistilBERT with TextCNN, respectively; and ...
Graphical abstractDisplay Omitted
Highlights- Three TextCNN-based sub-classifiers for Japanese text classification are designed.
- A Bagging ensemble learning model is proposed to combine three different subclassifiers for multi-label Japanese text classification.
- A Japanese ...
Fundamental Sentiment Analysis by Natural Language Processing and Machine Learning for Email Classification
APIT '23: Proceedings of the 2023 5th Asia Pacific Information Technology ConferenceDue to its ease of use, speed, adaptability, and ability to keep a complete record of correspondence, email is a commonly used and trusted communication medium. The vulnerability of these emails to cyberattacks has increased. This study utilized the ...
Combining Homogeneous Classifiers for Centroid-based Text Classification
ISCC '02: Proceedings of the Seventh International Symposium on Computers and Communications (ISCC'02)Centroid-based text classification is one of the most popular supervised approaches to classify texts into a set of pre-defined classes. Based on the vector-space model, the performance of this classification particularly depends on the way to weight ...
Comments