-
CAT: A lightweight Color-aware Transformer for sandstorm image enhancement Displays (IF 4.3) Pub Date : 2024-04-15 Zhengwei Guo, Bo Wang, Chongyi Li
Sandstorm images are characterized by color casts and reduced contrast due to the presence of suspended sand particles, which significantly impacts the performance of high-level computer vision tasks. Recently, numerous deep learning-based methods have been proposed for sandstorm image enhancement. However, most of them are either ineffective or have excessive parameters. In this paper, we introduce
-
Wide-viewing-angle holographic 3D display using lens array for point cloud data Displays (IF 4.3) Pub Date : 2024-04-14 Soma Fujimori, Fan Wang, Tomoyoshi Ito, Tomoyoshi Shimobaba
A method using a lens array to expand the viewing angle of holographic three-dimensional displays endows simplicity to optical systems and offers robustness against misalignment. In this study, we develop a method for generating holograms from point cloud data using the lens-array method. The proposed method utilizes the fact that the lens array generates a sparse light field in a checkered pattern
-
BVA-Transformer: Image-text multimodal classification and dialogue model architecture based on Blip and visual attention mechanism Displays (IF 4.3) Pub Date : 2024-04-12 Kaiyu Zhang, Fei Wu, Guowei Zhang, Jiawei Liu, Min Li
-
Multi-exposure high dynamic range imaging based on LSGAN Displays (IF 4.3) Pub Date : 2024-04-09 Yongqing Huo, Jing Gan, Wenke Jiang
Deep learning-based multi-exposure HDR imaging methods achieve better performance than traditional methods in recovering image information and removing ghosting artifacts, but still produce images with severe ghosting artifacts when dealing with poorly exposed dynamic scenes. In this paper, we proposed a least square generative adversarial network (LSGAN) based HDR imaging algorithm for dynamic scenes
-
Dual side transparent organic light-emitting diodes with a modified Ag top cathode Displays (IF 4.3) Pub Date : 2024-04-08 Tianzhuofu Wu, Yichao Jin, Zhaoyue Lü, Yiyang Zhao, Qihao Teng, Leyi Li
Transparent organic light-emitting diodes (TrOLEDs) offer enormous promising applications in next-generation consumer electronic devices. Herein, TrOLEDs based on a silver (Ag) top cathode are fabricated via vacuum deposition. The semi-transparent and efficient Ag cathode has been fulfilled by combining the optimization of the thickness and deposition rate of Ag films with the modification of lithium
-
CDMS: A real-time system for EEG-guided cybersickness mitigation through adaptive adjustment of VR content factors Displays (IF 4.3) Pub Date : 2024-04-08 Ufuk Uyan, Ufuk Celikcan
Cybersickness remains a major issue that can severely impact the user’s comfort, performance, and enjoyment of VR. While there are various approaches to combat cybersickness, only a few have been developed for real-time mitigation based on user biofeedback, and these do not aim to distinguish causal factors and apply mitigation accordingly. In this paper, we propose a novel real-time cybersickness
-
Enhanced ADHD detection: Frequency information embedded in a visual-language framework Displays (IF 4.3) Pub Date : 2024-04-06 Runze Hu, Kaishi Zhu, Zhenzhe Hou, Ruideng Wang, Feifei Liu
This paper presents the Frequency-Integrated Visual-Language Network (FIVLNet), a deep learning (DL) framework tailored to improve the diagnostic accuracy for Attention Deficit Hyperactivity Disorder (ADHD) using magnetic resonance imaging (MRI) scans. Traditional DL approaches in ADHD diagnosis often overlook the sequential dependencies of MRI images or fail to adequately capture their complex structural
-
A flexible optically driven liquid crystal display based on RM257 doping Displays (IF 4.3) Pub Date : 2024-04-04 Tangwu Li, Jingxin Sang, Jianhua Shang, Yunfei He, Xue Pei, Ming Xiao, Jiatong Sun
-
Learning topic emotion and logical semantic for video paragraph captioning Displays (IF 4.3) Pub Date : 2024-04-04 Qinyu Li, Hanli Wang, Xiaokai Yi
Video paragraph captioning aims to generate multiple descriptive sentences for videos, which strive to replicate human writing in accuracy, logicality, and richness. However, current research focuses on the accuracy and temporal order of events, ignoring emotion and other critical logical relations embedded in human language, such as causal and adversative relations. The ignorance impairs the reasonable
-
How symbol and text combine to promote sign comprehension: Evidence from eye-tracking Displays (IF 4.3) Pub Date : 2024-04-01 Yu-Hsiu Hung, Yongsheng Tan
-
Study on the application of 3D imaging based on the γ-photon silk thread method for visualisation and detection of flow fields Displays (IF 4.3) Pub Date : 2024-03-30 Min Yao, Ming Wang, Min Zhao, Ruipeng Guo, Bolin Ma
-
A hybrid classification model with radiomics and CNN for high and low grading of prostate cancer Gleason score on mp-MRI Displays (IF 4.3) Pub Date : 2024-03-29 Feng Liu, Yuanshen Zhao, Jukun Song, Guilan Tu, Yadong Liu, Yunsong Peng, Jiahui Mao, Chongzhe Yan, Rongpin Wang
Prostate cancer is a prevalent malignancy among men, and the precise diagnosis and grading of prostate cancer have become key research areas in contemporary medicine. Magnetic resonance imaging is an ideal non-invasive method for the detection and diagnosis of prostate cancer, offering superior spatial resolution compared with invasive biopsy procedures. We developed a hybrid model based on radiomics
-
Impact of mobile application loading icon type and animation frequency on user time perception and emotion Displays (IF 4.3) Pub Date : 2024-03-29 Ruoheng Lao, Liang Chen, Jing Yang, Anlan Fan
As the status indicator of system programs, loading icons have a significant role in human–computer interaction. This study explored the effect of the visual presentation of mobile application loading icons on users' time perception and emotion. The two variables adopted in this experiment were the type of loading icons and the animation frequencies. Three types of loading icons were used: circle type
-
A multimodal visual fatigue assessment model based on back propagation neural network and XGBoost Displays (IF 4.3) Pub Date : 2024-03-28 Lixiu Jia, Lixin Jia, Jian Zhao, Lihang Feng, Xiaohua Huang
An experiment was conducted using a subjective questionnaire, ophthalmological parameters, electroencephalogram (EEG) signals, electrocardiogram (ECG) signals, and eye-tracking parameters. The goal was to investigate the impact of display modes (2D, normal 3D, and enhanced 3D) on visual fatigue. The results of paired samples t-tests for both subjective and objective parameters indicated a significant
-
Combinatorial progressive architecture search for crowd counting Displays (IF 4.3) Pub Date : 2024-03-27 Qian Li, Chao Ma, Hao Chen, Xinyuan Chen, Xiaokang Yang
Recently, the automated machine learning system, including neural architecture search (NAS), has been introduced to the task of crowd counting. However, there are several concerns about applying existing AutoML methods to crowd counting: The previous AutoML system automates the CNNs of crowd counting without acceleration design and ignores the reproducibility on edge computation circumstances; a simple
-
RTEN-SR: A reference-based texture enhancement network for single image super-resolution Displays (IF 4.3) Pub Date : 2024-03-27 Shuying Huang, Wenjing Deng, Guoqiang Li, Yong Yang, Jichao Wang
Most current super-resolution (SR) reconstruction methods suffer from edge blurring and insufficient detail reconstruction. To avoid these problems, this paper proposes a reference-based texture enhancement network for single image SR (RTEN-SR). Firstly, a preliminary reconstruction module (PRM) is constructed to learn the initial reconstructed high-resolution (HR) image features. Then, a multi-scale
-
Identifying pathological groups from MRI in prostate cancer using graph representation learning Displays (IF 4.3) Pub Date : 2024-03-26 Feng Liu, Yuanshen Zhao, Chongzhe Yan, Jingxian Duan, Lei Tang, Bo Gao, Rongpin Wang
Multiparametric magnetic resonance imaging (mpMRI) plays a critical role in prostate cancer (PCa) diagnosis, aiding in clinical trial evaluation and personalized treatment planning. We propose a novel prediction approach based on graph representation learning (GRL) integrating mpMRI images for classifying International Society of Urological Pathology (ISUP) grade groups. We initially constructed a
-
MRI and RNA-seq fusion for prediction of pathological response to neoadjuvant chemotherapy in breast cancer Displays (IF 4.3) Pub Date : 2024-03-26 Hui Li, Yuanshen Zhao, Jingxian Duan, Jia Gu, Zaiyi Liu, Huailing Zhang, Yuqin Zhang, Zhi-Cheng Li
Accurate prediction of the pathological complete response (pCR) to neoadjuvant chemotherapy (NAC) is crucial for precise treatment of breast cancer. However, current studies mainly rely on single-modal data, with limited studies focusing on multimodal data. In this study, we developed and validated a deep learning-based multimodal fusion model that predicts the response of breast tumor to NAC by integrating
-
DDFL: Dual-Domain Feature Learning for nighttime semantic segmentation Displays (IF 4.3) Pub Date : 2024-03-26 Xiao Lin, Peiwen Tan, Zhengkai Wang, Lizhuang Ma, Yan Li
Nighttime semantic segmentation has been playing a critical role in intelligent transportation, building safety and urban management. However, nighttime scenes present some challenges such as complex structures, multiple light sources, uneven lighting and blurry image noise, which severely degrade the segmentation quality of nighttime images. To address these challenges, we propose a Dual-Domain Feature
-
Vergence eye movements in virtual reality Displays (IF 4.3) Pub Date : 2024-03-18 Ken McAnally, Philip Grove, Guy Wallis
There is increasing interest in incorporating eye tracking in virtual reality (VR) systems to infer gaze in 3 dimensions. Estimation of gaze in depth may be limited by errors of measurement of the small eye vergence angles involved. It may also be limited by true errors of vergence in VR. We found that observers commonly made errors of vergence when viewing in VR, such that the dominant eye was more
-
MUPT-Net: Multi-scale U-shape pyramid transformer network for Infrared Small Target Detection Displays (IF 4.3) Pub Date : 2024-03-16 Junjie Yin, Jingxia Jiang, Weijia Li, Erkang Chen, Liyuan Chen, Lihan Tong, Bin Huang
Infrared Small Target Detection (IRSTD) aims to detect small and dim targets in complex backgrounds. However, the low signal-to-noise ratio and reduced contrast in the infrared domain make it challenging to extract these targets, as the cluttered background can easily overpower them. Existing Convolutional Neural Networks (CNN)-based methods for IRSTD often suffer from information loss due to inadequate
-
JDECMC: Improving JDE based multi-object tracking with Camera Motion Compensation Displays (IF 4.3) Pub Date : 2024-03-16 Melikamu Liyih Sinishaw, Shu Liu
-
Effective augmentation of front opening unified pod filter images Displays (IF 4.3) Pub Date : 2024-03-16 Hao-Sung Chiu, I-Chen Lin, Yu-Bin Chen
-
Histogram shifting based reversible data hiding with multiple expansion bin pairs Displays (IF 4.3) Pub Date : 2024-03-16 Mengyao Xiao, Xiaolong Li, Wei Lu, Yao Zhao
-
UI semantic component group detection: Grouping UI elements with similar semantics in mobile graphical user interface Displays (IF 4.3) Pub Date : 2024-03-15 Shuhong Xiao, Yunnong Chen, Yaxuan Song, Liuqing Chen, Lingyun Sun, Yankun Zhen, Yanfang Chang, Tingting Zhou
-
Phantom sensation: Threshold and quality indicators of a tactile illusion of motion Displays (IF 4.3) Pub Date : 2024-03-13 Byron Remache-Vinueza, Andrés Trujillo-León, Fernando Vidal-Verdú
Utilizing a randomized, blind, controlled experiment, and the ascending method of limits, we determined the minimum amplitude of motion at which individuals perceive a tactile illusion called moving phantom sensation, the perceived level of clarity and continuity of motion. Implementing tactile illusions in virtual/augmented reality, sensory substitution systems, and other human–computer interaction
-
Modeling the carrier density in the exciton formation zone of organic light-emitting diode under high current injection Displays (IF 4.3) Pub Date : 2024-03-11 Dashan Qin
The carrier density in exciton formation zone is the electrical parameter most relevant to the stability of organic light-emitting diode: the decrease of carrier density improves the device stability. Here, based on the general mode of carrier device lifetimes, the carrier densities in the exciton formation zone of organic light-emitting diode have been calculated at current densities () ≥ 2.0 kA cm
-
The influence of signal hue and background music pitch on vigilance Displays (IF 4.3) Pub Date : 2024-03-11 Jinghan Wang, Yanqun Huang, Xueqin Huang, Junyu Yang, Jutao Li
Humans generally display vigilance decrement during sustained cognitive workloads, while visual and auditory stimuli have been shown to elicit arousal, which influences the level of user vigilance. This study explored the effects of hue and background music pitch on user vigilance. Thirty-five participants performed a 10-min Psychomotor Vigilance Test with background music playing. Three hue conditions
-
Human pose estimation in crowded scenes using Keypoint Likelihood Variance Reduction Displays (IF 4.3) Pub Date : 2024-03-11 Longsheng Wei, Xuefu Yu, Zhiheng Liu
Human pose estimation can be applied to many computer vision tasks, such as human–computer interaction, motion recognition, and action detection. However, few previous methods focused on the pose estimation problem in crowded scenes. Connection-based bottom-up approaches are the main pipelines in multi-person pose estimation. Keypoint detection, connection detection and pose assembly are the main processes
-
Video-based craniomaxillofacial disease screening system Displays (IF 4.3) Pub Date : 2024-02-28 Kaixun Zhang, Yuhang Men, Yiqiao Shi, Jiajie Chen, Jing Han, Menghan Hu, Jiannan Liu
Craniomaxillofacial disease, which is common all over the world, is difficult to screen at early stage. It will probably affect the patient’ facial appearance once someone suffers from it. This paper introduces an integrated system that is composed of data collection, three-dimensional reconstruction, and disease screening, facilitating timely detection of craniomaxillofacial diseases. With an expanding
-
The effect of short-form video content, speed, and proportion on visual attention and subjective perception in online food delivery menu interfaces Displays (IF 4.3) Pub Date : 2024-02-24 Mengyao Qi, Kenta Ono, Lujin Mao, Makoto Watanabe, Jinghua Huang
-
A contrastive learning based unsupervised multi-view stereo with multi-stage self-training strategy Displays (IF 4.3) Pub Date : 2024-02-24 Zihang Wang, Haonan Luo, Xiang Wang, Jin Zheng, Xin Ning, Xiao Bai
Recent years, unsupervised multi-view stereo (MVS) methods have achieved excellent success that can produce comparable results to earlier supervised work. However, as unsupervised MVS uses image reconstruction as pretext task, it faces two vital drawbacks: RGB value, which is the measurement of image, is not robust enough across views due to complicated environment like lighting conditions and reconstruction
-
RTHEN: Unsupervised deep homography estimation based on dynamic attention for repetitive texture image stitching Displays (IF 4.3) Pub Date : 2024-02-21 Ni Yan, Yupeng Mei, Tian Yang, Huihui Yu, Yingyi Chen
Homography estimation is regarded as one of the key challenges in image alignment, where the goal is to estimate the projective transformation between two images on the same plane. Unsupervised learning methods are gradually becoming popular due to their excellent performance and lack of need for labeled data. However, in regional scenes with repeated textures, there may be ambiguity in the correspondence
-
Effects of color scheme and visual fatigue on visual search performance and perceptions under vibration conditions Displays (IF 4.3) Pub Date : 2024-02-17 Da Tao, Xinyuan Ren, Kaifeng Liu, Qian Mao, Jian Cai, Hailiang Wang
Visual search represents one of the most encountered human–computer interaction tasks. However, the effect of visual fatigue on visual search, especially in conditions involving vibrations, remains largely known. The objective of this study was to assess the effects of color scheme and visual fatigue on visual search performance and perceptions in different vibration conditions. We conducted an experiment
-
The role of lifestyle factors, biological sex, and racial identity for (visually induced) motion sickness susceptibility: Insights from an online survey Displays (IF 4.3) Pub Date : 2024-02-16 Narmada Umatheva, Frank A. Russo, Behrang Keshavarz
Motion sickness (MS) and visually induced motion sickness (VIMS) are common side-effects when travelling or when using visual devices, respectively. A variety of individual factors may determine one’s susceptibility to MS/VIMS. Here, the role of lifestyle factors including video-game usage, physical activity, diet, and substance use on self-reported susceptibility to MS/VIMS was investigated. Additionally
-
Visually guided movement in virtual reality is tolerant of the vergence-accommodation conflict Displays (IF 4.3) Pub Date : 2024-02-16 Ken McAnally, Guy Wallis, Philip Grove
Stereoscopic virtual reality (VR) headsets display vergence cues to object distance but present images at a fixed focus, resulting in a vergence-accommodation conflict (VAC). This study examined the effects of introducing or reducing the VAC with optical lenses in a targeted reaching task implemented in both VR and the real world. Contrary to previous reports of reduced visual performance and fatigue
-
Effectiveness and visual performance assessment of anti-peeping films Displays (IF 4.3) Pub Date : 2024-02-16 Wenqian Xu, Qi Yao, Peiyu Wu, Rongjun Zhang, Wei Zhu, Pengfei Li, Leimin Bao
Some private information inevitably becomes more visible when using wide-angle electronic devices in public places. Thus, some people use anti-peeping films to achieve privacy protection. To examine the effectiveness and its influence on the visual performance of the anti-peeping film, we investigated the color gamut, luminance, contrast ratio, Bhattacharyya coefficient and structural similarity characteristics
-
Contrastive adaptive frequency decomposition network guided by haze discrimination for real-world image dehazing Displays (IF 4.3) Pub Date : 2024-02-15 Yaozong Mo, Chaofeng Li
Recent unsupervised image dehazing methods used unpaired real-world training data for enhancing generalization on real-world scenes. However, these methods often require dehazing and rehazing cycles with auxiliary networks for training, resulting in high computational costs and extended training time. In this work, we propose an unsupervised dehazing framework called Contrastive Adaptive Frequency
-
Effects of anthropomorphic design on comprehension of self-monitoring test results: Integrating evidence of eye-tracking and event-related potential Displays (IF 4.3) Pub Date : 2024-02-12 Pengbo Su, Kaifeng Liu
To investigate the effects of anthropomorphism design on individuals’ comprehension of self-monitoring test results. In addition, we employed eye-tracking and event-related potential techniques to explore the underlying mechanisms. A within-group design was employed with presentation format (black-and-white neutral design, black-and-white anthropomorphic design, and colored anthropomorphic design)
-
A dynamic detection and data association method based on probabilistic models for visual SLAM Displays (IF 4.3) Pub Date : 2024-02-10 Jianbo Zhang, Liang Yuan, Teng Ran, Song Peng, Qing Tao, Wendong Xiao, Jianping Cui
Visual Simultaneous Localization and Mapping (VSLAM) is a critical foundation in mobile robotics and augmented reality (AR). However, VSLAM faces challenges in dynamic environments since both the camera and the object are in motion, which contradicts the classical static scene assumption. Generally, multi-view geometry is employed for static features to estimate camera pose and reconstruct environment
-
Automatic quantitative intelligent assessment of neonatal general movements with video tracking Displays (IF 4.3) Pub Date : 2024-02-01 Xinrui Huang, Chunling Huang, Wang Yin, Hesong Huang, Zhuoheng Xie, Yuchuan Huang, Meining Chen, Xinyue Fan, Xiaoteng Shang, Zeyu Peng, You Wan, Tongyan Han, Ming Yi
General movement (GM) assessment (GMA) is an internationally recognised tool for the early screening and diagnosis of neurodevelopmental abnormalities in high-risk infants. Traditional GMA requires multiple internationally certified doctors, which is subjective and time-consuming and therefore limits its widespread use, especially among neonates. Quantifying and accelerating GMA can reduce artificial
-
Improving Braille–Chinese translation with jointly trained and pre-trained language models Displays (IF 4.3) Pub Date : 2024-02-01 Tianyuan Huang, Wei Su, Lei Liu, Chuan Cai, Hailong Yu, Yongna Yuan
-
The effects of representation of industrial icons on visual search performance Displays (IF 4.3) Pub Date : 2024-01-29 Jiang Shao, Yuhan Zhan, Hui Zhu, Mingming Zhang, Lang Qin, Shangxin Tian, Hongwei Qi
With innovations in intelligent manufacturing technology and the enhancement of intelligent manufacturing systems, the quantity of information held and transmitted by interactive interfaces has increased significantly, which also increases the cognitive load on the operators. As a component of an interactive interface, the icon has the vital mission of communicating semantics. The eye-movement experiments
-
Towards better video services: An EEG-based interpretable model for functional quality of experience evaluation Displays (IF 4.3) Pub Date : 2024-01-29 Yifan Niu, Kexin Di, Gangyan Zeng, Tao Wei, Yuan Zhang, Xia Wu
Since emerging video services can provide emotional and social value to users, the setting of their functional parameters directly affects human cognitive and affective states, further influencing video services’ quality of experience (QoE), which we call functional QoE (fQoE). FQoE is highly dependent on human subjective perceptions and the reasons for its generation are important for service providers
-
DSSO-YOLO: A fast detection model for densely stacked small object Displays (IF 4.3) Pub Date : 2024-01-28 Zheng Zhang, Liangchen Liu, Xunyi Zhao, Lijun Zhang, Jun Wu, Yan Zhang, Zhenghao Li
Visual detection for densely stacked small object (DSSO) has a wide range of applications in the construction, logistics, and import/export industries. Take the construction industry as an example, intelligent rebar counting, can considerably improve the management efficiency in sales, delivery and inventory management. It can also effectively prevent acts such as supervisory theft. However, current
-
Prediction model for indoor light environment brightness based on image metrics Displays (IF 4.3) Pub Date : 2024-01-28 Chao Ruan, Li Zhou, Liangzhuang Wei, Wei Xu, Yandan Lin
Currently, rapid progress in display technology and optical simulation software has enabled the visualization of lighting design, which can provide abundant visual information. However, renderings only allow designers to subjectively judge whether the lighting layout and optical parameters are reasonable. So we want to combine the rendered images and photometric data in the process of optical simulations
-
RAWIW: RAW Image Watermarking robust to ISP pipeline Displays (IF 4.3) Pub Date : 2024-01-24 Kang Fu, Xiaohong Liu, Jun Jia, Zicheng Zhang, Yicong Peng, Jia Wang
Invisible image watermarking is essential for image copyright protection. Compared to RGB images, RAW format images use a higher dynamic range to capture the radiometric characteristics of the camera sensor, providing greater flexibility in post-processing and retouching. RAW images are considered the original format for distribution and image production, thus requiring copyright protection. Existing
-
ARD-SLAM: Accurate and robust dynamic SLAM using dynamic object identification and improved multi-view geometrical approaches Displays (IF 4.3) Pub Date : 2024-01-23 Qamar Ul Islam, Haidi Ibrahim, Pan Kok Chin, Kevin Lim, Mohd Zaid Abdullah, Fatemeh Khozaei
In the evolving landscape of autonomous navigation, traditional Visual Simultaneous Localization and Mapping (SLAM) systems often encounter challenges in dynamic environments, primarily due to their reliance on assumptions of static surroundings. In response to these limitations, we introduce ARD-SLAM, a groundbreaking approach to dynamic SLAM that innovatively combines global dense optical tracking
-
Investigating visual determinants of visuomotor performance in virtual reality Displays (IF 4.3) Pub Date : 2024-01-22 Ken McAnally, Guy Wallis, Philip Grove
We report the relative efficiency of visually guided movement in virtual reality (VR) compared to that in the real world using a standardised visuomotor task based on Fitts’ tapping. Haptic cues were veridical across both displays to ensure that any differences in performance could be attributed to characteristics of the visual display. The presence of binocular cues, and of monocular surface texture
-
Charge generation layer with Yb assistant interlayer for tandem organic light-emitting diodes Displays (IF 4.3) Pub Date : 2024-01-19 Kanghoon Kim, Jae-In Yoo, Sung-Cheon Kang, Hyo-Bin Kim, Eun-young Choi, Sundararajan Parani, Jang-Kun Song
Tandem organic light-emitting diode (OLED) devices require an efficient charge generation layer (CGL) between two stacked OLED units. In this study, a CGL with an Yb assistant interlayer was fabricated and investigated. The optical transmittance and charge generation performances of the CGLs were analyzed with respect to the Yb thickness. The best result was obtained at a Yb thickness of 3 nm, at which
-
A convolutional neural network-based rate control algorithm for VVC intra coding Displays (IF 4.3) Pub Date : 2024-01-19 Jiafeng Wang, Xiwu Shang, Xiaoli Zhao, Yuhuai Zhang
The Versatile Video Coding (VVC) has shown significant improvements in Rate-Distortion (R-D) performance compared to its predecessor, High Efficiency Video Coding (HEVC). However, it still encounters several challenges. One of these challenges is the efficient allocation of bits among all Coding Tree Units (CTUs). Additionally, there is a lack of prior information for intra-frame coding, particularly
-
ReverseGAN: An intelligent reverse generative adversarial networks system for complex image captioning generation Displays (IF 4.3) Pub Date : 2024-01-19 Guoxiang Tong, Wei Shao, Yueyang Li
Towards the inclusion of complex semantic relational images, we propose an intelligent Reverse Generative Adversarial Network (ReverseGAN) with generative task guidance to build an image caption system. The system utilizes regenerated images to learn the concept of image caption generation, using a generative adversarial network as the overall framework of the model. The generative network uses a graph
-
CHDNet: A lightweight weakly supervised segmentation network for lung CT image Displays (IF 4.3) Pub Date : 2024-01-19 Fangfang Lu, Tianxiang Liu, Ting Zhang, Bei Jin, Weiyan Gu
Deep learning methods have ushered in an unprecedented transformation in medical image segmentation by automating the segmentation of computed tomography (CT) slices. However, challenges persist in the application of these deep learning methods, including models with a high number of training parameters which hinder their clinical deployment and practical use. Furthermore, acquiring a large volume
-
-
Ai-aided diagnosis of oral X-ray images of periapical films based on deep learning Displays (IF 4.3) Pub Date : 2024-01-11 Lifeng Gao, Tongkai Xu, Meiyu Liu, Jialin Jin, Li Peng, Xiaoting Zhao, Jiaqing Li, Mengting Yang, Suying Li, Sheng Liang
Oral X-ray images provide a useful technical means by which dentists examine teeth for dental problems, but the diagnostic process is defective due to its over-reliance on dentists’ subjective judgments, lack of objective criteria, etc. In this context, this study examined the AI-aided diagnosis of periapical films based on deep learning..Based on YOLOv7-X, a YOLO-DENTAL network architecture was used
-
WHRIME: A weight-based recursive hierarchical RIME optimizer for breast cancer histopathology image segmentation Displays (IF 4.3) Pub Date : 2024-01-11 Jie Xing, Ali Asghar Heidari, Huiling Chen, Hanli Zhao
In medical image processing, multi-threshold image segmentation has been challenging, as selecting appropriate thresholds is crucial for distinguishing different structures within an image, especially when dealing with breast cancer images. Breast cancer images are complex with multiple tissue types, which pose challenges to precise diagnosis. A weight-based recursive hierarchical bootstrapping rime
-
A directionally illuminated pixel-selective flickering-free autostereoscopic display Displays (IF 4.3) Pub Date : 2024-01-10 Yong He, Xuehao Chen, Guangyong Zhang, Yunjia Fan, Xingbin Liu, Dongyan Deng, Zhongbo Yan, Haowen Liang, Jianying Zhou
A directionally illuminated pixel-selective flickering-free autostereoscopic display is proposed and demonstrated. The system consists of the U-shaped backlight, the mix-grooves cylindrical Fresnel lens array, a light shaping diffuser film, and a liquid crystal display with a directional light splitting element. Simulation is applied to obtain the crosstalk and the illuminance distribution at each
-
Effect of rough screen on speckle suppression by wavelength and angle diversity in laser projection systems Displays (IF 4.3) Pub Date : 2024-01-09 Yuantong Chen, Linxiao Deng, Binghui Yao, Yuhua Yang, Liquan Zhu, Ting Li, Lixin Xu, Chun Gu
In speckle suppression research, screens play a critical role in formation of speckle. This paper examines screen speckle using diverse light sources and screens of varying roughness. Our findings demonstrate that speckle suppression by screens is the key reason wavelength and angle diversities are not mutually independent in existing literature. Different from the theory, our experiments reveal that
-
Applications of liquid crystal planer optical elements based on photoalignment technology in display and photonic devices Displays (IF 4.3) Pub Date : 2024-01-05 Fangfang Chen, Jihong Zheng, Chenchen Xing, Jingxin Sang, Tong Shen
Liquid crystal (LC) planar optical elements (POEs) based on photoalignment technology have emerged as a promising approach to manipulating light in various ways. Owing to the high diffraction efficiency, polarization sensitive, and simple fabrication process, LC POEs have found enormous applications in display and photonic devices. In this review, we analyze the Pancharatnam-Berry (PB) phase, polarization
-
Underwater image classification based on image enhancement and information quality evaluation Displays (IF 4.3) Pub Date : 2024-01-04 Shuai Xiao, Xiaotong Shen, Zhuo Zhang, Jiabao Wen, Meng Xi, Jiachen Yang
Underwater target imaging is widely used in oceans, rivers and lakes detection fields, but due to the existence of water on light absorption scattering attenuation effect, the diffraction limit of imaging system, aberration distortion and underwater turbulence, underwater images has serious degradation, mainly manifested in noise, fuzzy and low resolution, etc. In recent years, some scholars have started