-
Taming Diffusion Probabilistic Models for Character Control arXiv.cs.GR Pub Date : 2024-04-23 Rui Chen, Mingyi Shi, Shaoli Huang, Ping Tan, Taku Komura, Xuelin Chen
We present a novel character control framework that effectively utilizes motion diffusion probabilistic models to generate high-quality and diverse character animations, responding in real-time to a variety of dynamic user-supplied control signals. At the heart of our method lies a transformer-based Conditional Autoregressive Motion Diffusion Model (CAMDM), which takes as input the character's historical
-
CoARF: Controllable 3D Artistic Style Transfer for Radiance Fields arXiv.cs.GR Pub Date : 2024-04-23 Deheng Zhang, Clara Fernandez-Labrador, Christopher Schroers
Creating artistic 3D scenes can be time-consuming and requires specialized knowledge. To address this, recent works such as ARF, use a radiance field-based approach with style constraints to generate 3D scenes that resemble a style image provided by the user. However, these methods lack fine-grained control over the resulting scenes. In this paper, we introduce Controllable Artistic Radiance Fields
-
DreamPBR: Text-driven Generation of High-resolution SVBRDF with Multi-modal Guidance arXiv.cs.GR Pub Date : 2024-04-23 Linxuan Xin, Zheng Zhang, Jinfu Wei, Ge Li, Duan Gao
Prior material creation methods had limitations in producing diverse results mainly because reconstruction-based methods relied on real-world measurements and generation-based methods were trained on relatively small material datasets. To address these challenges, we propose DreamPBR, a novel diffusion-based generative framework designed to create spatially-varying appearance properties guided by text
-
The Life and Legacy of Bui Tuong Phong arXiv.cs.GR Pub Date : 2024-04-22 Yoehan Oh, Jacinda Tran, Theodore Kim
We examine the life and legacy of pioneering Vietnamese American computer scientist B\`ui Tuong Phong, whose shading and lighting models turned 50 last year. We trace the trajectory of his life through Vietnam, France, and the United States, and its intersections with global conflicts. Crucially, we present evidence that his name has been cited incorrectly over the last five decades. His family name
-
FaceFolds: Meshed Radiance Manifolds for Efficient Volumetric Rendering of Dynamic Faces arXiv.cs.GR Pub Date : 2024-04-22 Safa C. Medin, Gengyan Li, Ruofei Du, Stephan Garbin, Philip Davidson, Gregory W. Wornell, Thabo Beeler, Abhimitra Meka
3D rendering of dynamic face captures is a challenging problem, and it demands improvements on several fronts$\unicode{x2014}$photorealism, efficiency, compatibility, and configurability. We present a novel representation that enables high-quality volumetric rendering of an actor's dynamic facial performances with minimal compute and memory footprint. It runs natively on commodity graphics soft- and
-
DMesh: A Differentiable Representation for General Meshes arXiv.cs.GR Pub Date : 2024-04-20 Sanghyun Son, Matheus Gadelha, Yang Zhou, Zexiang Xu, Ming C. Lin, Yi Zhou
We present a differentiable representation, DMesh, for general 3D triangular meshes. DMesh considers both the geometry and connectivity information of a mesh. In our design, we first get a set of convex tetrahedra that compactly tessellates the domain based on Weighted Delaunay Triangulation (WDT), and formulate probability of faces to exist on our desired mesh in a differentiable manner based on the
-
MixLight: Borrowing the Best of both Spherical Harmonics and Gaussian Models arXiv.cs.GR Pub Date : 2024-04-19 Xinlong Ji, Fangneng Zhan, Shijian Lu, Shi-Sheng Huang, Hua Huang
Accurately estimating scene lighting is critical for applications such as mixed reality. Existing works estimate illumination by generating illumination maps or regressing illumination parameters. However, the method of generating illumination maps has poor generalization performance and parametric models such as Spherical Harmonic (SH) and Spherical Gaussian (SG) fall short in capturing high-frequency
-
Rendering Participating Media Using Path Graphs arXiv.cs.GR Pub Date : 2024-04-18 Becky Hu, Xi Deng, Fujun Luan, Miloš Hašan, Steve Marschner
Rendering volumetric scattering media, including clouds, fog, smoke, and other complex materials, is crucial for realism in computer graphics. Traditional path tracing, while unbiased, requires many long path samples to converge in scenes with scattering media, and a lot of work is wasted by paths that make a negligible contribution to the image. Methods to make better use of the information learned
-
Holographic Parallax Improves 3D Perceptual Realism arXiv.cs.GR Pub Date : 2024-04-18 Dongyeon Kim, Seung-Woo Nam, Suyeon Choi, Jong-Mo Seo, Gordon Wetzstein, Yoonchan Jeong
Holographic near-eye displays are a promising technology to solve long-standing challenges in virtual and augmented reality display systems. Over the last few years, many different computer-generated holography (CGH) algorithms have been proposed that are supervised by different types of target content, such as 2.5D RGB-depth maps, 3D focal stacks, and 4D light fields. It is unclear, however, what
-
MeshLRM: Large Reconstruction Model for High-Quality Mesh arXiv.cs.GR Pub Date : 2024-04-18 Xinyue Wei, Kai Zhang, Sai Bi, Hao Tan, Fujun Luan, Valentin Deschaintre, Kalyan Sunkavalli, Hao Su, Zexiang Xu
We propose MeshLRM, a novel LRM-based approach that can reconstruct a high-quality mesh from merely four input images in less than one second. Different from previous large reconstruction models (LRMs) that focus on NeRF-based reconstruction, MeshLRM incorporates differentiable mesh extraction and rendering within the LRM framework. This allows for end-to-end mesh reconstruction by fine-tuning a pre-trained
-
Lazy Diffusion Transformer for Interactive Image Editing arXiv.cs.GR Pub Date : 2024-04-18 Yotam Nitzan, Zongze Wu, Richard Zhang, Eli Shechtman, Daniel Cohen-Or, Taesung Park, Michaël Gharbi
We introduce a novel diffusion transformer, LazyDiffusion, that generates partial image updates efficiently. Our approach targets interactive image editing applications in which, starting from a blank canvas or an image, a user specifies a sequence of localized image modifications using binary masks and text prompts. Our generator operates in two phases. First, a context encoder processes the current
-
AniClipart: Clipart Animation with Text-to-Video Priors arXiv.cs.GR Pub Date : 2024-04-18 Ronghuan Wu, Wanchao Su, Kede Ma, Jing Liao
Clipart, a pre-made graphic art form, offers a convenient and efficient way of illustrating visual content. Traditional workflows to convert static clipart images into motion sequences are laborious and time-consuming, involving numerous intricate steps like rigging, key animation and in-betweening. Recent advancements in text-to-video generation hold great potential in resolving this problem. Nevertheless
-
S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal arXiv.cs.GR Pub Date : 2024-04-18 Nikolina Kubiak, Armin Mustafa, Graeme Phillipson, Stephen Jolly, Simon Hadfield
In this paper we present S3R-Net, the Self-Supervised Shadow Removal Network. The two-branch WGAN model achieves self-supervision relying on the unify-and-adaptphenomenon - it unifies the style of the output data and infers its characteristics from a database of unaligned shadow-free reference images. This approach stands in contrast to the large body of supervised frameworks. S3R-Net also differentiates
-
Novel View Synthesis for Cinematic Anatomy on Mobile and Immersive Displays arXiv.cs.GR Pub Date : 2024-04-17 Simon Niedermayr, Christoph Neuhauser, Kaloian Petkov, Klaus Engel, Rüdiger Westermann
Interactive photorealistic visualization of 3D anatomy (i.e., Cinematic Anatomy) is used in medical education to explain the structure of the human body. It is currently restricted to frontal teaching scenarios, where the demonstrator needs a powerful GPU and high-speed access to a large storage device where the dataset is hosted. We demonstrate the use of novel view synthesis via compressed 3D Gaussian
-
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation arXiv.cs.GR Pub Date : 2024-04-17 Kuan-ChiehJackson, Wang, Daniil Ostashev, Yuwei Fang, Sergey Tulyakov, Kfir Aberman
We introduce a new architecture for personalization of text-to-image diffusion models, coined Mixture-of-Attention (MoA). Inspired by the Mixture-of-Experts mechanism utilized in large language models (LLMs), MoA distributes the generation workload between two attention pathways: a personalized branch and a non-personalized prior branch. MoA is designed to retain the original model's prior by fixing
-
Generating Human Interaction Motions in Scenes with Text Control arXiv.cs.GR Pub Date : 2024-04-16 Hongwei Yi, Justus Thies, Michael J. Black, Xue Bin Peng, Davis Rempe
We present TeSMo, a method for text-controlled scene-aware motion generation based on denoising diffusion models. Previous text-to-motion methods focus on characters in isolation without considering scenes due to the limited availability of datasets that include motion, text descriptions, and interactive scenes. Our approach begins with pre-training a scene-agnostic text-to-motion diffusion model,
-
Transforming a Non-Differentiable Rasterizer into a Differentiable One with Stochastic Gradient Estimation arXiv.cs.GR Pub Date : 2024-04-15 Thomas Deliot, Eric Heitz, Laurent Belcour
We show how to transform a non-differentiable rasterizer into a differentiable one with minimal engineering efforts and no external dependencies (no Pytorch/Tensorflow). We rely on Stochastic Gradient Estimation, a technique that consists of rasterizing after randomly perturbing the scene's parameters such that their gradient can be stochastically estimated and descended. This method is simple and
-
InverseVis: Revealing the Hidden with Curved Sphere Tracing arXiv.cs.GR Pub Date : 2024-04-13 Kai Lawonn, Monique Meuschke, Tobias Günther
Exploratory analysis of scalar fields on surface meshes presents significant challenges in identifying and visualizing important regions, particularly on the surface's backside. Previous visualization methods achieved only a limited visibility of significant features, i.e., regions with high or low scalar values, during interactive exploration. In response to this, we propose a novel technique, InverseVis
-
LetsGo: Large-Scale Garage Modeling and Rendering via LiDAR-Assisted Gaussian Primitives arXiv.cs.GR Pub Date : 2024-04-15 Jiadi Cui, Junming Cao, Yuhui Zhong, Liao Wang, Fuqiang Zhao, Penghao Wang, Yifan Chen, Zhipeng He, Lan Xu, Yujiao Shi, Yingliang Zhang, Jingyi Yu
Large garages are ubiquitous yet intricate scenes in our daily lives, posing challenges characterized by monotonous colors, repetitive patterns, reflective surfaces, and transparent vehicle glass. Conventional Structure from Motion (SfM) methods for camera pose estimation and 3D reconstruction fail in these environments due to poor correspondence construction. To address these challenges, this paper
-
Reconstructing Curves from Sparse Samples on Riemannian Manifolds arXiv.cs.GR Pub Date : 2024-04-15 Diana Marin, Filippo Maggioli, Simone Melzi, Stefan Ohrhallinger, Michael Wimmer
Reconstructing 2D curves from sample points has long been a critical challenge in computer graphics, finding essential applications in vector graphics. The design and editing of curves on surfaces has only recently begun to receive attention, primarily relying on human assistance, and where not, limited by very strict sampling conditions. In this work, we formally improve on the state-of-the-art requirements
-
Oblique-MERF: Revisiting and Improving MERF for Oblique Photography arXiv.cs.GR Pub Date : 2024-04-15 Xiaoyi Zeng, Kaiwen Song, Leyuan Yang, Bailin Deng, Juyong Zhang
Neural implicit fields have established a new paradigm for scene representation, with subsequent work achieving high-quality real-time rendering. However, reconstructing 3D scenes from oblique aerial photography presents unique challenges, such as varying spatial scale distributions and a constrained range of tilt angles, often resulting in high memory consumption and reduced rendering quality at extrapolated
-
Learning Human Motion from Monocular Videos via Cross-Modal Manifold Alignment arXiv.cs.GR Pub Date : 2024-04-15 Shuaiying Hou, Hongyu Tao, Junheng Fang, Changqing Zou, Hujun Bao, Weiwei Xu
Learning 3D human motion from 2D inputs is a fundamental task in the realms of computer vision and computer graphics. Many previous methods grapple with this inherently ambiguous task by introducing motion priors into the learning process. However, these approaches face difficulties in defining the complete configurations of such priors or training a robust model. In this paper, we present the Video-to-Motion
-
CompGS: Efficient 3D Scene Representation via Compressed Gaussian Splatting arXiv.cs.GR Pub Date : 2024-04-15 Xiangrui Liu, Xinju Wu, Pingping Zhang, Shiqi Wang, Zhu Li, Sam Kwong
Gaussian splatting, renowned for its exceptional rendering quality and efficiency, has emerged as a prominent technique in 3D scene representation. However, the substantial data volume of Gaussian splatting impedes its practical utility in real-world applications. Herein, we propose an efficient 3D scene representation, named Compressed Gaussian Splatting (CompGS), which harnesses compact Gaussian
-
GPN: Generative Point-based NeRF arXiv.cs.GR Pub Date : 2024-04-12 Haipeng Wang
Scanning real-life scenes with modern registration devices typically gives incomplete point cloud representations, primarily due to the limitations of partial scanning, 3D occlusions, and dynamic light conditions. Recent works on processing incomplete point clouds have always focused on point cloud completion. However, these approaches do not ensure consistency between the completed point cloud and
-
AdaContour: Adaptive Contour Descriptor with Hierarchical Representation arXiv.cs.GR Pub Date : 2024-04-12 Tianyu Ding, Jinxin Zhou, Tianyi Chen, Zhihui Zhu, Ilya Zharkov, Luming Liang
Existing angle-based contour descriptors suffer from lossy representation for non-starconvex shapes. By and large, this is the result of the shape being registered with a single global inner center and a set of radii corresponding to a polar coordinate parameterization. In this paper, we propose AdaContour, an adaptive contour descriptor that uses multiple local representations to desirably characterize
-
3DMambaComplete: Exploring Structured State Space Model for Point Cloud Completion arXiv.cs.GR Pub Date : 2024-04-10 Yixuan Li, Weidong Yang, Ben Fei
Point cloud completion aims to generate a complete and high-fidelity point cloud from an initially incomplete and low-quality input. A prevalent strategy involves leveraging Transformer-based models to encode global features and facilitate the reconstruction process. However, the adoption of pooling operations to obtain global feature representations often results in the loss of local details within
-
Efficient and Scalable Chinese Vector Font Generation via Component Composition arXiv.cs.GR Pub Date : 2024-04-10 Jinyu Song, Weitao You, Shuhui Shi, Shuxuan Guo, Lingyun Sun, Wei Wang
Chinese vector font generation is challenging due to the complex structure and huge amount of Chinese characters. Recent advances remain limited to generating a small set of characters with simple structure. In this work, we first observe that most Chinese characters can be disassembled into frequently-reused components. Therefore, we introduce the first efficient and scalable Chinese vector font generation
-
Towards Practical Meshlet Compression arXiv.cs.GR Pub Date : 2024-04-09 Bastian Kuth, Max Oberberger, Felix Kawala, Sander Reitter, Sebastian Michel, Matthäus Chajdas, Quirin Meyer
We propose a codec specifically designed for meshlet compression, optimized for rapid data-parallel GPU decompression within a mesh shader. Our compression strategy orders triangles in optimal generalized triangle strips (GTSs), which we generate by formulating the creation as a mixed integer linear program (MILP). Our method achieves index buffer compression rates of 16:1 compared to the vertex pipeline
-
Nanouniverse: Virtual Instancing of Structural Detail and Adaptive Shell Mapping arXiv.cs.GR Pub Date : 2024-04-08 Ruwayda Alharbi, Ondřej Strnad, Markus Hadwiger, Ivan Viola
Rendering huge biological scenes with atomistic detail presents a significant challenge in molecular visualization due to the memory limitations inherent in traditional rendering approaches. In this paper, we propose a novel method for the interactive rendering of massive molecular scenes based on hardware-accelerated ray tracing. Our approach circumvents GPU memory constraints by introducing virtual
-
PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations arXiv.cs.GR Pub Date : 2024-04-05 Yang Zheng, Qingqing Zhao, Guandao Yang, Wang Yifan, Donglai Xiang, Florian Dubost, Dmitry Lagun, Thabo Beeler, Federico Tombari, Leonidas Guibas, Gordon Wetzstein
Modeling and rendering photorealistic avatars is of crucial importance in many applications. Existing methods that build a 3D avatar from visual observations, however, struggle to reconstruct clothed humans. We introduce PhysAvatar, a novel framework that combines inverse rendering with inverse physics to automatically estimate the shape and appearance of a human from multi-view video data along with
-
Stylizing Sparse-View 3D Scenes with Hierarchical Neural Representation arXiv.cs.GR Pub Date : 2024-04-08 Y. Wang, A. Gao, Y. Gong, Y. Zeng
Recently, a surge of 3D style transfer methods has been proposed that leverage the scene reconstruction power of a pre-trained neural radiance field (NeRF). To successfully stylize a scene this way, one must first reconstruct a photo-realistic radiance field from collected images of the scene. However, when only sparse input views are available, pre-trained few-shot NeRFs often suffer from high-frequency
-
Neural-ABC: Neural Parametric Models for Articulated Body with Clothes arXiv.cs.GR Pub Date : 2024-04-06 Honghu Chen, Yuxin Yao, Juyong Zhang
In this paper, we introduce Neural-ABC, a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for identity, clothing, shape, and pose. Traditional mesh-based representations struggle to represent articulated bodies with clothes due to the diversity of human body shapes and clothing styles, as well as the complexity of poses
-
Quand rechercher c'est faire des vagues : Dans et {à} partir des images algorithmiques arXiv.cs.GR Pub Date : 2024-04-05 Gaëtan RobillardUP8
In Search of the Wave is a computer-generated film made in 2013, highlighting the computation of images through computer simulation, and through text and voice. Originating from a screening of the film at the Gustave Eiffel University, the article presents a reflection on research-creation in and from algorithmic images. Fundamentally, what is it in this research-creation -- especially in research
-
LCM-Lookahead for Encoder-based Text-to-Image Personalization arXiv.cs.GR Pub Date : 2024-04-04 Rinon Gal, Or Lichter, Elad Richardson, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or
Recent advancements in diffusion models have introduced fast sampling methods that can effectively produce high-quality images in just one or a few denoising steps. Interestingly, when these are distilled from existing diffusion models, they often maintain alignment with the original model, retaining similar outputs for similar prompts and seeds. These properties present opportunities to leverage fast
-
iSeg: Interactive 3D Segmentation via Interactive Attention arXiv.cs.GR Pub Date : 2024-04-04 Itai Lang, Fei Xu, Dale Decatur, Sudarshan Babu, Rana Hanocka
We present iSeg, a new interactive technique for segmenting 3D shapes. Previous works have focused mainly on leveraging pre-trained 2D foundation models for 3D segmentation based on text. However, text may be insufficient for accurately describing fine-grained spatial segmentations. Moreover, achieving a consistent 3D segmentation using a 2D model is challenging since occluded areas of the same semantic
-
Discontinuity-preserving Normal Integration with Auxiliary Edges arXiv.cs.GR Pub Date : 2024-04-04 Hyomin Kim, Yucheol Jung, Seungyong Lee
Many surface reconstruction methods incorporate normal integration, which is a process to obtain a depth map from surface gradients. In this process, the input may represent a surface with discontinuities, e.g., due to self-occlusion. To reconstruct an accurate depth map from the input normal map, hidden surface gradients occurring from the jumps must be handled. To model these jumps correctly, we
-
MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment arXiv.cs.GR Pub Date : 2024-04-03 Duygu Ceylan, Valentin Deschaintre, Thibault Groueix, Rosalie Martin, Chun-Hao Huang, Romain Rouffet, Vladimir Kim, Gaëtan Lassagne
We present MatAtlas, a method for consistent text-guided 3D model texturing. Following recent progress we leverage a large scale text-to-image generation model (e.g., Stable Diffusion) as a prior to texture a 3D model. We carefully design an RGB texturing pipeline that leverages a grid pattern diffusion, driven by depth and edges. By proposing a multi-step texture refinement process, we significantly
-
3DStyleGLIP: Part-Tailored Text-Guided 3D Neural Stylization arXiv.cs.GR Pub Date : 2024-04-03 SeungJeh Chung, JooHyun Park, Hyewon Kan, HyeongYeop Kang
3D stylization, which entails the application of specific styles to three-dimensional objects, holds significant commercial potential as it enables the creation of diverse 3D objects with distinct moods and styles, tailored to specific demands of different scenes. With recent advancements in text-driven methods and artificial intelligence, the stylization process is increasingly intuitive and automated
-
Gen4DS: Workshop on Data Storytelling in an Era of Generative AI arXiv.cs.GR Pub Date : 2024-04-02 Xingyu Lan, Leni Yang, Zezhong Wang, Danqing Shi, Sheelagh Carpendale
Storytelling is an ancient and precious human ability that has been rejuvenated in the digital age. Over the last decade, there has been a notable surge in the recognition and application of data storytelling, both in academia and industry. Recently, the rapid development of generative AI has brought new opportunities and challenges to this field, sparking numerous new questions. These questions may
-
Efficient 3D Implicit Head Avatar with Mesh-anchored Hash Table Blendshapes arXiv.cs.GR Pub Date : 2024-04-02 Ziqian Bai, Feitong Tan, Sean Fanello, Rohit Pandey, Mingsong Dou, Shichen Liu, Ping Tan, Yinda Zhang
3D head avatars built with neural implicit volumetric representations have achieved unprecedented levels of photorealism. However, the computational cost of these methods remains a significant barrier to their widespread adoption, particularly in real-time applications such as virtual reality and teleconferencing. While attempts have been made to develop fast neural rendering approaches for static
-
Neural Implicit Representation for Building Digital Twins of Unknown Articulated Objects arXiv.cs.GR Pub Date : 2024-04-01 Yijia Weng, Bowen Wen, Jonathan Tremblay, Valts Blukis, Dieter Fox, Leonidas Guibas, Stan Birchfield
We address the problem of building digital twins of unknown articulated objects from two RGBD scans of the object at different articulation states. We decompose the problem into two stages, each addressing distinct aspects. Our method first reconstructs object-level shape at each state, then recovers the underlying articulation model including part segmentation and joint articulations that associate
-
VortexViz: Finding Vortex Boundaries by Learning from Particle Trajectories arXiv.cs.GR Pub Date : 2024-04-01 Akila de Silva, Nicholas Tee, Omkar Ghanekar, Fahim Hasan Khan, Gregory Dusek, James Davis, Alex Pang
Vortices are studied in various scientific disciplines, offering insights into fluid flow behavior. Visualizing the boundary of vortices is crucial for understanding flow phenomena and detecting flow irregularities. This paper addresses the challenge of accurately extracting vortex boundaries using deep learning techniques. While existing methods primarily train on velocity components, we propose a
-
CCWSIM: An Efficient and Fast Wavelet-Based CCSIM for Categorical Characterization of Large-Scale arXiv.cs.GR Pub Date : 2024-03-30 Mojtaba Bavandsavadkoohi, Erwan Gloaguen, Behzad Tokhmechi, Alireza Arab-Amiri, Bernard Giroux
Over the last couple of decades, there has been a surge in various approaches to multiple-point statistics simulation, commonly referred to as MPS. These methods have aimed to improve several critical aspects of realism in the results, including spatial continuity, conditioning, stochasticity, and computational efficiency. Nevertheless, achieving a simultaneous enhancement of these crucial factors
-
Mirror-3DGS: Incorporating Mirror Reflections into 3D Gaussian Splatting arXiv.cs.GR Pub Date : 2024-04-01 Jiarui Meng, Haijie Li, Yanmin Wu, Qiankun Gao, Shuzhou Yang, Jian Zhang, Siwei Ma
3D Gaussian Splatting (3DGS) has marked a significant breakthrough in the realm of 3D scene reconstruction and novel view synthesis. However, 3DGS, much like its predecessor Neural Radiance Fields (NeRF), struggles to accurately model physical reflections, particularly in mirrors that are ubiquitous in real-world scenes. This oversight mistakenly perceives reflections as separate entities that physically
-
OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees arXiv.cs.GR Pub Date : 2024-03-31 Hakyeong Kim, Andreas Meuleman, Hyeonjoong Jang, James Tompkin, Min H. Kim
We present a method to reconstruct indoor and outdoor static scene geometry and appearance from an omnidirectional video moving in a small circular sweep. This setting is challenging because of the small baseline and large depth ranges, making it difficult to find ray crossings. To better constrain the optimization, we estimate geometry as a signed distance field within a spherical binoctree data structure
-
OmniLocalRF: Omnidirectional Local Radiance Fields from Dynamic Videos arXiv.cs.GR Pub Date : 2024-03-31 Dongyoung Choi, Hyeonjoong Jang, Min H. Kim
Omnidirectional cameras are extensively used in various applications to provide a wide field of vision. However, they face a challenge in synthesizing novel views due to the inevitable presence of dynamic objects, including the photographer, in their wide field of view. In this paper, we introduce a new approach called Omnidirectional Local Radiance Fields (OmniLocalRF) that can render static-only
-
3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting arXiv.cs.GR Pub Date : 2024-03-30 Xiaoyang Lyu, Yang-Tian Sun, Yi-Hua Huang, Xiuzhe Wu, Ziyi Yang, Yilun Chen, Jiangmiao Pang, Xiaojuan Qi
In this paper, we present an implicit surface reconstruction method with 3D Gaussian Splatting (3DGS), namely 3DGSR, that allows for accurate 3D reconstruction with intricate details while inheriting the high efficiency and rendering quality of 3DGS. The key insight is incorporating an implicit signed distance field (SDF) within 3D Gaussians to enable them to be aligned and jointly optimized. First
-
Mil2: Efficient Cloth Simulation Using Non-distance Barriers and Subspace Reuse arXiv.cs.GR Pub Date : 2024-03-28 Lei Lan, Zixuan Lu, Jingyi Long, Chun Yuan, Xuan Li, Xiaowei He, Huamin Wang, Chenfanfu Jiang, Yin Yang
Mil2 pushes the performance of high-resolution cloth simulation, making the simulation interactive (in milliseconds) for models with one million degrees of freedom (DOFs) while keeping every triangle untangled. The guarantee of being penetration-free is inspired by the interior-point method, which converts the inequality constraints to barrier potentials. Nevertheless, we propose a major overhaul of
-
Nearest Neighbor Classication for Classical Image Upsampling arXiv.cs.GR Pub Date : 2024-03-28 Evan Matthews, Nicolas Prate
Given a set of ordered pixel data in the form of an image, our goal is to perform upsampling on the data such that: the resulting resolution is improved by some factor, the final result passes the human test, having added new, believable, and realistic information and detail to the image, the time complexity for upscaling is relatively close to that of lossy upscaling implementations.
-
TOGS: Gaussian Splatting with Temporal Opacity Offset for Real-Time 4D DSA Rendering arXiv.cs.GR Pub Date : 2024-03-28 Shuai Zhang, Huangxuan Zhao, Zhenghong Zhou, Guanjun Wu, Chuansheng Zheng, Xinggang Wang, Wenyu Liu
Four-dimensional Digital Subtraction Angiography (4D DSA) is a medical imaging technique that provides a series of 2D images captured at different stages and angles during the process of contrast agent filling blood vessels. It plays a significant role in the diagnosis of cerebrovascular diseases. Improving the rendering quality and speed under sparse sampling is important for observing the status
-
CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians arXiv.cs.GR Pub Date : 2024-03-28 Avinash Paliwal, Wei Ye, Jinhui Xiong, Dmytro Kotovenko, Rakesh Ranjan, Vikas Chandra, Nima Khademi Kalantari
The field of 3D reconstruction from images has rapidly evolved in the past few years, first with the introduction of Neural Radiance Field (NeRF) and more recently with 3D Gaussian Splatting (3DGS). The latter provides a significant edge over NeRF in terms of the training and inference speed, as well as the reconstruction quality. Although 3DGS works well for dense input images, the unstructured point-cloud
-
Holo-VQVAE: VQ-VAE for phase-only holograms arXiv.cs.GR Pub Date : 2024-03-29 Joohyun Park, Hyeongyeop Kang
Holography stands at the forefront of visual technology innovation, offering immersive, three-dimensional visualizations through the manipulation of light wave amplitude and phase. Contemporary research in hologram generation has predominantly focused on image-to-hologram conversion, producing holograms from existing images. These approaches, while effective, inherently limit the scope of innovation
-
MATTopo: Topology-preserving Medial Axis Transform with Restricted Power Diagram arXiv.cs.GR Pub Date : 2024-03-27 Ningna Wang, Hui Huang, Shibo Song, Bin Wang, Wenping Wang, Xiaohu Guo
We present a novel volumetric RPD (restricted power diagram) based framework for approximating the medial axes of 3D CAD shapes adaptively, while preserving topological equivalence, medial features, and geometric convergence. To solve the topology preservation problem, we propose a volumetric RPD based strategy, which discretizes the input volume into sub-regions given a set of medial spheres. With
-
InstructBrush: Learning Attention-based Instruction Optimization for Image Editing arXiv.cs.GR Pub Date : 2024-03-27 Ruoyu Zhao, Qingnan Fan, Fei Kou, Shuai Qin, Hong Gu, Wei Wu, Pengcheng Xu, Mingrui Zhu, Nannan Wang, Xinbo Gao
In recent years, instruction-based image editing methods have garnered significant attention in image editing. However, despite encompassing a wide range of editing priors, these methods are helpless when handling editing tasks that are challenging to accurately describe through language. We propose InstructBrush, an inversion method for instruction-based image editing methods to bridge this gap. It
-
Modeling uncertainty for Gaussian Splatting arXiv.cs.GR Pub Date : 2024-03-27 Luca Savant, Diego Valsesia, Enrico Magli
We present Stochastic Gaussian Splatting (SGS): the first framework for uncertainty estimation using Gaussian Splatting (GS). GS recently advanced the novel-view synthesis field by achieving impressive reconstruction quality at a fraction of the computational cost of Neural Radiance Fields (NeRF). However, contrary to the latter, it still lacks the ability to provide information about the confidence
-
Predicting Perceived Gloss: Do Weak Labels Suffice? arXiv.cs.GR Pub Date : 2024-03-26 Julia Guerrero-Viu, J. Daniel Subias, Ana Serrano, Katherine R. Storrs, Roland W. Fleming, Belen Masia, Diego Gutierrez
Estimating perceptual attributes of materials directly from images is a challenging task due to their complex, not fully-understood interactions with external factors, such as geometry and lighting. Supervised deep learning models have recently been shown to outperform traditional approaches, but rely on large datasets of human-annotated images for accurate perception predictions. Obtaining reliable
-
Distributed Simulation of Large Multi-body Systems arXiv.cs.GR Pub Date : 2024-03-25 Manas Kale, Paul G. Kry
We present a technique designed for parallelizing large rigid body simulations, capable of exploiting multiple CPU cores within a computer and across a network. Our approach can be applied to simulate both unilateral and bilateral constraints, requiring straightforward modifications to the underlying physics engine. Starting from an approximate partitioning, we identify interface bodies and add them
-
2D Gaussian Splatting for Geometrically Accurate Radiance Fields arXiv.cs.GR Pub Date : 2024-03-26 Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, Shenghua Gao
3D Gaussian Splatting (3DGS) has recently revolutionized radiance field reconstruction, achieving high quality novel view synthesis and fast rendering speed without baking. However, 3DGS fails to accurately represent surfaces due to the multi-view inconsistent nature of 3D Gaussians. We present 2D Gaussian Splatting (2DGS), a novel approach to model and reconstruct geometrically accurate radiance fields
-
GenesisTex: Adapting Image Denoising Diffusion to Texture Space arXiv.cs.GR Pub Date : 2024-03-26 Chenjian Gao, Boyan Jiang, Xinghui Li, Yingpeng Zhang, Qian Yu
We present GenesisTex, a novel method for synthesizing textures for 3D geometries from text descriptions. GenesisTex adapts the pretrained image diffusion model to texture space by texture space sampling. Specifically, we maintain a latent texture map for each viewpoint, which is updated with predicted noise on the rendering of the corresponding viewpoint. The sampled latent texture maps are then decoded
-
Makeup Prior Models for 3D Facial Makeup Estimation and Applications arXiv.cs.GR Pub Date : 2024-03-26 Xingchao Yang, Takafumi Taketomi, Yuki Endo, Yoshihiro Kanamori
In this work, we introduce two types of makeup prior models to extend existing 3D face prior models: PCA-based and StyleGAN2-based priors. The PCA-based prior model is a linear model that is easy to construct and is computationally efficient. However, it retains only low-frequency information. Conversely, the StyleGAN2-based model can represent high-frequency information with relatively higher computational