DreamCar: Leveraging Car-specific Prior for in-the-wild 3D Car Reconstruction

DreamCar reconstruct 3D cars in the moving forward scene, even given
single image supervision.

Abstract

Self-driving industries usually employ professional artists to build exquisite 3D cars. However, it is expensive to craft large-scale digital assets. Since there are already numerous datasets available that contain a vast number of images of cars, we focus on reconstructing high-quality 3D car models from these datasets. However, these datasets only contain one side of cars in the forward-moving scene. We try to use the existing generative models to provide more supervision information, but they struggle to generalize well in cars since they are trained on synthetic datasets not car-specific. In addition, The reconstructed 3D car texture misaligns due to a large error in camera pose estimation when dealing with in-the-wild images. These restrictions make it challenging for previous methods to reconstruct complete 3D cars. To address these problems, we propose a novel method, named DreamCar, which can reconstruct high-quality 3D cars given a few images even a single image. To generalize the generative model, we collect a car dataset, named Car360, with over 5,600 vehicles. With this dataset, we make the generative model more robust to cars. We use this generative prior specific to the car to guide its reconstruction via Score Distillation Sampling. To further complement the supervision information, we utilize the geometric and appearance symmetry of cars. Finally, we propose a pose optimization method that rectifies poses to tackle texture misalignment. Extensive experiments demonstrate that our method significantly outperforms existing methods in reconstructing high-quality 3D cars.

Reconstructed from Nuscenes

Our Car360 Dataset

This work aims to reconstruct a complete 3D model from a limited number of images, typically ranging from one to five. However, relying solely on this supervision information is insufficient. Therefore, we integrate a generative prior from the recent large-scale 3D-aware diffusion model, Zero-123-XL in our method. We found this model fails to generalize well in the realistic car subject, attributed to its training on large-scale synthetic datasets, like Objaverse, not car-specific. In this work, we collect a car dataset, named Car360, which contains 5,600 synthetic cars to boost our model robust to realistic cars.

Description of the image

Proposed Method

Our proposed method, dreamcar, reconstructs high-quality 3D car models from a limited number of images by leveraging several key techniques. It begins with image segmentation and mirroring to create additional reference views, followed by estimating and refining camera poses using datasets like Nuscenes. The method employs a progressive coarse-to-fine approach for geometry reconstruction using models like NeRF, Neus, and DMTET, combined with various loss functions to ensure accuracy. Texture refinement is achieved using generative models and DreamBooth for photorealistic results, and a PoseMLP is used for optimizing camera poses to correct texture misalignment.

Description of the image

Comparison