Abstract
3D model reconstruction is applied to an increasing number of fields related to construction, such as urban planning, mobile communication planning, solar power assessment, and so on. Previous 3D reconstruction models mostly focused on precise measurements, such as laser scanning, ultrasonic mapping, etc. Although these methods can achieve very precise results, they require specific equipment which is usually expensive. The essence of this technology is to infer the overall view of the building through pictures from the perspective that have been taken, thereby obtaining pictures from unfamiliar perspectives. This paper takes the rendering method as the starting point and learns architectural features by training a neural network to provide necessary information for rendering. Different from the more popular projection-based raster rendering method, this paper uses a point-based volume rendering method and uses light sampling to detect architectural features. This rendering method requires the color and density of specific sampling points. Therefore, this paper attempts to train a neural network to fit a five-dimensional function. The input to this function is a five-dimensional vector including position (x, y, z) and viewing direction (θ, φ) , and the output is the color and density of this point when viewed from this direction. This paper adopts the positional encoding method, which reduces the scale of the network and improves both the training speed and the rendering speed. Our method can train a usable network in dozens of seconds and render a building at 30-60 frames per second.