4D style transfer aims at transferring arbitrary visual style to the synthesized novel views of a dynamic 4D scene with varying viewpoints and times. Existing efforts on 3D style transfer can effectively combine the visual features of style images and neural radiance fields (NeRF) but fail to handle the 4D dynamic scenes limited by the static scene assumption. Consequently, we aim to handle the novel challenging problem of 4D style transfer for the first time, which further requires the consistency of stylized results on dynamic objects. In this paper, we introduce StyleDyRF, a method that represents the 4D feature space by deforming a canonical feature volume and learns a linear style transformation matrix on the feature volume in a data-driven fashion. To obtain the canonical feature volume, the rays at each time step are deformed with the geometric prior of a pre-trained dynamic NeRF to render the feature map under the supervision of pre-trained visual encoders. With the content and style cues in the canonical feature volume and the style image, we can learn the style transformation matrix from their covariance matrices with lightweight neural networks. The learned style transformation matrix can reflect a direct matching of feature covariance from the content volume to the given style pattern, in analogy with the optimization of the Gram matrix in traditional 2D neural style transfer. The experimental results show that our method not only renders 4D photorealistic style transfer results in a zero-shot manner but also outperforms existing methods in terms of visual quality and consistency.
Zero-shot 4D Style Transfer. Given a casually captured video containing dynamic objects, StyleDyRF can transfer the reference style to the 4D scene in a zero-shot manner. Taking one step further from the 3D multi-view consistency in style transfer, our model is capable of rendering novel views with temporal consistency in 4D scene.
The 1-st row shows style images. The 2-nd row shows the rendered results of dynamic NeRF in zero-shot 4D artistic style transfer. The 3-rd row shows the rendered results of dynamic NeRF in zero-shot 4D photorealistic style transfer.
The 1-st row shows style images. The 2-nd row shows the rendered results of dynamic NeRF in zero-shot 4D artistic style transfer. The 3-rd row shows the rendered results of dynamic NeRF in zero-shot 4D photorealistic style transfer.
The 1-st row shows style images. The 2-nd row shows the rendered results of dynamic NeRF in zero-shot 4D artistic style transfer. The 3-rd row shows the rendered results of dynamic NeRF in zero-shot 4D photorealistic style transfer.
The 1-st row shows style images. The 2-nd row shows the rendered results of dynamic NeRF in zero-shot 4D artistic style transfer. The 3-rd row shows the rendered results of dynamic NeRF in zero-shot 4D photorealistic style transfer.
The 1-st row shows style images. The 2-nd row shows the rendered results of dynamic NeRF in zero-shot 4D artistic style transfer. The 3-rd row shows the rendered results of dynamic NeRF in zero-shot 4D photorealistic style transfer.
The 1-st row shows style images. The 2-nd row shows the rendered results of dynamic NeRF in zero-shot 4D artistic style transfer. The 3-rd row shows the rendered results of dynamic NeRF in zero-shot 4D photorealistic style transfer.
The 1-st row shows style images. The 2-nd row shows the rendered results of dynamic NeRF in zero-shot 4D artistic style transfer. The 3-rd row shows the rendered results of dynamic NeRF in zero-shot 4D photorealistic style transfer.
The 1-st row shows style images. The 2-nd row shows the rendered results of dynamic NeRF in zero-shot 4D artistic style transfer. The 3-rd row shows the rendered results of dynamic NeRF in zero-shot 4D photorealistic style transfer.
@misc{xu2024styledyrf,
title={StyleDyRF: Zero-shot 4D Style Transfer for Dynamic Neural Radiance Fields},
author={Hongbin Xu and Weitao Chen and Feng Xiao and Baigui Sun and Wenxiong Kang},
year={2024},
eprint={2403.08310},
archivePrefix={arXiv},
primaryClass={cs.CV}
}