1. 資料
項目:
論文:
代碼:
2. 論文
2.1 摘要
神經(jīng)輻射場能夠重建高質(zhì)量的可驅(qū)動人類化身,但訓練和渲染成本很高。為減少消耗,本文提出可動畫化的3D高斯,從輸入圖像和姿勢中學習人類化身。我們通過在正則空間中建模一組蒙皮的3D高斯模型和相應(yīng)的骨架,并根據(jù)輸入姿態(tài)將3D高斯模型變形到姿態(tài)空間,將3D高斯[1]擴展到動態(tài)人類場景。本文引入哈希編碼的形狀和外觀來加快訓練,并提出與時間相關(guān)的環(huán)境光遮蔽,以在包含復雜運動和動態(tài)陰影的場景中實現(xiàn)高質(zhì)量重建。在新視圖合成和新姿態(tài)合成任務(wù)中,所提出方法在訓練時間、渲染速度和重建質(zhì)量方面都優(yōu)于現(xiàn)有方法。所提出方法可以很容易地擴展到多人類場景,并在25秒訓練時間內(nèi)實現(xiàn)十人場景的高質(zhì)量的新視圖合成結(jié)果。
2.2 Method
The proposed animatable 3D Gaussian consists of a set of skinned 3D Gaussians and a corresponding canonical skeleton. Each skinned 3D Gaussian contains center x0, rotation R, scale S, opacity α0, and skinning weights w. First, we sample spherical harmonic coefficients SH, vertex displacement δx, and ambient occlusion ao from the hash-encoded parameter field according to the center x0, where the multilayer perceptron for ao requires an additional frequency encoded time γ(t) as input. Next, we concatenate the sampled parameters, the original parameters, and a shifted center x0’ in canonical space. Finally, we deform 3D Gaussians to the posed space according to the input pose St,Tt and render them to the image using 3D Gaussian rasterization.
3 Result (Training)
3.2 Multi-Human Scene
4. 小結(jié)
我們提出了一種新的人體重建方法,能夠在數(shù)秒內(nèi)重建高質(zhì)量可驅(qū)動人體。與最新的方法比較,我們的方法能夠在更短的時間內(nèi)實現(xiàn)更高的重建質(zhì)量。并且我們的方法能夠擴展到多人場景,實現(xiàn)復雜場景下的快速自由視角視頻合成。
5. 參考文獻
[1] Kerbl, B., Kopanas, G., Leimkühler, T., & Drettakis, G. (2023). 3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics (ToG),42(4), 1-14.
[2] Müller, T., Evans, A., Schied, C., & Keller, A. (2022). Instant neural graphics primitives with a multiresolution hash encoding.ACM Transactions on Graphics (ToG),41(4), 1-15.
[3] Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., & Black, M. J. (2023). SMPL: A skinned multi-person linear model. InSeminal Graphics Papers: Pushing the Boundaries, Volume 2(pp. 851-866).
[4] Jiang, T., Chen, X., Song, J., & Hilliges, O. (2023). Instantavatar: Learning avatars from monocular video in 60 seconds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(pp. 16922-16932).
[5] Chen, J., Zhang, Y., Kang, D., Zhe, X., Bao, L., Jia, X., & Lu, H. (2021). Animatable neural radiance fields from monocular rgb videos.arXiv preprint arXiv:2106.13629.文章來源:http://www.zghlxwxcb.cn/news/detail-817835.html
[6] Alldieck, T., Magnor, M., Xu, W., Theobalt, C., & Pons-Moll, G. (2018, September). Detailed human avatars from monocular video. In2018 International Conference on 3D Vision (3DV)(pp. 98-109). IEEE.文章來源地址http://www.zghlxwxcb.cn/news/detail-817835.html
到了這里,關(guān)于【Animatable 3D Gaussian】3D高斯最新工作,25s重建十人, 炸裂的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!