SceneNet RGB-D: 5M Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth

John McCormac
Ankur Handa
Stefan Leutenegger
Andrew J. Davison

Dyson Robotics Lab at Imperial College, Department of Computing, Imperial Collge London


Example renders sampled from the dataset. We are able to render large variety of such scenes with objects sampled from ShapeNets and layouts from SceneNet.

Abstract

We introduce SceneNet RGB-D, expanding the previous work of SceneNet to enable large scale photorealistic rendering of indoor scene trajectories. It provides pixel-perfect ground truth for scene understanding problems such as semantic segmentation, instance segmentation, and object detection, and also for geometric computer vision problems such as optical flow, depth estimation, camera pose estimation, and 3D reconstruction. Random sampling permits virtually unlimited scene configurations, and here we provide a set of 5M rendered RGB-D images from over 15K trajectories in synthetic layouts with random but physically simulated object poses. Each layout also has random lighting, camera trajectories, and textures. The scale of this dataset is well suited for pre-training data-driven computer vision techniques from scratch with RGB-D inputs, which previously has been limited by relatively small labelled datasets in NYUv2 and SUN RGB-D. It also provides a basis for investigating 3D scene labelling tasks by providing perfect camera poses and depth data as proxy for a SLAM system.


Paper

Overview




Dataset

Training Set [263GB]    Training Set Protobuf [323MB]

Validation Set [15GB]     Validation Set Protobuf [31MB]

Caveat: Untar-ing can take some time since there are lots of subdirectories.


Training dataset is also split into 17 tarballs

train_0 [16GB]
train_1 [16GB]
train_2 [16GB]
train_3 [16GB]
train_4 [16GB]
train_5 [16GB]
train_6 [16GB]
train_7 [16GB]
train_8 [16GB]
train_9 [16GB]
train_10 [16GB]
train_11 [16GB]
train_12 [16GB]
train_13 [16GB]
train_14 [16GB]
train_15 [16GB]
train_16 [16GB]



Code to parse the dataset

https://github.com/jmccormac/pySceneNetRGBD


SUN RGB-D (Alternative links that provide cleaned up dataset)

https://github.com/ankurhanda/sunrgbd-meta-data


NYUv2 (Alternative links that provide cleaned up dataset)

https://github.com/ankurhanda/nyuv2-meta-data


Floor plans from SceneNet

https://drive.google.com/open?id=0B_CLZMBI0zcuRmM4cDIzdUtSdUU




Acknowledgements

Research presented in this paper has been supported by Dyson Technology Ltd. We would also like to thank Patrick Bardow for providing optical flow code and Phillip Isola for their neat website template that hosts pix2pix which we modified.



License

GPL.