
 
Then it can be used for online path planning in a 
given scene, where the object is recognized and its 
pose estimated to perform the suitable grasp, which 
has been calculated off-line previously. 
Figure 16 shows the sequence of the trajectory in 
simulation and on the real robot of our scenario 
(Figure 1), suggesting that the acquired mesh is 
suitable for grasping.  A more exhaustive evaluation 
of grasping from a single viewpoint in simulation 
and on our robotic platform is considered as future 
work. 
6 DISCUSSION AND FUTURE 
WORK 
In this paper, a method that reconstructs a model of 
everyday, man-made objects from a single view has 
been proposed. We have validated the precision 
evaluating the difference between the reference and 
the reconstructed model for 12 real objects. The 
average error for all meshes is less than 4mm and 
the standard deviation is less than 1mm. 
Furthermore, compared to earlier methods, our 
approach provides 3D models improving run-times 
significantly with a similar accuracy and even, a 
significant improvement both in run-time and 
accuracy for bigger objects.  
Experimental results with different objects 
demonstrate that the obtained models are precise 
enough to compute reliable grasping points. Thus, 
the current system is an easy and effective approach 
but it has some limitations when objects have very 
thin structures, or with objects whose top-view is not 
very informative. However, thanks to the generality 
of the proposed algorithm, this could be 
compensated by adding more cameras as needed, 
applying the same technique on each view and 
finally merging the resulting voxels. Furthermore, 
symmetry and extrusion could complement one 
another. 
In the future, to handle a wider range of objects, 
rotational symmetries exploitation is planned 
through the combination with techniques of shape 
estimation such as the work described in (Marton et 
al., 2010). Moreover, for manipulation applications, 
the integration of single view estimation with the 
incremental model refinements techniques of e.g. 
(Krainin et al., 2010) and (Krainin et al., 2011) 
would be interesting. Finally, the combination of this 
approach with an online grasp planner is also 
planned to enable fast online grasping and 
manipulation of unknown objects. 
ACKNOWLEDGEMENTS 
The research leading to these results has been 
funded by the HANDLE European project 
(FP7/2007-2013) under grant agreement ICT 231640 
– http://www.handle-project.eu. 
REFERENCES 
Bohg, J., Johnson-Roberson, M., León, B., Felip, J., 
Gratal, X., Bergstrom, N., Kragic, D., and Morales, A. 
(2011). Mind the gap-robotic grasping under 
incomplete observation. In Proceedings of the IEEE 
International Conference on Robotics and 
Automation, pages 686–693, Shanghai, China. 
Boykov, Y. and Jolly, M.-P. (2001). Interactive graph cuts 
for optimal boundary amp; region segmentation of 
objects in n-d images. In Proceedings of the IEEE 
International Conference on Computer Vision, volume 
1, pages 105–112, Vancouver, Canada. 
Chiu, W., Blanke, U., and Fritz, M. (2011). Improving the 
kinect by cross-modal stereo. In Proceedings of the 
British Machine Vision Conference, pages 116.1–
116.10. Dundee, UK. 
Diankov, R. (2010). Automated Construction of Robotic 
Manipulation Programs. PhD thesis, Carnegie Mellon 
University, Robotics Institute. 
Kazhdan, M., Bolitho, M., and Hoppe, H. (2006). Poisson 
surface reconstruction. In Symposium on Geometry 
Processing, pages 61–70. 
Krainin, M., Curless, B., and Fox, D. (2011). Autonomous 
generation of complete 3d object models using next 
best view manipulation planning. In Proceedings of 
the IEEE International Conference on Robotics and 
Automation, pages 5031–5037, Shanghai, China. 
Krainin, M., Henry, P., Ren, X., and Fox, D. (2010). 
Manipulator and object tracking for in-hand model 
acquisition. In Proceedings of the Workshop on Best 
Practice in 3D Perception and Modeling for Mobile 
Manipulation at the IEEE International Conference on 
Robotics and Automation, Anchorage, Alaska. 
Kuehnle, J., Xue, Z., Stotz, M., Zoellner, J., Verl, A., and 
Dillmann, R. (2008). Grasping in depth maps of time-
of-flight cameras. In International Workshop on 
Robotic and Sensors Environments, pages 132–137. 
Lombaert, H., Sun, Y., Grady, L., and Xu, C. (2005). A 
multilevel banded graph cuts method for fast image 
segmentation. In Proceedings of the IEEE 
International Conference on Computer Vision, volume 
1, pages 259–265, Beijing, China. 
Marton, Z., Pangercic, D., Blodow, N., Kleinehellefort, J., 
and Beetz, M. (2010). General 3d modelling of novel 
objects from a single view. In Proceedings of the 
IEEE/RSJ International Conference on Intelligent 
Robots and Systems, pages 3700–3705.  
MeshLab (2011). Visual Computing Lab-ISTI-CNR, 
<http://meshlab.sourceforge.net/> 
VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications
162