Towards modelling visual ambiguity for visual object detection