A two-level algorithm that is integrating contextual information to recover the structure of a single image is presented. Due to the structural features of outdoor scenes
we can classify the structure of a scene into three categories: sky
ground
and vertical objects. First
we over-segment the image into homogeneous regions. Then
we recognize the regions with significant features as "definite regions"
and the regions we can not classify as "undetermined regions". Next
every nearby definite region with similar features as the undetermined region will vote for an undetermined region. The class with the most votes is assigned to that undetermined region. Finally
we construct a 3D model of the scene. Experiments show that due to the exploitation of the contextual information
almost 92.3% of the pixels can be recovered successfully
which is better than the performance of the existing method