AMS P a n or

PanorAMS: Automatic Annotation for

Detecting Objects in Urban Context

14 million +

generated bounding boxes in

771,299

panoramic images covering

487

neighbourhoods

147k +

groundtruth bounding boxes in

7,348

panoramic images covering

10

neighbourhoods
14 million +

generated bounding boxes in

771,299

panoramic images covering

487

neighbourhoods

147k +

groundtruth bounding boxes in

7,348

panoramic images covering

10

neighbourhoods

Designed to move away from laborious manual annotation

The PanorAMS framework involves a method to automatically generate bounding box annotations in geo-referenced panoramic images based on geospatial context information. Following this method, we acquire large-scale (albeit noisy) annotations solely from open data sources in a fast and automatic manner. For detailed evaluation, the framework includes an efficient protocol (using the generated boxes as a starting point) to crowdsource groundtruth annotations for a subset of the images.

News

Website live - 06/03/2023

BibTeX
If you use PanorAMS in your research, please cite:
 

Automatically generating PanorAMS-noisy annotations for training

Our step-by-step method to use geospatial context information to automatically generate bounding box annotations in geo-referenced panoramic images:

  1. Based on city observations, geospatial object information, and elevation map data, acquire object attributes and 3D real-world measurements of all objects falling within a 150 meter radius of the image GPS location.
  2. Convert this information to 2D image coordinates using the pinhole camera model in order to generate an initial set of bounding boxes.
  3. Refine and filter the initial set of bounding boxes acquired during step 2 via geometric reasoning based on the percentage of overlap between boxes, the classes associated with overlapping boxes, and the real-world distance between overlapping objects and the camera. Urban knowledge is incorporated at this stage by optimizing these thresholds per class
  4. Map the final set of bounding boxes onto the image in order to qualitatively analyze the generated bounding boxes per class.
  5. Optimize class rules, thresholds, and estimates of objects’ real-world measurements by qualitative analysis of images and corresponding bounding boxes.
  6. Link object metadata that is available from geospatial object information (e.g. building value) to the automatically generated bounding boxes.
 

Efficiently crowdsourcing PanorAMS-clean annotations for evaluation

To evaluate the quality of our automatically generated bounding boxes, we crowdsource ground-truth annotations for a subset of the images contained in PanorAMS-noisy. For this, we implement an efficient crowdsourcing protocol using the generated boxes as a starting point. In the interest of minimizing the required annotation time, the user interface of our crowdsourcing tool is built such that the necessary mouse and eye movements are kept to a minimum. Our crowdsourcing protocol is subdivided into three tasks in order to avoid task-switching, which is well-known to increase response time and decrease accuracy. We introduce the concept of linked bounding boxes that is specific to objects split across the left and right side of 360° images, whereby two bounding boxes are labeled as belonging to the same object. The image depicts two active linked boxes, ready to be broken up (by clicking the linkage button below the left active box), corrected (by dragging the middle point and borders of the active box) or deleted (by clicking the red X mark button) as need be. The linkage icon in the middle of the screen informs the user that there are two active linked boxes. The boxes can be verified by clicking the green check mark button. The orange color is specific for the playground class.