Making an advanced vehicle-safety model or researching it requires a highly accurate annotated dataset. The dataset also must contain certain events that are critical to safety. CitySim, developed by the University of Central Florida researchers, contains highly accurate vehicle trajectory and several features. The videos from drones in highly intense locations contain several critical safety events.

As the technology progresses into the age of AI and, more specifically, Deep Learning, researchers are able to build more advanced systems for vehicle safety, for example, driver behavior analysis, accident avoidance, etc. Furthermore, they need more and more accurately annotated datasets to make advanced systems. The use of sensors to create a vehicle trajectory database has flaws. It misses much crucial information that is hard to estimate, for example, issues in vehicle geometry, data points outside the domain of the sensors, etc. So, the demand for video-based vehicle trajectory data using a birds-eye view rises. However, the existing video-based datasets are not good enough for use in safety research, so CitySim has been created.

CitySim uses multiple drones over a target area to capture wider observations and stitches all video outputs into a single video. First, they use SIFT features of the first and last frames to stabilize each video, aligning the frames. Now, a traffic video would contain objects of interest along with road marks, other hindrances, etc., resulting in a detection error. So, those events must be removed. They used a gaussian-mixture-based algorithm to segment foreground and background and then used an inpainting algorithm to remove undesirable objects. After object filtering, each drone’s videos need to be stitched together. Histogram color matching, SIFT feature matching, etc., and other techniques have been applied for smooth stitching for vehicles that move from one drone area to another.

Source: https://arxiv.org/pdf/2208.11036v1.pdf

They used Mask R-CNN to generate a pixel-to-pixel segmentation mask over each vehicle, and then the vehicle bounding-box was rotated according to the mask. After making a cohesive video of each location, the interesting objects were detected using a rotating rounding box. Moreover, the wrongly detected objects were manually deleted to create a more accurate dataset.

They have evaluated the dataset quality in a handful of test cases. The first task was to detect a lane-changing vehicle that cuts-in from one lane to another. The second task was to detect vehicles that diverge from or merge into a lane. And the third task was to measure the severity of conflict when two vehicles approach an intersection. In all these cases, the CitySim dataset gave the best performance.

The dataset creates a highly accurate 3D map of each location. This would lead to conducting research on digital twins. The highly accurate object detection, wide videoscape, and all the features of the dataset would enable researchers worldwide to conduct research to make more advanced vehicle safety devices.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'CitySim: A Drone-Based Vehicle Trajectory Dataset for Safety Oriented Research and Digital Twins'. All Credit For This Research Goes To Researchers on This Project. Check out the paper and github.