AerialCrowds

Computer Vision Laboratory

MiraLab

Computer Vision Lab

Virtual Reality Laboratory

Swiss National Science Foundation

AerialCrowds

Abstract

Most Augmented Reality applications deal with very restricted and constrained environments. The goal of the AERIAL CROWDS project is to take Augmented Reality out of the laboratory and into a real urban environment, where it can be used to virtually add or remove buildings and crowds. Virtual crowds will be customized by selecting from a variety of realistic behaviors and appearances.

First experimentation

First experimentation

This work is supported by the AerialCrowds project financed by the Swiss National Science Foundation, Grant agreement no. CRSI20-122696

Objectives

To capture in real-time views of the urban landscape, we will use both a network of static cameras and an existing low-cost aerial vehicle, a video-camera carrying blimp that we have already deployed in previous projects. It will allow real-time video streaming and will provide positioning data that can be used in conjunction with Computer Vision based techniques to register the mobile camera with respect to the static ones. Together with improved Computer Graphics techniques, this will guarantee real-time rendering and provide an effective tool for interactive simulation of 3D crowds in mixed environments, which include both real and virtual buildings. This will therefore provide urban planners with an invaluable set of training and simulation resources by allowing them to interactively overlay fully controllable crowds on real video sequences in an interactive fashion. To the best of our knowledge, this has never been done. We will use a scanner-based approach to create individual people. As we want to allow real time body deformations and generate variable size models, we will use a set of parametric template body models. Template model segmentation process is not as simple as dividing it into the limbs. Even though recent skinning techniques automatically determine the main limbs, in anthropometric approach that we will use, we should determine much more body features, which is really a challenge. In terms of rendering, we should manage a compromise between efficiency and realism. In order to ensure real-time, we will use a strategy of management of the crowds using dynamic meshes, static meshes, and impostors. Concerning the generation of Virtual Buildings, as it is essential to have also time-efficient rendering for these buildings, we will model buildings with Level-of-Detail representations for the models.
First results

First results

For the realism, the main issue is to render the buildings and the Virtual Humans using a light model that provides shading and shadows similar to the real environment. For this we will develop a Mixed Reality illumination model where complex (multi-segmented, multi-material), dynamic (animatable) skeleton-based dynamic scenes (such as deformable virtual humans) can be rendered in real-time with a dynamic, believable, consistent with the real environment area-light. Virtual crowds will be able to reproduce a variety of behaviors, corresponding to a set of simulation scenarios, the most important of which will be those that define how people move in urban environments. All of these need to be customizable according to age, clothing, and accessories. This behavioral aspect will be based on procedural crowd models and navigation graphs, which will be developed and validated using Computer Vision techniques to analyze real behaviors and paths in real environments. The most basic behaviors relate to locomotion. We will handle them by introducing behavioral maps that represent, for a given behavior, the direction or set of directions most likely to be followed by individuals at each location of the area being surveyed. If one knows the behavior of a person, this information will make the linking of detection across temporal frames much more robust. An EM approach will also be investigated to alternatively decide what the behavior of a person is and to construct a map describing this behavior. A key challenge in crowd simulation is to guarantee that the models produce truly realistic behaviors. Our goal will therefore be to extend a current multi-people tracking approach so that it can be used in real-life conditions to learn the crowds true behavior by observing it using cameras that are both static or mounted on an aerial vehicle. These models will in turn be incorporated into the algorithm to allow prediction and, therefore, further increase the robustness of the analysis, even when the density of people increases. By running the system on long sequences, we will be able to fine-tune its parameters and to validate it. The resulting model will then be available not only for crowd analysis but also for crowd simulation. We will then move on to more sophisticated behaviors such as stopping, waiting for somebody, or making decisions, and apply a similar strategy.