Software for Real-time, Real-world 3D products

For a long time, photogrammetry was based on analogue film, manual image orientation, and manual measurement. Each image was costly. which put great emphasis on minimising the number of images, given the quality one wanted to achieve. The time from flight to finished product was often weeks or months. Today, we have become accustomed to digital images and automatic processes. The products have evolved from paper-based 2D maps to photo-realistic 3D models and 3D point clouds. With regard to the future development of photogrammetry, we see:
  • Drones and other unmanned platforms are increasing in importance and number. Equipment for aerial mapping becomes widely available.
  • More operational steps are automated, also navigation and route planning. Less and less expertise is required to generate e.g. 3D models.
  • As much data as possible is collected, for example by digital video instead of still images. The term  videogrammetry  is used for measurements in video clips.
  • Results and products are expected to be available faster and faster, preferably in real-time.
  • New applications and products are expected and developed.

Based on these predictions, at I-CONIC we first take on the challenge to producing  static 3D models  in real-time from one ‘normal’ 2D video stream. And as a by-product we create  3D stereo viewing  from the same 2D video. As a second step, we increase the challenge by using multiple and synchronized video streams (e.g. from a swarm of drones) and thereby creating  Live 3D  where also moving object are modelled.

Static 3D
Video Stereoscope
Live 3D (4D)

Instant, static 3D models

The two sequences of video are taken from the same normal 2D drone video clip, but only started some seconds apart, which means that each object on the ground is seen from a different viewing angle. Each pair of frames are then used as input for the photogrammetric 3D modelling.

Video and still images from drones are increasingly being used to get a better overview or detailed understanding of the area. Very often, the images are also used for 2D or 3D mapping of the area. However, this process today is too slow to influence the flight route, the image capture or to take other immediate decisions about the field activities in the area. I-CONIC took on the challenge to make this possible.

If everything we need to be imaged is static, we can use video from one single drone to generate 3D models from images. In one and the same video we skip a constant number of frames and create a continuous stream of pairs of frames. For example, we can create a stereo pair from frames 1 and 51, the next stereo pair from frames 2 and 52, etc. And from the stereo pairs we take all the classic photogrammetric steps (and some more) to generate block-triangulated 3D point clouds and mesh. This is not hugely different from the way 3D models are created by other types of software packages on the market today – only that we do it from video and we do it instantly (at video rate).

Leading companies claim that they can create 3D models in just a handful of minutes. The software I-CONIC’s is developing is many times faster. A truly disruptive technology.

Example of 3D model from a drone. The left part shows a 3D point cloud (surface model), coloured with the RGB pixel values from the image. To the right: a 3D point cloud (surface model), coloured with height values. Both types can be used in real-time by the analyst. The left one to get a better overview (or detail) of the area by changing view point, view angle and zooming. The right one gives height values in absolute numbers and can help the analyst understand volumes and topography. It is also used for automatic processing into slope maps, etc.

The Video Stereoscope

An anaglyph 3D stereo video created by I-CONIC’s Video Stereoscope. To obtain the 3D effect of this particular video you need a pair of red/cyan anaglyph glasses.

Our ability to interpret images is far better in 3D than in 2D. Anyone who has seen Avatar, Gravity or any other 3D movie knows the increased realism this provides (for some, even headaches, should be acknowledged). With the same technique as described above for static 3D models, we can in one and the same video skip a constant number of frames and create a continuous stream of pairs of frames. For example, we can create a stereo pair from frames 1 and 51, the next stereo pair from frames 2 and 52, etc. To create a pleasant stereo view, the stereo pairs have to be processed through so-called epipolar resampling, and we can then play a 3D video from a regular 2D video.

We have not seen epipolar resampling in real-time to create 3D stereoscopic video before, but we are convinced by tests with video footage from both drones and mobile cameras that this can be a useful tool, e.g. for inspection by drones or forest inventories in the field. We even have had a VIP show in a 3D cinema to boast the capabilities of our Video Stereoscope.

We call this Video Stereoscope because it works like a classic stereoscope for pictures, but shows instead a moving video in 3D. The stereoscopic video requires the camera to move, preferably perpendicular to the camera’s image plane, as when filming from a side window of the car. The stereoscopic viewing also requires some kind of equipment that projects the left frame of the stereo pair to the left eye, and the right frame to the right eye. Such equipment can be for example VR glasses, 3DTV or red/blue anaglyph glasses like these:

Links:
https://en.wikipedia.org/wiki/Stereoscopy
https://en.wikipedia.org/wiki/3D_stereo_view

There are more 3D videos at I-CONIC’s YouTube channel.

Live 3D models of events and moving objects

Figure 1. Video from a fire exercise, filmed by two flight-synchronized drones from two different viewing angles, and with the video streams time-synchronized. This scenario has been used in our development project for 4D modelling.

Digital 3D models from images have been established as a standard product with many applications. However, our environment is filled with events and movements from vehicles, people, animals, water, fire and smoke. At I-CONIC, we are developing software that generates 4D models, i.e. time-dependent 3D models. The models are generated in real-time to be used when an event, such as a forest fire, takes place. The technology makes it possible to consider a process both stereoscopically and with the aid of a constantly changing point cloud. GPU programming is a crucial component for the extremely fast calculations.

Photogrammetry assumes that each part of the area to be mapped is visible in at least two images, taken from different viewpoints. The target must not have moved or changed the image between the exposures. However, there are many applications where moving objects and events that occur are of primary importance (Fig. 1). In addition to mapping, we can also include sports, film, scientific experiments, etc.

We want to achieve a user experience similar to that often found in computer games; while we look at and can move around in a digital 3D world, events occur in this world; vehicles and people move, explosions occur, etc. (Fig. 2). In this way, an intervention leader gets a “first-person viewer” perspective of the event without having to move. In addition to the increased understanding provided by this tool, measurements can be made to determine heights, volumes etc., also of moving objects.

Figure 2. A snapshot from a computer game, Emergency20, where you interact in an ever-changing 3D world.

We also want to be able to freeze a moment and then get a more detailed 3D model of the event at this particular moment. For example, Intel or 4DReplay have attractive solutions for this, which however, requires upwards of 160 cameras and is therefore expensive, complex and inflexible (Fig. 3). We will not, with our technology reach the same degree of realism, but our solution is far more practical, not least in time-critical efforts.

Figure 3. 4DReplay’s solution with a long row of system cameras, all filming from slightly different angles.

To be able to generate 4D models, we use at least two drones, which each acquire and link a video stream to the ground. The drones are synchronized so that they have a constant base between them, much like they were connected with a long stick. The base distance is continuously calculated during a mission and thus does not need to be well predetermined. The area on the ground must of course be seen simultaneously by both drones. The drones can either fly a predefined route, be flown manually, or hover, standing still over an event.

Figure 4. Overview of our concept with two or more drones filming an event and producing a continuous stream of time-dependent 3D models, presented to the user in real-time.

The video streams are synchronized, so that for each frame in a video sequence we know the frame that was taken exactly at the same time in the other video film, at least with an accuracy corresponding to the video’s recording rate, e.g. 50 frames/second, corresponding to 0.02 sec accuracy. In this way we get a stream with simultaneously registered image pairs. These image pairs are matched against each other so that their relative orientation can be determined.

The next step is another type of matching takes place to generate dense 3D point clouds, and from the dense point clouds a so-called textured mesh is created. This a now 3D model simulating the real world, it is updated at video-rate and moving object are also modelled.

We believe we are the first in the world to generate 3D models of moving objects from two or more moving drones, and we have applied for patents.

Applications

The user segments for I-CONIC’s software technology can be divided into three categories:
  1.  Inspection, Surveying, Monitoring and Mapping  which are segments currently using normal static 3D models from drones, or using drone video as a simple ‘eye-in-the-sky’ tool.
  2.  Public Safety and Emergency Services  which consists of several types of users that will benefit from instant, static 3D models but even more so from future 4D models.
  3.  Entertainment  including gaming, movie production, sports TV, mass market, etc, will comprise our third market step. Manufacturers servicing these segments will be able to enhance their products through integration of I-CONIC’s 4D software.