Oct 16 2019
Commercial drone products are mainly designed to tackle certain automated tasks. However, they do not include artistic filming. A team guided by Carnegie Mellon University scientists has suggested a comprehensive system for aerial cinematography that studies humans' visual preferences.
Image Credit: Bannafarsai_Stock / Shutterstock
The completely autonomous system does not need scripted scenes, prior maps of the environment, or GPS tags to localize targets.
We're putting the power of a director inside the drone. The drone positions itself to record the most important aspects in a scene. It autonomously understands the context of the scene—where obstacles are, where actors are—and it actively reasons about which viewpoints are going to make a more visually interesting scene. It also reasons about remaining safe and not crashing.
Rogerio Bonatti, Ph.D. Student, Robotics Institute, CMU
As a goal, "artistically interesting" is subjective and hard to measure mathematically, so the system was taught using a method known as deep reinforcement learning. During a user study, people viewed scenes on a photo-realistic simulator that altered among frontal, back, right, and left viewpoints.
Shot scale and distance were also examined, as well as the position of the actor on the screen. Users scored scenes based on how visually attractive they were and how artistically stimulating they found them.
The system learned that certain movements were more stimulating in comparison to others. For instance, other autonomous drone products regularly use a continuous backshot because it enables the drone to follow a distinct, secure path behind the actor. But in the user research, participants stated that a constant backshot becomes uninteresting after a while. They also learned that the drone had to change angles frequently for the shot to stay exciting, but they could not change very often.
Bonatti said the team was keen to make the learned behavior generalizable, ranging from training in simulation to deployment in real-world situations. While the system averaged users' preferences for shots as an actor walked a narrow passage between buildings, it can apply those preferences to parallel hindrances such as a forest path using topographic mapping.
"Future work could explore many different parameters or create customized artistic preferences based on a director's style or genre," said Sebastian Scherer, an associate research professor in the Robotics Institute.
The aerial system is also taught to maintain a clear view of the actor, avoiding what is called occlusions.
We were the first group to come up with new ways of dealing with occlusion that aren't just binary, but can actually quantify how bad the occlusion is.
Rogerio Bonatti, Ph.D. Student, Robotics Institute, CMU
Other advances include efficient motion planners to predict the trajectories of actors, and an incremental and effectual mapping system of the environment using LiDAR.
This system could be beneficial outside of sports and entertainment. Police and government departments these days already use manually driven drones for numerous applications, including understanding traffic patterns and monitoring crowds. But manually flying drones demands plenty of attention, and officers cannot devote their energy really looking at the scene.
Just like learning artistic principles, the machine could be taught the shots necessary for other applications like security. The goal of the research is not to replace humans. We will still have a market for highly trained professional experts. The goal is to democratize drone cinematography and allow people to really focus on what matters to them.
Rogerio Bonatti, Ph.D. Student, Robotics Institute, CMU
This work will be showcased at the 2019 International Conference on Intelligent Robots and Systems, and has been approved for publication in the Journal of Field Robotics. The study is funded by Yamaha Motor Company.