The Falcon AI agent managed to defeat the human pilot with an astonishing 5-0 score. But there’s more to it than meets the eye.
A virtually simulated F-16 Fighting Falcon controlled by Artificial Intelligence managed to beat a human pilot during the AlphaDogfight trials, a round of tests which is part of the Air Combat Evolution (ACE) program of DARPA (Defense Advanced Research Projects Agency). The AI vs human dogfight was the final event of the three-day competition, from August 18 to 20, which saw eight different AI programs battling each other in different scenarios.
The companies involved in the competition were Heron Systems, Lockheed Martin, Aurora Flight Sciences, PhysicsAI, EpiSys Science, Georgia Tech Research Institute, Perspecta Labs and SoarTech. Each company developed its AI program, called Falcon, in less than one year, as the program started in September 2019, and using machine learning they put their programs through thousands of simulations to gain experience. For an instance, Heron Simulation’s senior machine learning engineer Ben Bell said that their Falcon AI gained at least “12 years of experiences” through 4 billion simulations.
During the first day of the competition, each team tested its algorithms against five different simulated adversaries developed by Johns Hopkins University’s Applied Physics Laboratory (APL), which hosted the event. According to Air Force Magazine, one of the adversaries was called “Zombie” and reproduced the flight profile of a cruise missile or an older drone, while the other adversaries focused on bombers and fighter jets.
On the second day, the teams started to battle each other to determine which one was going to participate in the final event. At the end of the day, Heron Systems, Lockheed Martin, Aurora Flight Sciences and PhysicsAI emerged as the four finalists that made it to the last day for the semi-finals, with Heron emerging as the winner after the final against Lockheed Martin.
Heron’s AI emerged with an aggressive behavior which went for very precise high aspect gun kills, often at the first merge at the beginning of the fight. The gun hits were determined by a 3,000 ft line, as a visualization for the gun range, intersecting the opponent’s jet.
Here’s the video (full streaming):
Click here, to start from AI vs Human pilot pilot.
The human pilot which dueled against the AI is a senior pilot of the District of Columbia Air National Guard (DC ANG) and F-16 Weapons Instructor Course graduate with more than 2,000 flight hours in the jet, known only by is callsign “Banger”. The event saw five neutral BFM (Basic Fighter Maneuvers) setups at different altitudes, with the two virtual F-16s starting at the merge like in the events before. To be clear, the merge is the first pass where the two fighters find themselves flying head-on on opposite headings with little lateral separation. The pilot used Virtual Reality devices to pilot his jet.
The AI quickly got its hits on Banger and the pilot was forced to change his game plan, as he said before the final round: “The standard things that we do as fighter pilots are not working, so for this last round I’ll try to change it a little bit just to see if we can do something different”. Banger was able to last longer during every round by changing tactics and denying the AI to get him into the WEZ (Weapon Employment Zone) and score hits, but ultimately the Falcon AI managed to defeat the human pilot with an astonishing 5-0 score.
As a matter of fact, the standard BFM rules that human pilots are trained to are not enough as the AI disregards flight safety rules, imposed through the Air Force Instruction (AFI) publications to pilots. This was also confirmed by the program manager, Colonel Dan “Animal” Javorsek: “We do not allow pilots to pass within 500ft of each other. That bubble and a restriction to take no greater than 135-degree gunshots, they were violating routinely. The agents were capitalizing on precisely those limitations, which is in all honesty exactly what we want. The point of this exploration of what AI can do is that it can tell us and help us explore the tactics space that we just don’t accept from a risk perspective.”
The Falcon AI often turned away at the last possible moment, well within the 500 ft bubble. That bubble, other than preventing mid-air collisions, is also derived from real combat scenarios as in a real dogfight the pilot would have to safely clear the debris if he shoots down another plane. The AI doesn’t need this, as the drones that would be piloted by a program like Falcon would be considered expendable/attritable, much like the Boeing Airpower Teaming System or the Skyborg Vanguard programs.
Other than flying more precisely and disregarding safety regulations, the AI also had quicker reactions thanks to the computer’s faster OODA loop (Observe, Orient, Decide and Act). This was also confirmed by Banger at the end of the event: “I may not be comfortable putting my aircraft in a position where I might run into something else – or take that high-aspect gunshot is a better way to say that. The AI would exploit that. It is able to have a very fine precision control, with perfect-state information between the two aircraft. It’s able to make adjustments on a nanosecond level. I had to observe that transition, reorient my thoughts and my game plan, make a decision, which translates into moving motor controls into that final pack and then applying the stick pressure or the throttle change that is required to execute in relationship to that orientation that I observed in the beginning. And the loop just continues to repeat.”
A really interesting breakdown of the dogfight was given by former F-16 pilot C.W. Lemoine on the latest episode of his popular “Monday with Mover” VLOG that we suggest you to watch:
AI scored high-aspect gun shots at the merge whereas the DC ANG pilot mainly flew a standard Viper “game plan”: in other words, while AI got a shot as soon as it could, the human pilot tried to maneuver for a follow-on/rear shot and did not attempt to get a face-shot or a shot of opportunity, even when he had a chance. This made all the difference.
Furthermore, as “Mover” highlighted in the video, real fighter pilots don’t train BFM (Basic Fighter Maneuvers) in simulators as sims don’t provide the required feedback (noise, G-force – even though let’s not forget that this can be a limit to the human pilot but not to the AI-pilot that could sustain much more load until to the limits of the airframes; stress, field of view, etc) that they would find in a real aerial engagement. This means that a real pilot is probably not as proficient in a simulator dogfight as a DCS or sim gamer would be, because the simulation does not give him/her the a lot of situational awareness.
Moreover, the Rules Of Engagement (RoE) of the simulated dogfight were crafted to simplify the task of the Heron AI pilot: damage was done by simply having the cone on target without any trigger-pull.
Another thing to consider is that AI learns by repeating the same tasks: in other words, the more it engages in simulated air combat, the more it becomes efficient. But it still has a number of weaknesses, such as it is vulnerable to strategies it hasn’t seen before: a different set up (a not neutral 1 vs 1 or a 1 vs 2), the lack of Perfect State Information (AI knew everything about the aircraft/scenario/parameters etc.) and some more realistic ROE would probably have had a different outcome. This does not mean the demo wasn’t interesting: it roughly gives an idea of the AI’s autonomous capabilities and what these could bring to loyal wingman scenarios featuring AI-guided drones teaming with human pilots (rather than fighting one another), but the analysis of the dogfight scenario (that was at Heron’s advantage) and its caveats highlight how those capabilities are probably nowhere near the ones of a human pilot, at least not yet.
Those who believe pilots will soon be replaced by AI should not forget that AI is not always flawless as it is prone to “input”, “poisoning” as well as more traditional cyber attacks: along with the bugs, 0-day vulnerabilities or human mistakes, an AI system can be compromised by attacking the learning process in a way such that the model learns a backdoor or a wrong behaviour, that the attacker can exploit when needed. And there are several other ways AI can be compromised and made uneffective. Actually, the AI model itself, in a real combat scenario becomes more vulnerable: physical security of the drones on which AI systems live is something that must be taken into account, especially considering the current “edge computing” trend: in edge computing, rather than transferring data OTA (Over-The-Air) to a cloud or on-prem processing infrastructure, data and algorithms are stored and run on the devices fighting on the battlefield. This is done to because the bandwidth needed to support a cloud-based AI paradigm won’t be available on the battlefield (and, when available, the network connectivity should be secured from eavesdropping, jamming etc.). In this scenario, the loss and subsequent capture of an AI-enabled system by an adversary would give the enemy the ability to possibly access the AI model and reverse-engineer it: drones would not longer be expendable/attritable and this would also imply that AI would have to have a self-preservation logic, avoid collisions (as a human pilot would do) and be less aggressive in a dogfight.
Although this first AI vs human trial could be seen as the beginning of the end of an era, this is likely still far from happening. Javorsek in fact said that if the systems tested by DARPA during the trial were completely developed and ready-to-go, they would take 10 years to be ready to fly a real F-16 and they first need to verify how feasible is this solution. Right now, the most likely solution is the application to unmanned aircraft and advanced autopilots.
This can be found also in the description of the ACE program on its official DARPA page, which appears based on shared concepts with the Skyborg program:
The ACE program seeks to increase trust in combat autonomy by using human-machine collaborative dogfighting as its challenge problem. This also serves as an entry point into complex human-machine collaboration. ACE will apply existing artificial intelligence technologies to the dogfight problem in experiments of increasing realism. In parallel, ACE will implement methods to measure, calibrate, increase, and predict human trust in combat autonomy performance. Finally, the program will scale the tactical application of autonomous dogfighting to more complex, heterogeneous, multi-aircraft, operational-level simulated scenarios informed by live data, laying the groundwork for future live, campaign-level Mosaic Warfare experimentation.
In a future air domain contested by adversaries, a single human pilot can increase lethality by effectively orchestrating multiple autonomous unmanned platforms from within a manned aircraft. This shifts the human role from single platform operator to mission commander. In particular, ACE aims to deliver a capability that enables a pilot to attend to a broader, more global air command mission while their aircraft and teamed unmanned systems are engaged in individual tactics.
As of now, the next step will be a real dogfight between an AI-controlled UAV and a manned aircraft next year, as said by Lt. Gen. Jack Shanahan, then director of the Joint Artificial Intelligence Center, during a virtual event at the Mitchell Institute for Aerospace Studies. Our friends at The War Zone reported that the program, in development by the Air Force Research Laboratory’s (AFRL) Autonomy Capability Team 3 (ACT3), is called R2-D2, a reference to the famous droid from the Star Wars franchise.
Timothy Grayson, director of the Strategic Technology Office at DARPA, said that the AlphaDogfight trial as a victory for better human and machine teaming in combat, without necessarily replacing human pilots, but rather integrating them, reducing their workload so they can focus on more important tasks: “I think what we’re seeing today is the beginning of something I’m going to call human-machine symbiosis. Let’s think about the human sitting in the cockpit, being flown by one of these AI algorithms as truly being one weapon system, where the human is focusing on what the human does best [like higher order strategic thinking] and the AI is doing what the AI does best.”
In the end, we are probably decades away from replacing human pilots with AI, but we may be much closer to have a manned fighter manage supporting drones working at their side, running dangerous, complementary missions, in a loyal wingman scenario.