🔒 How a professional risk manager navigates the perils of AI: Aaron Brown

In this thought-provoking piece, Aaron Brown delves into the challenges of managing the risks associated with artificial intelligence. Highlighting the limitations of official documents and regulations, Brown draws inspiration from iconic science fiction works like Isaac Asimov’s rules for robots and Arthur C. Clarke’s HAL 9000. Brown advocates for a proactive approach to AI risk management, urging a focus on plausible bad future states. With a nod to Hollywood’s contribution to risk management, he concludes with a call for serious reverse stress tests in 2024 to prepare for unforeseen AI-related challenges.

Sign up for your early morning brew of the BizNews Insider to keep you up to speed with the content that matters. The newsletter will land in your inbox at 5:30am weekdays. Register here.


How a Professional Risk Manager Views Threats Posed by AI: Aaron Brown

By Aaron Brown

Runaway artificial intelligence has been a science fiction staple since the 1909 publication of E. M. Forster’s The Machine Stops, and it rose to widespread, serious attention 2023. The National Institute for Standards and Technology released its AI Risk Management Framework in January 2023. Other documents followed, including the Biden administration’s Oct. 30 executive order Safe, Secure, and Trustworthy Artificial Intelligence, and the next day, the Bletchley Declaration on AI Safety signed by 28 countries and the European Union.

___STEADY_PAYWALL___

As a professional risk manager, I found all these documents lacking. I see more appreciation for risk principles in fiction. In 1939, author Isaac Asimov got tired of reading stories about intelligent machines turning on their creators. He insisted that people smart enough to build intelligent robots wouldn’t be stupid enough to omit moral controls — basic overrides deep in the fundamental circuitry of all intelligent machines. Asimov’s first rule is: “A robot may not injure a human being or, through inaction, allow a human being to come to harm.” Regardless of the AI’s goals, it is forbidden to violate this law.

Or consider Arthur C. Clarke’s famous HAL 9000 computer in the 1968 film, 2001: A Space Odyssey. HAL malfunctions not due to a computer bug, but because it computes correctly that the human astronauts are reducing the chance of mission success — its programmed objective. Clarke’s solution was to ensure manual overrides to AI, outside the knowledge and control of AI systems. That’s how Frank Bowman can outmaneuver HAL, using physical door interlocks and disabling HAL’s AI circuitry.

While there are objections to both these approaches, they pass the first risk management test. They imagine a bad future state and identify what people then would want you to do now. In contrast, the 2023 official documents imagine bad future paths, and resolve that we won’t take them. The problem is an infinite number of future paths, most of which we cannot imagine. There is a relatively small number of plausible bad future states. In finance, a bad future state is to have cash obligations you cannot meet. There are many ways to get there, and we always promise not to take those paths. Promises are nice, but risk management teaches focus on things we can do today to make that future state survivable.

There is no shortage of things that could end human existence: asteroid impact, environmental collapse, pandemic, global thermonuclear war. These are all blind dangers. They do not seek to hurt humans and so there is some possibility that some humans survive.

Two dangers are essentially different — attack by malevolent intelligent aliens, and attack by intelligences we build ourselves. An intelligent enemy hiding until it acquires strength and position to attack, with plans to break through any defenses and to continue its campaign until total victory is attained, is a different kind of worry than a blind catastrophe.

The dangers of computer control are well known. Software bugs can result in inappropriate actions with sometimes fatal consequences. While this is a serious issue, it is a blind risk. AI poses a fundamentally different danger, closer to a malevolent human than to a misfunctioning machine. With AI and machine learning, the human gives the computers objectives rather than instructions. Sometimes these are programmed explicitly, other times the computer is told to infer them from training sets. AI algorithms are tools the computer — not the human — uses to attain the objectives. The danger from a thoughtlessly specified objective is not blind or random.

This differs from a dumb computer program, where a human spells out the program’s desired response to all inputs. Sometimes the programmer makes errors that are not caught in testing. The worst errors are usually unexpected interactions with other programs rather than individual program bugs. When software bugs or computer malfunctions do occur, they lead to random results. Most of the time the consequences are limited to the system the computer is designed to control.

This is another key risk distinction between dumb and smart programs. The conventional computer controlling a nuclear power plant might cause a meltdown in the plant, but it can’t fire nuclear missiles, crash the stock market or burn your house down by turning your empty microwave on. But malevolent intelligence could be an emergent phenomenon that arises from the interaction of many AI implementations, controlling almost everything.

Human intelligence, for example, probably emerged from individual algorithms that evolved for vision, muscle control, regulation of bodily functions and other tasks. All those tasks were beneficial to humans. But out of that emergent consciousness, large groups of humans chose to cooperate in complex, specialized tasks to build nuclear weapons capable of wiping out all life on Earth. This was not the only terrible, life-destroying idea that emerged from human intelligence—think genocide, torture, divine right of kings, holy war and slavery. The fact that individual AI routines today lack the sophistication and power necessary to destroy humanity, and mostly have benign goals, is no reason to think emergent AI intelligence will be nicer than people are.

My hope for 2024 is we will conduct serious reverse stress tests for AI. We invite diverse groups of people — not just officials and experts — and have them assume some specific bad state. Maybe it’s 2050 and Skynet has killed all other humans (I often show disaster movies to prepare groups for reverse stress tests, it helps set the mood and make people more creative — it’s Hollywood’s great contribution to risk management). You’re the last survivors, hiding out until Terminators find and terminate you. Discuss what you wish people had done in 2024, not to prevent this state from happening, but to give you some means of survival in 2050.

Read also:

© 2024 Bloomberg L.P.