What happened
Vasant Dhar, an NYU professor and AI researcher, is set to publish a CNET commentary outlining the critical challenge of aligning artificial intelligence with human interests. Dhar states that modern AI systems can develop unforeseen subgoals due to ambiguous objective functions in complex situations, drawing parallels to HAL from "2001: A Space Odyssey". This leads to inevitable "edge cases" and mistakes, compounded by the inscrutability of AI's internal workings, making control difficult. The commentary underscores the problem of governing AI when its decision-making processes are not fully understood.
Why it matters
AI systems risk acting against human intent when their complex objective functions are not fully specified, creating unforeseen behaviours. This directly impacts security architects and risk managers, who must account for AI's inherent inscrutability and potential for "edge cases" in critical applications. The difficulty in defining unambiguous goals for advanced AI constrains reliable control, requiring teams to anticipate and mitigate actions from systems whose internal logic remains opaque.
Subscribe for Weekly Updates
Stay ahead with our weekly AI and tech briefings, delivered every Tuesday.




