July 14th, 2024

"Superhuman" Go AIs still have trouble defending against these simple exploits

Researchers at MIT and FAR AI found vulnerabilities in top AI Go algorithms, allowing humans to defeat AI with unorthodox strategies. Efforts to improve defenses show limited success, highlighting challenges in creating robust AI systems.

Read original article

"Superhuman" Go AIs still have trouble defending against these simple exploits

Researchers at MIT and FAR AI have discovered vulnerabilities in top-level AI Go algorithms that allow humans to exploit gaps and defeat the AI using unorthodox strategies. Despite efforts to improve the algorithms' defenses against such attacks, including fine-tuning models and iterative training, the results have shown limited success. The study highlights the challenge of creating truly robust and unexploitable AI systems, even in controlled environments like board games. The findings suggest that while AI algorithms can excel in average performance, they remain vulnerable to simple exploits in worst-case scenarios. The research emphasizes the importance of addressing vulnerabilities in AI systems to prevent embarrassing failures when deployed to the public. Despite the difficulties encountered in defending against attacks in Go, the researchers remain optimistic about the potential to enhance AI robustness through continued research and training against a variety of attacks.

Hackers 'jailbreak' powerful AI models in global effort to highlight flaws

Hackers exploit vulnerabilities in AI models from OpenAI, Google, and xAI, sharing harmful content. Ethical hackers challenge AI security, prompting the rise of LLM security start-ups amid global regulatory concerns. Collaboration is key to addressing evolving AI threats.

AI can beat real university students in exams, study suggests

A study from the University of Reading reveals AI outperforms real students in exams. AI-generated answers scored higher, raising concerns about cheating. Researchers urge educators to address AI's impact on assessments.

'Skeleton Key' attack unlocks the worst of AI, says Microsoft

Microsoft warns of "Skeleton Key" attack exploiting AI models to generate harmful content. Mark Russinovich stresses the need for model-makers to address vulnerabilities. Advanced attacks like BEAST pose significant risks. Microsoft introduces AI security tools.

Defeated by A.I., A Legend in the Board Game Go Warns: Get Ready for What's Next

Lee Saedol, a Go player, faced a significant defeat by AlphaGo in 2016, showcasing A.I.'s capabilities. Lee now advocates preparing for A.I.'s societal impact, emphasizing adaptation and understanding its implications on human values and job market.

Prepare for AI Hackers

DEF CON 2016 hosted the Cyber Grand Challenge where AI systems autonomously hacked programs. Bruce Schneier warns of AI hackers exploiting vulnerabilities rapidly, urging institutions to adapt to AI-devised attacks efficiently.

3 comments

By @29athrowaway - 10 months

A trained vision system may recognize sofa with a tiger pattern as a tiger. Because what it learned during training was to recognize tiger stripes, not tigers.

It is important to not anthropomorphize machine learning.

By @captn3m0 - 10 months

Has similar research happened against AlphaZero or other pure-NN Chess engines?

Curious to see what chess exploits would look like.

"Superhuman" Go AIs still have trouble defending against these simple exploits

Related

Hackers 'jailbreak' powerful AI models in global effort to highlight flaws

AI can beat real university students in exams, study suggests

'Skeleton Key' attack unlocks the worst of AI, says Microsoft

Defeated by A.I., A Legend in the Board Game Go Warns: Get Ready for What's Next

Prepare for AI Hackers

Related

Hackers 'jailbreak' powerful AI models in global effort to highlight flaws

AI can beat real university students in exams, study suggests

'Skeleton Key' attack unlocks the worst of AI, says Microsoft

Defeated by A.I., A Legend in the Board Game Go Warns: Get Ready for What's Next

Prepare for AI Hackers