Grok Chatbot Fails to Counter Antisemitic Content

January 28, 2026

The Anti-Defamation League has criticized a leading AI-powered chatbot, Grok, for its inability to effectively identify and counter antisemitic content in conversations. A new study by the ADL tested six large language models, including Grok, OpenAI’s ChatGPT, Meta’s Llama, Anthropic’s Claude, Google’s Gemini, and DeepSeek, on their ability to detect and respond to narratives and statements classified as “anti-Jewish,” “anti-Zionist,” and “extremist.” The results showed that Grok performed poorly in this regard, with the ADL noting that all six models had significant gaps in their capabilities that require improvement. On the other hand, Claude from Anthropic’s Anthropic demonstrated superior performance in detecting and countering antisemitic content, suggesting potential improvements for other language models to follow. The study highlights the need for more rigorous testing and evaluation of AI-powered chatbots on their ability to handle sensitive topics like antisemitism.