When Low Vectara and High "AA-Omniscience" Mapped to Claude Refusal Issues: A Production Case Study
https://fire2020.org/why-the-facts-benchmark-rated-gemini-3-pro-at-68-8-for-factuality/
How a mid-market SaaS found a mismatch between summarization scores and refusal behavior In January 2025 we were operating a B2B knowledge platform with 120,000 monthly active users and a realtime summarization feature