-
Anyone else find that AI visibility tracking tools just give you a different number every week with no actual signal?
Been testing tools for tracking how often my client’s brand gets cited by ChatGPT/Perplexity. Tried 3 of the popular ones and the numbers kept jumping around. One week we’d be at 60% mention rate, next week 35%, then back up to 50%. We hadn’t changed anything.
At first I thought the tools were buggy. Then I ran the same prompt manually 10 times in a row in ChatGPT.
Got a different answer almost every time. Different brands appearing in different orders. Sometimes my client wasn’t mentioned at all, sometimes they were the top recommendation. Same prompt, same model, same day.
So the issue isn’t the tools. It’s that LLMs are non-deterministic and most tools are just running the prompt once and reporting that as data. Which is basically a coin flip.
I did the math out of curiosity. If you mention rate “40%” came from 4 mentions in 10 runs, the actual confidence interval on that is something like 12% to 74%. So saying you’re at 40% is meaningless without telling people your sample size.
Most tools don’t show sample size or confidence intervals because running each prompt 10+ times costs them 10x more in API fees. Economics push them to single-run snapshots.
Question for the sub: anyone found a tool that actually does this properly? Or is everyone just using the noisy numbers and pretending they’re real? Because right now I’m telling clients I can’t actually measure their AI visibility reliably and it’s a hard sell.
Also open to manual workflows if anyone has one that doesn’t take 4 hours per audit.
Log in to reply.