This is true. That’s why Deep Corp is now offering adversarial benchmarks. You cannot buy our data, but you can pay to evaluate on it. Then we make fun of your model if it doesn’t win.
good benchmarks are important but i find it difficult to trust results reported by a company whose primary customers are the producers of the models under evaluation. the incentives go against objectivity. https://t.co/jYXKNaADdZ
— anton 🇺🇸 (@atroyn) May 29, 2024