Scale AI is introducing a new benchmarking system designed to evaluate and rank AI models across various demographics, including regions, professions, and age groups. This system aims to provide a more nuanced understanding of AI model performance, moving beyond general benchmarks to offer insights into how models perform in specific contexts. The new leaderboard will offer regular updates to include new models and capabilities.
Scale AI's approach focuses on maintaining neutrality and integrity, ensuring the rankings are tamper-proof and provide a true measure of model performance. The initial domains covered include coding, instruction following, maths and multilinguality. By providing detailed evaluations, Scale AI intends to foster transparency and accelerate the adoption of AI technologies.




