Rakuten AI: Efficiency prioritised

What happened

Rakuten Group expanded its AI division to 1,000 employees, led by AI chief Ting Cai, focusing on cost-efficient AI model development using Nvidia chips. The company introduced Rakuten AI 3.0, a 700-billion-parameter Mixture of Experts (MoE) large language model, activating approximately 40 billion parameters per token. This design reportedly reduces inference costs by 90% compared to comparable LLMs. Trained on an in-house multi-node GPU cluster within a secure environment, the model achieved a Japanese MT-Bench score of 8.88, surpassing GPT-4o. Rakuten AI 3.0 will be released as an open-weight model in Spring 2026.

Why it matters

The upcoming open-weight release of Rakuten AI 3.0 introduces a new dependency for internal teams considering its adoption, requiring increased due diligence from procurement and IT security regarding its integration and ongoing operational costs. The model's reported 90% lower inference cost, while unverified, shifts the burden of cost optimisation from operational expenditure to initial integration and validation efforts. This also increases exposure for compliance teams to potential policy mismatches if the open-weight model's usage terms differ from existing internal AI governance frameworks.

Subscribe for Weekly Updates

Stay ahead with our weekly AI and tech briefings, delivered every Tuesday.

Read the newsletter →

Listen to the podcast →