Māori Data Sovereignty Inspires New Ai Voice Models
Māori Data Sovereignty Inspires New Ai Voice Models
https://spectrum.ieee.org/indigenous-ai-voice-models-maori
Publish Date: 2026-05-21 11:00:02
Source Domain: spectrum.ieee.org
Development of a Māori AI Voice Model
New Zealand’s linguistic landscape, particularly its indigenous language te reo Māori, faces challenges due to AI models that were built using scraped data without the involvement of Māori communities. This situation highlighted the urgency for Māori sovereignty in digital systems. Te Taka Keegan and Kingsley Eng developed a high-fidelity synthetic voice model for a specific dialect of te reo Māori, incorporating a phoneme-based approach and incorporating data collected under stringent ethical standards. The goal was to enable technology that empowers rather than colonizes their language and knowledge. Despite using relatively limited data, the voice model achieved 6.78 percent word error rate, deemed satisfactory by industry standards. Despite receiving some funding from Google, ultimate ownership remains in Māori hands, and the technology’s governance is tied directly to Māori community interests, emphasizing a move towards data sovereignty for indigenous languages.
Key Points:
- Unpermitted data collection by tech companies for building AI models in te reo Māori has raised sovereignty issues.
- Keegan and Eng’s synthetic voice project aims to empower Māori knowledge transfer through community-owned technology.
- The text-to-speech system, although trained on minimal data, achieved good industry accuracy levels.
- The focus on Māori sovereignty extends to other projects worldwide, such as Te Hiku Media’s automatic speech recognition system.
- The project provides a blueprint for community-owned models, facilitating indigenous language preservation and ownership.