Prince Kumar, Srikanth Tamilselvam, et al.
NAACL 2024
How can Large Language Models (LLMs) and modern NLP be used to increase the use and the documentation of Indigenous languages which are in danger of disappearing? First, I report on the development of high-quality translators for Indigenous languages by fine-tuning SOTA machine translators with tiny amounts of data, and discuss how to avoid some common pitfalls. Next, I present prototypes built with Indigenous communities aiming to stimulate and facilitate writing, using LLM models to create spell-checkers, next-word predictors, and similar tools. Finally, I discuss a future for documentation where dying languages are preserved as interactive language models.
Prince Kumar, Srikanth Tamilselvam, et al.
NAACL 2024
Hagen Soltau, Lidia Mangu, et al.
ASRU 2011
John T. Richards
CHI 1991
Christopher S. Campbell, Paul P. Maglio
Int. J. Hum. Comput. Stud.