Sentence-level Aggregation of Lexical Metrics Correlate Stronger with Human Judgements than Corpus-level AggregationPaulo Rodrigo CavalinPedro Henrique Leite Da Silva Pires Domingueset al.2025AAAI 2025
Fixing Rogue Memorization in Many-to-One Multilingual Translators of Extremely-Low-Resource Languages by Rephrasing Training SamplesPaulo Rodrigo CavalinPedro Domingueset al.2024NAACL 2024
Disappearing Without a Trace: Coverage, Community, Quality, and Temporal Dynamics of Wikipedia Articles on Endangered Brazilian Indigenous LanguagesVasconcelos MarisaPriscila Mizukamiet al.2024ICWSM 2024
Theoretical and Empirical Advantages of Dense-Vector to One-Hot Encoding of Intent Classes in Open-World ScenariosPaulo Rodrigo CavalinClaudio Santos Pinhanez2024LREC-COLING 2024
Quantifying the Ethical Dilemma of Using Culturally Toxic Training Data in AI Tools for Indigenous LanguagesPedro DominguesClaudio Santos Pinhanezet al.2024LREC-COLING 2024
Creating an African American-Sounding TTS: Guidelines, Technical Challenges, and Surprising EvaluationsClaudio Santos PinhanezRaul Fernandezet al.2024IUI 2024
Human Evaluation of the Usefulness of Fine-Tuned English Translators for the Guarani Mbya and Nheengatu Indigenous LanguagesClaudio Santos PinhanezPaulo Rodrigo Cavalinet al.2024PROPOR 2024
Balancing Social Impact, Opportunities, and Ethical Constraints of Using AI in the Documentation and Vitalization of Indigenous LanguagesClaudio S. PinhanezPaulo Cavalinet al.2023IJCAI 2023
Understanding Native Language Identification for Brazilian Indigenous LanguagesPaulo Rodrigo CavalinPedro Henrique Leite Da Silva Pires Domingueset al.2023ACL 2023