Sentence-level Aggregation of Lexical Metrics Correlate Stronger with Human Judgements than Corpus-level AggregationPaulo Rodrigo CavalinPedro Henrique Leite Da Silva Pires Domingueset al.2025AAAI 2025
Fixing Rogue Memorization in Many-to-One Multilingual Translators of Extremely-Low-Resource Languages by Rephrasing Training SamplesPaulo Rodrigo CavalinPedro Domingueset al.2024NAACL 2024
Theoretical and Empirical Advantages of Dense-Vector to One-Hot Encoding of Intent Classes in Open-World ScenariosPaulo Rodrigo CavalinClaudio Santos Pinhanez2024LREC-COLING 2024
Quantifying the Ethical Dilemma of Using Culturally Toxic Training Data in AI Tools for Indigenous LanguagesPedro DominguesClaudio Santos Pinhanezet al.2024LREC-COLING 2024
Human Evaluation of the Usefulness of Fine-Tuned English Translators for the Guarani Mbya and Nheengatu Indigenous LanguagesClaudio Santos PinhanezPaulo Rodrigo Cavalinet al.2024PROPOR 2024
Training Large Language Encoders with the Curated Carolina CorpusGuilherme Lamartine MelloPaulo Rodrigo Cavalinet al.2024PROPOR 2024
Balancing Social Impact, Opportunities, and Ethical Constraints of Using AI in the Documentation and Vitalization of Indigenous LanguagesClaudio S. PinhanezPaulo Cavalinet al.2023IJCAI 2023
Understanding Native Language Identification for Brazilian Indigenous LanguagesPaulo Rodrigo CavalinPedro Henrique Leite Da Silva Pires Domingueset al.2023ACL 2023
Using meta-knowledge mined from identifiers to improve intent recognition in conversational systemsClaudio PinhanezPaulo Cavalinet al.2021ACL-IJCNLP 2021
Towards a Method to Classify Language Style for Enhancing Conversational SystemsPaulo CavalinVictor Ribeiroet al.2021IJCNN 2021