NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API CallsKinjal BasuIbrahim Abdelazizet al.2025EMNLP 2025
R2D2: Remembering, Replaying and Dynamic Decision Making with a Reflective Agentic MemoryTenghao HuangKinjal Basuet al.2025ACL 2025
Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular TasksIbrahim AbdelazizKinjal Basuet al.2024EMNLP 2024
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMsKinjal BasuIbrahim Abdelazizet al.2024ACL 2024
Granite code models: A family of open foundation models for code intelligenceMayank MishraMatthew Stalloneet al.2024arXiv
MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error TypesKeerthiram MurugesanSarathkrishna Swaminathanet al.2023ACL 2023
Sygma: System for generalizable modular question answering over knowledge basesSumit NeelamUdit Sharmaet al.2022EMNLP 2022
A Two-Stage Approach towards Generalization in Knowledge Base Question AnsweringSrinivas RavishankarJune Thaiet al.2022EMNLP 2022
A Hybrid Neuro-Symbolic approach for Text-Based Games using Inductive Logic ProgrammingKinjal BasuKeerthiram Murugesanet al.2022AAAI 2022
Generative Relation Linking for Question Answering over Knowledge BasesGaetano RossielloNandana Mihindukulasooriyaet al.2021ISWC 2021