MILU: A Multi-task Indic Language Understanding Benchmark
- Sshubam Verma
- Mohammed Safi Ur Rahman
- et al.
- 2025
- NAACL 2025
I am a Researcher in Speech and Language Group at IBM Research Lab, India. I recieved my joint Ph.D. degree in Computer Science and Engineering from Indian Institute of Technology (IIT) Bombay and Monash University.
I work on interesting and novel research problems in the areas of Natural Language Processing and Machine (Deep) Learning with the current focus on Large Language models for retrieval augmented generation (RAG). I also work on interesting research problems around text embedding models for effective and efficient information retrieval.
Recently, We have released embedding models as part of granite family of models check out the technical report here https://arxiv.org/pdf/2502.20204. In past, I have worked on Question Answering, Question Generation, Ontologies, Knowledge Graphs, and Code-mixed NLP.
I have published more than 30 research papers in reputed ML/AI/NLP conferences like ACL, EMNLP, NAACL, PAKDD, ISWC, SIGIR.. etc. Please checkout this link (https://scholar.google.com/citations?user=zPp0y6IAAAAJ&hl=en) to know more details. I have published more than 10 patents/disclosures.