Agentic AI for bioinformatics workflows

Jennifer Kelly; Ashley Evans; Ritesh Krishna; Muhammad Mohammadi; Prattyush Mangal; Stephen Checkley; Barbara Camanzi; Anna Paola Carrieri

ISMB 2025

Poster

20 Jul 2025

Agentic AI for bioinformatics workflows

Abstract

Construction and execution of complex omics’ and multi-omics’ bioinformatics workflows require expert domain input, multi-step manual curation and computational expertise, meaning it is often one of the most challenging and interdisciplinary tasks within an omics’ experiment. Agentic AI are LLM-driven systems which autonomously plan, reason and dynamically call tools/functions, and have demonstrated a powerful capability for planning and executing complex workflows. This suggests Agentic AI has the potential to significantly benefit this field, allowing i) automation of repetitive tasks, ii) enhancing decision making (i.e. workflow, parameter and software selection), iii) reducing computational pre-requisites, iv) improving data management and v) reducing human errors. As such, we present an Agentic AI system built using the open-source agentic framework BeeAI, capable of complex omics’ and multi-omics’ bioinformatic workflow selection and execution, equipped with domain specific tools including BLAST, DeSeq2, domain database querying and more, and a ‘domain-expert’, a concept which incorporates domain specific knowledge via vector databases and RAG, guiding the agent to more accurately plan ‘omics’ workflows and reduce LLM hallucinations. We demonstrate the capability of our system with several omics’ workflows, ranging from differential gene expression analysis and pathway analysis to more complex machine learning and biomedical foundation model inference tasks, including tumour tissue classification, cell-type annotation and t-cell receptor (TCR)-epitope binding prediction. Successful execution demonstrates that domain-guided Agentic AI systems are capable of automated and accurate scientific reasoning, planning and execution, providing great potential for this technology to enhance the bioinformatics field in terms of scalability, reproducibility and scientific accuracy.

Paper