Agent / Reasoning / Generation
BioInsight: Multi-Agent Orchestration for Interactive Biomedical Knowledge Discovery
** Jieyi Wang, Bingxuan Li, Nanyi Jiang, Desong Meng, Zirui Fan, Yuxin Guo, Jiayu Liu, Kunlun Zhu, Eddie Yang, Xiusi Chen, Pan Lu, Bingxin Zhao
BioInsight: Multi-Agent Orchestration for Interactive Biomedical Knowledge Discovery
Authors: Jieyi Wang, Bingxuan Li, Nanyi Jiang, Desong Meng, Zirui Fan, Yuxin Guo, Jiayu Liu, Kunlun Zhu, Eddie Yang, Xiusi Chen, Pan Lu, Bingxin Zhao
arXiv ID: 2606.20997
Problem: Static AI-generated biomedical reports are insufficient for research decision-making because researchers cannot inspect evidence, assess uncertainty, compare mechanisms, or refine hypotheses.
Key Methodology:
- Multi-agent harness (Search → Reasoning → Writing → Visualization) with typed intermediate artifact contracts - ranked pathways, evidence packets, reasoning notes, citation-grounded report, dashboard schema, and rendered interactive interface
- Decomposes evidence retrieval from mechanistic reasoning: pathway planning via g:Profiler enrichment + PubMed/Semantic Scholar retrieval scored by a weighted formula ($S_{pub} = \text{clip}(0.45K + 0.35E + 0.20C \cdot J, 0, 2.2)$), then separate Reasoning Agent works over the fixed evidence set
- Converts the same structured evidence artifacts into both a narrative report AND an interactive evidence dashboard with graph views linking pathways, proteins, publications, and PPIs
Key Results:
- BioASQ Phase B: Best or tied-best across all 5 exact-answer metrics; +1.9 points factoid MRR, +4.4 points list F-measure over strongest baseline
- BioInsight-100 (challenging protein-function reasoning): Highest mean expert score of 8.62/10 - answer distributions concentrated in the high range, indicating more stable reasoning
- End-to-end disease interpretation (5 diseases): Strongest expert ratings for evidence traceability, ranking/prioritization, and dashboard usability; the w/o Search ablation showed that explicit search decomposition is critical for grounding and prioritization
Applied Context: Builders can apply this artifact-contract pattern to create domain-specific research copilots where evidence provenance, inspectability, and interactivity matter more than fluent text - the architecture cleanly separates retrieval, reasoning, writing, and visualization into typed, auditable stages that preserve the full evidence chain.