Skip to main content

6 posts tagged with "epigraphdb"

View All Tags

MR-KG: A Knowledge Graph of Mendelian Randomization Evidence Powered by Large Language Models


📌 Background

Mendelian randomization (MR) is a powerful causal inference method that uses genetic variants as natural experiments to assess causal relationships between putative risk factors and disease outcomes. MR studies are increasingly abundant, but synthesising evidence across them remains challenging due to heterogeneity in reporting, traits examined, and the structure of the published literature.

To address this, Liu, Burton, Gatua, Hemani & Gaunt (2025) introduce MR-KG — a knowledge graph of MR evidence automatically extracted from published studies using large language models (LLMs).

Liu et al. "MR-KG: A knowledge graph of Mendelian randomization evidence powered by large language models". 2025, medRxiv DOI:10.64898/2025.12.14.25342218

Integrating Mendelian randomization and literature mining to map breast cancer risk factors


Illustration of integrating MR and literature-mined evidence to identify breast cancer risk pathways.

Breast cancer research spans epidemiology, molecular biology, clinical trials, and a vast and rapidly growing literature. One challenge is triangulating across these evidence types: when different sources point in the same direction, we can be more confident we are seeing something causal rather than correlational.

In a paper led by Marina Vabistsevits published in the Journal of Biomedical Informatics, we show how to bring two complementary sources together:

  1. Mendelian randomization (MR) evidence generated at scale using MR-EvE (“Everything-vs-Everything”), and
  2. Literature-mined relationships stored in EpiGraphDB, our biomedical knowledge graph.

Triangulating evidence in health sciences with Annotated Semantic Queries


Update: The ASQ work has now been published in Bioinformatics.

Yi Liu, Tom R Gaunt, Triangulating evidence in health sciences with Annotated Semantic Queries, Bioinformatics, Volume 40, Issue 9, September 2024, btae519, https://doi.org/10.1093/bioinformatics/btae519

Overview

Integrating information from data sources representing different study designs has the potential to strengthen evidence in population health research. However, this concept of evidence “triangulation” presents a number of challenges for systematically identifying and integrating relevant information.

In this medRxiv preprint we present ASQ (Annotated Semantic Queries), a natural language query interface to the integrated biomedical entities and epidemiological evidence in EpiGraphDB . ASQ enables users to extract “claims” from a piece of unstructured text, and then investigate the evidence that could either support, contradict the claims, or offer additional information to the query.

The ASQ approach has the potential to support the rapid review of pre-prints, grant applications, conference abstracts and articles submitted for peer review. ASQ implements strategies to harmonize biomedical entities in different taxonomies and evidence from different sources, to facilitate evidence triangulation and interpretation.

ASQ is openly available at https://asq.epigraphdb.org.

Systematic comparison of Mendelian randomization studies and randomized controlled trials using electronic databases


Overview

Mendelian Randomization (MR) uses genetic instrumental variables to make causal inferences. Whilst sometimes referred to as “nature’s randomized trial”, it has distinct assumptions that make comparisons between the results of MR studies with those of actual randomized controlled trials (RCTs) invaluable.

Evaluating the potential benefits and pitfalls of combining protein and expression quantitative trait loci in evidencing drug targets


Overview

Molecular quantitative trait loci (molQTL), which can provide functional evidence on the mechanisms underlying phenotype-genotype associations, are increasingly used in drug target validation and safety assessment. In particular, protein abundance QTLs (pQTLs) and gene expression QTLs (eQTLs) are the most commonly used for this purpose. However, questions remain on how to best consolidate results from pQTLs and eQTLs for target validation.

EpiGraphDB platform version 1.0


EpiGraphDB version 1.0

The EpiGraphDB platform has been updated with a new major release (version 1.0). This is the first release since version 0.3 in 2020 (what a year!) as well as since the publication of the journal article on Bioinformatics. We believe the underlying integration pipeline, data structure and architecture for the EpiGraphDB platform has now progressed sufficiently to a stable state that we are pleased to announce this major release a version 1.0!