DMER Group - News

Latest posts

Pilot analysis on BioRxiv and MedRxiv full text data to facilitate comprehensive data mining on biomedical literature

We have curated the full text data archives of BioRxiv and MedRxiv preprints and conducted some exploratory analyses for our next stage research projects in text mining biomedical literature with automated approaches.

Posted by Yi Liu on Aug 21, 2023

seedcorn funding text mining NLP
Proteome-wide Mendelian randomization in global biobank to identify multi-ancestry drug targets

PhD student Huiling Zhao and co-supervisor Dr Jie (Chris) Zheng published an interesting cross-ancestry MR analysis of potential drug targets in collaboration with the Global Biobank Meta-analysis Initiative.

Posted by Tom Gaunt on Nov 1, 2022

drug-targets mr colocalization
New funding: NIHR Bristol Biomedical Research Institute.

The National Institute for Health and Care Research Bristol Biomedical Research Centre (NIHR Bristol BRC) has been awarded nearly £12 million of new funding for the next five years. The DMER programme is linked to the Translational Data Science theme of the BRC, providing a mechanism for translation of our methodological research, software tools and data resources.

Posted by Tom Gaunt on Oct 14, 2022

drug-targets mr funding
Systematic comparison of Mendelian randomization studies and randomized controlled trials using electronic databases

Triangulating results between Mendelian randomization studies and randomized controlled trials has the potential to strengthen evidence for an intervention target. In this work, led by Maria Sobczyk, we mined ClinicalTrials.Gov, PubMed and EpigraphDB databases and carried out a series of 26 manual literature comparisons among 54 MR and 77 RCT publications to explore the potential for systematic triangulation.

Posted by Tom Gaunt on Apr 16, 2022

database epigraphdb MR NLP
Triangulating evidence in health sciences with Annotated Semantic Queries

Integrating information from data sources representing different study designs has the potential to strengthen evidence in population health research. In this work, led by Yi Liu, we present ASQ (Annotated Semantic Queries), a natural language query interface to EpiGraphDB, which enables users to annotate “claims” from a piece of unstructured text with evidence relevant to the claim.

Posted by Tom Gaunt on Apr 16, 2022

database EpiGraphDB MR NLP software
Evaluating the potential benefits and pitfalls of combining protein and expression quantitative trait loci in evidencing drug targets

Molecular quantitative trait loci (molQTL), which can provide functional evidence on the mechanisms underlying phenotype-genotype associations, are increasingly used in drug target validation and safety assessment. In this work, led by Jamie Robinson, we evaluate the differences between expression and protein QTL and explore the possible reasons for apparent contradictory effects of genetic variants.

Posted by Tom Gaunt on Mar 17, 2022

database epigraphdb MR NLP
Senior Research Associate / Research Fellow in Health Data Science

We are seeking a talented postdoctoral scientist with expertise in biomedical data integration and analysis, data mining and causal inference

Posted by Tom Gaunt on Jan 12, 2022

jobs health data science work with us
Trans-ethnic Mendelian-randomization study reveals causal relationships between cardiometabolic factors and chronic kidney disease

This paper, led by Jie Zheng, systematically analysed previously reported risk factors for chronic kidney disease in European and East Asian populations using Mendelian randomization. The analysis showed evidence of both cross-population and population-specific risk factors.

Posted by Tom Gaunt on Oct 20, 2021

MR CKD
EpiGraphDB platform version 1.0

EpiGraphDB v1.0 and summary of features and changes.

Posted by Yi Liu on Mar 22, 2021

EpiGraphDB software database
MendelVar: gene prioritization at GWAS loci using phenotypic enrichment of Mendelian disease genes

This paper, led by Maria Sobczyk, presented MendelVar, a tool which integrates knowledge from four databases on Mendelian disease genes with enrichment testing for a range of functional annotations to support the prioritization of genes at GWAS loci.

Posted by Tom Gaunt on Jan 1, 2021

database GWAS software
Neo4J data integration pipeline

We make extensive use of Neo4J for graph databases (including EpiGraphDB). One of the key challenges in constructing a heterogeneous graph database is the data integration from different sources. Ben Elsworth describes the pipeline he has developed to automate this process.

Posted by Ben Elsworth on Nov 17, 2020

database Neo4J data integration software
Reducing drug development costs

Explaining our work in a way that is accessible to a wide audience is often challenging. Here we summarise some of our approaches to drug target prioritization in a short animation.

Posted by Tom Gaunt on Nov 8, 2020

drug targets video MR colocalization
Visualising Brexit’s Impact on Food Safety in Britain

PhD students Marina Vabistsevits and Ollie Lloyd entereed the Jean Golding Institute data visualization competition on food hazards from around the world. Here they present their visualizations and interpretation, which won them a runner-up prize.

Posted by Marina Vabistsevits and Ollie Lloyd on Oct 6, 2020

data visualization data science
Drug target prioritization using protein QTL

A lot of our research recently has focused on drug target prioritization using Mendelian randomization and genetic colocalization. Here we introduce Jie (Chris) Zheng’s Nature Genetics paper which describes our systematic analysis of the plasma proteome, part of an ongoing collaboration with pharma partners.

Posted by Tom Gaunt on Sep 7, 2020

drug targets papers MR colocalization
Exploring Elasticsearch architectures with Oracle Cloud

The IEU OpenGWAS database contains well over 100 billion rows of data on genetic associations. Ben Elsworth describes his work on implementing a cloud-based ElasticSearch database on the Oracle Cloud Infrastructure to can handle millions of queries per week.

Posted by Ben Elsworth on Apr 16, 2019

database Elasticsearch OpenGWAS cloud software
Indexing 200 billion records in 2 days

A few years ago we started collecting genome-wide association study datasets and making them available to the research community. As the data grew from tens of millions to tens of billions of rows we found a MySQL database no longer sufficient. Ben Elsworth describes how he implemented an ElasticSearch solution to the challenge of querying a really large dataset.

Posted by Ben Elsworth on Jan 24, 2019

database Elasticsearch OpenGWAS software

Categories

Latest posts