Latest posts
-
We have curated the full text data archives of BioRxiv and MedRxiv preprints and conducted some exploratory analyses for our next stage research projects in text mining biomedical literature with automated approaches.
seedcorn funding text mining NLP
-
Proteome-wide Mendelian randomization in global biobank to identify multi-ancestry drug targets
PhD student Huiling Zhao and co-supervisor Dr Jie (Chris) Zheng published an interesting cross-ancestry MR analysis of potential drug targets in collaboration with the Global Biobank Meta-analysis Initiative.
drug-targets mr colocalization
-
New funding: NIHR Bristol Biomedical Research Institute.
The National Institute for Health and Care Research Bristol Biomedical Research Centre (NIHR Bristol BRC) has been awarded nearly £12 million of new funding for the next five years. The DMER programme is linked to the Translational Data Science theme of the BRC, providing a mechanism for translation of our methodological research, software tools and data resources.
drug-targets mr funding
-
Triangulating results between Mendelian randomization studies and randomized controlled trials has the potential to strengthen evidence for an intervention target. In this work, led by Maria Sobczyk, we mined ClinicalTrials.Gov, PubMed and EpigraphDB databases and carried out a series of 26 manual literature comparisons among 54 MR and 77 RCT publications to explore the potential for systematic triangulation.
database epigraphdb MR NLP
-
Triangulating evidence in health sciences with Annotated Semantic Queries
Integrating information from data sources representing different study designs has the potential to strengthen evidence in population health research. In this work, led by Yi Liu, we present ASQ (Annotated Semantic Queries), a natural language query interface to EpiGraphDB, which enables users to annotate “claims” from a piece of unstructured text with evidence relevant to the claim.
database EpiGraphDB MR NLP software
-
Molecular quantitative trait loci (molQTL), which can provide functional evidence on the mechanisms underlying phenotype-genotype associations, are increasingly used in drug target validation and safety assessment. In this work, led by Jamie Robinson, we evaluate the differences between expression and protein QTL and explore the possible reasons for apparent contradictory effects of genetic variants.
database epigraphdb MR NLP
-
Senior Research Associate / Research Fellow in Health Data Science
We are seeking a talented postdoctoral scientist with expertise in biomedical data integration and analysis, data mining and causal inference
jobs health data science work with us
-
This paper, led by Jie Zheng, systematically analysed previously reported risk factors for chronic kidney disease in European and East Asian populations using Mendelian randomization. The analysis showed evidence of both cross-population and population-specific risk factors.
MR CKD
-
EpiGraphDB platform version 1.0
EpiGraphDB v1.0 and summary of features and changes.
EpiGraphDB software database
-
MendelVar: gene prioritization at GWAS loci using phenotypic enrichment of Mendelian disease genes
This paper, led by Maria Sobczyk, presented MendelVar, a tool which integrates knowledge from four databases on Mendelian disease genes with enrichment testing for a range of functional annotations to support the prioritization of genes at GWAS loci.
database GWAS software
-
Neo4J data integration pipeline
We make extensive use of Neo4J for graph databases (including EpiGraphDB). One of the key challenges in constructing a heterogeneous graph database is the data integration from different sources. Ben Elsworth describes the pipeline he has developed to automate this process.
database Neo4J data integration software
-
Reducing drug development costs
Explaining our work in a way that is accessible to a wide audience is often challenging. Here we summarise some of our approaches to drug target prioritization in a short animation.
drug targets video MR colocalization
-
Visualising Brexit’s Impact on Food Safety in Britain
PhD students Marina Vabistsevits and Ollie Lloyd entereed the Jean Golding Institute data visualization competition on food hazards from around the world. Here they present their visualizations and interpretation, which won them a runner-up prize.
data visualization data science
-
Drug target prioritization using protein QTL
A lot of our research recently has focused on drug target prioritization using Mendelian randomization and genetic colocalization. Here we introduce Jie (Chris) Zheng’s Nature Genetics paper which describes our systematic analysis of the plasma proteome, part of an ongoing collaboration with pharma partners.
drug targets papers MR colocalization
-
Exploring Elasticsearch architectures with Oracle Cloud
The IEU OpenGWAS database contains well over 100 billion rows of data on genetic associations. Ben Elsworth describes his work on implementing a cloud-based ElasticSearch database on the Oracle Cloud Infrastructure to can handle millions of queries per week.
database Elasticsearch OpenGWAS cloud software
-
Indexing 200 billion records in 2 days
A few years ago we started collecting genome-wide association study datasets and making them available to the research community. As the data grew from tens of millions to tens of billions of rows we found a MySQL database no longer sufficient. Ben Elsworth describes how he implemented an ElasticSearch solution to the challenge of querying a really large dataset.
database Elasticsearch OpenGWAS software