Skip to main content

7 posts tagged with "software"

View All Tags

DrivR-Base: a feature extraction toolkit for variant effect prediction


Understanding which genetic variants are likely to be functional (and which are probably benign) is a cornerstone of modern human genetics. Over the last decade, variant-effect predictors have become increasingly sophisticated — but behind every model sits the same practical headache: assembling a sensible set of features (annotations) for millions of variants from dozens of databases.

In a 2024 Bioinformatics paper led by Amy Francis, we introduce DrivR-Base, a reproducible, Dockerised toolkit that turns this feature-extraction step into something you can run and re-run with far less pain.

Triangulating evidence in health sciences with Annotated Semantic Queries


Update: The ASQ work has now been published in Bioinformatics.

Yi Liu, Tom R Gaunt, Triangulating evidence in health sciences with Annotated Semantic Queries, Bioinformatics, Volume 40, Issue 9, September 2024, btae519, https://doi.org/10.1093/bioinformatics/btae519

Overview

Integrating information from data sources representing different study designs has the potential to strengthen evidence in population health research. However, this concept of evidence “triangulation” presents a number of challenges for systematically identifying and integrating relevant information.

In this medRxiv preprint we present ASQ (Annotated Semantic Queries), a natural language query interface to the integrated biomedical entities and epidemiological evidence in EpiGraphDB . ASQ enables users to extract “claims” from a piece of unstructured text, and then investigate the evidence that could either support, contradict the claims, or offer additional information to the query.

The ASQ approach has the potential to support the rapid review of pre-prints, grant applications, conference abstracts and articles submitted for peer review. ASQ implements strategies to harmonize biomedical entities in different taxonomies and evidence from different sources, to facilitate evidence triangulation and interpretation.

ASQ is openly available at https://asq.epigraphdb.org.

EpiGraphDB platform version 1.0


EpiGraphDB version 1.0

The EpiGraphDB platform has been updated with a new major release (version 1.0). This is the first release since version 0.3 in 2020 (what a year!) as well as since the publication of the journal article on Bioinformatics. We believe the underlying integration pipeline, data structure and architecture for the EpiGraphDB platform has now progressed sufficiently to a stable state that we are pleased to announce this major release a version 1.0!

Neo4J data integration pipeline


Background

We’ve been using Neo4j for around five years in a variety of projects, sometimes as the main database MELODI and sometimes as part of a larger platform (OpenGWAS). We find creating queries with Cypher intuitive and query performance to be good. However, the integration of data into a graph is still a challenge, especially when using many data from a variety of sources. Our latest project EpiGraphDB uses data from over 20 independent sources, most of which require cleaning and QC before they can be incorporated. In addition, each build of the graph needs to contain information on the versions of data, the schema of the graph and so on.