Skip to main content

One post tagged with "data integration"

View All Tags

Neo4J data integration pipeline


Background

We’ve been using Neo4j for around five years in a variety of projects, sometimes as the main database MELODI and sometimes as part of a larger platform (OpenGWAS). We find creating queries with Cypher intuitive and query performance to be good. However, the integration of data into a graph is still a challenge, especially when using many data from a variety of sources. Our latest project EpiGraphDB uses data from over 20 independent sources, most of which require cleaning and QC before they can be incorporated. In addition, each build of the graph needs to contain information on the versions of data, the schema of the graph and so on.