Find out how Via Foundry’s metadata capabilities bring fragmented information together for AI excellence
In a perfect world, our data would flow seamlessly from a single source, uniform and orderly. However, the reality of scientific research is often far from this ideal. Data is fragmented, scattered across various sources, creating silos that hinder efficient analysis and integration.
This fragmentation becomes a significant challenge, especially in the era of artificial intelligence (AI), where unified data is crucial for deriving accurate and actionable insights.
At UMass Chan Medical School, we recognized this challenge (alongside many other challenges!) and developed DolphinNext, a groundbreaking solution designed to address the complexities of multi-omics research.
This innovation evolved into Via Foundry, a platform now commercialized by Via Scientific, Inc., our new Cambridge-based technology and AI company.
Via Foundry represents a leap forward in data integration and analysis, offering a robust metadata system that eliminates the barriers posed by fragmented data.
From Fragmentation to Unity
The heart of the issue lies in the nature of scientific data. As I mentioned, we collect information from diverse sources, such as healthcare records, environmental factors, sensor data, and more.
Each dataset often comes with its unique format and structure, making it a daunting task to bring everything together. This is where Via Foundry shines, providing a solution that not only unifies data but also enhances its usability through comprehensive metadata capabilities.
Via Foundry allows for normalized data accumulation and the merging of this data with detailed metadata. Metadata, often described as “data about data,” plays a crucial role in this process. It includes provenance data, which traces the origin and history of each dataset, ensuring transparency and reproducibility.
But Via Foundry goes further, allowing us to incorporate real-world dimensions into our metadata. This means that we can enrich our datasets with additional context, such as patient demographics, environmental conditions, and other relevant factors, either at the time of data collection or retrospectively.
Event-Based Hierarchical Metadata Tracking
Via Foundry’s innovative approach includes event-based hierarchical metadata tracking, allowing different types of scientists to enter data based on their roles.
This ensures that data is added by the individuals who collected it. For example, nurses can access forms for patient entry and visit records to enter patient data. Scientists involved in library preparation can access anonymized patient information to prepare bio samples and derive samples for sequencing experiments like RNA-Seq or ATAC-Seq.
Furthermore, events can be organized hierarchically, requiring users to first select a parent collection before entering information for an associated child collection.
In a clinical project, two collections might be defined: Patients and Patient Visits, with Patients as the parent collection. When a new patient arrives at the clinic, users need to submit entries to both the Patients and Patient Visits collections.
An event called “New Patient” would allow the submission of a patient record, while another event called “New Patient Visit” would enable the submission of the visit date once the associated patient is selected from a dropdown menu.
Template-Based Metadata Tracking
Data integration or fusion to run AI algorithms requires certain metadata overlaps between projects to ensure compatibility and enhance the quality of the resulting analysis.
Key metadata fields such as sample identifiers, experimental conditions, collection dates, and methodologies must be consistently captured across different datasets.
When preparing data for submission to repositories like NCBI, ENCODE, or other organizations, specific metadata fields are mandatory. These fields often include detailed descriptions of the sample type, source organism, experimental protocols, sequencing methods, and quality control measures.
For companies collecting diverse experimental data, it’s crucial to harmonize these datasets in a structured format. This harmonization enables machine learning algorithms to effectively learn from the combined data, providing more accurate and insightful predictions.
By making sure metadata is consistent and complete, researchers can facilitate better data integration, and enhance the utility of AI-driven analyses across various biological and clinical research projects.
Metadata in the AI Era
In the AI era, the importance of unified data cannot be overstated. AI algorithms thrive on large, coherent datasets. They require a rich, interconnected flow of information to generate reliable models and predictions.
Fragmented data, with its inconsistencies and gaps, poses a significant obstacle to AI’s potential. Via Foundry’s metadata capabilities are designed to bridge this gap, ensuring that data is not only integrated but also enriched with context that enhances its value for AI applications.
Imagine a scenario where we are studying the effects of environmental factors on cardiovascular health. Traditionally, this would involve painstakingly collecting and aligning data from various sources, each with its unique format.
With Via Foundry, this process is streamlined. The platform’s drag-and-drop pipeline and metadata-building functionality make it easy to integrate and annotate data, allowing us to focus on analysis and discovery rather than data wrangling.
A Solution Born from Expertise
Via Foundry was created based on the needs of science and expertise and vision of its founders. Originally developed over a decade at UMass Chan Medical School by our team of scientists, including Melissa J. Moore, PhD, Manuel Garber, PhD, Jim Crowley, Janet Kosloff, and myself, the platform has evolved to meet the growing demands of multi-omics research.
Today, Via Foundry automates complex data and analytical tasks, eliminating the need for coding and making advanced analytics accessible to a broader range of researchers.
Via Scientific, the company behind Via Foundry, aims to support biotech, pharmaceutical companies, research institutes, and academic universities in their quest for scientific breakthroughs. We’re opening up a new era of data-driven discovery by boosting the AI capabilities of the Foundry platform.
Conclusion
In the fragmented landscape of scientific data, Via Foundry excels in integration and innovation. Its powerful metadata capabilities not only unify data but also enrich it with meaningful context, allowing AI to reveal new insights and drive scientific progress.
As we continue to push the boundaries of what is possible with AI, the importance of platforms like Via Foundry cannot be overstated. They provide the foundation upon which the future of scientific research is built, turning the complex puzzle of data into a coherent picture of discovery and advancement.