Show/Hide Table View
Research & Development
Gene model data management
This is Phase II of an SBIR project by Geospiza and The HDF Group, which did not receive funding. In Phase I, we developed a prototype called BioHDF that demonstrated the power of HDF5 for managing high volume, highly complex DNA sequencing data. In the Phase II research project, Geospiza and The HDF Group proposed to develop a portable, scalable, and adaptable file technology for genotyping software. From the proposal, this project includes following aims:
The first products, based on BioHDF, will provide data models, APIs, software tools (I/O, algorithms), and a viewer based on HDFView, to support DNA polymorphism discovery and genotyping. Using BioHDF, researchers will be able perform resequencing-based SNP discovery, analyze genotyping data, and export data sets in formats ready for submission to key databases. As a programming environment, BioHDF will be easily extended to accept data from new genotyping platforms and format data for interchange with many databases.
Additionally, BioHDF will be able to be used to support studying and performing linkage disequilibrium (LD) calculations in very large data sets like HapMap. BioHDF will be delivered to the research community as an open source technology. During Phase II and in Geospiza's follow on Phase III efforts, Geospiza will use BioHDF to deliver scalable software applications to support clinical research, diagnostics, and bio-manufacturing processes that utilize genetic data for personalized therapeutics.