Zusammenfassung
With the increasing throughput of Next Generation DNA sequencing machines,
it has become important to come up with efficient ways of processing
the sequence data and producing assembled whole human genome sequences
for research and diagnostic purposes.
In this paper, we describe our efforts in scaling HapCUT, a haplotype
phasing from UCSD, using a parallel implementation that runs on Windows
Azure Cloud infrastructure. One of our novel contributions is a tool
to reduce the effort required to port, deploy and execute existing
methods in a .NET library or a Windows executable within the cloud;
we use this tool to run the extant phasing libraries within Azure.
We are currently conducting experiments to study the performance
implications and advantages of running the haplotype phasing on the
cloud as compared to a local Windows HPC cluster.
Nutzer