
functionality, and is available directly to the 
application developer or may be accessed by the 
user via the included Sequence Assembler 
demonstration application (Fig. 6). The Sequence 
Assembler application is a GUI-based interface to a 
range of MBF functions and uses rich user interface 
elements to enable visualization and manipulation of 
genomic data. The user can perform assembly, 
alignment and multiple sequence alignment of DNA, 
RNA and protein sequences, visualizing the output 
in a graphical alignment display built using the 
Windows Presentation Foundation and Silverlight. 
The Sequence Assembler also provides a connector 
to various BLAST (Altschul, 1997) web services, 
which can be used to characterize an assembled 
sequence using public databases.  
While our initial results are promising, some work is 
needed to further improve the quality and utility of 
the assembled output, especially for large size 
genomes. Nonetheless, PadeNA can currently be 
used for assembling bacterial genomes on shared 
memory architectures and each step can be 
customized to handle datasets with different 
characteristics, or better meet the needs of different 
groups of scientific users. 
ACKNOWLEDGEMENTS 
We would like to thank the Aditi-Microsoft MBF 
Engineering team for their continued support to 
make this de novo assembler design and technical 
implementation deep, robust and of very high 
quality. We would also like to thank Steve Jones, 
Inanc Birol and other staff at Canada’s Michael 
Smith Genome Center for their kind assistance in 
understanding the field of genomics. Last but not 
least, a very special thanks to Prasanth Koorma for 
his constant motivation and encouragement 
throughout the project. 
REFERENCES 
Altschul Stephen F., Madden Thomas L., Schaffer 
Alejandro A., Zhang Jinghui, Zhang Zheng, Miller 
Webb, & Lipman David J. 1997,’ Gapped BLAST and 
PSI-BLAST: a new generation of protein database 
search programs’, Nucleic Acids Res. 25:3389-3402. 
Batzoglou S., Jaffe D.B., Stanley K., Butler J., Gnerre S., 
Mauceli E., Berger B., Mesirov J. P., & Lander E. S., 
2002, ‘ARACHNE: a whole-genome shotgun 
assembler’, Genome Research, 12:177–189. 
Biswas Surupa 2006, The Performance Benefits of NGen., 
Viewed July 5
th
 2010, < http://msdn. microsoft.com/ 
en-us/magazine/cc163610.aspx> 
Butler J., MacCallum I., Kleber M., Shlyakhter I. A., 
Belmonte M. K., Lander E. S., Nusbaum C. N., & 
Jaffe D. B., 2008, ‘ALLPATHS: De novo assembly of 
whole-genome shotgun microreads’, Genome 
Research, 18:810–820. 
Chaisson M.J. & Pevzner P.A., 2008, ‘Short fragment 
assembly of bacterial genomes’, Genome Research, 
pages 18:324–330. 
De Novo Assembly using Illumina reads – technical note: 
Illumina sequencing, 2009, retrieved July 5
th
 2010, 
<http://www.illumina .com/Documents/products/tech 
notes/technote_denovo_assembly.pdf> 
Green P., 1996, ‘Documentation for Phrap. Technical 
report’ Genome Center, University of Washington. 
Havlak P., Chen R., Durbin K. J., Egan A., & Ren Y., 
2003, ‘The atlas genome assembly system’, Genome 
Research, 14:721–731. 
Huang X. & Madan A., 1999, ‘CAP3: A whole-genome 
assembly program’, Genome Research, 9:868–877. 
Huson Daniel H., Reinert Knut, & Myers Eugene W., 
2002, ‘The greedy path-merging algorithm for contig 
scaffolding’,  Journal of the ACM (JACM) archive, 
Volume 49, Issue 5. 
Kurtz S., Phillippy A., Delcher A. L., Smoot M., 
Shumway M., Antonescu C., & Salzberg S. L., 2004, 
‘Versatile and open software for comparing large 
genomes’, Genome Biology. 
Mono: Cross platform, open source .NET development 
framework, 2004. Viewed July 5
th
 2010, < 
http://mono-project.com/Main_Page> 
Myers E. W., Sutton G. G., Delcher A. L., & Dew I. M., 
2000, ‘A whole-genome assembly of Drosophila’, 
Science, 287(5461):2196–2204. 
Pattison Ted 1999, Understanding Interface-based 
Programming, Viewed July 5
th
 2010, < 
http://msdn.microsoft.com/en-us/library/aa 260635 
(VS.60).aspx> 
Pevzner P. A., Tang H., & Waterman M. S., 2001, ‘An 
eulerian path approach to DNA fragment assembly’, 
Proceedings of the National Academy of Sciences, 
98(17):9748–9753. 
Pop M., Kosack D. S., & Salzberg S. L., 2004, 
‘Hierarchical scaffolding with Bambus’, Genome 
Research, 14 (1), pp. 149-159. 
Simpson J. T., Wong K., Jackman S. D., Schein J. E., 
Jones S. J., & Birol I., 2009, ‘ABySS: A parallel 
assembler for short read sequence data’,  Genome 
Research. 
Sutton G. G., White O., Adams M. D., & Kerlavage A. R., 
1995, ‘TIGR assembler: A new tool for assembling 
large shotgun sequencing projects’, Genome Science 
and Technology, 1:9–19. 
Zerbino D. & Birney E., 2008. ‘Velvet: Algorithms for de 
novo short read assembly using de Bruijn graphs’, 
Genome Research, 18:821–829 
PadeNA: A PARALLEL DE NOVO ASSEMBLER
203