Open Access Highly Accessed Research

How do alignment programs perform on sequencing data with varying qualities and from repetitive regions?

Xiaoqing Yu, Kishore Guda, Joseph Willis, Martina Veigl, Zhenghe Wang, Sanford Markowitz, Mark D Adams and Shuying Sun*

BioData Mining 2012, 5:6 doi:10.1186/1756-0381-5-6

Was Soft clipping taken into account when determining True/False positive Alignments

Colin Hercus   (2013-03-13 15:29)  Novocraft Technologies email

Your paper shows a high false positive rate for alignments from Novoalign and I'm wondering if this could just be an effect of the soft-clipping in Novoalign. Novoalign will soft-clip alignments that have mismatches near the ends of the reads and this results in a shift in position of the alignment compared to the simulated alignment location. Depending on the version of Novoalign you used you might get 3-4bp soft-clipped off a read when the mismatch is in the first/last 3-4bp of the read. This could lead to 6-8% false positive alignments (in 50bp reads) if you don't allow for soft clipping in your evaluation.
Could you comment on this?

Would it also be possible for you to provide more details regard the version of Novoalign that you used and provide access to data and scripts that you used.

Thanks, Colin

Competing interests

Director of Novocraft Technologies, commercial interest in Novoalign

top

Post a comment