On expected score of cellwise alignments

Authors

  • Riho Klement Institute of Mathematics and Statistics, University of Tartu, J. Liivi 2, 50409 Tartu
  • Jüri Lember Institute of Mathematics and Statistics, University of Tartu, J. Liivi 2, 50409 Tartu

DOI:

https://doi.org/10.12697/ACUTM.2017.21.10

Keywords:

longest common subsequence, optimal alignment, large deviation inequality

Abstract

We consider certain suboptimal alignments of two independent i.i.d. random sequences from a finite alphabet A = {1;...,K}, both sequences having length n. In particular, we focus on so-called cellwise alignments, where in the first step so many 1-s as possible are aligned. These aligned 1-s define cells and the rest of the alignment is defined so that the already existing alignment of 1-s remains unchanged. We show that as n grows, for any cellwise alignment, the average score of a cell tends to the expected score of a random cell, a.s. Moreover, we show that a large deviation inequality holds. The second part of the paper is devoted to calculating the expected score of certain cellwise alignment referred to as priority letter alignment. In this alignment, inside every cell first all 2-s are aligned. Then all 3-s are aligned, but in such way that the already existing alignment of 2-s remains unchanged. Then we continue with 4-s and so on. Although easy to describe, for K bigger than 3 the exact formula for expected score is not that straightforward to find. We present a recursive formula for calculating the expected score.

Downloads

Download data is not yet available.

Downloads

Published

2017-07-03

Issue

Section

Articles