Pub. online:1 Jan 2019Type:Research ArticleOpen Access
Journal:Informatica
Volume 30, Issue 3 (2019), pp. 553–571
Abstract
The simplest hypothesis of DNA strand symmetry states that proportions of nucleotides of the same base pair are approximately equal within single DNA strands. Results of extensive empirical studies using asymmetry measures and various visualization tools show that for long DNA sequences (approximate) strand symmetry generally holds with rather rare exceptions. In the paper, a formal definition of DNA strand local symmetry is presented, characterized in terms of generalized logits and tested for the longest non-coding sequences of bacterial genomes. Validity of a special regression-type probabilistic structure of the data is supposed. This structure is compatible with probability distribution of random nucleotide sequences at a steady state of a context-dependent reversible Markov evolutionary process. The null hypothesis of strand local symmetry is rejected in majority of bacterial genomes suggesting that even neutral mutations are skewed with respect to leading and lagging strands.
Journal:Informatica
Volume 13, Issue 2 (2002), pp. 209–226
Abstract
Five methods for count data clusterization based on Poisson mixture models are described. Two of them are parametric, the others are semi-parametric. The methods emlploy the plug-in Bayes classification rule. Their performance is investigated by making use of computer simulation and compared mainly by the clusterization error rate. We also apply the clusterization procedures to real count data and discuss the results.