Journal:Informatica
Volume 21, Issue 1 (2010), pp. 13–30
Abstract
The genetic information in cells is stored in DNA sequences, represented by a string of four letters, each corresponding to a definite type of nucleotides. Genomic DNA sequences are very abundant in periodic patterns, which play important biological roles. The complexity of genetic sequences can be estimated using the information-theoretic methods. Low complexity regions are of particular interest to genome researchers, because they indicate to sequence repeats and patterns. In this paper, the complexity of genetic sequences is estimated using Shannon entropy, Rényi entropy and relative Kolmogorov complexity. The structural complexity based on periodicities is analyzed using the autocorrelation function and time delayed mutual information. As a case study, we analyze human 22nd chromosome and identify 3 and 49 bp periodicities.
Journal:Informatica
Volume 18, Issue 2 (2007), pp. 217–238
Abstract
The rapid development of network technologies has made the web a huge information source with its own characteristics. In most cases, traditional database-based technologies are no longer suitable for web information processing and management. For effectively processing and managing web information, it is necessary to reveal intrinsic relationships/structures among concerned web information objects such as web pages. In this work, a set of web pages that have their intrinsic relationships is called a web page community. This paper proposes a matrix-based model to describe relationships among concerned web pages. Based on this model, intrinsic relationships among pages could be revealed, and in turn a web page community could be constructed. The issues that are related to the application of the model are deeply investigated and studied. The concepts of community and intrinsic relationships, as well as the proposed matrix-based model, are then extended to other application areas such as biological data processing. Some application cases of the model in a broad range of areas are presented, demonstrating the potentials of this matrix-based model.