Home
Search

Informatica

Information Submit your article For Referees Help ATTENTION!

Journal home
To appear
Current issue
All issues
More
Journal home To appear Current issue All issues

Detailed search

Title

Author

Types

Abstract

Keywords

Published

Pages

Volumes

Issues

DOI

Affiliation

Search results 1283

Order by:

Select: All None Download:

Semi-Automatic Bilingual Corpus Creation with Zero Entropy Alignments

Algirdas Laukaitis Olegas Vasilecas Ricardas Laukaitis Darius Plikynas

https://doi.org/10.15388/Informatica.2011.323

Pub. online: 1 Jan 2011 Type: Research Article

Journal: Informatica Volume 22, Issue 2 (2011), pp. 203–224

Abstract

In this paper, we describe a model for aligning books and documents from bilingual corpus with a goal to create “perfectly” aligned bilingual corpus on word-to-word level. Presented algorithms differ from existing algorithms in consideration of the presence of human translator which usage we are trying to minimize. We treat human translator as an oracle who knows exact alignments and the goal of the system is to optimize (minimize) the use of this oracle. The effectiveness of the oracle is measured by the speed at which he can create “perfectly” aligned bilingual corpus. By “Perfectly” aligned corpus we mean zero entropy corpus because oracle can make alignments without any probabilistic interpretation, i.e., with 100% confidence. Sentence level alignments and word-to-word alignments, although treated separately in this paper, are integrated in a single framework. For sentence level alignments we provide a dynamic programming algorithm which achieves low precision and recall error rate. For word-to-word level alignments Expectation Maximization algorithm that integrates linguistic dictionaries is suggested as the main tool for the oracle to build “perfectly” aligned bilingual corpus. We show empirically that suggested pre-aligned corpus requires little interaction from the oracle and that creation of perfectly aligned corpus can be achieved almost with the speed of human reading. Presented algorithms are language independent but in this paper we verify them with English–Lithuanian language pair on two types of text: law documents and fiction literature.

A Method of Finding Bad Signatures in an RSA-Type Batch Verification

Kitae Kim Ikkwon Yie Seongan Lim Haeryong Park

https://doi.org/10.15388/Informatica.2011.322

Pub. online: 1 Jan 2011 Type: Research Article

Journal: Informatica Volume 22, Issue 2 (2011), pp. 189–201

Abstract

Batch cryptography has been developed into two main branches – batch verification and batch identification. Batch verification is a method to determine whether a set of signatures contains invalid signatures, and batch identification is a method to find bad signatures if a set of signatures contains invalid signatures. Recently, some significant developments appeared in such field, especially by Lee et al., Ferrara et al. and Law et al., respectively. In this paper, we address some weakness of Lee et al.'s earlier work, and propose an identification method in an RSA-type signature. Our method is more efficient than the well known divide and conquer method for the signature scheme. We conclude this paper by providing a method to choose optimal divide and conquer verifiers.

The Burg Algorithm with Extrapolation for Improving the Frequency Estimation

Kazys Kazlauskas

https://doi.org/10.15388/Informatica.2011.321

Pub. online: 1 Jan 2011 Type: Research Article

Journal: Informatica Volume 22, Issue 2 (2011), pp. 177–188

Abstract

The paper presents a novel method for improving the estimates of closely-spaced frequencies of a short length signal in additive Gaussian noise based on the Burg algorithm with extrapolation. The proposed method is implemented in two consecutive steps. In the first step, the Burg algorithm is used to estimate the parameters of the predictive filter, while in the second step the extrapolation technique of the signal is used to improve the frequency estimates. The experimental results demonstrate that the frequency estimates of the short length signal, using the Burg algorithm with extrapolation, are more accurate than the frequency estimates using the Burg algorithm without extrapolation.

Closed-Loop System Identification with Recursive Modifications of the Instrumental Variable Method

Nasko Atanasov Alexandar Ichtev

https://doi.org/10.15388/Informatica.2011.320

Pub. online: 1 Jan 2011 Type: Research Article

Journal: Informatica Volume 22, Issue 2 (2011), pp. 165–176

Abstract

The instrumental variable (IV) method is one of the most renowned methods for parameter estimation. Its bigger advantage is that it is applicable for open-loop as well as for closed-loop systems. The main difficulty in closed-loop identification is due to the correlation between the disturbances and the control signal induced by the loop. In order to overcome this problem, additional excitation signal is introduced. Non-recursive modifications of the instrumental variable method for closed-loop system identification on the base of a generalized IV method have been developed (Atanasov and Ichtev, 2009; Gilson and Van den Hof, 2001; Gilson and Van den Hof, 2003). In this paper, recursive algorithms for theses modifications are proposed and investigated. A simulation is carried out in order to illustrate the obtained results.

Change Point Detection by Sparse Parameter Estimation

Jiří Neubauer Vítězslav Veselý

https://doi.org/10.15388/Informatica.2011.319

Pub. online: 1 Jan 2011 Type: Research Article

Journal: Informatica Volume 22, Issue 1 (2011), pp. 149–164

Abstract

The contribution is focused on change point detection in a one-dimensional stochastic process by sparse parameter estimation from an overparametrized model. A stochastic process with change in the mean is estimated using dictionary consisting of Heaviside functions. The basis pursuit algorithm is used to get sparse parameter estimates. The mentioned method of change point detection in a stochastic process is compared with several standard statistical methods by simulations.

Community Detection Through Optimal Density Contrast of Adjacency Matrix

Tianzhu Liang Kwok Yip Szeto

https://doi.org/10.15388/Informatica.2011.318

Pub. online: 1 Jan 2011 Type: Research Article

Journal: Informatica Volume 22, Issue 1 (2011), pp. 135–148

Abstract

Detecting communities in real world networks is an important problem for data analysis in science and engineering. By clustering nodes intelligently, a recursive algorithm is designed to detect community. Since the relabeling of nodes does not alter the topology of the network, the problem of community detection corresponds to the finding of a good labeling of nodes so that the adjacency matrix form blocks. By putting a fictitious interaction between nodes, the relabeling problem becomes one of energy minimization, where the total energy of the network is defined by putting interaction between the labels of nodes so that clustering nodes that are in the same community will decrease the total energy. A greedy method is used for the computation of minimum energy. The method shows efficient detection of community in artificial as well as real world network. The result is illustrated in a tree showing hierarchical structure of communities on the basis of sub-matrix density. Applications of the method to weighted and directed networks are discussed.

Quality of Quantization and Visualization of Vectors Obtained by Neural Gas and Self-Organizing Map

Olga Kurasova Alma Molytė

https://doi.org/10.15388/Informatica.2011.317

Pub. online: 1 Jan 2011 Type: Research Article

Journal: Informatica Volume 22, Issue 1 (2011), pp. 115–134

Abstract

In this paper, the quality of quantization and visualization of vectors, obtained by vector quantization methods (self-organizing map and neural gas), is investigated. A multidimensional scaling is used for visualization of multidimensional vectors. The quality of quantization is measured by a quantization error. Two numerical measures for proximity preservation (Konig's topology preservation measure and Spearman's correlation coefficient) are applied to estimate the quality of visualization. Results of visualization (mapping images) are also presented.

On Comparison of the Estimators of the Hurst Index of the Solutions of Stochastic Differential Equations Driven by the Fractional Brownian Motion

Kęstutis Kubilius Dmitrij Melichov

https://doi.org/10.15388/Informatica.2011.316

Pub. online: 1 Jan 2011 Type: Research Article

Journal: Informatica Volume 22, Issue 1 (2011), pp. 97–114

Abstract

This paper presents a study of the Hurst index estimation in the case of fractional Ornstein–Uhlenbeck and geometric Brownian motion models. The performance of the estimators is studied both with respect to the value of the Hurst index and the length of sample paths.

A Quadratic Loss Multi-Class SVM for which a Radius–Margin Bound Applies

Yann Guermeur Emmanuel Monfrini

https://doi.org/10.15388/Informatica.2011.315

Pub. online: 1 Jan 2011 Type: Research Article

Journal: Informatica Volume 22, Issue 1 (2011), pp. 73–96

Abstract

To set the values of the hyperparameters of a support vector machine (SVM), the method of choice is cross-validation. Several upper bounds on the leave-one-out error of the pattern recognition SVM have been derived. One of the most popular is the radius–margin bound. It applies to the hard margin machine, and, by extension, to the 2-norm SVM. In this article, we introduce the first quadratic loss multi-class SVM: the M-SVM². It can be seen as a direct extension of the 2-norm SVM to the multi-class case, which we establish by deriving the corresponding generalized radius–margin bound.

Phase-Type Survival Trees and Mixed Distribution Survival Trees for Clustering Patients' Hospital Length of Stay

Lalit Garg Sally McClean Brian J. Meenan Peter Millard

https://doi.org/10.15388/Informatica.2011.314

Pub. online: 1 Jan 2011 Type: Research Article

Journal: Informatica Volume 22, Issue 1 (2011), pp. 57–72

Abstract

Clinical investigators, health professionals and managers are often interested in developing criteria for clustering patients into clinically meaningful groups according to their expected length of stay. In this paper, we propose two novel types of survival trees; phase-type survival trees and mixed distribution survival trees, which extend previous work on exponential survival trees. The trees are used to cluster the patients with respect to length of stay where partitioning is based on covariates such as gender, age at the time of admission and primary diagnosis code. Likelihood ratio tests are used to determine optimal partitions. The approach is illustrated using nationwide data available from the English Hospital Episode Statistics (HES) database on stroke-related patients, aged 65 years and over, who were discharged from English hospitals over a 1-year period.

55 56 57 58 59

Items per page

Export citation

Copy and paste formatted citation

Formatted citation

Placeholder

Citation style

Download citation in file

Export format

Authors

Placeholder

RSS

INFORMATICA

Online ISSN: 1822-8844
Print ISSN: 0868-4952

About

About journal

For contributors

OA Policy
Submit your article
Instructions for Referees

Contact us

Institute of Data Science and Digital Technologies
Vilnius University

Akademijos St. 4

08412 Vilnius, Lithuania

Phone: (+370 5) 2109 338

E-mail: informatica@mii.vu.lt
https://informatica.vu.lt/journal/INFORMATICA