Vol 16, No 2 (2018)

AN APPROACH TO BUILDING EXTENDED TOPIC MODELS OF RUSSIAN TEXTS

T. V. Batura, S. E. Strekalova

PDF (Rus)

5-18 59

Abstract

A new approach to building extended topic models of Russian scientific texts is described in this article. An extended topic model is a topic model containing not only one-word terms, but also multiword terms (key phrases). Such models are better interpreted for the user and more accurately describe the subject area of the document than models consisting only of unigrams (separate words). On the basis of the proposed approach, a system was developed which, as a result of the work, provides for each document a set of topics with probabilities, key words and phrases for each topic. The approach proposed in the article can be useful for development of recommendation systems and summarization systems.

TECHNIQUES FOR IMPROVING PERFORMANCE OF THE SMALL INFORMATION SYSTEMS THROUGH OPTIMAL RESTRUCTURING DATA BASED ON MULTIMODAL DISTRIBUTIONS ATTRIBUTES

I. V. Belchenko, R. A. Diyachenko

PDF (Rus)

19-30 62

Abstract

A systematic approach to increasing the productivity of small information systems is considered at the expense of optimal restructuring of tabular data structures. The authors formulated the task of optimizing the number of data blocks that are needed to query the group to read the information offered to the target function, and structural constraints. The impossibility of using crude methods of searching for the optimal solution is analyzed. The technique of multimodal attribute distribution is proposed depending on their frequency of occurrence in the query group. The experiment confirming the effectiveness of the developed methodology for small information systems.

DEVELOPMENT OF THE SERVICE FOR STIMULI SCENARIO REPRESENTATION BASED ON MODEL DRIVEN ARCHITECTURE

I. V. Brak, Yu. I. Sazonova

PDF (Rus)

31-40 42

Abstract

Methods of quantitative data analysis are important in modern physiology. Necessary condition for usage of mathematical statistics, signal analysis and machine learning is the availability of properly collected, marked and prepared data. Thus, preservation of meta-information and structuring results will be useful for their further processing. Physiological experiment consists of a set of trials (samples), in which instructions and certain stimuli are presented to the participant. Reaction on the test sample is recorded as physiological measures. Currently there are many software systems that allow you to create, edit and present scenarios of stimuli representation. Existing systems of presentation stimulus scenario can solve a wide range of tasks but they are not suitable for reusing and there is no universal way to extract metadata of the scenario of the experiment. Purpose of the work is development of the service for stimuli scenario representation with graphical interface, features of saving data in platform independent format and execution in one of the systems. Proposed approach uses model driven architecture principles. The platform-independent model is based on the open format of PsychoPy experiment. Neurobs Presentation system is used to execute scenario. Program code is generated automatically with transformation of the platform-independent model into platform-specific model and describing the syntax of the Presentation domain specific language. Implementation of this approach may be extended to other systems.

MEDILUX - SERVICE OF INTELLECTUAL FORMING THE SCHEDULE OF VISITING MEDICAL INSTITUTIONS

I. E. Bukshev

PDF (Rus)

41-48 72

Abstract

In this paper, a new type of service «intellectual registry» is offered. In the mobile application, there is a choice of employees of the medical institution. In addition to that, there is an analysis of the schedule of the selected doctor and free time of the patient using programming algorithms in limitations to determine the best time to book an appointment. The knowledge base of the user's free time is formed based on tasks that are aggregated from the mobile device and popular task management services. In case the user does not know who to contact, the product is equipped with a «smart» chat where the problem can be described. To be exact, the text will be sent to the server where there will be a parsing and a semantic comparison with the specific qualification of the doctor. The database stores information about all visits and medical statements (electronic medical records), which allows to remind the user about the need to take the medications. The practical value of the product lies in the automation of the business process «reception of patients», which leads to saving patients’ time, ensuring high availability of services and optimizing labor input in medical institutions.

SYSTEM OF ANALYSIS AND VISUALIZATION FOR CROSS-LANGUAGE IDENTIFICATION OF THE AUTHORS OF SCIENTIFIC PUBLICATIONS

V. V. Isachenko, Z. V. Apanovich

PDF (Rus)

49-61 77

Abstract

This paper describes a system for disambiguation of authorship of articles in English using Russian-language data sources. The system allows a user to find and correct mistakes in determining the authorship of scientific publications, which can improve the search results for articles by a certain author and calculation of the citation index. As a source of publications, the link.springer.com database was used. To obtain reliable information about authors and their articles, the eLIBRARY digital library was used. The system provides interactive visualization of the analysis results and editing facilities to improve the quality of expert analysis. The approaches used in this system are applicable for disambiguation of the authorship of publications from various bibliographic databases.

COLLABORATIVE FILTERING FOR CREATION OF RECOMMENDATIONS ON BASE OF ORDER DATA

A. A. Knyazeva, O. S. Kolobov, I. Yu. Turchanovsky, A. M. Fedotov

PDF (Rus)

62-69 89

Abstract

In the article an opportunity of the collaborative filtering methods application in a process of creating a recommender system on the base of order data of documents from library fund is considered. A comparison experimental analysis of three collaborative filtering methods is provided: item-based, user-based and hybrid method, which is a combination of first two methods.

THE BINARY OPERATIONS IN THE INFORMATION SYSTEM «MOLECULAR SPECTROSCOPY»

A. V. Kozodoev, E. M. Kozodoeva

PDF (Rus)

70-77 55

Abstract

The paper describes an approach to the development and implementation of functions for performing binary operations on data in the IS «Molecular Spectroscopy». The formalization of binary operations over sets of spectroscopic data is made, taking into account the features of the subject area. Describes an action algorithm and a user interface for performing binary operations, one for databases for various substances and several types of spectroscopic data.

DEVELOPMENT AND IMPLEMENTATION OF MEMORY DISAMBIGUATION ALGORITHMS IN DYNAMIC JAVA COMPILER FOR ELBRUS PROCESSOR

A. E. Malykh

PDF (Rus)

78-85 58

Abstract

This article describes algorithms for memory disambiguation in dynamic Java compiler for Russian processor Elbrus with their implementation. These algorithms let significantly improve possibilities for instruction scheduler which is a key optimization for VLIW-processors. Static and dynamic approaches for memory disambiguation are described in this work. All implemented algorithms' efficiency is shown based on popular testing suite SpecJVM2008.

ON EXPERIENCE IN MIGRATING APPLICATIONS TO THE FREELY DISTRIBUTABLE OPEN SOURCE SOFTWARE

S. N. Troshkov

PDF (Rus)

86-94 52

Abstract

An important problem of modern programming is the support and maintenance of legacy software. The functionality of applications written in older environments is valuable and still relevant. Outdated software environments do not allow to use the applications on modern machines and prevents their further development. The paper describes the experience of migration to open source software. The migration was performed for two applications: the Archive of academician A. P. Ershov and the Library system. These applications work and have been used in Ershov Institute of Informatics Systems of SB RAS for a number of years. CMF Drupal has been chosen as a free and open source platform for creating new applications. The advantages of CMF Drupal facilitates significantly the development of new application and migration of the data model. Migration included reengineering the application while preserving business logic, the data model, and migrating the data itself.

THE USE OF A HORIZONTALLY SCALABLE INFRASTRUCTURE IN THE SEARCH FOR GENETIC SIMILARITY IN BIODIVERSITY

A. A. Tskhai, S. V. Murzintsev

PDF (Rus)

95-103 61

Abstract

The problem of rapid detection of genetic similarity in the analysis of databases (DB) of genomes of individuals of ecosystems at various levels is considered. The distributed non-relational DB MongoDB and the Winnowing data processing algorithm are used as the basis for creating the information system. Using a non-relational database to identify genetic similarity, a variant of representing the prints of the structural variations of the genomes in the form of «key-value» was proposed, a program implementation of the developed model was carried out, and computational experiments were carried out, which confirmed the possibility of using the proposed method of genetic similarity search, for example, in a personified analysis of deviations in the gene level.

AUTOMATION OF CONSTRUCTION OF THREE-DIMENSIONAL GEOELECTRIC MODELS FOR THE METHOD OF SOUNDING THE FORMATION OF THE FIELD IN THE NEAR ZONE BASED ON THE RESULTS OF ONE-DIMENSIONAL INVERSION

M. V. Chubarov, A. A. Vlasov

PDF (Rus)

104-112 64

Abstract

The article proposes an algorithm for automating the construction of three-dimensional geoelectric models for the method of sounding the formation of the field in the near zone based on the results of one-dimensional inversion in order to calculate synthetic signals for three-dimensional models, as well as to accelerate the production of qualitative evaluation of field materials, and to minimize interpretation errors. An important part of the algorithm is the automatic generation of three-dimensional computational networks necessary for the calculation of synthetic signals in models. The results of the algorithm are prepared three-dimensional models of the studied medium with a calculated synthetic electromagnetic signal. The algorithm is tested on the data of electromagnetic monitoring of the consequences of the earthquake that occurred in 2003 in the Altai Republic.

SEMANTIC APPROACH TO MODELING OF THE FUND OF ASSESSMENT MEANS

G. E. Yakhyaeva, A. R. Absayduleva

PDF (Rus)

113-121 66

Abstract

The fund of assessment means (FAM) is an integral part of the normative and methodological support of the system for assessing the quality of the student's learning. The ontological approach to FAM modeling makes it possible to form current evaluation documents that take into account all the wishes of the examiner. The article describes the ontology, semantic and fuzzy models of the FAM. An algorithm for generating a template for an assessment document is consisting of three steps: narrowing, restriction, and definition. At each stage, a corresponding fuzzy model is generated, within which the consistency of the given template is checked and the feasibility of this template on the available basis of valuation tools is checked. The CLOPE algorithm is used for generating a set of evaluation documents, which allows clustering the category data.

Сведения об авторах

Editorial Article

PDF (Rus)

122-123 41

Информация для авторов

Editorial Article

PDF (Rus)

124-124 47

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Vestnik NSU. Series: Information Technologies

Cookies policy