Vol 16, No 2 (2018)
View or download the full issue
PDF (Russian)
5-18 48
Abstract
A new approach to building extended topic models of Russian scientific texts is described in this article. An extended topic model is a topic model containing not only one-word terms, but also multiword terms (key phrases). Such models are better interpreted for the user and more accurately describe the subject area of the document than models consisting only of unigrams (separate words). On the basis of the proposed approach, a system was developed which, as a result of the work, provides for each document a set of topics with probabilities, key words and phrases for each topic. The approach proposed in the article can be useful for development of recommendation systems and summarization systems.
19-30 53
Abstract
A systematic approach to increasing the productivity of small information systems is considered at the expense of optimal restructuring of tabular data structures. The authors formulated the task of optimizing the number of data blocks that are needed to query the group to read the information offered to the target function, and structural constraints. The impossibility of using crude methods of searching for the optimal solution is analyzed. The technique of multimodal attribute distribution is proposed depending on their frequency of occurrence in the query group. The experiment confirming the effectiveness of the developed methodology for small information systems.
31-40 36
Abstract
Methods of quantitative data analysis are important in modern physiology. Necessary condition for usage of mathematical statistics, signal analysis and machine learning is the availability of properly collected, marked and prepared data. Thus, preservation of meta-information and structuring results will be useful for their further processing. Physiological experiment consists of a set of trials (samples), in which instructions and certain stimuli are presented to the participant. Reaction on the test sample is recorded as physiological measures. Currently there are many software systems that allow you to create, edit and present scenarios of stimuli representation. Existing systems of presentation stimulus scenario can solve a wide range of tasks but they are not suitable for reusing and there is no universal way to extract metadata of the scenario of the experiment. Purpose of the work is development of the service for stimuli scenario representation with graphical interface, features of saving data in platform independent format and execution in one of the systems. Proposed approach uses model driven architecture principles. The platform-independent model is based on the open format of PsychoPy experiment. Neurobs Presentation system is used to execute scenario. Program code is generated automatically with transformation of the platform-independent model into platform-specific model and describing the syntax of the Presentation domain specific language. Implementation of this approach may be extended to other systems.
41-48 56
Abstract
In this paper, a new type of service «intellectual registry» is offered. In the mobile application, there is a choice of employees of the medical institution. In addition to that, there is an analysis of the schedule of the selected doctor and free time of the patient using programming algorithms in limitations to determine the best time to book an appointment. The knowledge base of the user's free time is formed based on tasks that are aggregated from the mobile device and popular task management services. In case the user does not know who to contact, the product is equipped with a «smart» chat where the problem can be described. To be exact, the text will be sent to the server where there will be a parsing and a semantic comparison with the specific qualification of the doctor. The database stores information about all visits and medical statements (electronic medical records), which allows to remind the user about the need to take the medications. The practical value of the product lies in the automation of the business process «reception of patients», which leads to saving patients’ time, ensuring high availability of services and optimizing labor input in medical institutions.
49-61 64
Abstract
This paper describes a system for disambiguation of authorship of articles in English using Russian-language data sources. The system allows a user to find and correct mistakes in determining the authorship of scientific publications, which can improve the search results for articles by a certain author and calculation of the citation index. As a source of publications, the link.springer.com database was used. To obtain reliable information about authors and their articles, the eLIBRARY digital library was used. The system provides interactive visualization of the analysis results and editing facilities to improve the quality of expert analysis. The approaches used in this system are applicable for disambiguation of the authorship of publications from various bibliographic databases.
62-69 79
Abstract
In the article an opportunity of the collaborative filtering methods application in a process of creating a recommender system on the base of order data of documents from library fund is considered. A comparison experimental analysis of three collaborative filtering methods is provided: item-based, user-based and hybrid method, which is a combination of first two methods.
70-77 44
Abstract
The paper describes an approach to the development and implementation of functions for performing binary operations on data in the IS «Molecular Spectroscopy». The formalization of binary operations over sets of spectroscopic data is made, taking into account the features of the subject area. Describes an action algorithm and a user interface for performing binary operations, one for databases for various substances and several types of spectroscopic data.
78-85 47
Abstract
This article describes algorithms for memory disambiguation in dynamic Java compiler for Russian processor Elbrus with their implementation. These algorithms let significantly improve possibilities for instruction scheduler which is a key optimization for VLIW-processors. Static and dynamic approaches for memory disambiguation are described in this work. All implemented algorithms' efficiency is shown based on popular testing suite SpecJVM2008.
86-94 48
Abstract
An important problem of modern programming is the support and maintenance of legacy software. The functionality of applications written in older environments is valuable and still relevant. Outdated software environments do not allow to use the applications on modern machines and prevents their further development. The paper describes the experience of migration to open source software. The migration was performed for two applications: the Archive of academician A. P. Ershov and the Library system. These applications work and have been used in Ershov Institute of Informatics Systems of SB RAS for a number of years. CMF Drupal has been chosen as a free and open source platform for creating new applications. The advantages of CMF Drupal facilitates significantly the development of new application and migration of the data model. Migration included reengineering the application while preserving business logic, the data model, and migrating the data itself.
95-103 47
Abstract
The problem of rapid detection of genetic similarity in the analysis of databases (DB) of genomes of individuals of ecosystems at various levels is considered. The distributed non-relational DB MongoDB and the Winnowing data processing algorithm are used as the basis for creating the information system. Using a non-relational database to identify genetic similarity, a variant of representing the prints of the structural variations of the genomes in the form of «key-value» was proposed, a program implementation of the developed model was carried out, and computational experiments were carried out, which confirmed the possibility of using the proposed method of genetic similarity search, for example, in a personified analysis of deviations in the gene level.
104-112 49
Abstract
The article proposes an algorithm for automating the construction of three-dimensional geoelectric models for the method of sounding the formation of the field in the near zone based on the results of one-dimensional inversion in order to calculate synthetic signals for three-dimensional models, as well as to accelerate the production of qualitative evaluation of field materials, and to minimize interpretation errors. An important part of the algorithm is the automatic generation of three-dimensional computational networks necessary for the calculation of synthetic signals in models. The results of the algorithm are prepared three-dimensional models of the studied medium with a calculated synthetic electromagnetic signal. The algorithm is tested on the data of electromagnetic monitoring of the consequences of the earthquake that occurred in 2003 in the Altai Republic.
113-121 64
Abstract
The fund of assessment means (FAM) is an integral part of the normative and methodological support of the system for assessing the quality of the student's learning. The ontological approach to FAM modeling makes it possible to form current evaluation documents that take into account all the wishes of the examiner. The article describes the ontology, semantic and fuzzy models of the FAM. An algorithm for generating a template for an assessment document is consisting of three steps: narrowing, restriction, and definition. At each stage, a corresponding fuzzy model is generated, within which the consistency of the given template is checked and the feasibility of this template on the available basis of valuation tools is checked. The CLOPE algorithm is used for generating a set of evaluation documents, which allows clustering the category data.
ISSN 1818-7900 (Print)
ISSN 2410-0420 (Online)
ISSN 2410-0420 (Online)