Preview

Vestnik NSU. Series: Information Technologies

Advanced search

Paper2vec and Cite2vec Methods for Analyzing Collections of Scientific Publications

https://doi.org/10.25205/1818-7900-2021-19-3-61-69

Abstract

Visualizations are used to better understand collections of scientific publications. Various methods of analyzing text collections can be used to build these visualizations. This article discusses two methods Paper2vec and Cite2vec that get vector representations of documents using citation information. To demonstrate a work of these techniques and an example of their application, visualizations were developed, which are described in this paper.

About the Author

N. I. Tikhonov
Novosibirsk State University
Russian Federation

 Nikolay I. Tikhonov, Graduate Student 

Novosibirsk 



References

1. Apanovich Z. V. Evolution of Visualization Methods for Research Publication Collections. Elektronnye biblioteki, 2018, vol. 21, no. 1, pp. 2–42. (in Russ.)

2. Mikolov T., Sutskever I., Chen K., Corrado G. S., Dean J. Distributed Representations of Words and Phrases and Their Compositionality. Advances in Neural Information Processing Systems, 2013, vol. 26, pp. 3111–3119.

3. Pennington J., Socher R. D., Manning C. Glove: Global vectors for word representation. In: Proceedings of the Empirical Methods in Natural Language Processing (EMNLP 2014), 2014, pp. 1532–1543. DOI 10.3115/v1/D14-1162

4. Bojanowski P., Grave E., Joulin A., Mikolov T. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 2017, vol. 5, pp. 135–146. DOI 10.1162/tacl_a_00051

5. Peters M., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., Zettlemoyer L. Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018, vol. 1, pp. 2227–2237. DOI 10.18653/v1/N18-1202

6. Tian H., Zhuo H. H. Paper2vec: Citation-Context Based Document Distributed Representation for Scholar Recommendation. ArXiv. abs/1703.06587, 2017.

7. Berger M., McDonough K., Seversky Lee M. Cite2vec: Citation-Driven Document Exploration via Word Embeddings. IEEE Transactions on Visualization and Computer Graphics, 2017, vol. 23, no. 1, pp. 691–700. DOI 10.1109/TVCG.2016.2598667

8. Maaten L. van der, Hinton G. Viualizing data using t-SNE. Journal of Machine Learning Research, 2008, vol. 9, pp. 2579–2605.


Review

For citations:


Tikhonov N.I. Paper2vec and Cite2vec Methods for Analyzing Collections of Scientific Publications. Vestnik NSU. Series: Information Technologies. 2021;19(3):61-69. (In Russ.) https://doi.org/10.25205/1818-7900-2021-19-3-61-69

Views: 164


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1818-7900 (Print)
ISSN 2410-0420 (Online)