Preview

Vestnik NSU. Series: Information Technologies

Advanced search

A Markov Chain - Based Method for JPEG Image Steganalysis and Its Application in Combination with Various Machine Learning Algorithms

https://doi.org/10.25205/1818-7900-2022-20-4-61-75

Abstract

The paper proposes a method of extracting the feature vector of images, which makes it possible to effectively detect the presence of hidden information in JPEG images embedded by various popular steganography tools. This method is based on the usage of the transition probability matrix. The essence of the method for extracting the feature vector of the image is to use the transition probability matrix and apply the image calibration method to improve the accuracy of steganalysis and reduce the number of false positives. For each image from the training and test sets a feature vector is found in this way, the number of elements is 324. Further, the models were trained on the training dataset by each of machine learning methods separately: decision trees with gradient boosting, linear models, k-nearest neighbors, support vector machines, neural networks, and artificial immune systems. To assess the capacity of the models the following metrics were used: accuracy, the rate of the false positive and false negative errors, and the confusion matrix. The results of classification by each of the above methods are given. For training and testing a dataset IStego100K was used, which consists of 208 thousand images of the same size 1024 x 1024 with different quality values in the range from 75 to 95. One of the J-UNIWARD, nsF5, and UERD steganography algorithms was used to embed a hidden message. As a result, we can observe that the proposed approach to extracting the feature vector makes it possible to detect the presence of hidden information embedded by non-adaptive steganography (Steghide, OutGuess and nsF5) in static JPEG images with high accuracy (more than 95%). However, for adaptive steganography methods (J-UNIWARD, UERD) the accuracy is less (about 50-60%).

About the Authors

A. V. Prokofieva
Siberian Federal University
Russian Federation

Aleksandra V. Prokofieva - postgraduate student of the Department of Applied Mathematics and Computer Security, Siberian Federal University.

Krasnoyarsk



A. N. Shniperov
Siberian Federal University
Russian Federation

Alexey N. Shniperov - Сandidate of Sciences in Technology, assistant Professor of the Department of Applied Mathematics and Computer Security, Siberian Federal University.

Krasnoyarsk



References

1. Gulasova M., Jokay M. Steganalysis of stegostorage library. Tatra Mountains Mathematical Publications, 2016, vol. 67, no. 1, pp. 99-116. DOI: 10.1515/tmmp-2016-0034

2. Fridrich J. J., Goljan M., Hogea D. Steganalysis of JPEG Images: Breaking the F5 Algorithm. 5th International Workshop on Information Hiding, 2002. DOI: 10.1007/3-540-36415-3

3. Hendrych J., Licev L. Advanced methods of detection of the steganography content. Lecture Notes in Electrical Engineering, 2020, vol. 554, pp. 484-493. DOI: 10.1007/978-3-030-14907-9_47

4. Yousfi Y. Butora J., Fridrich J., Giboulot Q. Breaking Alaska: Color separation for steganalysis in JPEG domain. IH and MMSec 2019 - Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, 2019, pp. 138-149. DOI: 10.1145/3335203.3335727

5. Saito T., Zhao Q., Naito H. Second Level Steganalysis - Embeding Location Detection Using Machine Learning. 2019 IEEE 10th International Conference on Awareness Science and Technology, iCAST 2019 - Proceedings. IEEE, 2019. Pp. 1-6. DOI: 10.1109/ICAwST.2019.8923205

6. Butora J., Fridrich J. Reverse JPEG Compatibility Attack. IEEE Transactions on Infor-mation Forensics and Security. IEEE, 2020, vol. 15, no. c, pp. 1444-1454. DOI: 10.1109/TIFS.2019.2940904

7. Shniperov A. N., Prokofieva A. V. Steganalysis Method of Static JPEG Images Based on Artificial Immune System. Automatic Control and Computer Sciences, 2020, vol. 54, no. 5. DOI: 10.3103/S0146411620050077

8. Yang Z. Wang K., Ma S., Huang Y., Kang X., Zhao X. IStego100K: Large-scale Image Stega- nalysis Dataset. Digital Forensics and Watermarking. IWDW 2019. Lecture Notes in Computer Science, 2019, vol. 12022. DOI: 10.1007/978-3-030-43575-2_29

9. Fridrich J., Pevny T., Kodovsky J. Statistically undetectable JPEG steganography: Dead ends challenges, and opportunities. MM and Sec'07 - Proceedings of the Multimedia and Security Workshop 2007, 2007. DOI: 10.1145/1288869.1288872

10. Holub V., Fridrich J., Denemark T. Universal distortion function for steganography in an arbitrary domain. Eurasip Journal on Information Security. 2014, vol. 2014. DOI: 10.1186/1687-417X-2014-1

11. Guo L., Ni J., Su W., Tang C., Shi Y. Using Statistical Image Model for JPEG Steganography: Uniform Embedding Revisited. IEEE Transactions on Information Forensics and Security, 2015, vol. 10, no. 12, DOI: 10.1109/TIFS.2015.2473815

12. Pevny T., Fridrich J. Merging Markov and DCT features for multi-class JPEG steganalysis. 2007. P. 650503, doi: 10.1117/12.696774

13. Vakhrushev A. et al. LightAutoML: AutoML Solution for a Large Financial Services Ecosystem [Online]. 2021. URL: https://www.researchgate.net/publication/354379217_LightAutoML_AutoML_Solution_for_a_Large_Financial_Services_Ecosystem (01.11.2021).

14. Dasgupta D. Iskusstvennye immunnye sistemy i ih primenenie; Ed. A. Romanyuha. FIZMAT- LIT, 2006. 344 p. (in Russ.).

15. Perez J. D. J. S., Rosales M. S., Cruz-Cortes N. Universal steganography detector based on an artificial immune system for JPEG images. Proceedings - 15th IEEE International Conference on Trust, Security and Privacy in Computing and Communications. 2017. Pp. 1896-1903, DOI: 10.1109/TrustCom.2016.0290


Review

For citations:


Prokofieva A.V., Shniperov A.N. A Markov Chain - Based Method for JPEG Image Steganalysis and Its Application in Combination with Various Machine Learning Algorithms. Vestnik NSU. Series: Information Technologies. 2022;20(4):61-75. (In Russ.) https://doi.org/10.25205/1818-7900-2022-20-4-61-75

Views: 263


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1818-7900 (Print)
ISSN 2410-0420 (Online)