International Journal of Image, Graphics and Signal Processing(IJIGSP)
ISSN: 2074-9074 (Print), ISSN: 2074-9082 (Online)
Published By: MECS Press
IJIGSP Vol.12, No.5, Oct. 2020
Learning Semantic Image Attributes Using Image Recognition and Knowledge Graph Embeddings
Full Text (PDF, 583KB), PP.44-52
Extracting structured knowledge from texts has traditionally been used for knowledge base generation. However, other sources of information, such as images can be leveraged into this process to build more complete and richer knowledge bases. Structured semantic representation of the content of an image and knowledge graph embeddings can provide a unique representation of semantic relationships between image entities. Linking known entities in knowledge graphs and learning open-world images using language models has attracted lots of interest over the years. In this paper, we propose a shared learning approach to learn semantic attributes of images by combining a knowledge graph embedding model with the recognized attributes of images. The proposed model premises to help us understand the semantic relationship between the entities of an image and implicitly provide a link for the extracted entities through a knowledge graph embedding model. Under the limitation of using a custom user-defined knowledge base with limited data, the proposed model presents significant accuracy and provides a new alternative to the earlier approaches. The proposed approach is a step towards bridging the gap between frameworks which learn from large amounts of data and frameworks which use a limited set of predicates to infer new knowledge.
Cite This Paper
Ashutosh Kumar Tiwari, Sandeep Varma Nadimpalli, " Learning Semantic Image Attributes Using Image Recognition and Knowledge Graph Embeddings", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.12, No.5, pp. 44-52, 2020.DOI: 10.5815/ijigsp.2020.05.05
Xie, R., Liu, Z., Jia, J., Luan, H., Sun, M. Representation Learning of Knowledge Graphs with Entity Descriptions. In AAAI, 2016
Kristiadi, A., Khan, M.A., Lukovnikov, D., Lehmann, J., Fischer, A. Incorporating Literals into Knowledge Graph Embeddings. CoRR, 2018
Goyal, P., Ferrara, E.Graph Embedding Techniques, Applications, and Performance: A Survey on Knowlege Based Systems,2018
Ivana Balažević, Carl Allen, Timothy M. Hospedales. Hypernetwork Knowledge Graph Embeddings. arXiv:1808.07018, 2018
Pezeshkpour, P., Chen, L., Singh, S. Embedding multimodal relational data for knowledge base completion. arXiv preprint arXiv:1809.01341,2018
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-CNN: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, 2015
Robert Speer, Joshua Chin, and Catherine Havasi. Conceptnet 5.5: An open multilingual graph of general knowledge. In AAAI, pages 4444–4451, 2017
Quan Liu, Hui Jiang, Andrew Evdokimov, Zhen-Hua Ling, Xiaodan Zhu, Wei, and Yu Hu. Probabilistic reasoning via deep learning: Neural association models. arXiv preprint arXiv:1603.07704, 2016.
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
Wang, Q., Mao, Z., Wang, B., Guo, L. Knowledge Graph Embedding: A Survey of Approaches and Applications. In TKDE, 2017
Xie, R., Liu, Z., Chua, T.S., Luan, H.B., Sun, M.Image-embodied, knowledge, representation learning. In IJCAI, 2017
Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition,2015
Andrea Frome, Greg S. Corrado, Jonathon Shlens, Samy Bengio et al. DeViSE: A deep visual-Semantic embedding model. Advances in Neural Information Processing Systems 26, NIPS, 2013
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1631–1642, 2013
Dong Li, Hsin-Ying Lee, Jia-Bin Huang, Shengjin Wang, and Ming-Hsuan Yang. Learning structured semantic embeddings for visual recognition. arXiv preprint arXiv:1706.01237, 2017.
Vincent Lonij, Ambrish Rawat, and Maria-Irina Nicolae. Open-world visual recognition using knowledge graphs. arXiv preprint arXiv:1708.08310, 2017.
Wikipedia contributors. 2018. Glove (machine learning) — Wikipedia, the free encyclopedia. [Online; accessed 18-December-2018].
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision,2014
Joseph Redmon, Santosh Kumar Divvala, Ross B Girshick, and Ali Farhadi. You only look once: unified, real-time object detection. CoRR abs/1506.02640, 2015
P. Druzhkov and V. Kustikova, A survey of deep learning methods and software tools for image classification and object detection. Pattern Recognition and Image Anal., vol. 26, no. 1, p. 9, 2016.
Jan Hendrik Hosang, Rodrigo Benenson, and Bernt Schiele.Learning-maximum suppression. In Conference on computer vision and pattern recognition, 2017
Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, Silvio Savarese. Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression. Conference on Computer Vision and Pattern Recognition. 2019
Joseph Redmon, Ali Farhadi. YOLO9000: Better, Faster, Stronger. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017