{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T23:38:49Z","timestamp":1773099529611,"version":"3.50.1"},"reference-count":77,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2024,1,18]],"date-time":"2024-01-18T00:00:00Z","timestamp":1705536000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"NSF","award":["IIS-1822800"],"award-info":[{"award-number":["IIS-1822800"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>The lack of accessible information conveyed by descriptions of art images presents significant barriers for people with blindness and low vision (BLV) to engage with visual artwork. Most museums are not able to easily provide accessible image descriptions for BLV visitors to build a mental representation of artwork due to vastness of collections, limitations of curator training, and current measures for what constitutes effective automated captions. This paper reports on the results of two studies investigating the types of information that should be included to provide high-quality accessible artwork descriptions based on input from BLV description evaluators. We report on: (1) a qualitative study asking BLV participants for their preferences for layered description characteristics; and (2) an evaluation of several current models for image captioning as applied to an artwork image dataset. We then provide recommendations for researchers working on accessible image captioning and museum engagement applications through a focus on spatial information access strategies.<\/jats:p>","DOI":"10.3390\/jimaging10010026","type":"journal-article","created":{"date-parts":[[2024,1,18]],"date-time":"2024-01-18T06:41:22Z","timestamp":1705560082000},"page":"26","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Images, Words, and Imagination: Accessible Descriptions to Support Blind and Low Vision Art Exploration and Engagement"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8322-6798","authenticated-orcid":false,"given":"Stacy A.","family":"Doore","sequence":"first","affiliation":[{"name":"INSITE Lab, Department of Computer Science, Colby College, Waterville, ME 04901, USA"}]},{"given":"David","family":"Istrati","sequence":"additional","affiliation":[{"name":"INSITE Lab, Department of Computer Science, Colby College, Waterville, ME 04901, USA"}]},{"given":"Chenchang","family":"Xu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Fu Foundation School of Engineering and Applied Science, Columbia University, New York, NY 10027, USA"}]},{"given":"Yixuan","family":"Qiu","sequence":"additional","affiliation":[{"name":"John A. Paulson School of Engineering and Applied Sciences, Harvard University, Allston, MA 02134, USA"}]},{"given":"Anais","family":"Sarrazin","sequence":"additional","affiliation":[{"name":"Sonos, San Francisco, CA 94109, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7640-0428","authenticated-orcid":false,"given":"Nicholas A.","family":"Giudice","sequence":"additional","affiliation":[{"name":"VEMI Lab, School of Computing and Information Science, University of Maine, Orono, ME 04469, USA"}]}],"member":"1968","published-online":{"date-parts":[[2024,1,18]]},"reference":[{"key":"ref_1","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014). Computer Vision, Proceedings of the ECCV 2014: 13th European Conference, Cham, Switzerland, 6\u201312 September 2014, Springer."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Garcia, N., and Vogiatzis, G. (2018, January 8\u201314). How to read paintings: Semantic art understanding with multi-modal retrieval. Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany.","DOI":"10.1007\/978-3-030-11012-3_52"},{"key":"ref_3","unstructured":"(2022, October 02). Web Gallery of Art. Available online: https:\/\/www.wga.hu\/."},{"key":"ref_4","unstructured":"Axel, E.S., and Levent, N.S. (2002). Art beyond Sight: A Resource Guide to Art, Creativity, and Visual Impairment, American Foundation for the Blind."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1016\/j.procs.2013.09.017","article-title":"The Talking Museum Project","volume":"21","author":"Amato","year":"2013","journal-title":"Procedia Comput. Sci."},{"key":"ref_6","unstructured":"Bahram, S. (2018). The Senses: Design Beyond Vision, Princeton Architectural Press."},{"key":"ref_7","unstructured":"Stock, K., Jones, C.B., and Tenbrink, T. (2019). CEUR Workshop Proceedings, Proceedings of the Workshop on Speaking of Location 2019: Communicating about Space Co-Located with 14th International Conference on Spatial Information Theory (COSIT 2019), Regensburg, Germany, 9\u201313 September 2019, CEUR-WS.org."},{"key":"ref_8","unstructured":"Eardley, A., Fryer, L., Hutchinson, R., Cock, M., Ride, P., and Neves, J. (2017). Inclusion, Disability and Culture. Inclusive Learning and Educational Equity, Springer."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1179\/1559689314Z.00000000023","article-title":"Case Studies from Three Museums in Art Beyond Sight\u2019s Multi-site Museum Accessibility Study","volume":"9","author":"Henrich","year":"2014","journal-title":"Museums Soc. Issues"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Hoyt, B. (2013). Emphasizing Observation in a Gallery Program for Blind and Low-Vision Visitors: Art Beyond Sight at the Museum of Fine Arts, Houston. Disabil. Stud. Q., 33.","DOI":"10.18061\/dsq.v33i3.3737"},{"key":"ref_11","unstructured":"Long, E. (2019). Technology Solutions to Enhance Accessibility and Engagement in Museums for the Visually Impaired, AMT Lab @ CMU."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"475","DOI":"10.1177\/0145482X20971958","article-title":"The accessible museum: Towards an understanding of international audio description practices in museums","volume":"114","author":"Hutchinson","year":"2020","journal-title":"J. Vis. Impair. Blind."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Li, F.M., Zhang, L., Bandukda, M., Stangl, A., Shinohara, K., Findlater, L., and Carrington, P. (2023, January 23\u201329). Understanding Visual Arts Experiences of Blind People. Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Hamburg, Germany.","DOI":"10.1145\/3544548.3580941"},{"key":"ref_14","unstructured":"Giudice, N.A., and Tietz, J.D. (2008). Spatial Cognition VI. Learning, Reasoning, and Talking about Space, Proceedings of the International Conference Spatial Cognition 2008, Freiburg, Germany, 15\u201319 September 2008, Springer. Proceedings 6."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Potluri, V., Grindeland, T.E., Froehlich, J.E., and Mankoff, J. (2021, January 8\u201313). Examining visual semantic understanding in blind and low-vision technology users. Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan.","DOI":"10.1145\/3411764.3445040"},{"key":"ref_16","unstructured":"(2019, September 13). American Museum of Natural History. Available online: https:\/\/www.amnh.org\/apps\/project-describe."},{"key":"ref_17","unstructured":"Bahram, S., Chun, S., and Lavatelli, A.C. (2019, September 12). Using Coyote to Describe the World. Available online: https:\/\/mw18.mwconf.org\/paper\/using-coyote-to-describe-the-world\/."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Lakoff, G. (1987). Women, Fire, and Dangerous Things: What Categories Reveal about the Mind, University of Chicago Press.","DOI":"10.7208\/chicago\/9780226471013.001.0001"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1080\/0163853X.2010.549452","article-title":"Spatial Strategies in the Description of Complex Configurations","volume":"48","author":"Tenbrink","year":"2011","journal-title":"Discourse Process."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Tyler, A., and Evans, V. (2003). The Semantics of English Prepositions: Spatial Scenes, Embodied Meaning, and Cognition, Cambridge University Press.","DOI":"10.1017\/CBO9780511486517"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Xiao, N., Kwan, M.P., Goodchild, M.F., and Shekhar, S. (2012). Geographic Information Science, Springer.","DOI":"10.1007\/978-3-642-33024-7"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1109\/TPAMI.2022.3148210","article-title":"From show to tell: A survey on deep learning-based image captioning","volume":"45","author":"Stefanini","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"102","DOI":"10.1016\/j.patrec.2020.02.017","article-title":"Machine Learning for Cultural Heritage: A Survey","volume":"133","author":"Fiorucci","year":"2020","journal-title":"Pattern Recognit. Lett."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Karayev, S., Hertzmann, A., Winnemoeller, H., Agarwala, A., and Darrell, T. (2013, January 1\u20135). Recognizing Image Style. Proceedings of the British Machine Vision Conference 2014\u2014BMVC 2014, Nottingham, UK.","DOI":"10.5244\/C.28.122"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Krause, J., Johnson, J., Krishna, R., and Fei-Fei, L. (2017, January 18\u201323). A Hierarchical Approach for Generating Descriptive Image Paragraphs. Proceedings of the Computer Vision and Patterm Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2017.356"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1080\/09687599.2016.1167671","article-title":"Accessibility of European museums to visitors with visual impairments","volume":"31","author":"Mesquita","year":"2016","journal-title":"Disabil. Soc."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Mao, H., Cheung, M., and She, J. (2017, January 23\u201327). DeepArt: Learning Joint Representations of Visual Arts. Proceedings of the 25th ACM International Conference on Multimedia, Mountain View, CA, USA.","DOI":"10.1145\/3123266.3123405"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Banerji, S., and Sinha, A. (2016, January 19\u201321). Painting classification using a pre-trained convolutional neural network. Proceedings of the International Conference on Computer Vision, Graphics, and Image Processing, Warsaw, Poland.","DOI":"10.1007\/978-3-319-68124-5_15"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Shen, X., Efros, A.A., and Aubry, M. (2019, January 15\u201320). Discovering Visual Patterns in Art Collections with Spatially-Consistent Feature Learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00950"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1007\/s41095-015-0017-1","article-title":"Cross-depiction problem: Recognition and synthesis of photographs and artwork","volume":"1","author":"Hall","year":"2015","journal-title":"Comput. Vis. Media"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"12263","DOI":"10.1007\/s00521-021-05893-z","article-title":"Deep learning approaches to pattern extraction and recognition in paintings and drawings: An overview","volume":"33","author":"Castellano","year":"2021","journal-title":"Neural Comput. Appl."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3295748","article-title":"A Comprehensive Survey of Deep Learning for Image Captioning","volume":"51","author":"Hossain","year":"2019","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3617592","article-title":"Deep learning approaches on image captioning: A review","volume":"56","author":"Ghandi","year":"2023","journal-title":"ACM Comput. Surv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., and Zhang, L. (2018, January 18\u201323). Bottom-up and top-down attention for image captioning and visual question answering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00636"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"108545","DOI":"10.1016\/j.patcog.2022.108545","article-title":"Human-centric image captioning","volume":"126","author":"Yang","year":"2022","journal-title":"Pattern Recognit."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Cornia, M., Baraldi, L., and Cucchiara, R. (2019, January 15\u201320). Show, control and tell: A framework for generating controllable and grounded captions. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00850"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Gu, J., Zhao, H., Lin, Z., Li, S., Cai, J., and Ling, M. (2019, January 15\u201320). Scene graph generation with external knowledge and image reconstruction. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00207"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Yang, X., Tang, K., Zhang, H., and Cai, J. (2019, January 15\u201320). Auto-encoding scene graphs for image captioning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01094"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Shi, Z., Zhou, X., Qiu, X., and Zhu, X. (2020, January 5\u201310). Improving Image Captioning with Better Use of Caption. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online.","DOI":"10.18653\/v1\/2020.acl-main.664"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Gao, L., Wang, B., and Wang, W. (2018, January 26\u201328). Image captioning with scene-graph based semantic concepts. Proceedings of the 2018 10th International Conference on Machine Learning and Computing, Macau, China.","DOI":"10.1145\/3195106.3195114"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Zhong, Y., Wang, L., Chen, J., Yu, D., and Li, Y. (2020, January 23\u201328). Comprehensive image captioning via scene graph decomposition. Proceedings of the Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK.","DOI":"10.1007\/978-3-030-58568-6_13"},{"key":"ref_42","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention is all you need. Adv. Neural Inf. Process. Syst., 30."},{"key":"ref_43","unstructured":"Huang, L., Wang, W., Chen, J., and Wei, X.Y. (November, January 27). Attention on attention for image captioning. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Pan, Y., Yao, T., Li, Y., and Mei, T. (2020, January 13\u201319). X-linear attention networks for image captioning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01098"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Cagrandi, M., Cornia, M., Stefanini, M., Baraldi, L., and Cucchiara, R. (2021, January 21\u201324). Learning to select: A fully attentive approach for novel object captioning. Proceedings of the 2021 International Conference on Multimedia Retrieval, Taipei, Taiwan.","DOI":"10.1145\/3460426.3463587"},{"key":"ref_46","unstructured":"Liu, W., Chen, S., Guo, L., Zhu, X., and Liu, J. (2021). Cptr: Full transformer network for image captioning. arXiv."},{"key":"ref_47","unstructured":"Wang, P., Yang, A., Men, R., Lin, J., Bai, S., Li, Z., Ma, J., Zhou, C., Zhou, J., and Yang, H. (2022, January 17\u201323). OFA: Unifying architectures, tasks, and modalities through a simple sequence-to-sequence learning framework. Proceedings of the International Conference on Machine Learning, Baltimore, MD, USA."},{"key":"ref_48","unstructured":"Hu, J.C., Cavicchioli, R., and Capotondi, A. (2022). Expansionnet v2: Block static expansion in fast end to end training for image captioning. arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Zhou, L., Palangi, H., Zhang, L., Hu, H., Corso, J., and Gao, J. (2020, January 7\u201312). Unified vision-language pre-training for image captioning and vqa. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.","DOI":"10.1609\/aaai.v34i07.7005"},{"key":"ref_51","unstructured":"Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., and Clark, J. (2021, January 18\u201324). Learning transferable visual models from natural language supervision. Proceedings of the International Conference on Machine Learning, Virtual Event."},{"key":"ref_52","unstructured":"Mokady, R., Hertz, A., and Bermano, A.H. (2021). Clipcap: Clip prefix for image captioning. arXiv."},{"key":"ref_53","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI Blog"},{"key":"ref_54","unstructured":"Li, J., Li, D., Xiong, C., and Hoi, S. (2022, January 17\u201323). Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. Proceedings of the International Conference on Machine Learning, Baltimore, MD, USA."},{"key":"ref_55","unstructured":"Li, J., Li, D., Savarese, S., and Hoi, S. (2023). Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv."},{"key":"ref_56","unstructured":"(2019, September 18). Using Coyote to Describe the World. Available online: https:\/\/coyote.pics\/the-software\/."},{"key":"ref_57","unstructured":"Marshall, K.J. (2020, October 30). Untitled (Painter). Available online: https:\/\/mcachicago.org\/collection\/items\/kerry-james-marshall\/3160-untitled-painter."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/332833.332843","article-title":"Universal Usability","volume":"43","author":"Shneiderman","year":"2000","journal-title":"Commun. ACM"},{"key":"ref_59","unstructured":"Bartolome, J.I., Quero, L.C., Kim, S., Um, M.Y., and Cho, J. (2019, January 17\u201320). Exploring art with a voice controlled multimodal guide for blind people. Proceedings of the Thirteenth International Conference on Tangible, Embedded, and Embodied Interaction, Tempe, AZ, USA."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"1071759","DOI":"10.3389\/feduc.2023.1071759","article-title":"Multimodality as universality: Designing inclusive accessibility to graphical information","volume":"8","author":"Doore","year":"2023","journal-title":"Front. Educ."},{"key":"ref_61","first-page":"1","article-title":"Staying open: The use of theoretical codes in grounded theory","volume":"5","author":"Glaser","year":"2005","journal-title":"Grounded Theory Rev."},{"key":"ref_62","unstructured":"Horibuchi, M. (2021, October 30). Watercolor of Persimmons. Available online: https:\/\/mcachicago.org\/publications\/blog\/2018\/4-things-mika-horibuchi."},{"key":"ref_63","unstructured":"Heyer, P. (2021, October 30). Heaven. Available online: https:\/\/mcachicago.org\/exhibitions\/2018\/paul-heyer."},{"key":"ref_64","unstructured":"Dijkstra, R. (2021, October 30). Hel, Poland. Available online: https:\/\/mcachicago.org\/about\/who-we-are\/people\/rineke-dijkstra."},{"key":"ref_65","unstructured":"Cunningham, C., and Curtis, J. (2021, October 30). The Way You Look at Me Tonight. Available online: https:\/\/mcachicago.org\/calendar\/2018\/02\/claire-cunningham-jess-curtis-the-way-you-look-at-me-tonight."},{"key":"ref_66","unstructured":"Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv."},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Papineni, K., Roukos, S., Ward, T., and Zhu, W.J. (2002, January 7\u201312). Bleu: A method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA.","DOI":"10.3115\/1073083.1073135"},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Yang, A., Liu, K., Liu, J., Lyu, Y., and Li, S. (2018, January 19). Adaptations of ROUGE and BLEU to Better Evaluate Machine Reading Comprehension Task. Proceedings of the Workshop on Machine Reading for Question Answering, Melbourne, VIC, Australia.","DOI":"10.18653\/v1\/W18-2611"},{"key":"ref_69","unstructured":"Banerjee, S., and Lavie, A. (2005, January 25\u201330). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization, Ann Arbor, MI, USA."},{"key":"ref_70","unstructured":"(2022, September 24). Google Cloud Evaluating Language Models. Available online: https:\/\/cloud.google.com\/translate\/automl\/docs\/evaluate."},{"key":"ref_71","unstructured":"(2022, September 27). Machine Translation\u2014METEOR. Available online: https:\/\/machinetranslate.org\/meteor."},{"key":"ref_72","doi-asserted-by":"crossref","first-page":"923","DOI":"10.1016\/j.concog.2007.07.005","article-title":"Neural correlates of object indeterminacy in art compositions","volume":"17","author":"Fairhall","year":"2008","journal-title":"Conscious. Cogn."},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1037\/a0022087","article-title":"Pictorial depth increases body sway","volume":"5","author":"Kapoula","year":"2011","journal-title":"Psychol. Aesthet. Creat. Arts"},{"key":"ref_74","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1080\/17588928.2015.1083968","article-title":"A TMS study on the contribution of visual area V5 to the perception of implied motion in art and its appreciation","volume":"8","author":"Cattaneo","year":"2017","journal-title":"Cogn. Neurosci."},{"key":"ref_75","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1037\/qup0000231","article-title":"Making sense of an artwork: An interpretative phenomenological analysis of participants\u2019 accounts of viewing a well-known painting","volume":"10","author":"Starr","year":"2023","journal-title":"Qual. Psychol."},{"key":"ref_76","unstructured":"OpenAI (2023). GPT-4 Technical Report. arXiv."},{"key":"ref_77","unstructured":"Garcia, N., and Vogiatzis, G. (2017). Learning Non-Metric Visual Similarity for Image Retrieval. arXiv."}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/10\/1\/26\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T13:49:35Z","timestamp":1760104175000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/10\/1\/26"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,18]]},"references-count":77,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,1]]}},"alternative-id":["jimaging10010026"],"URL":"https:\/\/doi.org\/10.3390\/jimaging10010026","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,1,18]]}}}