Cataloguing, Metadata, and Generative AI. Early Experiences and Future Perspectives

Gino Roncaglia

doi:10.36253/jlis.it-693

Vol. 17 No. 1 (2026): What does the future hold for shared cataloguing in Italy and for SBN? A proactive contribution / Quale futuro per la catalogazione condivisa in Italia e per SBN? Un contributo propositivo

Articles

Cataloguing, Metadata, and Generative AI. Early Experiences and Future Perspectives

Italian PDF

Gino Roncaglia

more info

Gino Roncaglia
Roma Tre University
Bio

DOI: https://doi.org/10.36253/jlis.it-693

Published 2026-01-15

Keywords

Artificial intelligence,
Metadata creation/management,
LLM,
Bibliographic description,
Cataloguing standards.

How to Cite

Roncaglia, Gino. 2026. “Cataloguing, Metadata, and Generative AI. Early Experiences and Future Perspectives”. JLIS.It 17 (1):106-27. https://doi.org/10.36253/jlis.it-693.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

The article deals with the intersection of generative artificial intelligence (AI) and bibliographic/metadata practices, assessing how large language models (LLMs) can support cataloguing and metadata creation while navigating the constraints of formal knowledge architectures. In the first section, the article discusses the evolution of cataloguing paradigms from MARC to Linked Open Data (LOD), emphasizing the shift from rigid records to semantic, entity-based models like FRBR, RDA, and BIBFRAME. The second section deals with the epistemological clash between deterministic, rule-based metadata standards (the "architect") and probabilistic, generative AI systems (the "oracle").
Three strategies are proposed for integrating AI into bibliographic workflows:
1) Specialized AI systems trained exclusively on controlled, high-quality datasets.
2) Retrieval-Augmented Generation (RAG), blending LLMs with authoritative knowledge bases.
3) Next-generation LLMs enhanced via reasoning models, multimodal inputs, expanded context windows, and small/medium-scale local models to align generative outputs with metadata standards.
Key challenges include hallucinations, data sparsity in bibliographic corpora, and the obsolescence of MARC-centric experiments. The article argues for caution against retrofitting AI onto outdated data models, urging alignment with LOD and IFLA’s Library Reference Model (LRM). Ethical considerations (bias, transparency, AI literacy) and the potential of local SLMs/MSLMs for privacy-sensitive applications are highlighted.

Italian PDF

Metrics

Metrics Loading ...

References

Avram, Henriette D. 1975. MARC. Its History and Implications. Washington, DC: Library of Congress.
Bailey, Charles W., Jr. 1990. “Building Knowledge-Based Systems for Public Use: The Intelligent Reference Systems Project at the University of Houston Libraries.” In Convergence: Proceedings of the Second National Conference of the Library and Information Technology Association, October 2-6, 1988, a cura di Michael Gorman, 190-4. Boston: American Library Association.
Bailey, Charles W. 2018. “Artificial Intelligence in Libraries in the Late 1980’s and Early 1990’s.” DigitalKoans. https://digital-scholarship.org/digitalkoans/2018/08/02/artificial-intelligence-in-libraries-in-the-late-1980s-and-early-1990s/.
Baker, Thomas, 2013. “Designing data for the open world of the web.” JLIS.it 4 (1): 63-6. https://doi.org/10.4403/jlis.it-6308.
Balnaves, Edmund, Leda Bultrini, Andrew Cox, e Raymond Uzwyshyn, a c. di. 2025. New Horizons in Artificial Intelligence in Libraries. Berlin, Boston: De Gruyter Saur. DOI: https://doi.org/10.1515/9783111336435
Bianchini, Carlo. 2017. “Remarks about IFLA Library Reference Model.” JLIS.It 8 (3): 86-99. https://doi.org/10.4403/jlis.it-12416.
Bianchini, Carlo, e Mauro Guerrini, a c. di. 2016. “RDA, Resource Description and Access: The metamorphosis of cataloguing.” JLIS.it 7 (2). http://jlis.it/index.php/jlis/issue/view/15.
Borko Harold, e Myrna Bernick. 1963. “Automatic document classification.” J. Assoc. Comput. Mach 10: 151-62. DOI: https://doi.org/10.1145/321160.321165
Cordell, Ryan. 2020. Machine Learning + Libraries: A Report on the State of the Field. Commissioned by LC Labs Library of Congress. https://labs.loc.gov/static/labs/work/reports/Cordell-LOC-ML-report.pdf.
Coyle, Karen. 2022. FRBR, Before and After: A Look at Our Bibliographic Models. Chicago: ALA.
De Caro, Mario, e Benedetta Giovanola. Intelligenze. Etica e politica dell’IA. Bologna: Il Mulino 2025.
Deng, Sui. 2023. “AI, Cataloging & Metadata.” STARS Faculty Scholarship and Creative Works, University of Central Florida. https://stars.library.ucf.edu/ucfscholar/1251/.
D'Souza, Jennifer, Sameer Sadruddin, Holger Israel, Mathias Begoin, e Diana Slawig. 2025. SemEval-2025 Task 5: LLMs4Subjects -- LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog. https://arxiv.org/abs/2504.07199v1.
ExLibris. 2023. The Impact of Generative AI on Libraries. https://discover.clarivate.com/The_impact_of_Generative_AI_on_libraries.
Fumagalli, Giuseppe. 1887. Cataloghi di biblioteche e indici bibliografici. Firenze: G.C. Sansoni.
Galeffi, Agnese, e Lucia Sardo. 2013. FRBR. Roma: Associazione italiana biblioteche.
Gan, Aoran, Hao Yu, Kai Zhang, Qi Liu, Wenyu Yan, Zhenya Huang, Shiwei Tong, e Guoping Hu. 2025. “Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive Survey.” https://arxiv.org/abs/2504.14891.
Ghiringhelli, Lapo, e Mauro Guerrini. 2019. “Entità, attributi e relazioni bibliografiche: rileggendo la tesi PhD di Barbara B. Tillett trent’anni dopo.” AIB Studi 58 (3): 417-25. https://doi.org/10.2426/aibstudi-11868.
Guerrini, Mauro. 2022a. Dalla catalogazione alla metadatazione. Tracce di un percorso. Seconda edizione a cura di Denise Biagiotti e Laura Manzoni. Roma: Associazione italiana biblioteche.
Guerrini, Mauro. 2022b. Metadatazione. Milano: Editrice Bibliografica.
Guerrini, Mauro, Gianfranco Crupi, e Ginevra Peruginelli, a c. di. 2013. “Global Interoperability and Linked Data in Libraries: Special issue.” JLIS.it 4 (1). https://www.jlis.it/index.php/jlis/issue/view/23.
Guerrini, Mauro, e Tiziana Possemato. 2015. Linked data per biblioteche, archivi e musei. Milano: Editrice Bibliografica.
Guerrini, Mauro, e Lucia Sardo. 2018. IFLA Library Reference Model (LRM). Un modello concettuale per le biblioteche del XXI secolo. Milano: Editrice Bibliografica.
Guerrini, Mauro, e Franco Neri. 2020. “La tormentata formulazione delle Regole del British Museum del 1839.” In Scaffali come segmenti di storia: studi in onore di Vincenzo Trombetta, a cura di Rosa Parlavecchia e Paola Zito, 153-65. Roma: Edizioni Quasar.
Guerrini, Mauro, e Stefano Gambari. 2023. “‘Definite cataloguing rules set down in writing’: Antonio Panizzi’s Rules and the catalogue’s manifestations.” JLIS.it 14 (2): 93-9. https://doi.org/10.36253/jlis.it-528. DOI: https://doi.org/10.36253/jlis.it-528
Heaps Harold Stanley. 1973. “A theory of relevance for automatic document classification.” Information & Computation 22 (3): 268–78. https://doi.org/10.1016/S0019-9958(73)90310-0. DOI: https://doi.org/10.1016/S0019-9958(73)90310-0
IFLA (International Federation of Library Associations and Institutions). 1963. International Conference on cataloguing principles. Paris, october 9, 1961. Report. A cura di Arthur Hugh Chaplin e Dorothy Anderson. London: IFLA.
IFLA (International Federation of Library Associations and Institutions). 1998. Functional Requirements for Bibliographic Records. Final Report. Den Haag: IFLA. DOI: https://doi.org/10.1515/9783110962451
IFLA (International Federation of Library Associations and Institutions). 2011. ISBD International Standard Bibliographic Description Consolidated Edition. Den Haag: IFLA.
IFLA (International Federation of Library Associations and Institutions). 2017. IFLA Library Reference Model. A Conceptual Model for Bibliographic Information. A cura di Pat Riva, Patrick Le Bœuf, e Maja Žumer. Den Haag: IFLA.
IFLA (International Federation of Library Associations and Institutions). 2020. IFLA Statement on Libraries and Artificial Intelligence.
https://repository.ifla.org/handle/20.500.14598/1646.
IFLA (International Federation of Library Associations and Institutions). 2022. IFLA Library Reference Model. 2017-2022 Consolidation. A cura di Pat Riva, Patrick Le Bœuf, e Maja Žumer. Den Haag: IFLA.
IPaC (Infrastruttura e servizi digitali per il Patrimonio culturale). 2024. Aggiornamenti di progetto. Alphy 1.9.0. 9 agosto 2024. https://ipac.cultura.gov.it/2024/08/09/alphy-1-9-0/.
Islam, Md Nurul, Shakil Ahmad, Mohammad Aqil, Guangwei Hu, Murtaza Ashiq, Majed Mohammed Abusharhah, e Sheikh Abu Toha Md Saky. 2025. “Application of artificial intelligence in academic libraries: a bibliometric analysis and knowledge mapping.” Discov Artif Intell 5, 59. https://doi.org/10.1007/s44163-025-00295-9. DOI: https://doi.org/10.1007/s44163-025-00295-9
Jones, Steven E. 2018. Roberto Busa, S. J., and the Emergence of Humanities Computing: The Priest and the Punched Cards. London: Routledge.
Kaldeli, Eirini, Alexandros Chortaras, Vassilis Lyberatos, Jason Liartis, Spyridon Kantarelis, e Giorgos Stamou. 2024. “Combining Automatic Annotation with Human Validation for the Semantic Enrichment of Cultural Heritage Metadata.” In CHR 2024 Computational Humaniti. Proceedings of the Computational Humanities Research Conference 2024 Aarhus, Denmark, 4-6 dicembre, 2024, a cura di Wouter Haverals, Marijn Koolen, e Laure Thompson, 353-68. https://ceur-ws.org/Vol-3834/.
Kimaid, Luís. 2024. “Artificial Intelligence-Driven Archibot: Transforming Access to European Union Parliament Archives.” LegisTech Library. https://library.bussola-tech.co/p/artificial-intelligence-archibot-eu-parliament.
La Gorga, Angelo, Roberto Testa, e Lorenzo Verna. 2025. “LLMs and Retrieval Augmented Generation (RAG) for libraries.” DigitCult - Scientific Journal on Digital Cultures,10 (1): 59-73. https://doi.org/10.36158/97912566920714.
Lana, Maurizio. 2023. “Leggere l’IFLA Statement on Libraries and Artificial Intelligence al tempo di ChatGPT.” Biblioteche oggi Trends 9 (1). 10.3302/2421-3810-202301-004-1.
Ma, Wei. 2002. “A Database Selection Expert System Based on Reference Librarian's Database Selection Strategy: A Usability and Empirical Evaluation.” Journal of the American Society for Information Science & Technology 53 (7): 567-80. DOI: https://doi.org/10.1002/asi.10067
Ma, Wei, e Timothy W. Cole. 2001. “Test and Evaluation of an Electronic Database Selection Expert System.” In ACRL 10th National Conference, "Crossing the Divide". American Library Association. http://hdl.handle.net/11213/16793.
Morris, Victoria. 2019. “Automated Language Identification of Bibliographic Resources.” Cataloging & Classification Quarterly 58 (1): 1–27. https://doi.org/10.1080/01639374.2019.1700201. DOI: https://doi.org/10.1080/01639374.2019.1700201
Namur, Jean Pie. 1834. Manuel du bibliothécaire, accompagné de notes critiques, historiques et littéraires. Bruxelles: J.B. Tircher.
Noruzi, Alireza. 2012. “FRBR and Tillett’s Taxonomy of Bibliographic Relationships.” Knowledge Organization 39 (6): 409-16. DOI: https://doi.org/10.5771/0943-7444-2012-6-409
Northwestern University Libraries. 2025. Enriching Collection Access and Use with Generative AI. https://collections-and-ai.library.northwestern.edu/.
OCLC (Online Computer Library Center). 2025. Implementing AI to further scale and accelerate WorldCat de-duplication, 4 febbraio, 2025. https://www.oclc.org/en/news/announcements/2025/ai-worldcat-deduplication.html.
Panizzi, Antonio, a c. di. 1841. Catalogue of Printed Books in the British Museum: Volume I. London: Printed by order of the Trustees of the British Museum.
Rodriguez, Elena Escolano. 2013. “ISBD adaptation to semantic web of bibliographic data in linked data.” Jlis.it 4 (1): 119-37. https://www.jlis.it/index.php/jlis/article/view/259.
Roncaglia, Gino. 2023. L’architetto e l’oracolo. Modelli digitali di organizzazione delle conoscenze da Wikipedia a ChatGPT. Roma-Bari: Laterza.
Roncaglia, Gino. 2024. “IA generativa, system prompt e biblioteche.” In Un incontro di sguardi: biblioteche, libri e lettura come nodi di un reticolo di possibilità creative e generative: scritti in onore di Maurizio Vivarelli, a cura di Sara Dinotola e Anna Maria Marras, 343-54. Roma: Associazione italiana biblioteche.
Roncaglia, Gino. 2025a. “Verso il reference generativo?.” Biblioteche Oggi Trends 11 (1). https://www.bibliotecheoggitrends.it/it/articolo/4165/verso-il-reference-generativo.
Roncaglia, Gino. 2025b. Filosofia dell’intelligenza artificiale. Milano: Edizioni Corriere della Sera.
Russell, Stuart, e Peter Norvig. 2020. Artificial Intelligence: A Modern Approach (4th Edition). Boston: Pearson. https://aima.cs.berkeley.edu/.
Saccucci, Caroline, e Abigail Potter. 2024a. “Exploring Computational Description: LC Labs Planning Framework in action” (presentazione PowerPoint tenuta presso il Works in Progress Webinar: LC Labs AI Planning Framework in action – Understand, experiment, and implement AI tools that support catalogers). https://www.oclc.org/research/events/2024/ai-planning-framework-in-action.html.
Saccucci, Caroline, e Abigail Potter. 2024b. “Could Artificial Intelligence Help Catalog Thousands of Digital Library Books? An Interview with Abigail Potter and Caroline Saccucci.” The Signal (blog). 19 novembre 2024. https://blogs.loc.gov/thesignal/2024/11/could-artificial-intelligence-help-catalog-thousands-of-digital-library-books-an-interview-with-abigail-potter-and-caroline-saccucci/.
Saccucci, Caroline, e Abigail Potter. 2025. "16 Assessing Machine Learning for Cataloging at the Library of Congress." In New Horizons in Artificial Intelligence in Libraries, a cura di Edmund Balnaves, Leda Bultrini, Andrew Cox, e Raymond Uzwyshyn, 227-38. Berlin-Boston: De Gruyter Saur. https://doi.org/10.1515/9783111336435-017. DOI: https://doi.org/10.1515/9783111336435-017
Sebastiani, Fabrizio. 2002. “Machine learning in automated text categorization.” ACM Computing Surveys (CSUR) 34 (1): 1-47. https://doi.org/10.1145/505282.505283. DOI: https://doi.org/10.1145/505282.505283
Stevens, Mary Elizabeth. 1965. Automatic Indexing: A State-of-the-Art Report. Washington, D.C: United States Government Printing Office. DOI: https://doi.org/10.6028/NBS.MONO.91
Suominen, Osma, Juho Inkinen, e Mona Lehtinen. 2022. “Annif and Finto AI: Developing and Implementing Automated Subject Indexing.” JLIS.It 13 (1): 265-82. https://doi.org/10.4403/jlis.it-12740.
Tennant, Roy. 2002. “MARC Must Die.” Library Journal 127 (17): 26-7. DOI: https://doi.org/10.1049/cp:20020261
Thomale, Jason. 2010. “Interpreting MARC: Where’s the Bibliographic Data?.” Code4Lib Journal 11. https://journal.code4lib.org/articles/3832.
Tillett, Barbara B. “Bibliographic relationships: toward a conceptual structure of bibliographic information used in cataloging”. Ph.D. dissertation, University of California, Los Angeles, 1987.
Tomasi, Francesca. 2022. Organizzare la conoscenza. Digital Humanities e web semantico. Milano: Editrice Bibliografica.
W3C Library Linked Data Incubator Group. 2011. Final Report. https://www.w3.org/2005/Incubator/lld/XGR-lld-20111025/.
Zeng, Marcia L., e Qin Jian. 2022. Metadata. London: Facet Publishing.

Cataloguing, Metadata, and Generative AI. Early Experiences and Future Perspectives

Keywords

How to Cite

Abstract

Metrics

References

Similar Articles

Most read articles by the same author(s)

Similar Articles

Collective intelligence. AI for expanding the information content offered to users: the SBN Sommerso Project

Meaningful and Inclusive Access to Information

The Role of the Information and Documentation Professional in Big Data Management for Scientific Research

Digital legal deposit: cooperation, preservation, and new access opportunities

The multilingual challenge in bibliographic description and access

The Updating of ISBD and its transformation

Rethinking bibliographic control in the light of IFLA LRM entities: the ongoing process at the National library of France

The future of bibliographic services in light of new concepts of authority control

The Italian National Bibliography today

Wikidata: a new perspective towards universal bibliographic control

Cataloguing, Metadata, and Generative AI. Early Experiences and Future Perspectives

Keywords

How to Cite

Download Citation

Abstract

Metrics

References

Similar Articles

Most read articles by the same author(s)