By Helena Byrne, Curator of Web Archives, British Library
The 2024 Summer Olympics and Paralympics held in Paris were record breaking events. Like with previous Games since 2010, the International Internet Preservation Consortium (IIPC) Content Development Group organised a collaborative transnational Web archive collection on the Games. The events on and off the field of play from web publications from 86 countries. There are 47 languages represented in the collection. Not surprisingly the largest number of nominations were in French with 1,181 records while many languages have as few as 1 or 2 records.
The majority of these records were nominated by IIPC members but a small number of unique records were nominated through the public nomination form that was launched on the IIPC blog in July 2024.
As with our previous collection on the 2022 Winter Olympics and Paralympics, social media was excluded from this collection. This was due to the fact that it was very difficult to preserve any meaningful social media captures through the Archive-it platform at the time of the event. One change from the scoping rules for this collection compared to previous Olympic and Paralympic collections was to exclude the Seed page plus 1 click of all links on seed page (e.g. a single news page linking to multiple articles), because these types of crawls normally pick up lots of irrelevant content that eats up data.
All seeds added to the crawler were capped at 2mb. This is generally enough data to capture a standard website but would mean we only have shallow captures of bigger media heavy websites. Overall the 3,429 websites and webpages that were archived amounted to 458 GB and 6,315,815 documents.
By the lead curators: Anaïs Crinière-Boizet, Digital Curator at the National Library of France and Lead Curator of theWar in Ukraine collection and Vladimir Tybin (Head of Digital Legal Deposit, National Library of France)
The IIPC Content Development Working Group launched “the War in Ukraine” collaborative collection in July 2022, a few months after the beginning of this conflict, in order to capture its impact on digital history and culture on the web. Based on suggestions by curators, web archivists and members of the public worldwide, we have launched six crawls in total: three in 2022, two in 2023 and one in 2024. You can read an update on this collaborative effort here.
Many of the pages nominated in the early crawls are already offline, which shows why it is important to continue this effort. We encourage everyone to nominate websites around the themes listed below. The first 2025 crawl will start on 24March.
What we want to collect
This collection is built through the following themes:
General information about the military confrontations
Consequences of the war on the civilian population
Refugee crisis and international relief efforts
Political consequences
International relations
Diaspora communities – Ukrainian people around the world
Human rights organisations
Foreign embassies and diplomatic relations
Sanctions imposed against Russia by foreign powers
Consequences on energy and agri-food trade
Public opinion: blogs/protest sites/activists
The list is not exhaustive and it is expected that contributing partners may wish to explore other subtopics within their areas of interest and expertise, providing that they are within the general collection development scope.
Out of scope
The following types of content are out of scope for the collection:
By the lead curators: Anaïs Crinière-Boizet, Digital Curator at the National Library of France and Vladimir Tybin, Head of Digital Legal Deposit at the National Library of France.
The War in Ukraine collaborative collection led by the IIPC Content Development Working Group (WIU CDG) was initiated in July 2022, a few months after the beginning of this conflict, in order to capture its impact on digital history and culture on the web. Based on suggestions by curators, web archivists and members of the public worldwide, we have launched six crawls in total, three in 2022, two in 2023 and one in 2024. Together with our colleague Kees Teszelszky, we provided an update on the status of the collection after the 2023 crawls.
This blog post aims to give an update on this transnational collection documenting an important historical event. Since the beginning of the collection, many IIPC members but also the public responded to the call for contributions to document the conflict. In total 1,528 member proposals were received and 322 via the public nomination form, making 1,850 seeds in total. After cleaning up duplicates and invalid URLs, 1,822 seeds remained. All these were crawled at least once between July 2022 and November 2024. Today the collection represents 2.3 TB.
Below is the geographic coverage of nominated content and the IIPC members’ contributions.
Collection scope
The selections cover the following topics given in the call for nominations: general information on: military confrontations; consequences of the war on the civilian population in Ukraine; refugee crisis and international relief efforts in and outside Europe; political consequences; international relations; diaspora communities like Ukrainians around the world; human rights organizations; foreign embassies and diplomatic relations; sanctions imposed on Russia by foreign powers; consequences on energy and agri-food trade; and public opinion such as blogs, protest sites, online writings of activists etc. Websites from countries all over the world and in all languages are in scope. Special attention has been devoted to websites which can be a source of internet culture, such as sites with internet memes.
To learn more about the context of this collection, see this article of 2022 published by the SUCHO initiative.
Screenshot of the SUCHO Meme Wall taken on December 6, 2023.
Timeline and crawl depth
We launched the sixth crawl for the War in Ukraine web collection on 1,206 seeds in November 2024, with a budget of 500 GB. 157 new seeds were submitted by members and through the public nomination form between the last crawl in December 2023 and November 2024. 419 seeds have been deactivated since December 2023. These were pages which were not updated since the last crawl or went offline. These “404 file not found” errors also show why our collection work is important, as some sites have already gone offline. In total, 25 jobs have been launched: 21 crawls with the standard crawler and 3 with Brozzler (a distributed web crawler that uses a real web browser (Chrome or Chromium) to fetch pages and embedded URLs and to extract links).
The tendency to select « One page + » sites is confirmed as the collection continues, with the ‘standard’ scope being the least used of the three proposed. This can be explained by the fact that the first type of site selected (see Figure 2) is ‘News’. « One page + » depth allows crawling of newspaper or online media pages pointing to articles related to the war in Ukraine.
Figure 1
Figure 2
Looking at the distribution of sites by website type, it is noticeable that a large proportion of the sites are news sites, NGOs and government websites. The role of blogs in internet culture has diminished in recent years, as is also visible in this collection. In contrast, NGO websites contain more and more information worth preserving for historians of the future, as they document their activities to their donors.
Screenshot of Ukraine’s 24 Channel taken on October 9, 2022.
Compared to the first crawls, international languages such as English (451) and French (260) are still in the lead, but third place is now occupied by Ukrainian (225) and seventh by Russian (71), showing our will to encourage selections of pages in those languages. The impact of the conflict on the rest of Central, Eastern and Southern Europe around Ukraine can be seen by the collection of sites in Hungarian (46), Czech (53), and Serbian (42).
Figure 3
1580 selections are already available on Archive it. The rest are still waiting for QA. The necessity to control all the URLs and the poor quality of some captures due to blocking or inherent limitations of the crawler make this a long process, but we hope that the collection will soon be completely available on Archive-it.
Quality assurance (QA)
For QA, we had the welcome help of Eilidh MacGlone, web archivist at the National Library of Scotland, who tells us in this article about what she learnt by volunteering on the War in Ukraine collection.
“Volunteering to quality assure (QA) targets in Archive-It was a beneficial experience, building on what I knew about Regular expressions (Regex) and learning about Sort-friendly URI Reordering Transforms (SURTs). I alternately scoped in and blocked URLs for several targets, directing Heritrix to intended pages and avoiding traps and redundant locations. This work will reduce the memory requirement for seeds remaining active in any future crawls. I employed an AI assistant to create Regex phrases which I then checked at regexchecker.com, mixing links I wished the crawler to find with others from the domain I did not and running my generated candidate against these. The patterns I needed are characteristic and I found myself reusing them through this work.
Another aspect of the work was a light touch metadata check, downloaded and edited in a Google spreadsheet as recommended by Archive-It. I reduced the number of alternate titles, and ran a spellcheck, though only for the English language text. I amended a few sentences as I worked, ran a final check for spelling mistakes – and found them, given human nature! Uploading to overwrite the metadata for a collection of more than 2000 items was nerve-wracking but worked well.”
Call for nominations
In a period of uncertainty regarding the future of the conflict, influenced by global political shifts and leadership changes, it is more important than ever to document this event for future generations and researchers.
In a period of uncertainty regarding the future of the conflict, influenced by global political shifts and leadership changes, it is important to document this event for future generations and researchers.
Three years after the start of this conflict, the importance of continuing to archive its traces on the web is as crucial as ever. We are launching a new call for nominations and encourage everyone to suggest content that should be crawled, particularly from countries underrepresented in the current collection.
We would like to thank Kees Teszelszky, who started this project with us, Eilidh MacGlone for her QA work, Carlos Lelkes-Rarugal and Nicola Bingham for their continuous support and guidance.
By Ahn Kyung-Ja, Librarian at the National Library of Korea
The National Library of Korea held a special exhibition, ‘Webtro: Digital Memory’ celebrating the 20th anniversary of “OASIS”, a web resources archive of Korea.
(Exhibition room on the first floor of the main building, October 14 – December 8, 2024)
Exhibition Key Visual
‘OASIS (Online Archiving & Searching Internet Sources, OASIS)’ was launched in 2004 to preserve Korean digital intellectual heritage. OASIS has collected and preserved a vast amount of domestic and international K-web resources totaling around 2.6 million items and is providing services to the public through the OASIS website (nl.go.kr/oasis).
We prepared an exhibition to celebrate OASIS’s 20th anniversary and promote it to the public. It is the first attempt for the OASIS project team to realize purely digital web resources in an offline space.
The web is a communication network that produces and shares various types of content in real-time and is the most popular and practical knowledge information resource representing this era. Since the early 2000s, member libraries in the IIPC have been collaborating with institutions around the world to preserve web resources that change and evolve every day. To enhance awareness of the necessity and value of web preservation, the National Library of Korea organized an exhibition to show how web archiving works.
The exhibition title, ‘Webtro’, is a newly coined word that connects the words ‘web’ and ‘retro’ and refers to visitors’ past experiences, which are collected and recorded in the OASIS system. We organized domestic websites and web resources collected in OASIS during the past 20 years. Visitors can see them at a glance by theme and by period.
OASIS: Exhibition Intro
We applied pixel design to maximize the unique features of the web and redesigned a once-famous character created for a South Korean social network service. The exhibition’s main goal was to realize a digital-memory trip in a real space. We also posted three video works by young web artists with the motif of OASIS and envisaged the future of OASIS in virtual space on a big screen.
The exhibition consists of 4 parts in total.
Part 1 Understanding OASIS: Digital Time Capsule
Part 1 introduces a world map showing web preservation projects and participating institutions around the world, explaining the formation of the IIPC and worldwide cooperation for web preservation. It shows video images from major web preservation institutions and introduces OASIS from its birth to its current status.
The Chronology of OASIS
Map of IIPC and Web Preservation Cooperation
Part 2 Exploring the Web: Bringing Back the Lost Web
It’s time to explore the K-Web resources in OASIS. We prepared a chronological archive table that shows 45 major disasters in twenty years and representative K-culture collections such as BTS, Squid Game, and Han Kang’s Nobel Prize in Literature. Visitors can also use a touchscreen to search nostalgic websites that are not currently serviced.
OASIS Web Resources Exploration
Part 3 Retro Experience: Y2K Time Travel
This is a space where you can experience time travel into the past. We provide infographics of thirteen major sports events, from the 2014 Incheon Asian Games to the 2024 Paris Olympics and Paralympics.
Sports Time Machine Zone
The Miniroom Area in Our Memory
We created a retrospective space where you can experience a once-popular virtual community in the 2000s. Cyworld, launched in 1999, was the beginning of the Korean Social Network Service. Users created their own mini-me (avatar) and mini-room in cyberspace and communicated with friends and others within the virtual space. By restoring virtual images in real space, we gave visitors an opportunity to relive a good memory. There is also a digital guestbook in which visitors can write a message. We will permanently preserve it after the exhibition ends.
Part 4 Extended Future: Connect to 2050
In the final part of the exhibition, we extended the scope of the web. We imagined a future for OASIS in metaverse space, complete with web art. In the gallery area, you can enjoy a Paik Nam June-style media artwork, three web artworks created by young web artists, and a video illustrating a possible future of OASIS 2050 in the metaverse space.
A Collaborative Work Combining the Web, Art and the Future of OASIS
Celebrating the 20th anniversary of OASIS, the exhibition expressed the value and importance of preserving digital cultural heritage to the Korean public. We have performed a leading mission of web preservation in South Korea and informed the public that many overseas partner institutions cooperate for web preservation projects. We also reported on the TV news and live broadcasts to let the public recognize our mission. About 3,500 visitors came to the exhibition and recognized the need to preserve web resources. They left encouraging words and wrote short memos to show their thanks for the web preservation project.
I would like to express my sincere gratitude to IIPC Chairman Jeffrey van der Hoeven and to Olga Holownia for their support and kind congratulatory letters. I thank Alex Osborne for visiting the exhibition and promoting it in Australia, and the Bibliothèque Nationale de France for visiting the exhibition.
I hope this exhibition will increase public interest, support, and encouragement for the pioneering work of web archiving.
The call for nominations for the 2024 Steering Committee elections has closed. We had four vacant seats and received three nominations, so an election process is not required this year. We would like to congratulate and thank the British Library, the National Library of France, and the National Library of the Netherlands, who will be continuing for another term.
Please find the statements of all the nominees below. The new, three-year term starts on 1 January 2025.
The National Library of France (BnF) started its web archiving programme in the early 2000s and now holds an archive of more than 2.2 petabytes. We develop national strategies for the growth and outreach of web archives and host several academic projects in our DataLab. We use and share our expertise on key tools for IIPC members (Heritrix 3, NetarchiveSuite, OpenWayback, SolrWayback, webarchive-discovery) and contribute to the development of several of them.
As one of the founding members of the IIPC, we have always actively contributed to the GA & WAC meetings, workshops and most of the working groups, and we remain committed to the development of a strong community sharing knowledge and practices. Recently, the BnF has been particularly active within the CDG, leading a collaborative collection on the War in Ukraine, participating in the work of various groups, and also hosting the 2024 GA & WAC in Paris. BnF is currently co-leading the Membership Engagement Portfolio. By drafting and implementing the main thrusts of the new Strategic Plan and Consortium Agreement, our participation in the steering committee will be focused on making web archiving a thriving community, engaging researchers in the study of web archives, developing harvest and access strategies.
This year, the British Library has had good cause to reflect on the importance of an international community in supporting resilience as well as development of capability in web archiving. The British Library became a founding member of the IIPC because we recognised the need to work collaboratively and share knowledge and experience in a field that was technologically challenging and rapidly changing. That remains the case today, as the technology of the web, web archiving tools and researcher needs have advanced.
The next 3 years will be a period of rapid change for the UK Web Archive as we restore our service following last year’s cyber attack. We remain engaged in how the community can support the development of tools on which we depend. Collaborative collecting remains a key part of how we work with the IIPC. We are excited by new research, including support for use of the archived web as data. We are a member of the Steering Committee and, as Treasurer for 2022 – 2023, gained a good understanding of how the organisation operates. As a member of the Steering Committee we would additionally take a role in the strategic direction of the IIPC.
We believe the IIPC is an important network organization which brings together ideas, knowledge and best practices on how to preserve the web and retain access to its information in all its diversity. In recent years, KB National Library of the Netherlands (KBNL) has taken an active role in the IIPC: we co-hosted the 2023 Web Archiving Conference and our representatives have served in various leadership roles, including as portfolio leads, IIPC vice-chair (2023) and chair (2024).
We would like to continue our work and bring together more organizations, large and small across the world, to learn from each other and ensure web content remains findable, accessible and re-usable for generations to come. Our main focus will be to support the IIPC in reshaping the Consortium Agreement and developing the new strategic plan, taking input from our members and the wider web archiving community.
As a national library, our work is fueled by the power of the written word. It preserves stories, essays, and ideas, both printed and digital. When people come into contact with these words, whether through reading, studying, or conducting research, they impact their lives. With this perspective in mind, we find it vital to preserve web content for future generations.
The IIPC Web Archiving Conference (WAC) 2024, held from April 24-26 at the National Library of France (BnF) in Paris. Co-organized by the IIPC and the BnF in partnership with the French National Audiovisual Institute (INA), the conference brought together over two hundred members of the web archiving community from all over the world. Below are insights and experiences of a few of the attendees who received student bursaries from the IIPC, collected from their submitted travel reports.
Student Bursary Experience
Each year, the IIPC awards up to ten applicants in good academic standing with a bursary to cover their registration costs. We had nine student bursary recipients this year, four of whom also attended as presenters at the conference. While many were local students of the National School of Charters, others hailed from Belgium, Portugal, England, and the United States. Additionally, five of the twenty-four mentees in this year’s mentoring program – a fairly new but much appreciated element of the conference – were student bursary recipients.
Jonas Melo, a student of the University of Porto’s Information and Communication in Digital Platforms program, was a familiar face at WAC, having attended past conferences. As both an attendee and a speaker at this year’s conference, Melo expressed his gratitude for the assistance. “The student bursary program was a fantastic initiative by the IIPC, providing financial support that made it possible for students like me to attend the conference,” he says, noting the ease of the application process and that “the support from the IIPC team was exceptional. I am grateful for this opportunity and hope that the program continues to support future students.” The other recipients echoed his sentiments in their own reports.
Program Favorites
The conference featured a variety of presentations, short talks, panels, posters, and workshops on a range of diverse topics. Attendees had the challenging task of choosing between parallel tracks, each offering valuable insights and innovations. Though they found value in all the sessions they were able to attend, the bursary recipients made note of those they enjoyed the most.
Melo points to the very first session as a standout for him, saying it “sparked many ideas on how we can leverage AI to improve the efficiency and accuracy of our archiving efforts.” He went on to add that the second panel of the conference, Archiving Social Media in an Age of APICalypse, was “particularly relevant as it underscored the importance of balancing technological advancements with ethical responsibilities.”
Archiving Social Media In An Age of APIcalypse From the left: Anat Ben-David, Benjamin Ooghe-Tabanou, Frédéric Clavert, Beatrice Cannelli, and Jerôme Thièvre Photo credit: Olga Holownia / IIPC
Fellow attendee Lizzy Zarate, a student of New York University’s Archives and Public History program, agrees with Melo on the quality of both the second panel, which she describes as “an examination of the legal, ethical, and technical issues relating to the regulation of API access by tech platforms that are not incentivized to act in the public interest,” as well as the Artificial Intelligence & Machine Learning session. “As somebody who primarily performs quality assurance checks on archived websites, I was interested in learning about attempts to automate this process and other facets of web archiving using machine learning and artificial intelligence,” she explains.
She points to Benjamin Lee’s work with the End of Term Archive as “an interesting exploration of how preserving materials like PDFs can be accomplished using machine learning. Projects such as these seem particularly important for government accountability as well as potential uses for curation.” She goes on to add that, aside from AI, Alex Dempsey’s lightning talk on the Internet Archive’s deduplication work was “an introduction to a topic that I had never encountered in my work, and I am excited to track how IA continues to address this issue in the future.”
With this year’s conference, I confirmed my affinity for web archives, both as the object of my studies and as a field I would like to work in my future archivist career. I met many engaging professionals, ready to have a small talk around a coffee.
– Alice Guérin
The opening keynote panel, Here Ya Free! Crossed Views on Skyblog, the French Pioneer of Digital Social Networks, was mentioned as a favorite by nearly all of the student bursary recipients.
“After almost 20 years of providing users with a personal digital space, enabling them to connect with other users sharing the same interests, the platform announced its closure in 2023,” explains Cannelli, whose doctoral research focuses on the strategies employed by archiving initiatives in the preservation of social media platforms. “The BnF and INA – as France’s electronic deposit institutions – coordinated an emergency capture to preserve billions of URLs.”
Panelists included Pierre Bellanger, founder and CEO of Skyrock Radio, freelance journalist and former Skyblog user Pauline Ferrari, and Web Archiving Technical Leads Jerôme Thièvre of INA and Sara Aubry of the BnF, and was moderated by Emmanuelle Bermès, Educational Manager of the Digital Technologies Applied to History master’s program at the National School of Charters.
Opening Keynote Panel Here Ya Free! Crossed Views on Skyblog, the French Pioneer of Digital Social Networks From the left: Pierre Bellanger, Pauline Ferrari, Emmanuelle Bermès, Sara Aubry, and Jerôme Thièvre Photo credit: Nola N’Diaye / BnF
“This mix of voices underscored the important role that such platforms play in our daily lives and the vital function performed by web archiving institutions in ensuring the long-term preservation of such content even beyond the platforms’ lifespan,” says Cannelli. Zarate, who regularly works with student blogs and engages with university students in her research, says the panel “helped illuminate the value and challenges of preserving materials created by young people on relatively unregulated platforms.”
Alice Guérin, who is pursuing a master’s degree in Digital Technologies Applied to History at the National School of Charters under Bermès, cited her current thesis on the history of the Skyblog platform as the reason the keynote panel drew her so strongly. She also notes that the entirety of the Digital Preservation session, as well as Niels Brügger’s presentation on web history, The Form Of Websites: Studying The Formal Development Of Websites, The Case Of Professional Danish Football Clubs, “offered very interesting perspectives for researchers.”
Personal Insights and Takeaways
Despite the packed program, the conference provided ample time to mingle, from casual chats during session breaks to purposefully engineered networking opportunities. Attendees appreciated the chance to engage with such a diverse cross-section of the web archiving community.
As a student bursary recipient, I found the conference to be an invaluable learning experience. The sessions were not only informative but also thought-provoking, encouraging us to think critically about the future of web archiving. I appreciated the opportunity to engage with experts in the field and to gain insights that will undoubtedly shape my future research and career.
– Jonas Melo
“Something that really surprised me was the wide variety of disciplines represented across the conference,” notes Zarate, who learned that her future master’s degree in Archives is an uncommon one in many other countries. At the conference, she says, she was able to meet “archivists, librarians, and computer programmers from around the world…Hearing about the sheer number of different ongoing projects expanded my view of what I had previously thought was possible within web archives.”
Guérin had the opportunity to participate in the Early Scholars Spring School on Web Archives organized by Emmanuelle Bermès (National School of Charters, PSL University of Paris) and Valérie Schafer (Luxembourg Centre for Contemporary and Digital History, C2DH; Internet and Society Center, CNRS). While this did mean she was unable to attend any of the pre-WAC workshops on April 24th, the Spring School gave her a chance to prepare for the intensity of the conference and to have familiar faces to look for at the conference proper. She agrees that the diversity of both the careers and experiences of the attendees were a key component in the enrichment of their discussions, adding that this year’s mentoring program provided her and her fellow participants with “valuable insight on their career prospects and research subjects.”
Networking break in the Grand Auditorium Foyer of the BnF’s François Mitterrand site.Photo credit: Olga Holownia / IIPC
Conclusion
Overall, the bursary recipients found immeasurable value in the 2024 Web Archiving Conference, leaving with a wealth of gained knowledge, new connections, and a renewed sense of purpose in their web archiving careers.
“This conference was instrumental in shaping my understanding of web archival work, and I hope to use this knowledge as I prepare to begin my career as an archivist.”
– Lizzy Zarate
“Although some panels were too technical for my understanding, I can’t wait to have more experience in the field to understand its subtleties,” says Guérin, emphasizing that the conference experience confirmed her desire for a future career in web archiving. Melo agrees that the interactions with his fellow attendees “were instrumental in expanding my understanding of the global web archiving community,” and that he hopes that the connections he formed will lead to future collaborations.
“I left the conference with so much food for thought, and I am looking forward to the 2025 IIPC Web Archiving Conference in Oslo,” says Cannelli, before offering a “special thanks to the organizers for putting together such a fantastic event, and to the IIPC for their invaluable support through the student bursary.”
By Anastasia Nefeli Vidaki, law scholar and researcher at the Cyber and Data Security Lab, Vrije Universiteit Brussel
The IIPC Web Archiving Conference (WAC), organized by the International Internet Preservation Consortium (IIPC) and BnF (National Library of France) in partnership with the French National Audiovisual Institute (INA), united professionals and enthusiasts globally to explore the nuances of web archiving. Set in Paris, France, between 24-26 April 2024, this gathering served as a vital forum for participants to delve into groundbreaking research and good practices, network, and enjoy the French hospitality in the library, the exhibitions, and events. It was my first time participating in the conference, and in this report, I aim to depict my experience as a delegate and presenter.
Day 1: Arrival, Registration and Workshops
Entrance of the National Library of France | Photo credit: Anastasia Nefeli Vidaki, 2024
Upon arrival in Paris early on Wednesday morning, delegates were welcomed by the bright spring sunlight. After checking into our accommodations, we reached BnF’s François-Mitterrand site where the bulk of the conference was taking place. We were astonished by its modern architecture, rich collection, and exhibitions. Right after lunch, we registered for the conference and collected our badges and materials. In our conference bag we found many surprises, from a vintage map of Paris to a ticket for the library’s special exhibition, “La France sous leurs yeux.” The afternoon commenced with captivating workshops and ended with a warm welcome reception at the BnF’s Richelieu site. Among drinks and conversations, the attendees had the opportunity to get to know each other in a friendly environment and establish connections.
Day 2: Keynote Panel, Sessions, Workshops and Talks
With enthusiasm from the previous day’s experience, the second day commenced. The conference began with a keynote panel on Skyblog, which filled us with nostalgia and thoughts on the freedom of the web realm. Right after was a series of inspiring sessions delivered by esteemed practitioners in the field of archiving, digitization, data science, AI, law, and web-crawling. From discussions on the future of archiving to good practices, each presentation ignited engaging conversations and fueled attendees’ interest for the development and the challenges of web archiving in the era of novel technologies. We delved deeper into specific areas with interactive workshops and panels and then sat in on lightning and drop-in talks, exchanging insights with fellow experts before walking across the foyer, which was filled with posters. The second day ended with dinner, giving participants a chance to relax from the active day and reflect upon all the information they received.
Photos from the first poster session, featuring Nola N’Diaye of the National Library of France (top right) and Helena Byrne of the British Library (bottom right) Photo credit: Guillame Murat / BnF
Day 3: Panel Discussions, Sessions and Posters
Day three started with another poster session relevant to the previous day’s discussions. While having our morning coffee and typical French croissants, we toured around the poster presenters. That day of the conference was marked by dynamic panel discussions covering a diverse range of interdisciplinary topics. From exploring the role of communities in web archiving to the requests for accessibility and inclusivity, the panels sparked lively debates.
I took advantage of the gifted ticket inside my conference bag and visited the photo exhibition during the lunch break. More relaxed, I returned to the last panel and sessions of the conference. During networking breaks, I was able to meet many experts across the globe, share ideas, and present my research and point of view on many topics of common interest.
With some closing remarks, takeaways, and a slide show of conference photos, WAC came to an end. Our stay in Paris ended along with it, and we prepared for our return with a feeling of fullness, valuable knowledge gained, and connections established throughout the last three days. Carrying with us a wealth of memories and insights, we waved goodbye to the National Library of France, WAC, and the people behind it, and promised to continue to be inspired and to strive for openness, preservation, and freedom in archiving, whether in physical or digital discourse.
Conclusion
The Web Archiving Conference in Paris offered us delegates an exceptional opportunity to delve into pioneering research, network with peers, and enjoy a few days exploring the vibrant city of Paris. From enlightening sessions to hands-on workshops and stimulating panel discussions and talks, the conference nurtured collaboration and innovation within the international web archiving community. Departing attendees left with a revitalized sense of purpose and passion, eager to apply the fresh insights and knowledge acquired toward tackling the field’s most significant challenges.
By Helena Byrne, Curator of Web Archives, British Library
The International Internet Preservation Consortium (IIPC)’s Content Development Group would like your help to archive websites from around the world related to the Olympic and Paralympic Games.
The IIPC has members in 33 countries, but there are over 200 countries competing in the Games, and we need your help to ensure that these countries are represented in the collection.
We want to collect websites in various formats such as:
Websites, or subsections of websites, related to the Olympic and Paralympic Games 2024
Individual articles, or documents on websites
News reports
Blogs
The subjects covered on these sites can include but are not limited to:
Athletes/Teams
Computer Games (eGames)
Doping/Cheating and Corruption
Environmental Issues
Fandom
Gender Issues (Ex. media coverage, sexual harassment etc.)
General News/ Commentary
Health issues (covid, bed bugs etc.)
Human Rights issues
Olympic/Paralympic Venues
Security
Sports Events
Other
Social media policy
Social media platforms are difficult to archive due to technical issues and privacy concerns. For these reasons, we will not be accepting nominations of content on Facebook, Instagram, Twitter, or from other social media platforms.
How to get involved
Once you have selected the web pages you would like to see in the collection it takes less than 5 minutes to fill in the submission form:
The Steering Committee (SC) is composed of no more than fifteen Member Institutions. SC Members provide oversight of the Consortium and define and oversee action on its strategy. This year, four seats are up for election.
What is at stake?
Serving on the Steering Committee is an opportunity for motivated members to help guide the IIPC’s mission of improving the tools, standards and best practices of web archiving while promoting international collaboration and the broad access and use of web archives for research and cultural heritage. SC members are expected to take an active role in leadership and help guide and administer the organisation. The elected SC members also lead IIPC Portfolios and thus have the opportunity to shape the Consortium’s strategic direction related to three main areas: tools development, membership engagement and partnerships. Every year, three SC members are designated as IIPC Officers (Chair, Vice-Chair and Treasurer) to serve on the IIPC Executive Board and are responsible for implementing the Strategic Plan. The SC members meet in person (if circumstances allow) at least once a year. Face-to-face meetings are supplemented by two teleconferences plus additional ones as required. The key tasks for the upcoming term include drafting and overseeing the implementation of the new Strategic Plan and Consortium Agreement.
Who can run for election?
Participation in the SC is open to any IIPC member in good standing. We strongly encourage any organisation interested in serving on the SC to nominate themselves for election.
Please note that the nomination should be on behalf of an organisation, not an individual. Once elected, the member organisation designates a representative to serve on the Steering Committee. The list of current SC member organisations is available on the IIPC website.
How to run for election?
All nominee institutions, both new and existing members whose term is expiring but are interested in continuing to serve on the SC, are asked to write a short statement (no longer than 200 words) outlining their vision for how they would contribute to IIPC via serving on the SC. Statements can point to current and past contributions to the IIPC activities (e.g. through collaborative projects, conference hosting, participation in SC, Working Groups or taskforces), relevant experience or expertise, new ideas for advancing the organisation, or any other relevant information.
All statements will be posted online and emailed to members prior to the election, giving all members ample time to review them. The results will be announced in October, and the three-year term on the Steering Committee will start on 1 January.
Below is the election calendar. We are very much looking forward to receiving your nominations. If you have any questions, please contact the IIPC Senior Program Officer (SPO).
Election Calendar
13 June – 10 September 2024: Nomination period. IIPC Designated Representatives are invited to nominate their organisation by emailing the IIPC SPO. The nomination statement should be no longer than 200 words.
11 September 2024: Nominee statements are published on the Netpreserve blog and circulated to the Members mailing list. Nominees are encouraged to campaign through their own networks.
11 September – 9 October 2024: Members are invited to vote online. The vote is cast by the Designated Representative.
11 October 2024: The results of the vote are announced on the Netpreserve blog and Members mailing list.
1 January 2025: The newly elected SC members start their three-year term.
By Friedel Geeraert, Expert in web archiving at KBR | Royal Library of Belgium
This year’s IIPC General Assembly and Web Archiving Conference took place at the Bibliothèque nationale de France (BnF) in Paris. It was wonderful to be welcomed once again into the warm web archiving community, especially in the superb surroundings the BnF had to offer. The welcome reception in the oval reading room at the BnF Richelieu site was especially memorable in that respect. Other than the lovely encounters with web archiving colleagues from around the world, the General Assembly and the Web Archiving Conference program had a lot to offer.
Opening remarks by the President of the BnF, Gilles Pécout, in Salle Ovale. Photo credit: Guillaume Murat, BnF
The General Assembly gave insight into the strategic plan for 2026-2031 and the reflections of the Steering Committee during their meeting that took place the day before. The transparency about their discussions and the active call for participation of members in determining the strategic priorities of the IIPC was greatly appreciated. The historical overview of the changes that have taken place in the Consortium Agreement was also fun to see, as it showed how the IIPC has grown as an organization over the decades.
Workshops offered participants opportunities to gain hands-on experience in becoming confident trainers in the domain of web archiving, running your own full stack SolrWayback, and crawling using the Browsertrix Cloud, among others. Panel discussions and keynotes allowed for deepening one’s knowledge about Skyblog (a French pioneer in social networks), the archivability of websites, archiving social media, and training Large Language Models. Sessions focused on a myriad of subjects such as capturing unique content (ads, digital artworks, memes, etc.), digital preservation, and planning (tenders, sustainability of web archiving programs, training, etc.). The poster sessions and the drop-in and lightning talks allowed participants to gather information on a whole range of concepts very efficiently.
This is only a selection of themes that were covered during the conference. The program comprised three parallel sessions, all covering interesting topics, thereby inspiring a significant level of FOMO in participants.
Friedel Geeraert presenting a KBR drop-in talk. Photo credit: Olga Holownia.
At KBR, there are currently three projects in the pipeline:
Setting up a web archive on a voluntary basis (via a public tender)
Extending the legal deposit legislation to online content
TheBelgicaWeb research project. The project is funded by BELSPO, the Belgian Science Policy Office, through the BRAIN 2.0 program and aims to make the born-digital heritage of Belgium accessible and FAIR.
Bearing in mind this institutional context, a number of elements evoked during the General Assembly and Web Archiving Conference are particularly useful. Within the BelgicaWeb project, we will further look into SolrWayback and Browsertrix Cloud. APIs offered by organizations such as Arquivo.pt are also sources of inspiration. Initiatives such as Datasheets for Web Archives by Emily Maemura and Helena Byrne can also prove useful in describing the provenance of collections of archived web content. Using PWIDs to reference web sources archived in certain web archive collections has also been adopted as best practice within the BelgicaWeb project.
As a member of the Preservation Working Group at KBR, I found the session on Digital Preservation especially useful. The Danish Royal Library proved itself once again as one of the leading examples in Europe where digital preservation of born-digital content is concerned. Thanks to their presentations, we will be looking further into Bitrepository.org.
All in all, this was another great edition of the IIPC GA & WAC. I can’t wait for the next conference in Oslo!