How ChatGPT can help you do archival research — but never replace archivists


Archivists assist users like historians, genealogists, students or citizens in locating, accessing and interpreting archives. Archival reference services have long been seen as services that mediate understanding and dialogue between archivists, users and archives to make documentary objects more accessible and usable.

Recent years have seen the introduction of artificial intelligence (AI) in heritage institutions like libraries, archives, museums and galleries.

Researchers are examining how AI is affecting and will affect archival services, from the automation of recordkeeping, to organizing archives and new forms of digital archives. There has been much discussion about the benefits of AI in terms of supporting users.

Among AI-powered technologies, ChatGPT can support some aspects of archival reference services. However, using it requires human supervision.

Through a few examples of a real conversation with this chatbot, it’s possible to explore the relevance of this AI-powered technology as an archival assistant — and also, its limitations.

A glossary of key concepts

Supposing a student in social sciences needs to consult and examine historical written sources. However, the student is insufficiently familiar with the basic principles of archival science — how to identify, acquire, authenticate, preserve and access records of continuing value.

Having formulated a request online for the national archives of his country, for example, Library and Archives Canada, the student was invited to consult, in person, the archives of interest, as they aren’t accessible online.

If the student wants to ensure a basic orientation for a visit, he could use ChatGPT to help, asking the question: “What are archives?”

When I posed this question, the answer generated suggests how this chatbot can be used as a glossary. The answer focuses on the nature of archives and their importance for providing authentic evidence of activities: “Archives refer to a collection of historical records that are preserved for long term use. These records can be in various formats, including paper documents, photographs, audiovisual materials, digital files and more.”

Chat GPT defined what the word “archives” means, as well as the most distinctive properties of these authentic documentary objects. What was helpful in the definition was that, more precisely, ChatGPT qualified archives as documentary proofs preserved for evidence, whether for administrative, historical or scientific purposes.

A practical guide or a manual

Supposing a student realized that in an e-mail received from the national archives of his country, the archivist used the word “finding aid” as a search tool. The student wishes to understand how a finding aid works.

I posed this question to ChatGPT and the generated answer reflects a set of instructions usually provided by archivists for users. It indicates a finding aid helps users get more familiar with archives’ scope, arrangement as well as how they are organized and any restrictions on them.

ChatGPT-generated answer for the question: ‘How does a finding aid work to retrieve archives?’ (Siham Alaoui), Author provided (no reuse)

A search engine or an online catalogue

Or, if a student decides to visit an archives consultation room, he might notice that some archives include a reference to archival material held by National Library and Archives of Québec (Bibliothèque et Archives nationales du Québec).

The student doesn’t live in Québec but is willing to travel there to consult these archives. It is possible to query ChatGPT — as I did — to see if it can provide him with some relevant information so he is more prepared for his visit, asking: “Give me an example of collections held by Bibliothèque et Archives nationales du Québec.”

For each of the examples generated, a short description of the archives included in each category is provided (photographs, maps, texts, audiovisual material, and so on). However, this query does not yield precise names of archives. Nevertheless, even this limited query of ChatGPT suggests it can also enhance the visibility of heritage institutions’ archival collections and support a better dissemination of cultural heritage.

ChatGPT-generated answer for the question: ‘Give me an example of collections held by Bibliothèque et Archives nationales du Québec.’ (Siham Alaoui), Author provided (no reuse)

‘Archival intelligence’

Getting the most of one’s archival user experience requires a set of skills translating what archival researchers Elizabeth Yakel and Deborah A. Torres refer to as archival intelligence. As my examples note, the use of a generative AI tool like ChatGPT makes it possible for users to develop a part of this archival intelligence autonomously.

However, limitations must be noted. In my examples, most of ChatGPT-generated answers include concepts like “document” and “record” that are used interchangeably. A trained eye familiar with the sources and variances in terminology across time and systems is needed to consider how this affects the search.

Expert familiarity with terminology

When I tried to problem-solve this by posing another query: “What is the difference between a record and a document?”, ChatGPT was not able to distinguish, properly, the key difference between the two.

The given answer was: “Documents are a subset of records. Not all records are documents, as records can include non-documentary forms of information like data entries, logs, or other records or activities.” This answer isn’t correct, because in archival science, records are documents, since they are products of an inscription on a relatively stable medium.

ChatGPT response to: ‘What is the difference between a record and a document?’ (Siham Alaoui), Author provided (no reuse)

Archivists still needed

Many scholars in archival science have studied the forms of records in digital environments. Chief of these are research groups of The International Research on Permanent Authentic Records in Electronic Systems (InterPARES), led by Luciana Duranti, professor emerita in archival studies at the University of British Columbia.

InterPARES offers insights on how records, which should exist in a documentary medium and have persistent content, can manifest differently in digital environments. So, records are documents. This makes us realize ChatGPT may give false or erroneous information. Another issue is that users may misinterpret ChatGPT answers.

Thus, archivists are still invited to play a documentary mediation role with users. This will ensure that the latter make good use of AI-powered technologies to improve their archival intelligence for a better understanding of archives and their terminology.

Author Bio: Siham Alaoui is a PhD candidate in archival and communication studies, Sessional lecturer and research assistant at the Université Laval