AutoDoc - Generative AI Document Sorting and Searching

Project Info

Team Name

Team Alone(?) (I am alone) (Creative Name!)

Team Members


Project Description


Have you or your team ever had to go through 100s of documents, to index them, manually adding tags and dating them? This is now a problem of the past with AutoDoc, a platform for you to interact with your data in a whole new way, using OCR to read your documents, and OpenAIs GPT, we are able to automate the whole process of indexing, sorting, and even interacting with your documents, the process of importing and sorting your documents, which used to be a labor intensive process, is now possible in a couple of clicks.

When you upload your document, or even image of your document, it will be processed through OCR, formatted, and ran through OpenAI’s GPT chat model trained to extract the title, date, persons and groups involved, as well as write a small searchable description of the document.

#ai #openai #llm #ocr #documents

Data Story

Taking the data and information provided by ArchivesACT and the Territory Records Office, and sorting it and indexing to be accessible in a more user friendly context isn’t an easy challenge.

However AutoDoc addresses the issues and challenges bought by manually importing, indexing and sorting these historical documents by employing the use of optical character recognition (OCR), to take even handwritten notes into a searchable and indexable format, as well as then using OpenAIs GPT chat model to interpret the documents, extracting Titles, Dates, Tags, The Persons and Groups involved, as well as a short searchable description of the documents to then allow manual searching for the documents, as well as again using OpenAIs GPT chat model to request information from the documents in a more natural way.


Team DataSets

State Records of South Australia

Description of Use Documents used for testing code

Data Set

ACT Memory

Description of Use Example data used in the video.

Data Set

Challenge Entries

Making public archives more accessible

Online catalogues, like ACT Memory, provide information about government records and, where possible, provide copies of the records themselves. These records are generally in PDF or JPEG format. This makes the documents difficult to search for, access, and use. How might governments with record catalogues, like ACT Memory, solve this problem and make these rich sources of information more useful?

Go to Challenge | 7 teams have entered this challenge.

Generative AI: Unleashing the Power of Open Data

Explore the potential of Generative AI in conjunction with Open Data to empower communities and foster positive social impact. This challenge invites participants to leverage Generative AI models to analyse and derive insights from Open Data sourced from government datasets. By combining the power of Generative AI with the wealth of Open Data available, participants can create innovative solutions that address real-world challenges and benefit communities.

Go to Challenge | 29 teams have entered this challenge.