influenzAI: Using AI to provide convenient, private, trustworthy access to Long COVID & Influenza information from open, reputable government data.

Project Info

G2 thumbnail

Team Name


G2


Team Members


David , Jacky , Sudeep , Kien Pham , Emilian Roman , Shann , Dhanush , William

Project Description


influenzAI

A revolutionary clinical decision support tool leveraging generative AI to provide on-demand evidence-based insights and recommendations to clinicians regarding long COVID and influenza, utilising its curated healthcare knowledge repository of health care data.

Our Mission

During the COVID pandemic, we have seen an incredible explosion of information. However, clinicians are unable to efficiently incorporate the ever-expanding sea of data into their clinical practice. An effective mitigation is leverage emergeing generative AI technology to allow easy, convenient and intuitive access to reputable information.

This is where InfluenzAI tries to fill in the gap. It is a generative AI solution powered by LocalGPT that allows clinicians to query a model trained on trusted, reputable open government data. By ethically curating the data we ingest onto the large language model, we strive to mitigate bias and influence, and aim for inclusivity and neutrality.

While currently targeted at busy clinicians, we believe that there is an opportunity in education for healthcare students to utilise this platform. InfluenzAI will show the source data and related data used to generate its answers, thus making it a convenient yet comprehensive medical resource.

Data security and patient privacy is paramount, thus we self-hosted the large language model ourselves and made sure that no patient identifiable information are stored, let alone sent out to third parties. For transparency and discoverability, InfluenzAI will show you the exact datasets used to respond to the query. Furthermore, it will show you related datasets to broaden your knowledge and understanding.

Accessibility

  1. Querying is done using natural language prompts, instead of scrolling through oceans of documents. A person should not need to think like a computer when querying answers!
  2. A minimalist, responsive UI which focuses on the queries and responses. With this project's architecture, different UIs can be built for different kinds of scenarios and devices.

Scaling Potentials

Of course, Long COVID and Influenza are a drop in the ocean - there are plenty of other health issues and concerns people have. Using a diverse range of open government data, we can augment our LLM with additional knowledge of other topics, and reach new audiences whom would benefit from a convenient, informative and private source of healthcare information.

From a technical perspective, InfluenzAI has been architected with the intent of being modular and scalable. We can leverage container orchestration to scale up the project as necessary, with minimal difficulties.

Using features such as voice to text, we improve convenience by reducing typing on phones for busy medical practitioners, and emphasising on accessibility to people new to this technology.

While InfluenzAI focuses on Influenza and long COVID, our long-term plan is to expand to other pathogens and use InfluenzAI as a stepping stone towards VirAI.

Interactive and Visual

To improve the depth of responses, we built our own system which visualises government data using intuitive, useful charts and maps. This bespoke system would respond to known queries (e.g. "long covid trends in 2023") using visual responses, as a complement to the text-based responses from our LLM.

Privacy

Privacy is paramount in healthcare, which is why our solution uses a self-hosted ML trained on open government data.

Queries do not get exposed to third-party solutions, and we refrain from gathering personal information from people.

Bias Mitigation

Our architecture and self-hosted model grants us control over our model, and thus permits us to minimise external bias. The challenge of bias lies in the government data itself, which we can curate using an ethical protocol that focuses on inclusivity, diversity and authenticity.

Openness and Transparency

To improve openness and transparency, we visually display all the documents/datasets used for our model. This allows our model to be scrutinised for any unintended biases, and provides users/stakeholders with confidence in the potential & quality of our model.

Our Team

In alphabetical order...

  • David: Data Analysis / Content Creator
  • Dhanush: Quality Assessment Lead
  • Emilian: Lead Developer / Architect / UI Layout
  • Jacky: Content Creator / Business Analyst / Industry Insight
  • Kien Pham: Website UI/UX Design
  • Shann: Data Analyst / Curator / Narrator
  • Sudeep: Graphic / Logo Design
  • William: Project Manager / Data Analysis / Content Creator

Architecture

InfluenzAI consists of four major components:

  1. The InfluenzAI Web Interface, a PHP platform which provides a minimalist and beautiful UI to the end-user for querying and responses.
  2. The InfluenzAI Database, which stores metadata (name, source, keywords, etc.) for the documents fed into localGPT. Using this metadata, we can provide accurate and related datasets relevant to the query.
  3. The InfluenzAI Brdiging API, which sits in-between the Web Interface and the LLM back-end (LocalGPT for our prototype).
  4. The back-end LLM, which ingests the provided open government data and responded to the inbound user queries.

architecture

Using this architecture, the components remain decoupled and scalable. This opens up the opportunities for alternative UIs (e.g. mobile apps) and LLMs (e.g. LLaMA).

Attributes


#ai #health #prompts #ml #machine #learning #artificial #intelligence #healthcare #long #covid #influenza #privacy #transparency #intuitive

Data Story


Clinicians have access to an ever-expanding sea of data. However, they do not have the time to absorb and apply this wealth of knowledge in the clinical setting.

We aim to solve this with InfluenzAI. As a generative AI-powered solution, it harnesses the potential to grasp and distil the vast expanse of medical literature, alleviating the burden on clinicians and enhancing their decision-making process.

In our proof of concept, we have curated a list of healthcare datasets focusing on long COVID and influenza. These include treatment protocols, surveillance reports, journal articles, and parliament inquiries from a variety of government and non-government sources:
• Federal and State Departments of Health
• RACGP
• data.nsw.gov.au
• Department of Education
• Australian Institute of Health and Welfare
• High impact journals like Nature Research Journal

These datasets were used to train the InfluenzAI’s model. The user can then access InfluenzAI through our dedicated website to ask specific queries and InfluenzAI will then return an answer utilising its healthcare knowledge repository.

InfluenzAI leverages the diverse and curated datasets to provide clinicians with on-demand evidence-based insights and recommendations to specific queries. Our groundbreaking decision support tool exemplifies how data reuse can revolutionize patient care and medical decision-making.


Evidence of Work

Video

Homepage

Project Image

Team DataSets

2022 Attendance rates by Government Schools(CSV)

Description of Use Used to train generative AI model in order to enhance value of query responses regarding lasting impacts of COVID on education

Data Set

Australian Institute of Health and Welfare

Description of Use A subset of resources was taken from the AIHW to train the generative AI on resources related to Covid and Long Covid

Data Set

Nature

Description of Use A subset of publications was taken from Nature to train the generative AI on leading research relating to Covid and its mechanisms

Data Set

Data.gov.au - Open Data Portal

Description of Use A subset of resources were taken from the Open Data Portal for training the generative AI on varied government data relating to Covid, Influenza and other health information

Data Set

NSW Department of Education

Description of Use A subset of resources was taken from NSW Education to train the generative AI on information regarding the impacts of Covid on education.

Data Set

Royal Australian College of General Practitioners

Description of Use A subset of resources were taken from this to train generative AI on information regarding Covid / Long covid, to enhance useful responses.

Data Set

Department of Health and Aged Care

Description of Use A subset of resources was taken for training generative AI on documents/information regarding influenza, covid, and other health topics

Data Set

Menzies

Description of Use A subset was taken from the Menzies resources to train generative AI on information regarding Rheumatic Fever.

Data Set

Challenge Entries

Using Open Data to Close the Gap

How can we improve the data and information sharing practices between governments and Aboriginal and Torres Strait Islander communities and organisations to Close the Gap?

Go to Challenge | 7 teams have entered this challenge.

Generative AI: Unleashing the Power of Open Data

Explore the potential of Generative AI in conjunction with Open Data to empower communities and foster positive social impact. This challenge invites participants to leverage Generative AI models to analyse and derive insights from Open Data sourced from government datasets. By combining the power of Generative AI with the wealth of Open Data available, participants can create innovative solutions that address real-world challenges and benefit communities.

Go to Challenge | 29 teams have entered this challenge.

Using machine learning and generative AI to improve health outcomes

How can machine learning or generative AI be used to help Australians to live longer, healthier, happier lives?

Go to Challenge | 14 teams have entered this challenge.

COVID’s lasting impact on education

What impact has COVID had on student educational outcomes (for example retention/enrolment rates, HSC completions and more)? What lessons can we learn to minimise the impact on education for future pandemics?

Go to Challenge | 7 teams have entered this challenge.

Best Creative Use of Data in Response to ESG

How can you showcase data in a creative manner to respond to ESG challenges? How can we present and visualise data to stimulate conversation and promote change?

Go to Challenge | 33 teams have entered this challenge.