Back to Projects

Team Name:

Early Development Duo for Insights and Education (EDDIE)


Team Members:


Evidence of Work

EDDIE Development Needs Prediction

Project Info

Early Development Duo for Insights and Education (EDDIE) thumbnail

Team Name


Early Development Duo for Insights and Education (EDDIE)


Team Members


Cat and 1 other member with an unpublished profile.

Project Description


Machine learning can help us crunch data to better plan for the future. As early childhood development is critical to building our society's future, we have trained a random forest algorithm on over 50 variables from Victorian and Commonwealth datasets to predict the developmental vulnerability of local government areas, and the demand for the health and social support workforce by LGA across Victoria up until 2031.


#earlychildhood #randomforest #victoriangovernment #prediction #machinelearning #ai #socialpolicy #victoria

Data Story


We utilised a range of datasets to inform our prediction model. This included:
* For mapping of Victorian LGAs we used the ABS LGA shape file data available at: https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3/jul2021-jun2026/access-and-downloads/digital-boundary-files
* ABS table builder was a great source of LGA-level workforce data, and where we sourced information on the size of the healthcare and social service workforce in each LGA: https://www.abs.gov.au/statistics/microdata-tablebuilder/tablebuilder
* Data Victoria's town and community profiles data provided many rich predictor variables including measures like family violence rates, offending rates, the proportion of people born overseas, unemployment, one parent families, food insecurities, immunisations and distance from Melbourne: https://discover.data.vic.gov.au/dataset/2011-town-and-community-profiles-data/resource/01feeb94-359a-455b-9a70-d3bca9f432ac
* SEIFA data for 2021 came from the ABS - we utilised both disadvantage scores and deciles. https://www.abs.gov.au/statistics/people/people-and-communities/socio-economic-indexes-areas-seifa-australia/latest-release
*And, to bring it all together, the population prediction work done by VIC's Department of Transport and Planning: https://www.planning.vic.gov.au/__data/assets/excel_doc/0032/691655/VIF2023_LGA_VIFSA_Pop_Age_Sex_Projections_to_2036_Release_2.xlsx

We used a random forest model, a type of machine learning model, to predict the demand for workforce, the need for additional workforce, and the expected AECD results up to 2031. The benefit of this model is that it has been tested on similar demand problems and shown to be more predictive than regression-based models and as good as other machine learning models. It is also very interpretable, unlikely some machine learning models, and has the added benefit of being able to handle collinear variables reasonably well. We used a training set and hold-out test set and found that our model had very good accuracy results for our test set.

The LGAs with the highest projected increase in need for health and social service workers are: Melton, Bayside, Hume, Melbourne, Maribyrnong and Greater Dandenong.

The LGAs with the highest number of predicted children with early development vulnerabilities are: Wyndham, Casey, Melton, Whittlesea, Hume and Melbourne.

Using R’s Leaflet package, we have presented our predictions in heat maps, which show clearly the areas likely to have a high need for greater support workforce in the future.

This work could be enhanced and expanded on in the future by adding more variables and newer variables, and by exploring the decision trees that make up the random forest to understand the underlying variables that are most predictive either singularly or in combination, to inform more granular service planning.


Evidence of Work

Video

Project Image

Team DataSets

VIC Population Projection

Description of Use This data was used by our model to make predictions on the changes in developmental vulnerability across Victoria.

Data Set

AEDC Data

Description of Use This data was used to predict the expected developmental vulnerability in each LGA for the years 2026 and 2031.

Data Set

Healthcare and Social Assistance workers by LGA

Description of Use This dataset helped our model predict the number of additional workers in this industry that would be needed in each LGA.

Data Set

SEIFA data

Description of Use We used both SEIFA disadvantage scores and deciles as predictor variables for AECD results and workforce demand predictions up to 2031

Data Set

SEIFA data

Description of Use We used both SEIFA disadvantage scores and deciles as predictor variables for AECD results and workforce demand predictions up to 2031

Data Set

LGA profiles

Description of Use We used these community profiles to train our model on a variety of features of LGAs to help its predictive capacity.

Data Set

Challenge Entries

Forecasting Community Evolution: Leveraging AI and Historical Planning Data

How might we predict future changes in community dynamics, such as population density, housing demand, traffic patterns, and the demand for public services or amenities?

Go to Challenge | 13 teams have entered this challenge.