Data Story
We utilised a range of datasets to inform our prediction model. This included:
* For mapping of Victorian LGAs we used the ABS LGA shape file data available at: https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3/jul2021-jun2026/access-and-downloads/digital-boundary-files
* ABS table builder was a great source of LGA-level workforce data, and where we sourced information on the size of the healthcare and social service workforce in each LGA: https://www.abs.gov.au/statistics/microdata-tablebuilder/tablebuilder
* Data Victoria's town and community profiles data provided many rich predictor variables including measures like family violence rates, offending rates, the proportion of people born overseas, unemployment, one parent families, food insecurities, immunisations and distance from Melbourne: https://discover.data.vic.gov.au/dataset/2011-town-and-community-profiles-data/resource/01feeb94-359a-455b-9a70-d3bca9f432ac
* SEIFA data for 2021 came from the ABS - we utilised both disadvantage scores and deciles. https://www.abs.gov.au/statistics/people/people-and-communities/socio-economic-indexes-areas-seifa-australia/latest-release
*And, to bring it all together, the population prediction work done by VIC's Department of Transport and Planning: https://www.planning.vic.gov.au/__data/assets/excel_doc/0032/691655/VIF2023_LGA_VIFSA_Pop_Age_Sex_Projections_to_2036_Release_2.xlsx
We used a random forest model, a type of machine learning model, to predict the demand for workforce, the need for additional workforce, and the expected AECD results up to 2031. The benefit of this model is that it has been tested on similar demand problems and shown to be more predictive than regression-based models and as good as other machine learning models. It is also very interpretable, unlikely some machine learning models, and has the added benefit of being able to handle collinear variables reasonably well. We used a training set and hold-out test set and found that our model had very good accuracy results for our test set.
The LGAs with the highest projected increase in need for health and social service workers are: Melton, Bayside, Hume, Melbourne, Maribyrnong and Greater Dandenong.
The LGAs with the highest number of predicted children with early development vulnerabilities are: Wyndham, Casey, Melton, Whittlesea, Hume and Melbourne.
Using R’s Leaflet package, we have presented our predictions in heat maps, which show clearly the areas likely to have a high need for greater support workforce in the future.
This work could be enhanced and expanded on in the future by adding more variables and newer variables, and by exploring the decision trees that make up the random forest to understand the underlying variables that are most predictive either singularly or in combination, to inform more granular service planning.