Bounty: Is seeing truely believing?
How can we tell a story with visualisations, that speaks the truest representation of our data?
Go to Challenge | 28 teams have entered this challenge.
Altis Canberra
We trained random forest and neural network classification models to do predictive modelling on cases of insolvency non-compliance. From these models were also able to extract useful indicators for potential non-compliance. We then combined the AFSA insolvency data set with 7 other datasets to explore a range of potential additional correlations.
Beginning this project, we found that it was a relatively easy task to simply mash together a few datasets and generate visualisations that suggest one thing or another. However, we realised there wasn’t necessarily an immediate basis for these claims, so we took a step back and decided to take a more scientific approach.
By leveraging random forests and a deep learning algorithm based on a triple layered neural network, we were able to train our system to recognise the key correlating factors that contributed toward non-compliance prediction and potential causation factors for personal insolvency.
Only once we found these correlated contributing factors, we chose linking fields associated with these factors to find relationships. We are not claiming to have found perfect causation factors for non-compliance or insolvency, but we have built a robust system to identify where potential causality could lie in order to assist with identifying venues for further research.
Description of Use Used to further assist with the limited granularity of the insolvency dataset
Description of Use Used to generate crime levels for Victorian SA3 levels to correlate to insolvency statistics
Description of Use Used to generate crime statistics by SA3 by connecting from suburb to postcode and then to SA3
Description of Use Used to correlate with insolvency statistics
Description of Use Used to correlate population and population density with insolvency statistics
Description of Use Used to map between postcode-based data and SA3-based data
Description of Use Used to generate average housing prices per postcode to correlate more expensive suburbs vs less expensive compared to insolvency
Go to Challenge | 28 teams have entered this challenge.
Go to Challenge | 13 teams have entered this challenge.
Go to Challenge | 5 teams have entered this challenge.