NSW Machine Learning Building Extraction

An algorithm to extract building footprints utilising machine learning with aerial imagery and LiDAR.

The Challenge

NSW Spatial Services is the owner and custodian of the ten foundation spatial data themes of the NSW Foundation Spatial Data Framework (FSDF). The FSDF is the State’s authoritative geographic information portfolio which underpins any other information, and is used to support evidence-based decisions across government, industry and the community. This project helped deliver improvements to the second data theme ‘land parcel and property’. The aim was to create a prototype algorithm using machine learning to extract building footprints from aerial imagery and LiDAR, identifying and tackling the environmental differences for each landscape archetype.


The project partners were the NSW Department of Finance, Services and Innovation, Spatial Services Division (DFSI) and RMIT. FrontierSI was supported by Player Piano Data Analytics.

The Solution 

This project thoroughly investigated the problem of automatic building footprint extraction for NSW using imagery and machine learning algorithms. The first step of the project was for human analysts to generate machine learning training sets by manually delineating individual building features from the imagery provided. The project collected an extensive number of building footprints from a variety of archetype regions across NSW (rural, suburban, metropolitan, industrial). In total there are 45,775 building polygons located in 30 regions across Sydney, Wollongong and Wollondilly.

The objectives of the next step of the project were as follows:

  1. Select three state-of-the-art machine learning algorithms for testing:
    1. UNet, PSPNet and MaskR-CNN.
  2. Implement and train the three algorithms to generate building footprint predictions for NSW using:
    1. Aerial imagery.
    2. Geoscape buildings dataset (for low-level features e.g., general homogeneous surfaces).
    3. Human analyst annotated building footprints (for high-level features e.g., building shapes and roof surfaces).
  3. Apply post-processing to regularise the predicted building footprints and attach roof material types and building height attributes from LiDAR.
  4. Compare the prediction results from objective two (2), to recommend the prototype algorithm to implement for NSW.

The project was successful in validating that building outlines and features could be automatically extracted using a machine learning approach, and that the outputs were comparable or better than existing approaches and data products. The trained prototype model was deployed as a command line tool for NSW to run across the state to generate building footprints with an accuracy close to human digitised footprints.


A machine learning approach to building footprint extraction is efficient and repeatable, providing more timely, consistent, and accurate building footprint datasets at a fraction of the cost of human digitised footprints. Using aerial imagery which is orthorectified to NSW’s reference framework improves the positional accuracy of vector-based products away from the original digitised map base, towards absolute positioning. Machine learning generated building footprints will also enable new and improved data analytics needs of customers by supplying time series spatial data, as well as supporting the development of smart cities and digital twins for NSW.

The current project outcomes were not intended to be a production ready tool, so further work is recommended to optimise the approach and code. The model was trained and tested using imagery of a particular resolution and format, so the prototype is limited to that format and optimised for the resolution. It is expected, with relatively limited human intervention, the predicted building footprints can be used for most applications. Potential future work may include product improvements, pushing the prototype solution into production, increasing the number of inputs data formats, and training with additional resolutions of imagery.


To learn more, contact FrontierSI at contact@frontiersi.com.au or Project Manager, Jessica Keysers, at jkeysers@frontiersi.com.au.