Sydney's public institutions are sitting on vast stores of duplicated digital imagery — redundant photographs, scanned documents and archived visuals clogging servers from the City of Sydney's Town Hall records unit to the NSW Health network centred at Westmead Hospital — and the people responsible for fixing the problem are now speaking with unusual frankness about how badly it has been mismanaged.
The timing is not accidental. The NSW Government's Digital Information Policy, which applies to all state agencies, set mid-2026 as a soft benchmark for agencies to demonstrate measurable progress on digital asset rationalisation. That deadline has passed, and several agencies are still working through backlogs that, according to industry observers familiar with public sector procurement, run into the tens of millions of files.
Why the Problem Has Compounded
Archivists and data managers across the sector point to the same root cause: agencies acquired imaging systems independently, without shared standards, and never built in deduplication from the start. The State Archives and Records Authority of NSW, based on Macquarie Street in the CBD, has been advocating for unified metadata protocols since at least 2022. The authority's published guidance urges agencies to adopt consistent file-naming and hash-verification practices before migrating legacy collections — advice that, practitioners say, was too often treated as optional.
Western Sydney has become a particular flashpoint. Parramatta City Council, managing one of the fastest-growing administrative precincts in the country, completed an internal audit of its digital asset holdings earlier this year. That audit identified thousands of image files stored in multiple locations across the council's content management systems, according to council documents obtained under the Government Information (Public Access) Act. The council has since engaged a specialist vendor to run automated replacement workflows, though the process is ongoing.
The University of Western Sydney's Information Systems faculty, based at the Penrith and Parramatta campuses, has been studying duplicate-image propagation in mid-sized government datasets since 2024. Researchers there have noted that even conservative estimates suggest duplicated assets can consume between 15 and 30 per cent of an agency's total allocated storage — a cost that translates directly into cloud licensing fees that have risen sharply since 2023.
What Replacement Actually Looks Like in Practice
The technical process — scanning a corpus for perceptually identical or near-identical images, flagging the redundant copies and replacing links or references throughout downstream systems — is well understood. The harder problem is governance: deciding which version of a duplicated image is canonical, particularly when different teams have independently edited the same source file.
Transport for NSW, which manages enormous photographic archives tied to the Metro West construction project running from Sydney CBD to Westmead, has adopted a policy of nominating a single source-of-truth repository for all project imagery, with automated checks at ingest. Infrastructure NSW published a contractor brief in March 2026 requiring all documentation suppliers to submit images with embedded provenance metadata. That requirement now applies to every major capital works contract in the state.
The NSW Health records team at the Sydney Local Health District, covering Royal Prince Alfred Hospital in Camperdown and Canterbury Hospital in Campsie, told a sector forum in May that transitioning to a centralised digital asset management platform had already reduced its image-related storage costs, though the district has not yet released specific figures publicly.
For organisations still in the thick of the problem, practitioners recommend three immediate steps: run a full hash-based audit before touching any files; establish a written policy nominating which copy survives in any conflict; and communicate with downstream users — staff, contractors, web teams — before broken links start appearing. The last point matters more than most IT managers anticipate. At least two Sydney councils discovered that automated replacement scripts, run without adequate change management, broke hundreds of public-facing web pages overnight.
The State Archives authority is expected to release updated deduplication guidance before the end of July 2026. Agencies that want to avoid a repeat audit cycle would do well to read it before their next budget submission lands in September.