NSW government agencies collectively manage more than 40 petabytes of unstructured digital data, and a growing share of that storage is consumed by duplicate images — the same photograph, scan or document saved dozens of times across siloed systems. That figure, drawn from state government digital strategy assessments conducted in recent years, is forcing a reckoning inside agencies from Parramatta's Department of Planning offices to the Sydney Local Health District.
The pressure is immediate. With the NSW Labor government under financial strain heading into a budget cycle already dominated by housing and infrastructure spending, the cost of redundant data storage is drawing fresh scrutiny from the NSW Treasury and the Department of Customer Service. Cloud storage contracts — many negotiated between 2019 and 2022 — are due for renewal, and agencies are being asked to justify what they hold before committing to expanded capacity.
Where Sydney's Duplicate Data Piles Up
The problem is sharpest in high-volume processing environments. At Liverpool Hospital in Western Sydney, radiology departments generate thousands of diagnostic images weekly, and without robust deduplication protocols, patient scans can be stored across multiple platforms simultaneously — PACS servers, cloud backups and local drives all holding the same file. The Sydney Children's Hospitals Network, which operates across Westmead and Randwick, flagged digital asset management as a priority in its 2024–25 operational review, noting that imaging libraries were consuming storage at a rate outpacing projections made just three years earlier.
Planning and development records present a different version of the same headache. The City of Sydney Council processes thousands of development applications annually, each accompanied by architectural drawings, site photographs and heritage impact reports. When those files are uploaded to the state's NSW Planning Portal and simultaneously retained on council servers and emailed between departments, duplication compounds quickly. Council technology staff, speaking in general terms at an open government data forum held in March 2026 at Sydney's International Convention Centre, described the challenge of standardising deletion policies across legacy systems that were never designed to communicate with each other.
The numbers tell the story bluntly. Industry research published by the storage analytics firm Aparavi in 2023 found that duplicate and redundant files account for, on average, 30 to 40 percent of enterprise storage consumption. Applied to the NSW government's known data footprint, that suggests somewhere between 12 and 16 petabytes of storage may be occupied by files that are exact or near-exact copies of something already held elsewhere. At current AWS and Azure pricing for enterprise-grade cloud storage — roughly $25 to $30 per terabyte per month for hot-tier access — the annual cost implication runs into tens of millions of dollars before egress fees are counted.
The Push to Clean House
The NSW Government's Data and Digital Strategy, updated in late 2025, nominates data quality and lifecycle management as tier-one priorities for the 2026–2028 period. The strategy tasks the Department of Customer Service's Digital.NSW division with publishing updated guidance on deduplication standards by the end of this calendar year. Several councils in Sydney's growth corridor — including Blacktown City Council and Cumberland Council, both managing rapid population increases driven by Western Sydney development — have begun piloting automated image-deduplication tools across their document management systems.
What happens next depends largely on whether agencies treat deduplication as a one-time clean-up or embed it as an ongoing governance obligation. Technology procurement teams renewing contracts with vendors like Microsoft, ServiceNow and OpenText are being advised by the NSW Procurement Board to include deduplication benchmarks as a condition of tender evaluation — a change from previous rounds where raw storage capacity was the headline metric.
For Sydney residents, the practical stakes are not abstract. Faster retrieval of planning documents, quicker access to medical imaging records, and lower long-term costs for government IT infrastructure all flow from agencies getting this right. The data already exists to show the scale of the problem. The harder task is doing something about it before the next storage invoice lands.