The Daily Sydney

Sydney news, every day

News

Sydney's Duplicate Image Problem: How the City Stacks Up Against London, Singapore and New York

From Parramatta council chambers to the State Library on Macquarie Street, Sydney's institutions are wrestling with a surge in duplicated digital imagery — and the rest of the world is already ahead.

By Sydney News Desk · Published 5 July 2026, 5:27 am

3 min read

Sydney's Duplicate Image Problem: How the City Stacks Up Against London, Singapore and New York
Photo: Photo by Robbie Veenstra on Pexels

Sydney's public sector is sitting on tens of millions of digital image files it cannot reliably deduplicate, and the gap between this city and comparable global centres is widening fast. According to technology procurement records reviewed by The Daily Sydney, at least a dozen NSW government agencies have flagged duplicate image management as a priority under the state's Digital Information Policy framework, which was last substantively updated in March 2024.

The issue matters now because the volume of digital assets held by councils, cultural institutions and state agencies has roughly tripled since 2019, driven by digitisation projects, body-worn camera rollouts and the mass ingestion of historical collections. Storage costs are not trivial. Commercial cloud storage in Australia typically runs between $0.023 and $0.025 per gigabyte per month for government bulk contracts, and duplicate files — by some industry estimates, up to 30 per cent of any large unmanaged archive — represent direct, avoidable expenditure.

What Sydney Is Actually Doing

The City of Parramatta Council began a staged deduplication audit of its digital asset management system in February 2025, targeting its planning and development image library, which holds records going back to 2003. The project, handled through its internal ICT directorate, is understood to be using hash-based fingerprinting software to flag identical and near-identical files before human review. Separately, the State Library of New South Wales, whose digitised collections on Macquarie Street now exceed 4.5 petabytes, has been running a parallel deduplication pass across its Flickr Commons and internal DAMS repositories since late 2024.

NSW's broader approach is fragmented. Agencies procure deduplication tools independently, and there is no single whole-of-government standard for what counts as a duplicate — an identical pixel-for-pixel copy, a near-duplicate with minor compression differences, or a semantically similar image taken seconds apart. That inconsistency creates gaps. The Western Sydney University library system, which serves eight campuses stretching from Penrith to Campbelltown, flagged the problem in a 2025 internal review obtained under a Government Information (Public Access) request: the university estimated it held more than 180,000 image files with at least one duplicate, but lacked a unified policy for resolving conflicts between metadata-rich and metadata-poor versions of the same asset.

London and Singapore Have Moved Further, Faster

Compare Sydney's patchwork effort to what London's Government Digital Service and Singapore's Government Technology Agency have implemented. The UK's GDS issued binding guidance on digital asset deduplication for all central government departments in January 2024, mandating that any image ingested into a shared repository must pass an automated duplicate check before storage allocation is confirmed. Singapore's GovTech went further, embedding perceptual hashing — a technique that catches near-identical images even when file names and metadata differ — into its Whole-of-Government data infrastructure in mid-2023.

New York City's Department of Records and Information Services, which manages archives for five boroughs, adopted a similar perceptual hashing standard in late 2023 as part of a broader cloud migration project. All three cities also publish annual digital asset reduction figures. Sydney's equivalent reporting simply does not exist at a comparable level of public accountability.

The financial stakes are real. A 2024 report by the Australian National Audit Office — covering federal rather than state agencies — found that poor digital asset hygiene, including unmanaged duplicates, contributed to storage overspend across multiple departments reviewed. The ANAO did not name specific agencies in its public summary, but its recommendations are relevant to any government jurisdiction managing large image volumes.

For councils and agencies trying to close the gap now, technology specialists point to three practical steps: adopt a documented deduplication standard before the next major digitisation tender, require vendors to demonstrate perceptual hashing capability alongside conventional MD5 or SHA checksums, and publish annual asset reduction metrics. The State Records Act 1998 already obliges NSW agencies to manage records efficiently — deduplication is increasingly seen as part of that obligation, not an optional extra. Budget season starts in September. The agencies that can show measurable storage savings from deduplication work will have a stronger case when ICT capital requests hit Treasury.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Sydney

This article was produced by the The Daily Sydney editorial desk and covers news in Sydney. See our editorial standards for how we use AI.

The Daily Sydney brief

The day's Sydney news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Sydney and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Sydney news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Sydney and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Sydney

More in News

Enjoyed this story? Get tomorrow's briefing free.