The Daily Sydney

Sydney news, every day

News

Sydney's Duplicate Image Problem: How the City Stacks Up Against London, Amsterdam and Singapore

Councils and cultural institutions across Sydney are quietly wrestling with a surge of duplicate digital images clogging archives — and the city's piecemeal response trails purpose-built programs overseas.

By Sydney News Desk · Published 5 July 2026, 4:45 am

3 min read

Sydney's public institutions are sitting on hundreds of thousands of duplicate digital images — redundant scans, re-uploaded heritage photographs and repeated planning documents — that are quietly draining storage budgets and slowing access to records that residents, researchers and developers actually need. The problem has become acute enough that at least two major Sydney organisations have begun dedicated deduplication programs in 2026, raising questions about whether the city is acting fast enough compared to peers in London, Amsterdam and Singapore.

The pressure point is timing. Sydney's Metropolitan Local Aboriginal Land Council, the City of Sydney Council's digital archive division and the State Library of NSW have all expanded their digitisation programs over the past three years, pulling in material from suburban libraries, demolished buildings and community groups. More digitisation means more duplication — and without automated detection tools embedded at the point of ingestion, the problem compounds quickly. The City of Sydney's archive, based at the Customs House precinct on Alfred Street in the CBD, reportedly holds millions of image files across multiple storage environments, though the council has not publicly released a breakdown of what proportion are duplicates.

What Sydney Is Doing — and What It Isn't

The State Library of NSW, on Macquarie Street, has been running a quiet internal audit of its digital collections since at least late 2025, according to publicly available tender documents on the NSW Government eTendering portal. A contract published in March 2026 sought software capable of perceptual hashing — a technique that identifies visually similar images even when file names or metadata differ. That is a meaningful step. But the contract covered only the Library's photographic holdings, leaving the broader question of how council and university repositories handle duplicates largely unanswered.

Western Sydney University's Parramatta campus library, which holds significant community heritage collections from the Hawkesbury and Penrith regions, does not appear to have a publicly described deduplication program. The university has not responded to questions from The Daily Sydney.

Compare that with Amsterdam's Rijksmuseum, which completed a full deduplication sweep of its 700,000-image Rijksstudio collection in 2024, cutting redundant files by an estimated 18 percent. The British Library in London embedded automated duplicate detection into its digitisation pipeline in 2023 as part of a broader infrastructure overhaul. Singapore's National Library Board went further still, integrating AI-assisted deduplication across all thirteen of its digital repositories by mid-2025 — a project that officials there described publicly as a cost-avoidance measure worth millions of Singapore dollars annually.

The Cost of Doing Nothing

Cloud storage is not free. Amazon Web Services and Microsoft Azure — the two platforms most commonly used by NSW government agencies — charge in the range of $25 to $35 per terabyte per month for standard storage tiers, depending on configuration and data transfer volumes. For a mid-sized council archive holding, say, 50 terabytes of image data where 15 to 20 percent is duplicated, the annual waste runs to thousands of dollars. Across a city with thirty-three local government areas, the aggregate figure becomes meaningful.

Beyond cost, there is a practical consequence for users. Researchers at institutions like the Powerhouse Museum in Ultimo — which is in the middle of its own controversial physical relocation to Parramatta — have noted informally that search results across digitised heritage collections are cluttered by near-identical images, making genuine research slower and less reliable. The museum's digital team did not provide a formal comment.

For Sydney to close the gap on Amsterdam, London and Singapore, the most obvious path is coordinated policy rather than institution-by-institution tinkering. The NSW Department of Communities and Justice oversees record-keeping standards under the State Records Act 1998, and a revision of digital storage guidelines — last meaningfully updated before cloud storage became the norm — would give agencies a mandate to act together. Institutions waiting for that guidance should at least begin internal audits now. The cost of cataloguing what you have is almost always lower than the cost of storing what you don't need.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Sydney

This article was produced by the The Daily Sydney editorial desk and covers news in Sydney. See our editorial standards for how we use AI.

The Daily Sydney brief

The day's Sydney news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Sydney and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Sydney news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Sydney and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Sydney

More in News

Enjoyed this story? Get tomorrow's briefing free.