A sample of 750,000 data entries from the three main indexes of the database was included in the seller's post. CNN verified the authenticity of more than two dozen entries from the sample provided by the seller, but was unable to access the original database.
The Shanghai government and police department did not respond to CNN's repeated written requests for comment.
The seller also claimed the unsecured database had been hosted by Alibaba Cloud, a subsidiary of Chinese e-commerce giant Alibaba. When reached by CNN for comment on Monday, Alibaba said "we are looking into this" and would communicate any updates. On Wednesday, Alibaba said it declined to comment.
But experts CNN spoke with said it was the owner of the data who was at fault, not the company hosting it.
"As it stands today, I believe this would be the largest leak of public information yet -- certainly in terms of the breadth of the impact in China, we're talking about most of the population here," said Troy Hunt, a Microsoft regional director based in Australia.
China is home to 1.4 billion people, which means the data breach could potentially affect more than 70% of the population.
How many people have downloaded the data
It is unclear how many people have accessed or downloaded the database during the 14 months or more it was left publicly available online. Two Western cybersecurity experts who spoke to CNN were both aware of the existence of the database before it was thrust into the public spotlight last week, suggesting it could be easily discovered by people who knew where to look.
Vinny Troia, a cybersecurity researcher and founder of dark web intelligence firm Shadowbyte, said he first discovered the database "around January" while searching for open databases online.
"The site that I found it on is public, anybody (could) access it, all you have to do is register for an account," Troia said. "Since it was opened in April 2021, any number of people could have downloaded the data," he added.
Troia said he downloaded one of the main indexes of the database, which appears to contain information on nearly 970 million Chinese citizens. But it was difficult to judge whether the open access was an oversight from the owners of the database, or if it was an intentional shortcut intended to be shared among a small number of people, he said.
"Either they forgot about it, or they intentionally left it open because it's easier for them to access," he said, referring to the authorities responsible for the database. "I don't know why they would. It sounds very careless."
Exposed data is a growing problem
Unsecured personal data -- exposed through leaks, breaches, or some form of incompetence -- is an increasingly common problem faced by companies and governments around the world, and cybersecurity experts say it is not unusual to find databases that are left open to public access.
In 2018, Trioa discovered that a Florida-based marketing firm exposed close to 2 TB of data that appeared to include personal information on hundreds of millions of American adults on a publicly accessible server, according to Wired.
In 2019, Victor Gevers, a Dutch cybersecurity researcher, found an online database containing names, national ID numbers, birth dates and location data of more than 2.5 million people in China's far-western region of Xinjiang, which was left unprotected for months by Chinese firm SenseNets Technology, according to Reuters.
But the latest data leak is particularly worrying, cybersecurity researchers say, not only because of its potentially unprecedented volume, but also the sensitive nature of the information contained.
What is inside the database
A CNN analysis of the database sample found police records of cases spanning nearly two decades from 2001 to 2019. While the majority of the entries are civil disputes, there are also records of criminal cases ranging from fraud to rape.
In a record, a mother called the police in 2010, accusing her father-in-law of raping her 3-year-old daughter. "There could be domestic violence, child abuse, all sorts of things in there, that to me is a lot more worrying," said Hunt, the Microsoft regional director
"Might this lead to extortion? We often see extortion of individuals after data leaks, examples where hackers can even try to ransom individuals."
Bob Diachenko, a security researcher based in Ukraine, first came upon the database in April. In mid-June, his company detected that the database was attacked by an unknown malicious actor, who destroyed and copied the data and left a ransom note demanding 10 bitcoin for its recovery, Diachenko said.
It is not clear if this was the work of the same person who advertised the sale of the database information last week. By July 1, the ransom note had disappeared, according to Diachenko, but only 7 gigabytes (GB) of data was available -- instead of the 23 TB originally advertised.
Diachenko said it suggested the ransom had been resolved, but the database owners had continued to use the exposed database for storing, until it was shut down over the weekend. "Maybe there was some junior developer who noticed it and tried to remove the notes before senior management noticed them," he said.
Shanghai Police did not respond to CNN's request for comments on the ransom note.
IMAGE SOURCE: PIXABAY