An open database of 1.8 million ‘breed ready’ Chinese women. Another open database of 2.5 million Uighur Muslims in Xinjiang. Ethical hacker Victor Gevers, a researcher for the GDI Foundation, accidentally stumbled on a truly mass system of surveillance in China. And you won’t believe how Western tech companies and the Chinese police and state tie into it all.
On this episode of China Unscripted Chris Chappell, Shelley Zhang, and Matt Gnaizda sit down with ethical hacker Victor Gevers who discovered how the Chinese Communist Party is tracking millions of Chinese citizens, including their text messages and GPS locations, and then sending that to local police stations.
Chris Chappell: Joining us now is Victor Gevers, a researcher for the GDI Foundation and I guess what you’d call an ethical hacker. He recently discovered, not one, but two (!) massive open databases in China. One was called SenseNets, which has a disturbing amount of personal information, about 2.5 million Uighur Muslims in Xinjiang, and the other was a database of 1.8 million women in China with a mysterious field called “breed ready”. Thanks for joining us today, Victor!
Victor Gevers: Thank you.
I’m an ethical hacker. I have been exploring the Internet and machines online for almost 20 years. The GDI Foundation is a nonprofit organization existing of twenty five volunteers at the moment and we operate globally. What we do is actually we look for trouble. The internet is full of vulnerable devices, data leaks and we have eyes on them. There are about 43.5 million issues like those on our map. And every time when a new red dot appears, we’re going to take a look, we’re going to investigate what’s going on and after we have investigated what it is, we’re going to try to find the owner and if we found the owner, then we’ll send them an email, like a responsible disclosure and explaining, this is your system, this is the issue with it and this is how you have to fix it. And if you need more help from us, please reach out to us. And that’s actually what we’d been doing for the last three year.
Chris Chappell: So, so an ethical hacker, you’re basically like Keanu Reeves is what you’re saying then?
Victor Gevers: No, no, no. I’m way not that cool. So no.
Chris Chappell: Well that is a high bar.
Victor Gevers: It is a very high bar. We say ‘ethical’ because if we say ‘hacker’ then people are going, “Oh my God, are you going to steal my credit card or are you going to send me a fishing email?” We have to put the word ‘ethical’ in front of it so people understand, “Oh, they are the nice guys.” But actually we can bring cookies.
Chris Chappell: Oh, cookies! Oh wait, are those the bad kind of cookies? The ones that can get into my information and stuff?
Victor: Exactly! Well, those are the things we’re trying to point out. So when we see a website that, for example, helps you find for which political party you want to vote and we see that there are tricking cookies on that website, then we reach out to the organization and say “Hey, maybe this is not a good idea.” One of those cookies gets transferred to Google. Google knows everything.
Chris Chappell: So, you find these red dots, sort of these systems that may have some issues and you reach out to them to offer help. But sometimes you notice that these red dots maybe have other issues that may be the people behind them aren’t so benevolent. Right?
Victor Gevers: That is an issue that we found out. So when we, I think sense net was actually the first case for us where we say, hey, wait a minute! We went into the database, we saw some fields and, normally, when you go into a database that runs a system, there is always like a user table. In the user table there are the people that can log in and can operate the device. A little like a website. Most of the time the first or the second user has the admin rights. Someone has to be on top of that to control app. The problem with this database, there was no operator table. So, what we had to do, we had to go into the next part of the database, which was showing like checkpoints and GPS locations, and cameras, and all kinds of other stuff. Then I was like, okay, this is like a very important system. Let’s see if we can find the owner. And in the second database field, there was the name: SenseNets. So we never heard of that, so we started googling it. Okay, this is a security company that builds facial recognition security systems in China and they use state of the art equipment. So I was like, okay, this is interesting because if this is really a system for keeping things secure, why is it on the Internet and why is it so open, you know? Did these people not read the manual or have they no clue what they’re doing? The first week I was like, okay, now that we reported to the owners, we wait 12 hours and then we share it with the world, listen, you know, if you’re going to run a security system, maybe you don’t want to put it on the Internet like this.
Chris Chappell: When you say an open database, what does that mean?
Victor Gevers: An open database is a system where you can connect to, from the Internet, without any form of authentication. The database that was open and it is known as MongoDB, which is a very famous open source database system. And when you install it, when you’ve finished the installation, the operator screen says, “Please put a password on it and please do not put it on the Internet.” Well both were done in the case of sense nets. And the most horrible part was actually people did not only have read access to the database, they also had administrative access. That means that you can change, create, read, update and delete records so you can remotely manipulate a security system.
Shelly Zhang: So anybody who could access that and knew what they were doing could do this.
Victor Gevers: Yes, and what we always do when we show journalists a data leak, because they approach us and ask for, hey, you found a data leak, please explain us. We say no, let’s install the software that you need, which is also open source and we guide you through it. And if we can teach a journalist in five minutes how to work with the resource like that, that makes it clear how easy it is. So that’s why it is not even called hacking, it’s just entering a system that is open. And that’s the main point. I think this is the most remarkable thing about this data leak. So we thought, okay, this must be an error. You know, this is something very rare, this normally never happens. We started noticing that more databases in China now suddenly appear online. So we started exploring more and more of these database.
Chris Chappell: Hold on! Before we talk about the other databases. In your story, you said you contacted SenseNets about like, hey, this database is open, this probably shouldn’t be. You waited 12 hours. So what was in this database that was so, you know, you said it did raise concerns?
Victor Gevers: Yes! It raised concerns. What we found in the database were a few fields that triggered us, there were things like face search alarm, which was a function that you could address a face to. So when someone passes a camera or a mobile check point where the camera would register the person and it’ll set off an alarm. The other problem was the persons that were passing through those checkpoints, they were allowed to go in a certain area and, it will actually follow everyone. So everyone in Xinjiang province was being tracked and I mean everyone that went in that province! There was no exception. Even the security staff who were labeled “security” were registered. So this means that this, this system is being used for keeping in the specific area very secure and very monitored.
Chris Chappell: It’s a mass, mass surveillance system.
Victor Gevers: Yes. That’s actually what it is. Yes.
Chris Chappell: What is especially worrying in the Xinjiang region they have a concentration camps with at least a million Uighur Muslims… Is there anything else we should know about what you discovered was sense nets?
Victor Gevers: What happened is after this thing hit the news, we got a lot of questions like, “can you please tell us if our family members were still in there?” And that is the most depressing part is that we do report an issue when something is broken and then later we find out, oh, this was not, you know, this was more sinister than we thought. Next time should we just, you know, delete the entire database? Because we did get remarks: why did you not destroy it if you had full open excess? Well that will not help those people.
Chris Chappell: It’s not a long term solution to the problem.
Victor Gevers: Exactly. We have a mission statement that says that we operate neutral. So I think the SenseNets case for us was the first time from, okay, we’re neutral, but we’re not going to let this go by quietly. We have to share this, also for us, for learning. What do we do if we find databases like this? Because, to give a little bit of an idea in 2018 we have a reported over 600,000 issues, more than 500,000 were solved. If you operate at that speed of fixing things on the Internet, maybe you’re going to miss things, so this was actually for us an eye opener. Based on the responses that we get from people that are directly or indirectly affected by what sense nets does we need to think about how we’re going to handle this in the future. So that also slowed us down a lot because now we have to take extra care when we report things, that we are doing the right thing.
Chris Chappell: That makes sense because, you know, we’re going to enter a period where we’re going to have 5G. There’s going to be incredible amounts of data about everyone out there.
Shelly: So did you know what was happening in Xinjiang before you guys stumbled upon the SenseNets database?
Victor Gevers: No. When I posted this tweet we have technical journalists who know something or write about our new discoveries. And one of the technical journalist who wrote many articles about this said, “Hey, that is interesting. That’s the SenseNets. I think they build systems for that and he asked me, can you pass me some GPS locations?” And so I said, okay. It’s not closed yet, so yeah, I can feed you some 10,000 GPS locations. And within five minutes he came back to me and he says, “Well, Victor, I think, you need to follow up on this because this is no good. You did a good thing in reporting it, but you did a bad thing by reporting it.” So that’s a bit, sounds a bit contradictory, well we reported anyway to the company so they’re going to fix it, but in the meantime, of course, you know, we can let the world know about it. So I let him write an article about it, and that triggered the rest of the mainstream news, it went very quickly after that.
Shelly: So, when you say in 2018, you guys looked at 600,000 databases, where were they mostly? Where do you usually find these things?
Victor Gevers: Mostly I think it should be the United States. Asia is about the second place in the world and then comes Europe.
Chris Chappell: So did SenseNets send you a letter saying thank you for letting us know
Victor Gevers: No, no.
Chris Chappell: They weren’t grateful.
Victor: No. And that’s okay. You know, we report these things hoping that we get an answer. Our biggest concern is that issues are being fixed. We don’t need to receive gratitude or recognition for that. Our biggest concern is that there’s a database, and many people in there are at risk. So when a master vision system is open and anyone could edit that, then our biggest concerns is for the people that are in there, because if the people can mess with these systems, that can go very wrong.
Chris Chappell: Well, in case of the Uighurs in Xingjiang, I think they’re still very at risk. Maybe just at different things. Did this prompt you to take a closer look at China and some of the databases there?
Victor Gevers: Yeah, that’s a good question. For some reason we were like, okay, this was a one time incident, let’s not focus on China anymore. And then like 48 hours later, the amount of red dots that we have normally per country… in China, some rocketed up, we had like 15 or 1800 new incidents in 24 hours.
Shelly: These are databases that are open suddenly?
Victor: Yes. There are many more dots so we had a theory. For one, there’s just a technical failure in the great firewall, because that’s possible because humans make mistakes or, people know that we were looking for these open databases, just because we’re doing this already for years and they want to be found.
So, yes, for us it started okay, and these databases, they were not small databases in size or had names like… there was another like a supermarket or a pizza delivery service that was on, but these all had names and database names that would trigger alerts.
Chris Chappell: Do you have any sense of why somebody would want these databases out in the open?
Victor Gevers: If you cannot ask for help, well then maybe this could work. It is like my, look at this, what’s going on? On the other hand, we see so many open databases, appear every day. Because we see databases even from very big security companies, companies that sell online security solutions to protect against hackers and their databases are all swirled all throughout the Internet.
Chris Chappell: That makes me feel great.
Victor: Yeah, same here. The good thing is that this problem is not new. It was that the problem was already identified in 2014 and people are just making the same mistakes over and over and over because that’s human behavior. And also the people that had learned from this mistake, an engineers or a security engineer doesn’t normally stay at his job forever. They move on. So then the next generation comes in. The younger generation just fresh out of school, they are asked, please build this system and deploy it, and that’s why we see the same mistakes being made over and over.
Shelly: Are these databases still open or did they…
Victor: Yes, the most are still open. I have a list. If you give me a second and I’ll get you the actual list. I think I could share that list
Chris Chappell: This is all speculation, but one thing I’ll just talk a little while you’re looking for the list, that pops into my mind, yeah, it could be that something happened with the firewall, the great firewall of China, but also the Internet in China is often used in a political way. There’s often sensitive dates where there’s some significant political thing that has happened. Tienanmen Square Massacre is an example of a sensitive date, but at times, various times throughout the past, like 10 years or so, you’ll see certain sensitive words either blocked more often than not, but sometimes you’ll see words become suddenly uncensored. I know there’s a period of time where for some reason like searches for Falun Gong, which is a very heavily persecuted spiritual group in China usually heavily, heavily censored. But there was like this brief period where suddenly that was unblocked and there was some speculation that maybe that was a political decision by some people within the Chinese government So, I mean, it’s all speculation in this case, but, yeah, I do wonder if there could be an element of somebody using the sort of the original SenseNets exposure to achieve their own political ends?
Victor Gevers: It makes sense. So I found the Tweet, I put it in a chat but can also read it for you. So China, has at this moment more than 29,808 open databases and they’re in ranking in the second place in the top five list in countries who have Mongodb’s and connected to the Internet. U.S is leading in the first position with more than 45,000 servers followed by Germany, the Netherlands and France.
Chris Chappell: US number one.
Victor: Oh yes. Obviously because Amazon and Microsoft and Google cloud, those three are actually the biggest internet polluters. And if you can mention, bad use of technology then, yeah.
Shelly: And then what do you mean by Internet polluter?
Victor: Well, it’s very important when systems are not safe, hackers will go in, and I mean the bad hackers, the bad people. They will go in and own the systems and in the past they’ll just deface your website by putting a funny image on it. Nowadays, they use it to create botnets. Botnets are very scary because what this can do a lot of damage. The biggest example of a botnet that was roaming last year, was able to take down Netflix, because it was so big. And I use a vulnerability that was known for two years called memcached. And it’s just an open source servers that you can install on your servers to handle traffic going to your website. And if you do configure that wrong, other people can go to your server and abuse that power or use that system to target the system.
Chris Chappell: So when you say, these bots are very dangerous, like what, what do they do? They can shut down a website or?
Victor Gevers: Yeah, they can DDoS they can indeed create so much ddos traffic that websites go offline. One example that we know is the Mirai botnet 2016 that disconnected the entire country of the Internet for 24 hours. And, if someone is powerful enough to shut down or quiet Facebook and Netflix, these are not small parties, these parties are always present on the Internet. So when they’re not reachable anymore, know that’s a bad time. Especially when those powerful DDOS services are going to target banks or supply chains for hospitals or supermarkets.
There’s another threat, if a country takes another country’s communication devices online or any means of communicate, and they’re not available anymore, that is actually the most scary thing because if we lose communications, then just game over.
Chris Chapell: Wow. I mean, I know the Chinese Communist Party is putting a lot of effort into training military hackers to do exactly that.
Victor Gevers: Yeah.
We need to make the Internet a more secure place. We have to take off the rotten fruit from the tree before it becomes a problem. Before the insects come in. You cannot, you know, you cannot use the tree anymore. So now this is for us, the most important thing to do is to have our communication–to be able to communicate with each other.
Chris: Recently, we celebrated International Women’s Day, and you uncovered another very large open database of 1.8 million women. Tell us a bit about that and how you found this thing?
Victor Gevers: We found a database like any other, and it looked like a dating website or a database. It was found by one of our younger volunteers and he says, “This is China. Victor, you take care of it.” Because we now have a new rule: Everything in China goes past me because I want to make sure that before we report something, we did a proper job in identifying what it is. So that’s one of the things that changed. So when we were looking at the database, the first fields were some basic information like the birthday, and the people names. And then I saw identification card, um, which you know, looked like the number or Chinese passport number or ID card number.
And, going on down, the sex, and if they had social media profiles. And some of them had Facebook profiles. Photo, apparently the photos were stored somewhere on the server; if there were politically active, what kind of education they had, and all the way down home geographic location, longitude, latitudes, which province, the city, the area, the city code, the city name, all the way down to the last field labeled ‘breed ready’. So we were like, okay, wait a minute, what does that mean?! We seen a lot of databases, especially also now in China, that sometimes you’ll see that they mix up names. So we’re like, okay, maybe the developer made a mistake. And then we went back to other field. But wait a minute, if this is a dating website, where are the hobbies? Where are their interests? Favorite activities were missing. Such a crucial information was missing.
But there were some other fields like, installation. Installation, what does it mean? And, there was a description of a geographic location, like a building or a location in the building, fro example, the roof, or the window, or the door. So that made it very confusing. This looks more like a surveillance database or something. Then we start querying the data.
So the first query that I kicked off was the average age of the people on this thing. It was 42. The average age of the people in the database is 42, okay. Well, I see another field, how many people are breed ready, and then it would return results. Then I was like, okay, how old is the youngest person with that status? Return: The youngest person with a breed ready status, is 18 years old. What does the oldest woman in the database with the same status she has for years old. There were a few women aged 95 and 93. I thought this is clearly not a dating website. It must be something else.
The next step is ask as the Internet, did you ever seen this? Then the responses came in. Some of them were very helpful.
It took us actually quite a few days to find another database that has almost the same data structure. We start comparing, hey, these are in the same network range as the previous one.
Chris Chapell: This ‘breed ready’ database is, you know, for decades China had the one child policy which basically resulted in China having 30 million more men than there are women. There’s the population aging. There’s declining birth rates for a variety of reasons. Essentially, because there are such a disproportionate amount of men, this has resulted in horrible sex trafficking.
And really the communist party has done so many crazy mindbogglingly horrible things.
Victor Gevers: You don’t need to be a China expert to come with a plausible explanation.
Chris Chappell: Did you have any idea of the scale of mass surveillance in China before all this happened?
Victor Gevers: To be honest, no.
I’m an innovation manager for me, technology like a facial recognition and machine learning is, you know, that makes my heart pump, that makes me happy. I love technology and I love technology being helpful to humans. Now when I see technology being used in a way that we all agree on is wrong, Hey, this is not where it’s supposed to be, and be used for–that makes me angry.
Chris Chapel: And if this episode made about the Chinese Communist Party’s technological surveillance and overreach made you depressed, get ready for the next week! Because we’ll be sitting down with retired Air Force General to talk about how the Communist Party infiltrating the U.S. economy.
Join us next time.
Thanks for listening to China Unscripted!