Malicious Life Podcast: Inside Clearview AI Facial Recognition

Clearview AI scrapes billions of images off social media and the open web, applies facial recognition algorithms on them, and sells that data to law enforcement agencies all over the world. But who are the people behind this secretive company, and what did a breach into its databases reveal?

Host Ran Levi is joined by Yossi Naar, chief visionary officer and co-founder at Cybereason to explore the implications - check it out…

About the Guest

Yossi Naar

Chief Visionary Officer and Co-Founder at Cybereason

As chief visionary officer and co-founder at Cybereason, Yossi Naar is an accomplished software developer and architect.

During his 20 years of industry experience, Yossi has designed and built many products, from cutting-edge security platforms for the defense industry to big data platforms for the AdTech / digital marketing industry.

Yossi is also the visionary behind the Cybereason in-memory graph engine dubbed the Malop^TM (malicious operation), which provides defenders with multi-stage visualizations of attack sequences that are context-rich and correlated from root cause across every affected device and user.

About the Host

Ran Levi

Born in Israel in 1975, Malicious Life Podcast host Ran studied Electrical Engineering at the Technion Institute of Technology, and worked as an electronics engineer and programmer for several High Tech companies in Israel.

In 2007, created the popular Israeli podcast Making History. He is author of three books (all in Hebrew): Perpetuum Mobile: About the history of Perpetual Motion Machines; The Little University of Science: A book about all of Science (well, the important bits, anyway) in bite-sized chunks; Battle of Minds: About the history of computer malware.

About The Malicious Life Podcast

Malicious Life by Cybereason exposes the human and financial powers operating under the surface that make cybercrime what it is today. Malicious Life explores the people and the stories behind the cybersecurity industry and its evolution. Host Ran Levi interviews hackers and industry experts, discussing the hacking culture of the 1970s and 80s, the subsequent rise of viruses in the 1990s and today’s advanced cyber threats.

Malicious Life theme music: ‘Circuits’ by TKMusic, licensed under Creative Commons License. Malicious Life podcast is sponsored and produced by Cybereason. Subscribe and listen on your favorite platform:

All Posts by Malicious Life Podcast

TRANSCRIPT:
Clearview AI Episode

“[Nate] Yossi, where could I find your face now other than in front of me here?

[Yossi Naar] Well, that’s a good question. Probably everywhere.”

Yossi Naar is a security engineer of two decades, and one of the founders of Cybereason.

“[Yossi] Definitely you can find it on Facebook. You can find it on LinkedIn, on several newspapers, online publications, physical publications probably too.”

Yossi isn’t a public figure, but he’s been around the block enough that, if you manage to spell his name right on Google, you’ll find plenty of pictures of his face.

Some of you out there are in this same predicament. Some of you–due to your career, or because you’re proficient with social media–will have many more pictures online than Yossi does. But even for a cybersecurity audience, I can’t imagine many of you out there having no pictures out on the web. These days, it requires diligence and effort to be that private.

“[Nate] is it possible that I might find your image in places where you don’t yet realize it is?

[Yossi] I wouldn’t be surprised.”

Is it possible that you, listener, have images on the internet you’re not aware of?

You might not realize what’s out there–in the database for the gym you used to go to, or the office building you used to work at, or on your old MySpace account. The other day, Nate Nelson, our Senior Producer, ran his and my face through a “face search” engine to test it out.

[Nate] I found you at around 20-something with a Backstreet Boys haircut and an earring. And there was one headshot labeled “adult content” which, I can only assume, was because your old sideburns were so dang sexy.

Ah…yeah…that was my…*cough* Asimov period. But, hey, it’s nice to know I have an alternative career in the adult industry if this podcast thing never gets off the ground…

[Nate] I don’t know, Ran, there’s a reason they stuck you on the radio instead of on T.V.

*Scribbling notes* Note for later…fire Nate…

ANYWAY… The fact that there are pictures of me I don’t remember, in places I didn’t expect, doesn’t surprise me much. And that’s important for our story today. This episode is about a problem that arises when we have too many faces in too many places. Because, like any data, your face isn’t something to be carelessly tossed around. There’s a value to it. A market for it.

INTRO TO HOAN TON-THAT

Cam-Hoan Ton-That understood that before the rest of us.

You probably wouldn’t have pegged this guy to start a revolution in facial recognition technology. But then again, it’s impossible to tell what the heck this guy’s deal is. When you watch him on T.V., he seems utterly ordinary and harmless. When you look closer, things get a lot, lot weirder.

Hoan, an Australian of Vietnamese descent, is a young guy–born in 1988. In 2007, he dropped out of college and moved to San Francisco to become an app developer. He set up shop in a hip, up-and-coming part of town where startups tended to congregate, and did his coding at a stylish, high-end coffee shop in the neighborhood. So far, so good.

Here’s how Hoan describes his first two years in the industry. Quote:

“From July 2007 to July 2008, I built 16 Facebook apps (with different codebases) with a combined unique install base of 6 million. In March 2008 the applications had over 150 million page views. In August 2008, I sold the top apps (Have You Ever, Would You Rather, Friend Quiz and Romantic Gifts). I’ve also built 8 iPhone apps, notably Expando being the #2 app in September 2008 receiving 4 stars and over 400 reviews.”

It sounds pretty impressive, but we can’t say to what extent this information is actually true. What little reporting there is suggests that his early apps largely failed and, when you think about it, that makes sense. Why would anybody build 16 apps in two years, if they were doing so well? Mark Zuckerberg didn’t invent Facebook and go: “now I have to build a dozen more apps to go with it!”

In 2009, years before Hoan was truly worth writing about, a Gawker writer summarized his place in the San Francisco startup scene. Quote:

“Everything about Ton-That’s life and work is a screaming stereotype of San Francisco’s Web crowd — a bunch of supposed individualists who’d be paralyzed with fear by the idea that they’re not living in the right neighborhood, working in the right office, and chasing the right technological trend.”

If Hoan really were successful, and not just faking it, he probably wouldn’t have made the career choice he did a few months later. In 2009 he created the website “ViddyHo.com,” which asked users to log in with their Google accounts in order to watch a particular video. When the user did so, the website hijacked their account and sent malicious phishing links to all their contacts.

Police became interested in ViddyHo, and the website was shut down. Hoan then revived his computer worm via a new domain name–“fastforwarded.com”–and included the following disclaimer, quote: “We had a bug in our code that would send everyone a video when they logged in.”

Hoan soon gave up on hacking and got a job at AngelList. In 2016, he tried out modeling. And it kind of makes sense: he’s an interesting-looking guy, with soft facial features and dark brown eyes. He’s thin, and has long, flowing black hair that reaches below his shoulders. You wouldn’t think twice seeing him on a poster for H&M or J. Crew (which, let’s face it, isn’t something you can say about most programmers).

But even more than that, Hoan has that kind of vibe to him. He plays the guitar, and mixes androgynous high fasion with clean, tight suits. His personality somehow blends Silicon Valley with futuristic alien-hipster. In his Twitter bio, he described himself as an “Anarcho-Transexual Afro-Chicano American Feminist Studies Major.” If you’re confused, that’s by design–with Hoan, it’s always hard to tell where reality ends and trolling begins. I mean, even if we leave everything else aside, he’s a Vietnamese-Australian describing himself as an African-Mexican-American. We weren’t sure which pronouns to use to refer to Hoan in this story, because “anarcho-transexual” may be just as much of a ruse as “Afro-Chicano American.” We chose “he/him” as that is how he is referred to in the news, and that’s about as close to reality as you can get with this person.

ALT-RIGHT ASSOCIATIONS

So the question now is: how does an Anarcho-Transexual Afro-Chicano American Feminist Studies Major slash developer hacker slash model end up sparking a revolution in facial recognition technology? Well, the answer is obvious: by becoming a political extremist.

According to testimony and leaked documents obtained by the Huffington Post, it was around 2015 when Hoan Ton-That became involved in the highest rungs of the far-right, brushing shoulders with the likes of white nationalist Richard Spencer, and Pax Dickinson, the disgraced former CTO of Business Insider. He had connections with financiers of the far-right, and was a member of the Slack channel for WeSearchr–a short-lived crowdfunding site for conservative political causes. The channel included such esteemed individuals as the men’s rights activist and conspiracy theorist Mike Cernovich, and the famous hacker Andrew Auernheimer.

As he became embedded with the far right, Hoan developed a little app that allowed users to put Donald Trump hair on their selfies. It was a modest contribution to the pro-Trump movement, and not very popular. But his luck was about to turn.

At a Manhattan event hosted by a conservative think tank, Hoan met Richard Schwartz, a longtime aide for Trump’s personal lawyer, Rudy Giuliani. Hoan and Schwartz took to one another, and decided to partner up on a facial recognition business. Hoan would do the tech, Schwartz market it to his political connections.

Possibly even more crucial, though, was a man who joined up soon after. Charles Johnson is a former writer for the alt-right website Breitbart, with notable connections to the billionaire Peter Thiel. Sources claim that Charles and Hoan began working together in 2016, and at least two of Johnson’s colleagues joined in as well. By 2017, they’d honed in on a direction for the company: a facial recognition app for law enforcement.

Why law enforcement? For Hoan, maybe it was just another exciting new project in his illustrious career as a developer. Or maybe it was more than that. Johnson, his partner, had a very clear motive. In a Facebook post he reported, quote, “building algorithms to ID all the illegal immigrants for the deportation squads.” End quote.

In 2017, Peter Thiel became one of Hoan’s earliest investors, buying equity in the company at a price of $200,000. According to HuffPo, quote:

“Thiel himself has an obvious interest in mass surveillance: Palantir, his data-mining behemoth, aggregates enormous amounts of personal information about immigrants and undocumented workers, and it provides the analytical tools for ICE raids.”

With backing from mainstream investors like Thiel, and the co-founder of AngelList, Hoan distanced himself from his alt-right beginnings. As part of the re-brand, he would give his company a suitably vague, Silicon Valley-sounding name: “Clearview AI.” But even as he re-positioned himself as a proper startup founder with a legitimate, politically-neutral business, the DNA of Clearview–the underlying reason it was created in the first place–would never entirely go away.

CLEARVIEW EXPLAINED

Clearview is a company that looks – that gathers data from kind of open source images. Well, open source perhaps is not the right word because nobody gave them permission.

When you post a photo on social media, its visibility is dictated by whatever privacy settings you’ve set on your account. Like, maybe your vacation photo albums aren’t public but your profile picture is. You can adjust your settings to manage all this but, frankly, not everybody goes through the trouble. And sometimes you’re in someone else’s photo, in which case you’re not the one making these decisions.

“[Yossi] By the end of it all, you probably end up with a certain amount of photos which are out on the open internet. A lot of this data is just publicly available or rather publicly searchable. So you could run indexing against Facebook, against Twitter images.

They usually have a name attached to it and with the social networks moving towards a more and more verified model, you know, Facebook wants you to use your actual name when you’re on Facebook. So – and some people have other public or personal identifying data attached to it as well. Their email, their phone number, maybe their family members.”

Cumulatively what we’re talking about here are untold billions of photos, connecting faces with names and other personally identifying information, for most of the world’s internet-using population. All this, on the open internet. It’s something we don’t think about much, because it’s never been important.

It was never important, until Hoan Ton-That had an idea.

To build a facial recognition algorithm, you need two things: code and reference data. Around 2016-17, Hoan realized that all our photos, just waiting around on the internet, completely took care of that second part. It was free real estate. All he needed was to develop the code to gather, index and query it all. So he hired an engineer to build a program that would take all our photos, and organize them in a single database.

“[Yossi] So it’s actually not that hard. I mean if you kind of wanted to do it, it’s not particularly difficult. So basically what we do is they run a process called scraping, which is really looking through records in these companies.”

When Clearview scrapes photos from social media, it does so without the sites’ permission, and in violation of their terms. But it’s not actually illegal. It’s like going to an ice cream shop and sampling every single flavor: everybody knows you’re not supposed to do it, but nobody’s going to arrest you for it, either.

“[Yossi] So they do an unauthorized collection of photos mostly from social media and publicly available photos of people. There are a ton of these and a ton of sources, a ton of ways to get them. They attach them to names. [. . .] and they allow you basically to do reverse image search.”

HOW FACIAL REC WORKS

According to Hoan, Clearview built a database of over three billion images from the web. The next step was to design a program which could make sense of it all.

“[Nate] could you talk about how this kind of algorithm, this kind of tech works?

[Yossi] In the case of facial recognition, you really just want an image of the face and at a minimum a name that you can attach to it. The difficult problem is converting that into information that you can use to kind of hash or identify, kind of link up and match.

So all biometric matching algorithms are done through a process called hashing, which is taking aspects of your face, and we try to use things that are translation-free. So for example, the distance between your pupils, the distance between your pupils and your nose, the distance between your nose and your mouth, the shape of your mouth, the kind of relative length of your head.”

Developing a biometric matching algorithm, these days, isn’t beyond reach for even moderately talented engineers. Clearview is perfect evidence to the point. According to the New York Times the entire algorithm was coded by just one engineer. It was based on existing academic research, but still…

THE TRANSLATION PROBLEM

The only real technical barrier was figuring out how the software would interpret data–this “translation” problem.

“[Yossi] you want features that if you turn your head a little bit to the left, a little bit to the right, that they would be preserved as much as possible. [. . .] No glasses, look head on. It makes it a lot simpler to identify a person when you see their face kind of head on. The head isn’t tilted in any way.

The more translation there is, the harder it is to kind of correct the image to its original form. However, there have been significant advances in image processing over the past couple of decades and translation or re-translating the image back to kind of its original shape, identifying the specific pivot and turn that the image has and to some extent, removing artifacts like, you know, identifying how you would look like maybe without your glasses makes it a little bit easier.”

Machine learning has improved, to where you don’t need perfect training data. But…

“[Yossi] if you’ve seen social media photos, they are quite difficult.”

Social media pictures exist in every conceivable form and fashion. It’s not easy for a program to distinguish your facial features in your class photo, where you’re just one small face among 50 others, or in those blurry photos you drunkenly took in a low-lit bar.

“[Yossi] So it’s not an easy thing to do. The algorithmics of it are a little bit difficult and you need kind of high quality photos. Unfortunately [. . .] the quality of photos is becoming a lot more improved over time and of course you can – there can be photos of you tagged with your identity without you even doing it, right? On social media, your friends take a photo of you. So the availability of photos where you can be identified has been significantly increased.”

ADVANCED AI CAPABILITIES

Have you ever uploaded a drunken, late-night photo to Facebook, but Facebook still suggested the correct people to tag even before you could tag them? The algorithms have gotten really good in recent years. In many cases, they beat humans. Two years ago, for instance, the National Institute of Standards and Technology, NIST, gave a set of 20 pairs of images, designed to be very difficult to parse, to two groups. The first group were leading facial recognition algorithms. The second were humans, but not just humans: experienced forensic examiners. A NIST researcher summed up their findings succinctly. Quote: “Well it turns out the best algorithm is comparable to the best humans.” End quote.

There are so many cases of machines now surpassing humans in facial recognition. In fact, it’s not even a new phenomenon. A decade and a half ago, researchers began building algorithms that could surpass our own ability to recognize one another, under specific conditions. Cold, heartless machines even beat us at recognizing emotions. In 2014, a company founded out of The University of California at San Diego developed an algorithm which could distinguish between when someone was making a genuine facial expression, and when they were just acting, at a rate of 85 percent accuracy. Humans, in that same test, could tell only 55 percent of the time.

In other words, machines would hate movies. They’d see right through Meryl Streep.

POOR AI FAILURES

This isn’t to say that all facial recognition is better than humans, or even good at all. It really depends on which algorithms you choose. Last year, NIST tested 189 different algorithms sold on the market, developed by 99 different companies. They conducted a variety of tests and, suffice it to say, the results varied.

“[Yossi] So NIST, they did an experiment where they tried to match up well-known public figures. I think they specifically tested congress people in the United States against a database of criminals and they got a surprisingly high match. So I think they got like 30 or 40 false positives [. . .] So these algorithms, they’re not super accurate and they used a really, really good data source, right? Because they were matching up against criminal photos, criminal photos. They’re taken as well-formed portrait photos of the people and they were matching them up against really nice, well-aligned photographs of the congress people.”

The extent of NIST’s findings were shocking, prompting calls for investigation by Washington lawmakers. We don’t have time to review the whole study, but for a sense of just how bad some of the results were, consider this: some algorithms falsely identified certain racial groups 100 times more often than others. 100 times! That is an incredible and dangerous result, for algorithms which are currently on the market and being used out in the world today.

HOW GOOD IS CLEARVIEW?

with Clearview specifically, we don’t have access to their technology and to their specific database that they use.

There’s reason to be suspicious about whether something like Clearview could actually work. For one thing, it was largely built by one engineer. (I still haven’t gotten over that part.) Furthermore, a database of three billion images is really difficult to wrangle. Having lots of training data is always good in AI, but having to cross-reference three billion data points for any given query is a different story.

“[Yossi] let’s say I look like a hundred people out of the billions that live in the world. Probably I look like more than a hundred people.

So take those hundred people. If all of these hundred are in the database, then there’s a good chance that if you showed even a human my picture and a picture of any of those hundred people, they would say, oh yeah, it’s this guy and it’s this guy and it’s this guy.

So the larger the database you’re using, the larger the probability of mistake.”

There’s no publicly verifiable data on the efficacy of Clearview’s algorithm, so we can only infer from circumstantial evidence. Like from the testimony of a Florida detective named Nick Ferrara, who spoke to the New York Times. For years, Ferrara had relied on Florida’s statewide facial recognition program called the “Face Analysis Comparison and Examination System,” or “FACES” for short. FACES worked because it was designed for use across law enforcement in the state, leveraging a database of 30 million Floridians’ mugshots and DMV photos (you’ll recall that these kinds of photos are the best kind you can give a program like this). But when Ferrara tried out Clearview, it was no contest. FACES didn’t touch the web, but Clearview pulled from everywhere. FACES required clear, straight-on pictures, but Clearview could handle angles, and even hats, glasses and partially-covered faces. When Ferrara fed it photos from old, dead-end cases, Clearview gave him back over 30 new suspects to look into.

Hoan Ton-That told CBS that his program runs at an accuracy rate of 99.6%. He told the New York Times it was 75%. Whatever the exact figure is, it’s definitely high, because Hoan has a lot of customers, and they pay an absolute premium for his service. Nick Ferraro’s police department pays $10,000 for a year’s subscription. According to CBS, Hoan’s biggest clients pay around $25,000 per year.

You better bet that, for $25,000 a year, this program can match faces pretty well.

HOAN’S ACHIEVEMENT

In a sense, Clearview was the thing that finally turned Hoan Ton-That into the visionary tech entrepreneur he always imagined he would be. Because, whether you like his idea or not, it’s built on the same kind of innovation that has defined our era of big data. Whether it’s Google, Equifax, or the entire internet-of-things industry, many of the most important developments in technology of the past 20 years have been borne out of the idea that there is a wealth of data out in the world, waiting to be seized. The internet had websites, then Google indexed them. We all had personal information before Equifax collected and sold it. Facebook turned us into data-generating machines.

But we shouldn’t give Hoan too much credit. He wasn’t the first person to leverage online photos for facial recognition. Facebook already had complex facial recognition technology built into its program. And Google had considered a Clearview-like algorithm back when Hoan Ton-That was still writing malware. Google’s former Chairman, Eric Schmidt, explained that his company wouldn’t pursue facial recognition technology because it could be used in, quote, “a very bad way.”

So really, Hoan didn’t come up with anything new. He merely did what others before him were not willing to do: take everyone’s photos without asking, and use them in a very bad way.

Clearview, which had marketed itself behind closed doors for two years, became public in 2019, in association with a case of theft in Florida. Just as quickly, Twitter, Venmo, and the other companies it scraped from sent cease-and-desist orders to Hoan’s company. Facebook sent a cease-and-desist, despite Peter Thiel, one of its board members, being the project’s most prominent investor.

It’s unclear whether any of these orders had any effect, as Clearview hardly slowed down in the months that followed. Frankly, if Hoan Ton-That cared about following rules and getting permission, he wouldn’t have come up with Clearview in the first place. You just don’t beat a company like this by doing everything above board.

You have to get down in the mud. Play at their level.

HACK

On February 26th of last year, Clearview AI notified its clients that it had been breached. An unknown entity had gained unauthorized access to their systems.

You might say that a hacker scraped their data without permission.

Now, this is the point in the story where you’d expect us to criticize Clearview’s security–to talk about how a company could have stolen all our pictures, then lost them to a hacker. We’d talk about how irresponsible that is, and what it means for privacy in general.

But that’s not what happened at all. Instead, after the hacker gained access to Clearview’s systems, they went straight past their image databases, to their client databases. They stole a list of Clearview’s customers, as well as data on how many accounts they owned, and how many searches they made.

In a grand stroke of luck, this hacker turned out to be on our side. Their mission was righteous.

And kind of genius. Big tech couldn’t stop Clearview AI. Law enforcement wouldn’t. But this hacker spotted an exhaust port in the Death Star, and fired two proton torpedoes straight at it. They leaked Clearview’s clients to Buzzfeed, not to change anyone’s minds about Clearview, but to expose all their clients who wanted to use facial recognition, but didn’t want anyone to know about it.

And let’s just say: Hoan Ton-That has exaggerated a lot in his career, but this was not one of those cases. Clearview was more popular than he’d let on.

In addition to hundreds of local police departments in the U.S., it had users in Homeland Security, Customs and Border Protection, the FBI, even Interpol. It had clients in Canada, Australia and India.

On February 19th, in an interview with PBS, Hoan Ton-That was asked if he’d sell to countries where being gay is a crime. He didn’t answer. Just one week later, when Clearview’s clients were leaked, it became known that they had contracts in Saudi Arabia and the UAE.

And there was one more surprise in store. It turned out that Clearview AI was marketing itself not only to law enforcement, but to completely ordinary businesses, too, like Macy’s, Best Buy and Walmart. AT&T, Verizon, T-Mobile. Columbia University, and the University of Alabama. Equinox. The Chicago Cubs. And, as if having to watch the Knicks wasn’t already punishment enough, Madison Square Garden.

The list goes on from there. And as lawsuits came flowing in, and PR teams scrambled to explain why they were secretly using cutting-edge facial recognition on their customers, the Clearview hacker’s job was done. On May 7th, the company announced it would terminate its relationships with all private companies.

CONCLUSION

You can’t stop technological progress. You can’t stop people from using technology, or weaponizing it. You can only disincentivize its use, or defeat it with something better.

Clearview is unique, and its founder is utterly strange, but it is a perfect case study in the future of facial recognition technology. It demonstrates how quickly this technology is improving, and how useful it can be. It demonstrates how your face is one of the most exposed, misunderstood forms of personal data you have. It demonstrates just how many people want in on this racket, and what it’ll take to stop them.

Which brings us to the most important point of all. Clearview AI promised–at least outwardly–to stop taking on clients in the private sector, but by all accounts, they are still growing within law enforcement. In China, the U.S., and around the world, facial recognition is becoming more powerful and more integrated into policing. It’s being used to catch criminals, but also to track peaceful protestors, and even ordinary pedestrians.

In the next episode of Malicious Life: what happens when Big Brother comes for you.