#42: Facial recognition exodus, OpenAI's new GPT-3 language model, and Oil in the Cloud

June 21, 2020

Hey everyone, welcome to Dynamically Typed #42! Today I’m covering three big stories: big tech’s long-awaited move away from selling facial recognition APIs; OpenAI’s enormous (read: 100x bigger) new GPT-3 language model; and the Greenpeace report on tech’s machine learning contracts with oil and gas companies. I also have lots of quick links across productized AI, ML research, and climate change AI in today’s issue, so let’s dive straight in.

Productized Artificial Intelligence 🔌

Big tech companies are putting an end to their facial recognition APIs. Beside their obvious privacy problems, commercial face recognition APIs have long been criticized for their inconsistent recognition accuracies for people of different backgrounds. Frankly said, these APIs are better at identifying light-skinned faces than dark-skinned ones. Joy Buolamwini and Timnit Gebru first documented a form of this in their 2018 Gender Shades paper, and there have been many calls to block facial recognition APIs from being offered ever since; see Jay Peter’s article in The Verge for some more historical context.

It took two years and the recent reckoning of discrimination and police violence in the United States (see DT #41), for IBM to finally write a letter to the US congress announcing they’re done with the technology:

IBM no longer offers general purpose IBM facial recognition or analysis software. IBM firmly opposes and will not condone uses of any technology, including facial recognition technology offered by other vendors, for mass surveillance, racial profiling, violations of basic human rights and freedoms, or any purpose which is not consistent with our values and Principles of Trust and Transparency.

Amazon and Microsoft followed soon after, pausing police use of their equivalent APIs. Notably Google, where Gebru works, has never had a facial recognition API. Now that these big-name tech companies are no longer providing facial-recognition-as-a-service, however, this does expose a new risk. Benedict Evans, in his latest newsletter:

The catch is that this tech is now mostly a commodity (and very widely deployed in China) - Google can say “wait”, but a third-tier bucketshop outsourcer can bolt something together from parts it half-understands and sell it to a police department that says ‘it’s AI - it can’t be wrong!’.

This is a real risk, and that’s why the second half of these announcements is equally—if not more—important. Also from IBM’s letter to congress:

We believe now is the time to begin a national dialogue on whether and how facial recognition technology should be employed by domestic law enforcement agencies.

The real solution here is not for individual big tech companies to be publicly shamed into stopping their facial recognition APIs, but for the technology to be regulated by law—so that a “third-tier bucketshop outsourcer” can’t do the same thing, but out of the public eye. So: these are good steps, but this week’s news is far from the last chapter in the story of face recognition.

Quick productized AI links 🔌

🖼 remove.bg, the service that automatically removes backgrounds from images, has released an update that significantly improves the quality of their cutouts. It includes better hair handling, edge color correction, and multi-object scenes. This is Software 2.0 in action: the same APIs are now powered by better models, providing better results for users who don’t have to change their workflow.
💱 Isaac Caswell and Bowen Liang summarized recent advances in Google Translate, linking out to papers and other posts describing each change in depth. These changes over the past year have together resulted in an average improvement of +5 BLEU across the 100+ languages now supported by Translate (see this fun gif), with low-resource languages improving by an additional +2 BLEU on average.
📱 Android 11 includes much-improved voice access, where instead of having to say the number next to the part of the screen you want to click, you can just say what you’re trying to do, and the phone is pretty good at understanding your intention. Check out Dieter Bohn’s demo video on twitter.
🥁 Background sound removal in Google Meet got improved significantly: G Suite director of product management Serge Lachapelle made a demo video showing it successfully muting all sorts of annoying meeting noises—while preserving his talking at the same time. Reminds me of Krisp.ai (DT #16).

Machine Learning Research 🎛

OpenAI announced GPT-3, the next generation of its language model. As we’re used to by now, it’s another order of magnitude bigger than previous models, at 175 billion parameters—compared to 1.5 billion for GPT-2 and 17 billion for Microsoft’s Turing NLG (DT #33). It’s not the model’s size that’s interesting, though, but what this enables. From the abstract of the 74-page paper by Brown et al. (2020) detailing GPT-3:

Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. … For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model.

This is super cool! Where GPT-2 could only complete a passage from a given input in a natural-sounding way, GPT-3 can now do several tasks just from being shown examples. Instead of fine-tuning the model for specific tasks like translation, question-answering, or generating podcast episode titles that do not exist (👀), the model can do everything out of the box. For example, if you feed it several questions and answers prefixed with “Q:” and “A:” respectively, followed by a new question and “A:”, it’ll continue the passage by answering the question—without ever having to update its weights! Other example include parsing unstructured text data into tables, improving English-language text, and even turning natural language into Bash terminal commands (but can it do git?).

OpenAI rolled out its previous model in stages, starting with a 117-million parameter version (“117M”) in February 2019 (DT #8), followed by 345M in May of that year (DT #13), 774M in September with a six-month follow up blog post (DT #22), and finally the full 1.5-billion parameter version in November (DT #27). The lab is doing the same for GPT-3, which is also the first model that it’s making commercially available in the form of an API. Just a few vetted organizations have had access to the API so far. Ashlee Vance for Bloomberg:

To date, Casetext has been using the technology to improve its legal research search service, MessageBird has tapped it for customer service, and education software maker Quizlet has used it to make study materials.

Janelle Shane als has access to GPT-3, and she has used the API to make some “spookily good Twitter bots” on her AI Weirdness blog.

I’m glad OpenAI staging the release of their API this way again, since valid criticism has already started popping up: Anima Anandkumar pointed out on Twitter that the GPT-2 has “produced shockingly racist and sexist paragraphs without any cherry picking.” (Also see this follow-up discussion with OpenAI policy director Jack Clark.) These type of bias problems have to be worked out before the model can responsibly be released beyond a few trusted partners, which OpenAI CEO Sam Altman also acknowledged this in Vance’s piece:

As time goes on, more organizations will gain access, and then the API will be public. “I don’t know exactly how long that will take,” Altman said. “We would rather be on the too-slow than the too-fast side. We will mistakes here, and we will learn.”

As the OpenAI API gets released more broadly and integrated into more products, I’ll keep following its progress.

Quick ML research + resource links 🎛 (see all 65 resources)

🔆 Nick Cammarata et al. published a new article on curve detectors in Distill’s Circuits thread (see DT #35, #37). It’s a deep dive into the 3b:379 neuron of the InceptionV1 network, and—as usual—it’s an exceptionally well-written and well-illustrated post. I also really like the idea of a CNN learning how to do e.g. curve detection (which hasn’t been solved classically!), and then teaching us how to implement it by hand. Goes to show the power of the Circuits hypothesis.
🗜 Vincent Sitzmann et al. introduced SIREN, a new activation function for implicit neural representations, a technique to encode a signal (e.g. an image, audio sample, video clip, or 3D scenes) in the parameters of a neural network. Their main innovation is using a periodic activation function (based on a sine wave) instead of the usual ReLU, TanH, or Softplus nonlinearities, which yields very impressive results. Check out their paper video and demo site.
🎞 Watch tip: Superintelligence: The Idea That Eats Smart People, a 2016 keynote by Maciej Ceglowski in which he compares machine learning to alchemy and artificial general intelligence to the philosopher’s stone. It’s quite relevant to the issues in our field today.

Artificial Intelligence for the Climate Crisis 🌍

Overview of Greenpeace’s findings in their Oil in the Cloud report.

Overview of Greenpeace’s findings in their Oil in the Cloud report.

Greenpeace released their Oil in the Cloud report . Focusing on Google’s GCP, Amazon’s AWS, and Microsoft’s Azure, the report covers in what ways these cloud companies are working with oil and gas companies. We’ve already heard a lot about this: it’s been highlighted in a viral Vox video, on the CCAI forums, and in the Tech Won’t Drill It pledge (see DT #33). This report adds an exhaustive overview of how cloud services—and sometimes machine learning—are involved in the different phases of oil and gas extraction:

Upstream: finding and extracting oil and gas, using ML to fill in missing data and manage datasets.
Midstream: transporting and storing oil and gas, using ML to monitor pipelines (see DT #39) and “optimize pipelines, inventory and workforce.”
Downstream: refining, marketing and selling oil and gas; this seems more focused on other cloud services, not ML specifically.

Greenpeace found specific examples of contracts that all three companies had in at least one of these phases. It also notes that because of public outrage over the past few months, all three companies have deemphasized their oil and gas products on marketing websites. So far, though, it looks like only Google has actually committed to no longer taking on new oil and gas contracts (but still continuing with its existing contracts).

Overall, Amazon and Microsoft, the largest players in western cloud computing at 33% and 18% market share respectively, come out of this report looking pretty bad. Google, the smallest at 8%, is taking the biggest steps in the right direction.

Google also the only one of the three that’s already matching its datacenter energy use with renewable power purchases, and doing some very cool work to shift its workloads to happen when electricity grids are cleanest. If you’re working in ML and training your models in the cloud, encouraging your company or group to switch to GCP—away from AWS and Azure–is probably one of the highest-impact actions you can take for climate change right now.

Quick climate AI links 🌍

🌳 Brandt and Stolle (2020) propose a new method for detecting scattered trees outside of dense forests, “important for carbon sequestration, supporting livelihoods, maintaining ecosystem integrity, and climate change adaptation and mitigation.”
🌳 Williams et al. (2020) present a pipeline for tracking “early-successional” species of trees that are the first to appear after an area has been logged, when the forest begins to recover. They process videos from unmanned aerial vehicles (UAVs) and use SVMs and random forests to identify the crowns of different species of trees in Indonesia.

Thanks for reading! As usual, you can let me know what you thought of today’s issue using the buttons below or by replying to this email. If you’re new here, check out the Dynamically Typed archives or subscribe below to get a new issues in your inbox every second Sunday.

If you enjoyed this issue of Dynamically Typed, why not forward it to a friend? It’s by far the best thing you can do to help me grow this newsletter. 🍹

Dynamically Typed

#42: Facial recognition exodus, OpenAI's new GPT-3 language model, and Oil in the Cloud

Productized Artificial Intelligence 🔌

Machine Learning Research 🎛

Artificial Intelligence for the Climate Crisis 🌍

Productized AI

ML Research

Climate AI

Cool Things