#59: A visual search engine, Google's camera-based vitals measurements, and two AI lab tooling long reads
Hey everyone, welcome to Dynamically Typed #59! I’ve got a classic links issue for you today, featuring:
- 🔌 Productized AI: camera-based vitals measurements on Google Pixel phones; and Microsoft’s new Speller100 model powering input validation in Bing.
- 🎛 ML Research: Papers with Code’s new Datasets section; two tooling long reads from the big AI labs (one from DeepMind on JAX and one from OpenAI on Kubernetes); and a new paper on “the AI brain drain” from academia to industry.
- ✨ Cool things: same.energy, a visual search engine for images that “feel” the same; and a website with submissions to the CVPR/ICCV computer vision art workshop.
Happy reading! After I send today’s newsletter our, I’ll be heading right back out on the ice — if you haven’t seen it yet, we’re having a pretty magical winter here in Amsterdam.
PS: Revue, the service I use to send you Dynamically Typed, got acquired by Twitter. This shouldn’t change anything for you, but I just wanted to let you know. 💌
Productized Artificial Intelligence 🔌
- ❤️ Google is adding camera-based vitals measurement to its Fit app on Android. Initially rolling out to Pixel phones, the new feature can measure your respiratory (breathing) rate by looking at your face and upper torso through the selfie camera — something that, judging from a cursory Scholar search, was only becoming a mainstream research topic just two years ago! The rate at which computer vision research makes it from an idea to deployment on millions of phones remains pretty astonishing. The Fit app can also read your heart rate when you place your finger on the back-facing camera, though I don’t think this is as new: I’ve used iPhone apps that did this years ago — but one big difference is that Google has actually done clinical studies to validate these features.
- 💱 Jingwen Lu, Jidong Long and Rangan Majumder wrote a blog post about Speller100, Microsoft’s zero-shot spelling correction models that now collectively work across 100+ languages. Speller100 is currently live in production as part of Microsoft’s Bing search engine, where it corrects typos in search queries — it’s what powers the “did you mean…” prompt. Although this feature has been around for English-language search queries for a very long time, Speller100 newly enables it for a whole host of smaller languages. It’s also an interesting case study of how an AI-powered refinement step of user input can significantly improve a product’s overall experience. By A/B testing Speller100 against not having spelling correction, the researchers found that it reduced the number of pages with no results by 30%, and manual query reformatting by 5%; and that it increased the number of clicks on spelling suggestions by 67%, and clicks on any item on the page by 70%.
Machine Learning Research 🎛
- 💻 Papers with Code is continuing on its quest to index more and more aspects of machine learning research. They’ve now launched a new Datasets section that lets you search benchmarks by the dataset and modality they’re based on. The page for ImageNet, for example, has a description, samples, usage statistics and metadata about the dataset, and links to 52 related benchmarks. Papers with Code’s execution has been impressive lately: in the last year and a half, they’ve also launched sotabench, Methods, arXiv integration and Axcell. These are all useful resources, and it seems their acquisition by Facebook in the middle of it all hasn’t slowed the team down one bit!
- ⚙️ AI lab tooling long read #1: DeepMind published a blog post about using JAX to accelerate their research. JAX is a modern take on the NumPy API that “includes an extensible system of composable transformation that help support machine learning research” by taking care of differentiation, vectorization (like abstracting batching away from the researcher), and JIT-compilation (for GPUs and TPUs). The Python library now underpins many of DeepMind’s recent publications, and they’ve also open-sourced several components of their internal ecosystem on top of JAX: Haiku, Optax, RLax, Chex, and Jraph (“it’s pronounced gif ”).
- ⚙️ AI lab tooling long read #2: OpenAI added a blog post about scaling Kubernetes to 7,500 nodes. Kubernetes is a system for orchestrating Docker containers across a datacenter, and I think most compute-heavy companies use it by now. Both startups I’ve worked at also use it for their machine learning workloads — but at a scale on the order of tens or hundreds of nodes, not many thousands. At that scale, a whole load of problems and potential optimizations suddenly become worth their engineer-time to look at, and that’s exactly what OpenAI does in this detailed post. (A fun fact I quite enjoy and will probably never have a better excuse to share in DT than now: Kubernetes is abbreviated as K8s — “K-then-eight-letters-then-s,” like how internationalization is i18n — and there’s a management tool for Kubernetes called K9s. At first sight, the name just looks like a typical programmer move, “K8+1s = K9s,” but it has another level to it: if you pronounce K9s as a word, it sounds like “canines” — dogs! So the logo for K9s is a dog. 🐩)
- 💼 A new paper by Jurowetzki et al. (2021) quantifies transition flows of machine learning researchers between industry and academia and “finds that researchers working within the field of deep learning as well as those with higher average impact tend to transition into industry.” Juan Mateos Garcia, one of the authors, refers to this as “the AI brain drain.” This is quite a controversial topic and I haven’t made up my mind about how I feel about it yet, so I’d love to hear your thoughts. (Hit that reply button!)
Cool Things ✨
Same.Energy visual search
same.energy is a deep learning-powered visual search engine built by Jason Jackson (the creator of TabNine, see DT #18), who says it works similarly to OpenAI’s CLIP. It’s a really nice website to just browse around for a while and get lost in, which is also exactly what it’s meant for. Jackson, on the site’s about page, says:
We believe that image search should be visual, using only a minimum of words. And we believe it should integrate a rich visual understanding, capturing the artistic style and overall mood of an image, not just the objects in it. We hope Same Energy will help you discover new styles, and perhaps use them as inspiration.“
This may remind you of Google’s similar image search, but I found that same.energy does a much better job of finding diverse results: where Google will usually show me different crops and resolutions of the same photo or different photos of the same object, same.energy manages to capture the “feel” of an image and show more (different!) images with that same “feel.” Some of my favorite searches so far are __blue geometric patterns, glass art, and rusty surfaces.
Quick cool things links ✨
- 🎨 Just like the NeurIPS ML creativity workshop has a gallery of accepted works at aiartonline.com, I found that the CVPR/ICCV computer vision art workshop also has an equivalent: computervisionart.com! The winner of the 2019 workshop was Terence Broad, who trained GANs “to have more variation in the colors they produce […] without any data,” and produced an hour-long loop called (un) stable equilibrium 1:1. The website also has the short list and all other accepted work, which are worth a browse through. (The CFP for this year’s workshop is also now live; the decline is March 15th.)
Thanks for reading! As usual, you can let me know what you thought of today’s issue using the buttons below or by replying to this email. If you’re new here, check out the Dynamically Typed archives or subscribe below to get a new issues in your inbox every second Sunday.
If you enjoyed this issue of Dynamically Typed, why not forward it to a friend? It’s by far the best thing you can do to help me grow this newsletter. ⛸