Distill: Early Vision in CNNs
The largest neuron groups in the mixed3a
layer of InceptionV1. (Olah et al., 2020)
Chris Olah and his OpenAI collaborators published a new Distill article: An Overview of Early Vision in InceptionV1 . This work is part of Distill’s Circuits thread, which aims to understand how convolutional neural networks work by investigating individual features and how they interact through the formation of logical circuits (see DT #35). In this new article, Olah et al. explore the first five layers of Google’s InceptionV1 network:
Over the course of these layers, we see the network go from raw pixels up to sophisticated boundary detection, basic shape detection (eg. curves, circles, spirals, triangles), eye detectors, and even crude detectors for very small heads. Along the way, we see a variety of interesting intermediate features, including Complex Gabor detectors (similar to some classic “complex cells” of neuroscience), black and white vs color detectors, and small circle formation from curves.
Each of these five layers contains dozens to hundreds of features (a.k.a.
channels or filters) that the authors categorize into human-understandable groups, which consist of features that detect similar things for inputs with slightly different orientations, frequencies, or colors.
This goes from conv2d0
, the first layer where 85% of filters fall into two simple categories (detectors for lines and for contrasting colors, in various orientations), all the way up to mixed3b
, the fifth layer where there are over a dozen complex categories (detectors for small heads, for circles/loops, and much more).
We’ve known that there are line detectors in early network layers for a long time, but this detailed taxonomy of later-layer features is novel—and it must’ve been an enormous amount of work to create.
A cicuits-based visualization of the black & white detector neuron group in layer mixed3a
of InceptionV1. (Olah et al., 2020)
For a few of the categories, like black & white and small circle detectors in mixed3a
, and boundary and fur detectors in mixed3b
, the article also investigates the “circuits” that formed them.
Such circuits show how strongly the presence of a feature in the input positively or negatively influences (“excites” or “inhibits”) different regions of the current feature.
One of the most interesting aspects of this research is that some of these circuits—which were learned by the network, not explicitly programmed!—are super intuitive once you think about them for a bit.
The black & white detector above, for example, consists mostly of negative weights that inhibit colorful input features: the more color features in the input, the less likely it is to be black & white.
The simplicity of many of these circuits suggests, to me at least, that Olah et al. are currently exploring one of the most promising paths in AI explainability research. (Although there is an alternate possibility, as pointed out by the authors: that they’ve found a “taxonomy that might be helpful to humans but [that] is ultimately somewhat arbitrary.”)
Anyway, An Overview of Early Vision in InceptionV1 is one of the most fascinating machine learning papers I’ve read in a long time, and I spent a solid hour zooming in on different parts of the taxonomy.
The groups for layer mixed3a
are probably my favorite.
I’m also curious about how much these early-layer neuron groups generalize to other vision architectures and types of networks—to what extent, for example, do these same neuron categories show up in the first layers of binarized neural networks?
If you read the article and have more thoughts about it that I didn’t cover here, I’d love to hear them. :)