Same.Energy, a visual search engine
Same.Energy visual search
same.energy is a deep learning-powered visual search engine built by Jason Jackson (the creator of TabNine, see DT #18), who says it works similarly to OpenAI’s CLIP.
It’s a really nice website to just browse around for a while and get lost in, which is also exactly what it’s meant for.
Jackson, on the site’s about page, says:
We believe that image search should be visual, using only a minimum of words.
And we believe it should integrate a rich visual understanding, capturing the artistic style and overall mood of an image, not just the objects in it.
We hope Same Energy will help you discover new styles, and perhaps use them as inspiration.“
This may remind you of Google’s similar image search, but I found that same.energy does a much better job of finding diverse results: where Google will usually show me different crops and resolutions of the same photo or different photos of the same object, same.energy manages to capture the “feel” of an image and show more (different!) images with that same “feel.” Some of my favorite searches so far are __blue geometric patterns, glass art, and rusty surfaces.
pixel-me.tokyo: GAN for 8-bit portraits
Pixel-me at graduation.
Japanese developer Sato released pixel-me.tokyo, a website that generates an 8-bit style portrait based on user-submitted photos.
It works both on human faces and pets.
Sato also previously made AI Gahaku (“AI master painter”), which creates portraits in the style of classical painters.
He built both projects using pix2pix, a conditional generative adversarial network (cGAN) that’s commonly used for AI art projects.
Google developer advocate Kaz Sato (“similar names, but we are not the same person”) wrote up a nice piece for the Google Cloud blog about Sato’s process the projects.
AR Face Doodles
Yours truly, now with mustache, beard, and brows.
Cyril Diagne, resident artist/designer/programmer at Google Arts & Culture, built AR Face Doodle —a website that lets you draw on your face in 3D.
It’s powered by MediaPipe Facemesh, “a lightweight machine learning pipeline predicting 486 3D facial landmarks to infer the approximate surface geometry of a human face,” which can run real-time in browsers using TensorFlow.js.
The site lets you draw squiggles on top of your selfie camera feed and then locks them to the closest point on your face.
As you move your face around—or even scrunch it up—the doodles stick to their places and move around in 3D remarkably well.
AR Face Doodle should work on any modern browser; you can also check out the site’s code on GitHub: cyrildiagne/ar-facedoodle.
3D photography using depth inpainting
Layered depth inpainting. (Shih et al., 2020)
Here’s another cool AI art piece that can’t be done justice using just the static screenshot above: Shih et al. (2020) published 3D Photography using Context-aware Layered Depth Inpainting at this year’s CVPR conference.
Here’s what that means:
We propose a method for converting a single RGB-D input image into a 3D photo, i.e., a multi-layer representation for novel view synthesis that contains hallucinated color and depth structures in regions occluded in the original view.
Based on a single image (plus depth information), they can generate a 2.5-dimensional representation, realistically re-rendering the scene from slightly different perspectives from which it was originally taken.
Contrast that with recent work on neural radiance fields, which requires on the order of 20 - 50 images to work (see DT #36).
Shih et al. set up a website with some fancy demos, which is definitely worth a look; see these gifs on Twitter too.
One of the authors also works at Facebook, so I wonder if we’ll one day see Instagram filters with this effect—or if it’ll be a part of Facebook’s virtual reality ambitions.
Since the next generation of iPhones will likely have a depth sensor on the back too, I expect we’ll see a lot of this 2.5D photography stuff in the coming years.
Neural radiance fields for view synthesis: NeRF
Novel views generated from pictures of a scene.
Representing Scenes as Neural Radiance Fields for View Synthesis, or NeRF, is some very cool new research from researchers at UC Berkeley.
Using an input of 20-50 images of a scene taken at slightly different viewpoints, they are able to encode the scene into a fully-connected (non-convolutional) neural network that can then generate new viewpoints of the scene.
It’s hard to show this in static images like the ones I embedded above, so I highly recommend you check out the excellent webpage for the research and the accompanying video.