Dynamically Typed

Papers with Code sotabench

The sotabench homepage. (sotabench)

The sotabench homepage. (sotabench)

The team behind Papers with Code has launched sotabench. The name derives from “state of the art (sota)” + “benchmark”, and its mission is precisely that: to benchmark every open source model —for free!

This is super cool. A researcher just needs to implement a small Python file that specifies how to run their model on some given test data. They can then submit their repository to sotabench, which tracks it and runs the model on standardized test data for every commit to the master branch. This way, it independently keeps track of whether models achieve the performance claimed by the authors (within some benchmark-specific error range).

The project is run by Atlas ML, a company whose mission is to “advance open source deep learning” (emphasis mine).

We believe the software of the future should be accessible to everyone, not just large technology companies. We are realising this future by building breakthrough tooling that allows the world to build and collaborate on ambitious deep learning projects.

Atlas ML was co-founded by Robert Stojnic, one of the first Wikipedia engineers. It’s therefore not surprising that the team’s main objective is to push the open and collaborative values that also drive Wikipedia. The meta dataset resulting from sotabench will also surely lead to lots of interesting research on reproducibility and model characteristics vs. performance. Check out the project at sotabench.com.