YOLO OpenAI Five ANYmal

YOLO is (or was when I was keeping track) the state-of-the-art real-time image classification model. It (or tinyYOLO, a light-weight modification) is capable of running at 30FPS, with low latency, on hardware you can fit into a battery-powered robot. It has a somewhat limited set of labels - you can see in the sample video that it's struggling to decide if a falcon is a dog or a cat. Nonetheless, it's an extremely impressive model.

OpenAI Five is a DOTA 2 bot made by the people who made GPT-3, and their insane amount of Musk/Gates money. There's a lot of smart people working at OpenAI, but their ability to just throw money (and therefore computing power) at the problem is what really puts them out ahead of the rest of the field. Anyways, this bot is as good as, or better than, the best DOTA 2 players in the world (at least when dealing with a 17-character subset of the full 122-character roster). I find this a lot more impressive than similar victories in games like Chess or Go because DOTA 2 much more closely resembles robotics (i.e. real-world) problems. It's real-time, there's teams, the set of options available is continuous (i.e. can pick any angle to move in, shoot at, etc) instead of discrete (can only place a Go piece on one of 361 locations). Robotics has this paradigm of sense-plan-act, and right now there's rapid progress being made on all three parts, but hardware limitations on robots' ability to sense and act limit development of the purely-software "plan" phase. Using a game like DOTA 2 seems like a very clever way to be able to work on sense-plan-act without any hardware limits (since the world the AI is inhabiting is purely digital). So, given the ease with which they solved DOTA 2 by throwing neural nets and processing power at it, why does solving the problem of human-level performance at "walking around and picking up objects" feel so much further off?

Okay so for this one you gotta just read that whole article, and watch both the videos. Something that maybe the article doesn't get across perfectly is just how impressive this is - it's a massive breakthrough in robotic locomotion! This is the cutting edge of the field, and the closest humans have ever gotten to creating a mammal-level intelligence. But, critically, ANYmal is really, really far away from that goal. It's mostly human-controlled, and it doesn't have the ability to recognize objects.

The Ontology Of Robotics