Summary: The Generalization Capabilities of Large Neural Networks

We have seen a transformation over the years away from training large models entirely from scratch, and toward methods that utilize finetuning or few-shot learning. Classically, models might be pre-trained on a large-scale supervised or self-supervised task (e.g., pre-training a large ResNet model on ImageNet), and then the last few layers of the model might be fine-tuned on a much smaller dataset for the task of interest. More recently, open-vocabulary vision-language models and promptable language models have made it possible to avoid fine-tuning, and instead define new tasks by constructing a textual prompt, potentially containing a few examples of input-output pairs.

In robotics, learning policies from a small amount of data is even more important. A central benefit of robotic learning should be in enabling rapid and autonomous acquisition of new tasks on command, but if each task requires either a large human-provided demonstration dataset or a long

Related articles

General-Purpose Pre-Trained Models in Robotics

Can we (pre-) train policies to control any robot for any task?

Read the complete article at: sergeylevine.substack.com

Add a Comment

Your email address will not be published. Required fields are marked *