The s1 development team reported that they started with a ready-made base model and then improved it through distillation, a process that involves extracting reasoning abilities from another AI model by learning from its responses. According to the researchers, 16 NVIDIA H100 GPUs were used to train s1 and took about 30 minutes.
The researchers report that reasoning models can be derived from a relatively small dataset through a process called supervised fine-tuning (SFT), in which the AI model is explicitly instructed to mimic certain behaviors in the dataset.
The idea that a few researchers without millions of dollars in funding can still innovate in the field of AI is indeed fascinating. But s1 raises important questions about the commercialization of AI models, and the fact that anyone can now accurately reproduce a multi-million dollar model for little money.