The OpenAI R1 Advanced Reasoning System and its Implications for DeepSeek and the OpenAI API Services and o3-mini in ChatGPT
The berry stack is a code base built for speed and was the place where research that went into o1 was done. “There were trade-offs—experimental rigor for throughput,” says a former employee with direct knowledge of the situation.
OpenAI spent years experimenting with reinforcement learning to fine-tune the model that eventually became the advanced reasoning system called o1. (Reinforcement learning is a process that trains AI models with a system of penalties and rewards.) DeepSeek took the reinforcement learning work done by OpenAI in order to create its R1 advanced reasoning system. A former Open Artificial Intelligence researcher who is not authorized to speak publicly about the company says that they benefited as a result of knowing that reinforcement learning works.
Some inside OpenAI want the company to build a unified chat product, one model that can tell whether a question requires advanced reasoning. So far, that hasn’t happened. Users can choose whether or not to use o1 or GPT-4o in the drop-down menu.
OpenAI is particularly eager to demonstrate that it remains at the forefront of developing and commercializing AI, according to sources inside the company.
OpenAI CEO Sam Altman teased exactly two weeks ago that o3-mini would ship in “a couple of weeks,” and it’s arriving on time today. OpenAI is launching its latest o3-mini reasoning model inside ChatGPT and its API services and making a version with rate limits available to free users of ChatGPT for the first time.
The company says that the latest model will also incorporate new features, including the ability to tap into web searches, call functions from a user’s code, and toggle between different reasoning levels that trade off speed for problem-solving capabilities.
The US government strategy to curb China’s rise in artificial intelligence has been raised by DeepSeek’s sudden rise. Two administrations in the US have introduced sanctions against China in order to restrict their access to advanced technology from Nvidia which is used in cutting-edge Artificial Intelligence models. Several types of Nvidia chips are described in the research by DeepSeek, however it’s not clear what exactly was used.
OpenAI – Hide and Seek: Why OpenAI is hiring PhD students to help train a powerful large language model for deep learning?
Are you a current or former employee at OpenAI? Tell us where you are, we would love to hear from you. Using a nonwork phone or computer, contact Will Knight at [email protected] or on Signal via his username wak01.
The R1 continues to wreak havoc in the US tech industry. The fact that a powerful model is free puts pressure on the companies to reduce their prices.
A job posting says that an example problem is strikingly close to a problem in a benchmark that is designed to test large language models ability to solve complex science problems.
OpenAI has evidently been using PhD students to help train a new model for some time. Several weeks ago, the company began recruiting PhD computer science students at $100 per hour for a “research collaboration” that would “involve working on unreleased models,” according to an email viewed by WIRED.
The company said that o3-mini advances the boundaries of what small models can achieve.
Openai wants to answer the hype surrounding a new open source offering from DeepSeek by making a smaller, more efficient version of its model available for free.