Meta has announced the release of several new AI models, including a notable "Self-Taught Evaluator," which aims to reduce human involvement in AI development. This follows the introduction of the evaluator in an August paper that described its reliance on the "chain of thought" technique, similar to OpenAI's recent o1 models. This approach breaks down complex problems into smaller, logical steps, enhancing accuracy in fields like science, coding, and math.
The Self-Taught Evaluator was trained entirely using AI-generated data, eliminating the need for human input during this phase. Researchers suggest that this capability may pave the way for creating autonomous AI agents capable of learning from their mistakes, potentially evolving into digital assistants that can perform a wide range of tasks independently. This method could streamline current processes like Reinforcement Learning from Human Feedback, which relies heavily on specialized human annotators for data labeling and validation.
October 18, 2024
Jason Weston, one of the researchers, expressed optimism that as AI progresses, it will improve its self-evaluation abilities, ultimately surpassing human performance. The concept of self-taught and self-evaluating AI is seen as essential for achieving super-human levels of intelligence. Other companies, including Google and Anthropic, have also explored similar ideas, although they have not publicly released their models as Meta has done.
In addition to the Self-Taught Evaluator, Meta unveiled updates to its image-identification Segment Anything model and tools designed to accelerate LLM response generation and assist in discovering new inorganic materials.