OpenAI recently launched CriticGPT powered by GPT-4. As the name suggests, the model "writes critiques of ChatGPT responses to help human trainers spot mistakes" in ChatGPT's code output.
According to the ChatGPT maker:
"We found that when people get help from CriticGPT to review ChatGPT code, they outperform those without help 60% of the time. We are beginning the work to integrate CriticGPT-like models into our RLHF labeling pipeline, providing our trainers with explicit AI assistance."
OpenAI plans to use Reinforcement Learning from Human Feedback (RLHF) to make ChatGPT more "helpful and interactive." An integral part of this process involves collecting comparisons from AI trainers. This is based on how they rate different ChatGPT responses against each other.
CriticGPT will help improve ChatGPT's reasoning capabilities, ultimately reducing hallucinations or the generation of incorrect responses and misinformation. As it happens, it's increasingly becoming hard for AI trainers to identify mistakes as ChatGPT advances.
The tool is primarily trained to identify and write critiques highlighting inaccuracies in ChatGPT answers. OpenAI admits the tool isn't always 100% accurate, but it helps AI trainers identify errors faster and easier than they would ordinarily without AI.
CriticGPT will reportedly augment skills, ultimately equipping people with more comprehensive critique techniques. While AI trainers and CriticGPT can get the job done as separate entities, a Human+CriticGPT combination is seemingly popular and thorough when providing accurate and detailed critiques.
According to OpenAI's findings:
"We find that CriticGPT critiques are preferred by trainers over ChatGPT critiques in 63% of cases on naturally occurring bugs, in part because the new critic produces fewer "nitpicks" (small complaints that are unhelpful) and hallucinates problems less often."
While impressive, CriticGPT still needs a lot of work. OpenAI has highlighted the model's shortcomings as listed below:
In the future, OpenAI intends to scale greater heights with CriticGPT by improving its RLHF data for GPT-4 training. In a separate report, Oxford researchers leveraged semantic entropy to assess the quality and meanings of generated outputs to determine the quality of responses and spot traces of hallucination.
AI models are becoming more advanced and sophisticated, allowing them to handle complex tasks better. NVIDIA CEO Jensen Huang argues coding might be dead in the water as a career option for the future generation. Huang might not be entirely wrong if OpenAI GPT-4o's coding capabilities are anything to go by. Instead, he recommends seeking alternative career options in biology, education, manufacturing, or farming.
2024-07-02T19:01:55Z dg43tfdfdgfd