OpenAI worked out CriticGPT, based on GPT-4, to help human trainers validate the program code generated by ChatGPT. The model analyzes the code and points out potential bugs, making it easier to detect flaws that might have gone unnoticed.
In studies conducted by OpenAI, annotators preferred CriticGPT remarks to human remarks in 63% cases with natural language model errors. This preference is explained by the fact that CriticGPT generated fewer false positives and useless remarks. Working humans and CriticGPT together also resulted in more complete error reports than when humans alone were used. In addition, the use of CriticGPT helped to reduce the level of hallucinations that occurred when the model was running alone.
A large dataset of deliberately introduced errors was used to develop CriticGPT. Experts modified the ChatGPT code by introducing errors and providing examples of feedback, allowing the model to learn how to identify and criticize different types of errors.
Researchers have developed a new Force Sampling Beam Search (FSBS) technique that helps CriticGPT write more detailed bug reports. This technique allows for adjusting the thoroughness of the problem search and the frequency of non-existent error generation, which can be customized depending on specific tasks.
Interestingly, CriticGPT demonstrated its capabilities not only in code analysis. In experiments, the model identified errors in 24% cases in ChatGPT training data, which were previously considered flawless in human evaluation. These errors were later confirmed by experts.
Despite its successes, CriticGPT has limitations. The model is trained on relatively short ChatGPT responses, which may not be sufficient to evaluate longer and more complex tasks. Although CriticGPT reduces false positives, false positives cannot be completely eliminated, and human trainers may err on the side of labeling based on false positives. The model is more effective at detecting errors localized to a specific point in the code, whereas errors may be distributed across multiple parts of the response, which presents a challenge for future versions of the model.
OpenAI plans to use models like CriticGPT to help trainers evaluate the output of language models, which will improve assessment tools and increase efficiency. However, even with AI, complex tasks can be a challenge for humans.
Ailib neural network catalog. All information is taken from public sources.
Advertising and Placement: [email protected] or t.me/fozzepe