OpenAI believes their latest GPT-4o model is'medium' risk
OpenAI has released its GPT-4o system map, a research paper outlining the security measures and risk assessments the startup made before releasing the latest model. GPT-4o was publicly launched in May this year. Before its debut, OpenAI used an external red team, or group of security experts, to try to identify weaknesses in the system to identify critical risks in the model (which is pretty standard practice). They looked at risks such as the possibility of GPT-4o cloning someone else's voice without permission, pornographic and violent content or copying copyrighted audio blocks. The results are now out.
According to OpenAI's own framework, the researchers found GPT-4o to be a "moderate" risk. The overall risk level is taken from the highest risk assessment in four general categories: cyber security, biological threats, persuasion and model autonomy. All were considered low-risk except for persuasion, the researchers found that some writing samples on the GPT-4o could influence readers' opinions more than human-written texts, although the pattern samples were generally not stronger. Openai spokesman Lindsay McCallumrémy told The Verge that the system card includes preparation evaluations performed by internal teams and external testers listed on the Openai website as model evaluation and threat research (METR) and Apollo. AI system.
This is not the first system card released by Openai; But OpenAI released this system map at a critical time. The company's safety standards have been constantly criticized, from company employees to state senators. There were just minutes before the GPT-4O system card is released, and RIM reported only in a public letter from Senator Elizabeth Vallen (D-MA) and representatives. Lori Trahan (D-MA) called for answers about OpenAI's handling of whistleblowers and security breaches. The letter outlines several public safety concerns, including CEO Sam Altman's brief ouster from the company in 2023 over board concerns and the departure of the safety chief. The latter claimed that "security culture and processes have changed".
In addition, the company launched a powerful intermodal transportation model before the US presidential election. There’s a clear potential risk of the model accidentally spreading misinformation or getting hijacked by malicious actors — even if OpenAI is hoping to highlight that the company is testing real-world scenarios to prevent misuse. There have been plenty of calls for OpenAI to be more transparent, not just with the model’s training data (is it trained on YouTube?), but with its safety testing. In California, where OpenAI and many other leading AI labs are based, state Sen. Scott Wiener is working to pass a bill to regulate large language models, including restrictions that would hold companies legally accountable if their AI is used in harmful ways. If that bill is passed, OpenAI’s frontier models would have to comply with state-mandated risk assessments before making models available for public use. But the biggest takeaway from the GPT-4o System Card is that, despite the group of external red teamers and testers, a lot of this relies on OpenAI to evaluate itself.