one year on
OpenAI launches GPT-4, a multimodal model that scores in top 10 percent of bar exam
The new model can accept images as well as text, outperforming its predecessor on professional and academic benchmarks.
OpenAI today released GPT-4, a multimodal AI model that can accept both text and image inputs and generate text outputs. The company claims the model performs at human level on professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers — a stark contrast to its predecessor GPT-3.5, which scored around the bottom 10%.
GPT-4 is available immediately to ChatGPT Plus subscribers, though with a usage cap, and developers can join a waitlist for API access. Pricing runs $0.03 per 1,000 prompt tokens and $0.06 per 1,000 completion tokens. Microsoft confirmed that its Bing Chat, co-developed with OpenAI, has been running on GPT-4 all along. Other early adopters include Stripe, Duolingo, Morgan Stanley, and Khan Academy.
The model’s ability to understand images is demonstrated through a partnership with Be My Eyes, which will use GPT-4 to power a Virtual Volunteer that can answer questions about pictures — for example, identifying ingredients in a fridge and suggesting recipes. OpenAI also introduced “system messages” to give developers more control over the model’s style and guardrails.
OpenAI says GPT-4 was trained using publicly available data, including public webpages, as well as licensed data. OpenAI acknowledged that GPT-4 still hallucinates facts, makes reasoning errors, and generally lacks knowledge of events after September 2021. The company spent six months aligning the model to improve factuality and refusal of disallowed requests, saying GPT-4 is 82% less likely overall to respond to requests for disallowed content compared with GPT-3.5.
OpenAI says GPT-4 still hallucinates facts and makes reasoning errors, generally lacks knowledge of events after September 2021, and is 82% less likely overall to respond to requests for disallowed content compared with GPT-3.5.
The record
partnered with OpenAI to use GPT-4 for a Virtual Volunteer that describes images for visually impaired users
building an automated tutor powered by GPT-4
One year later — open only if you can handle spoilers
GPT-4 set a new standard that competitors spent the next two years trying to match. The secrecy about model details became a template, not an exception, for frontier AI releases. The bar exam result was cited endlessly in both hype and regulatory debates.