Open AI Launches GPT-4o-mini

pen AI has another surprise up its sleeve with the release of the GPT-4o mini model. It scored an impressive 82% on the MMLU benchmark, outperforming other models in its class. The price is also very competitive at $0.15 for every 1 million token inputs and $0.6 for every 1 million token outputs. This is more than 60% cheaper than GPT-3.5 Turbo. It features a large context window of 128k, making it ideal for RAG. GPT-4o mini supports text and images in the API and will support text, image, video, and audio input and output in the future. GPT-4omini is set to replace 3.5 as the free model in ChatGPT, but currently does not support multimodality, and the API's token count will significantly increase once it involves images.

I. Model Overview
GPT-4o mini excels in text intelligence and multimodal reasoning. It has outperformed GPT-3.5 Turbo and other small models in academic benchmark tests, supports the same language range as GPT-4o, and performs exceptionally well in function calls, making it suitable for building relevant applications, with superior long-context performance compared to GPT-3.5 Turbo.

II. Benchmark Performance
Reasoning Tasks
It stands out in reasoning tasks involving text and vision, scoring 82.0% on the MMLU benchmark, surpassing Gemini Flash's 77.9% and Claude Haiku's 73.8%.
Math and Coding Abilities
It achieves excellent results in mathematical reasoning and coding tasks. It scored 87.0% on MGSM, higher than Gemini Flash's 75.5% and Claude Haiku's 71.7%; on HumanEval, it scored 87.2%, exceeding Gemini Flash's 71.5% and Claude Haiku's 75.9%.
Multimodal Reasoning
It scored 59.4% on the MMMU benchmark, better than Gemini Flash's 56.1% and Claude Haiku's 50.2%.

III. Application Cases and Collaborations
Collaborating with companies like Ramp, it has shown significant superiority over GPT-3.5 Turbo in tasks such as extracting structured data and generating high-quality email responses.

IV. Security Measures
Security Integration During Development
During pre-training, it filters out undesirable information such as hate speech and adult content. After training, techniques like RLHF are used to align the model's behavior with policies, improving accuracy and reliability.
Built-in Security Mitigations
It shares the same security measures as GPT-4o, having been automatically and manually evaluated, with over 70 external experts testing to address potential risks and enhance security. The GPT-4o mini in the API applies a hierarchical approach to instructions, increasing resistance and making responses more reliable and safer to use. Continued monitoring and security improvements are planned.

V. Availability and Pricing
API Availability
It is now available in Assistants API, Chat Completions API, and Batch API as a text and vision model.
Pricing
Developers pay 15 cents for every 1 million input tokens and 60 cents for every 1 million output tokens. Plans for fine-tuning are in the pipeline.
Application Version Usage
Free, Plus, and Team users of ChatGPT can start using it today, replacing GPT-3.5; enterprise users will begin using it next week.

VI. Future Outlook
Over the past few years, AI costs have significantly decreased, with GPT-4o mini's per-token cost dropping by 99% since 2022. The future envisions seamless integration of models into applications and websites, with GPT-4o mini paving the way for developers to build AI applications, making AI more accessible, reliable, and embedded in everyday digital experiences.