GPT-5.5 vs Mythos Preview: AI Safety Hype Exposed

UK ke AI Security Institute ke naye tests mein GPT-5.5 ne Mythos Preview ke barabar performance dikhayi. Jaaniye kya hai poori kahani.

ISHRAFIL KHAN

AI News

GPT-5.5 vs Mythos Preview: AI Safety Hype Exposed

TL;DR — Quick Summary

OpenAI ke GPT-5.5 ne UK ke AI Security Institute ke cybersecurity tests mein Anthropic ke Mythos Preview ke barabar score kiya hai. Mythos ko pehle bahut khatarnaak bataya gaya tha.

Model 1

OpenAI GPT-5.5

Model 2

Anthropic Mythos Preview

Testing Body

UK AI Security Institute (AISI)

Test Type

95 Capture the Flag cybersecurity challenges

Highest Level Tasks

Expert-level

GPT-5.5 Score

71.4% average on Expert tasks

Mythos Preview Score

68% average on Expert tasks

Key Finding

GPT-5.5 reached "similar level of performance" as Mythos Preview

Pichle mahine Anthropic ne apne Mythos Preview model ko lekar kafi hype create ki thi. Company ne dava kiya tha ki yeh model cybersecurity ke liye ek bada khatra hai. Itna ki unhone iska initial release sirf "critical industry partners" tak hi limited rakha tha.

Lekin ab UK ke AI Security Institute (AISI) ki nayi research ne is hype ko challenge kiya hai. Research ke mutabiq, OpenAI ka GPT-5.5, jo pichle hafte publicly launch hua hai, ne Mythos Preview ke barabar hi performance dikhayi hai cybersecurity tests mein.

Kya hai AISI ka test?

AISI 2023 se frontier AI models ko 95 alag-alag Capture the Flag challenges ke through test kar raha hai. Yeh challenges cybersecurity tasks ko measure karte hain — jaise reverse engineering, web exploitation, aur cryptography.

AISI ke mutabiq, sabse upar ke "Expert" level tasks par GPT-5.5 ne average 71.4 percent score kiya. Yeh Mythos Preview ke 68 percent se thoda zyada hai. AISI ne kaha ki GPT-5.5 ne "a similar level of performance on our cyber evaluations" dikhayi hai jaisi Mythos Preview ne pichle mahine dikhayi thi.

Kya matlab hai is research ka?

Yeh research Anthropic ki us hype ko challenge karti hai jo unhone Mythos Preview ke aas-paas create ki thi. Agar GPT-5.5, jo publicly available hai, wahi level ka khatra hai jo Mythos Preview ka tha — toh phir Mythos ko itna special kyun bataya gaya?

Seedha baat karein toh — Anthropic ne apne model ko bahut khatarnaak bataya, lekin AISI ke tests batate hain ki OpenAI ka model bhi utna hi capable hai. Aur OpenAI ne apne model ko publicly launch kar diya.

Hamaari Baat: Hype vs Reality

Hamari nazar mein, yeh ek important reminder hai ki AI companies apne models ko lekar hype create kar sakti hain. Anthropic ne Mythos Preview ko itna khatarnaak bataya ki uski public release rok di gayi. Lekin jab independent testing hui, toh pata chala ki ek aur model bhi utna hi capable hai — aur woh publicly available hai.

Iska matlab yeh nahi ki Mythos Preview khatarnaak nahi hai. Lekin iska matlab yeh zaroor hai ki humein AI companies ke davon ko blindly accept nahi karna chahiye. Independent testing aur transparency bahut zaroori hai.

Readers ke liye yeh samajhna important hai ki AI safety ek serious issue hai. Lekin jab companies apne models ko "too dangerous to release" batate hain, toh humein puchhna chahiye — kya yeh sach mein safety concern hai, ya sirf ek marketing strategy?

Sources & References

AISI Research Report — UK AI Security Institute

Written by

ISHRAFIL KHAN

Senior Reporter

GPT-5.5 vs Mythos Preview: AI Safety Hype Exposed

Kya hai AISI ka test?

Kya matlab hai is research ka?

Hamaari Baat: Hype vs Reality

Sources & References

More From AI

Pentagon AI Deals with Nvidia Microsoft AWS

SAP AI Governance Secures Profit Margins

ChatGPT Images 2.0 India Viral Trend Guide

Shivon Zilis OpenAI Insider Role Revealed

AI News