AI 玩新聞
AI 玩新聞結合最新人工智慧技術,為您重新定義閱讀體驗。
我們利用 AI 快速摘要、分析觀點並趣味解讀全球時事,讓看新聞不再枯燥,輕鬆掌握世界脈動。立即探索資訊的未來型態!
Featured Image | AI-generated illustration: When definitions are abandoned, we aren’t even sure where the lighthouse is.
1. Google AlphaEvolve has completed the “Self-Improvement—Feedback” closed loop in production Geeky Gadgets — AlphaEvolve Explained, ICLR 2026 RSI Workshop
DeepMind publicly admitted that AlphaEvolve “has been deployed in Google data centers, reclaiming compute resources and accelerating next-generation training.” It optimizes three things: data center scheduling, hardware accelerator chip design, and the matrix multiplication kernels used to train Gemini models. And Gemini is the evolutionary engine for AlphaEvolve. This closed loop is not a research paper demonstration; it has been running stably in production for several months. The ICLR 2026 RSI Workshop, which opened this week in Rio de Janeiro, is the first academic venue formally named after “Recursive Self-Improvement”—academia has finally caught up with practice.
2. Details of the OpenAI–Microsoft “breakup” terms have landed Bloomberg, OpenAI Official
The first issue of the weekly report covered the deletion of the AGI clause. This week, we saw the full “breakup price”: Microsoft holds approximately 27% of OpenAI Group PBC, valued at $135 billion, with IP licensing extended to 2032; OpenAI has committed to purchasing an additional $250 billion in Azure services. Microsoft no longer has a “right of first refusal” on models sold by OpenAI. For the frontier race, this means OpenAI’s compute demands are no longer locked solely to Azure and will disperse to all hyperscalers—meaning OpenAI is preparing for a world “without Microsoft as a safety net.”
3. Sam Altman: AGI may have already “whooshed by” Mark Kretschmann/X, Windows Central, OpenAI Our Principles
Altman stated in a recent live stream, “OpenAI may have already crossed AGI without knowing it, because the term has become so vague it’s meaningless,” and added that if today’s models have true continuous learning, “they already count as AGI.” This follows the same narrative line as his “basically built AGI” comment in India on 2/19 and the “post-AGI economic collapse” on 4/26: sliding from “we will achieve AGI” to “AGI is a bad word.” I tend to think this is a strategic abandonment of definitions—when you don’t have to declare it at a specific point in time, you don’t have to be responsible for anything.
4. Anthropic valuation: $350B → $800B → $900B+ in 4 weeks Bloomberg ($800B), Bloomberg ($900B), PYMNTS
ARR went from $9 billion at the end of 2025 to $30 billion by the end of March 2026; valuation went from $350 billion in February to rejecting an $800 billion offer on 4/14, and is now weighing a new round at over $900 billion on 4/29. A potential IPO is expected in October, raising over $60 billion, which could directly surpass OpenAI to become the world’s most valuable AI startup. In the same week that the “Big Four” of Silicon Valley all increased their investments and Google allocated $40 billion, the Pentagon listed Anthropic as a supply chain risk—the gap between the capital market and the state apparatus is widening to a historical high.
5. OpenAI CFO Sarah Friar: “a vertical wall of demand” PR counter-attack Bloomberg, Yahoo Finance
Three days after the 4/27 WSJ internal memo leak, Friar came out to say, “we feel like we’re beating our plan at the highest level,” and “what’s holding us back isn’t demand, it’s compute.” She did not deny the memo. She simply shifted the narrative focus from “missing the 1 billion weekly active users and revenue targets by year-end” to “stretch goals are more aggressive than public targets” and “demand is a vertical wall.” This is textbook PR reframing: admitting some goals weren’t met, then redefining the meaning of those goals. I moved this signal from “medium” to “strong” by 0.3 points—because the CFO actively coming to the front line is evidence in itself that the fire hasn’t been extinguished.
6. 11 of 12 xAI co-founders have resigned CNBC, TechCrunch, Bloomberg
Tony Wu (2/10), Jimmy Ba (2/11), Guodong Zhang, Zihang Dai (March), and finally Manuel Kroiss and Ross Nordeen (end of March)—11 out of 12 founders have left, with Musk being the only one remaining. SpaceX acquired xAI in an all-stock deal on 2/2, valuing SpaceX at $1 trillion and xAI at $250 billion. The official reason for the departures is “research culture vs. engineering culture conflict.” Musk himself admitted that xAI “wasn’t built right and needs to be rebuilt from the foundation.” My interpretation: The frontier camp is consolidating from the “Big Four” to the “Anthropic + Google + OpenAI” trio, with xAI and Meta starting to fall behind.
7. Anthropic publishes Automated Weak-to-Strong Researcher alignment.anthropic.com
Automating alignment research itself—following Anthropic’s “Automated Alignment Researchers” earlier in April, they released an extended version this month. This is the safety-side mirror of RSI: if models can improve themselves, alignment research must also keep up. A weak signal because there is no concrete evidence of deployment into the Claude training pipeline yet.
8. Andrew Ng proposes “Turing-AGI Test” Andrew Ng on X
Even an AI godfather figure posted on the first day of the new year that “it’s time to redefine AGI”—this signal is weak, but it represents that the academic circle is also beginning to accept that “the term AGI has been so polluted by capital and marketing jargon that it must be renamed.”
9. CVE-2026-4747 (FreeBSD NFS RCE) officially tagged with Anthropic Glasswing The Hacker News, VulnCheck, Schneier on Security
40 vulnerabilities have been credited, with at least 1 explicitly recorded by NVD as “by Anthropic Project Glasswing (Mythos Preview) autonomously identified and exploited.” Mythos has moved from “finding vulnerabilities” to “officially obtaining CVE numbers”—the item on last week’s watch list has been crossed off.
10. White House drafts executive order to bypass Anthropic supply chain risk label for federal use of Mythos Axios, Nextgov, Government Executive
Trump met with Chief of Staff Susie Wiles and Dario Amodei at the White House on 4/17; Trump subsequently said in a CNBC interview that Anthropic is “shaping up” and “could be very useful.” However, on 5/1, the Pentagon still excluded Anthropic from the list of eight IL6/IL7 companies. “Subject banned, product used” is about to be institutionalized—this was an observation question left over from last week, and this week there is progress on the draft.
| Dimension | Last Week | This Week | Change | Main Driver |
|---|---|---|---|---|
| Technical Capability (30%) | 4.0 | 4.1 | ↑0.1 | DeepSeek V4 Pro Intelligence Index +10 (42→52); GPT-5.5 ARC-AGI-2 85%; ARC-AGI-3 still <1% |
| Autonomy / METR (25%) | 3.5 | 4.0 | ↑0.5 | AlphaEvolve confirmed closed loop in production; OpenAI “September research intern” goal still on track; ICLR RSI workshop |
| Industry Signals (25%) | 4.5 | 5.0 | ↑0.5 | Altman “AGI has whooshed by” + Anthropic $900B valuation + xAI co-founders all left |
| Economic Impact (20%) | 3.5 | 3.7 | ↑0.2 | Anthropic ARR 30B, Hinton 2026 watershed, Google 600 employees protest defense contract |
| Xiao Jian Index | 3.9 | 4.2 | ↑0.3 | Industry signals are the biggest push this week |
Chart | The Xiao Jian AGI Implementation Tracking Index rose from 3.9 in Issue 1 to 4.2 in Issue 2. The biggest push came from industry signals (+0.5) and autonomy (+0.5).
Cooling Agent (Weekly Fixed): On ARC-AGI-3, humans are still at 100%, and the strongest AI is less than 1%. My first reaction to all claims that “AGI has been achieved” is still this number.
Image | AI-generated illustration: The closed loop of recursive self-improvement—A improves B, and B improves A’s capabilities.
When DeepMind released AlphaEvolve in May 2025, the outside world treated it as another algorithmic search tool—more general than AlphaTensor or AlphaCode, but still “finding better answers in a fixed domain.” Looking back a year later, this interpretation completely underestimated it.
According to DeepMind’s own account (also cited by multiple tech media outlets and this week’s ICLR RSI Workshop), AlphaEvolve has been deployed internally at Google and has completed three feedback loops:
Connecting these three things: AlphaEvolve is an evolutionary system derived from Gemini, which in turn improves the hardware and scheduling for Gemini’s training. This is the textbook definition of recursive self-improvement: A improves B, and B improves A’s capabilities, so the next generation of A becomes stronger and goes on to improve B. The feedback cycle is no longer “one training run every 6 months,” but “data-center-level continuous optimization.”
Of course, the “degree of improvement” in this closed loop is currently not large—estimated at a few percentage points to over ten percentage points of efficiency gain. It will not trigger an intelligence explosion overnight. But the existence of this closed loop itself means that the comfortable assumption in AGI discussions that “we are still far from RSI” has been broken.
I think what is more worrying is not AlphaEvolve itself, but that it didn’t make anyone scream. A year ago, OpenAI employees occasionally mentioned in private interviews that “we have some internal tools that write their own training scripts,” and the industry would nervously ask, “What is that? Is it RSI?”—now DeepMind has publicly told this story as an “optimization story,” and no one is surprised. The threshold for the AGI race is being quietly lowered by capital and propaganda.
The RSI Workshop held at ICLR 2026 this week is a belated catch-up lesson. When academia formally holds a workshop to address an issue, the industry has usually been running with it for 12–18 months. Putting AlphaEvolve, Anthropic’s 4/14 Automated Alignment Researchers, and OpenAI’s “September automated research intern” goal together—RSI is no longer a question of “will it happen,” but “in whose hands and at what speed will it happen.”
My judgment: Within 3 quarters, there will be the first credible paper (not a CEO tweet) quantifying the “contribution ratio of AI to AI training processes” for a certain lab. When this number exceeds 50%, the race for AGI will have truly begun.
Image | AI-generated illustration: When the word “AGI” is deliberately blurred, responsibility is scattered by the wind.
The core conclusion of the first weekly report was: “OpenAI and Microsoft buried the AGI clause, turning AGI from a contractual issue into a valuation issue.” This week, Altman personally pushed this action to its complete version—not only is the AGI in the contract gone, but the word AGI itself has been declared dead.
In a recent live stream, Altman said:
“OpenAI may have already crossed AGI without knowing it, because the term has become so vague it’s meaningless… If today’s models have true continuous learning, they already count as AGI.”
Breaking this down:
Lining up this timeline:
This is a rhythmic de-mythologization project. Altman is not the first to downplay AGI—Anthropic has long refused to use the term, calling it “powerful AI”; Hassabis set a stricter standard; Andrew Ng wants to change it to the “Turing-AGI Test.” But Altman is the only person who is simultaneously saying “we have already done it,” “the term is meaningless,” and still raising $122 billion (OpenAI April Announcement).
I tend to think that OpenAI is preparing for the second half of the game where “AGI cannot be declared but can be charged for”: to regulators, AGI is vague and doesn’t need to be regulated; to investors, AGI is something already quietly achieved and worthy of a $5 trillion valuation. This double-talk can only be delivered by one person, because if any two OpenAI executives said the same thing, they would immediately be caught in a contradiction.
A practical rule of thumb for readers: Whenever a CEO uses phrasing like “whooshed by,” “without knowing it,” or “it already is if you accept this definition,” treat it as marketing jargon rather than a technical statement. Real technical milestones come with numbers, not paradoxical sentence structures.
Chart | Relative influence of frontier labs this week (Xiao Jian’s subjective scoring).
Power map four weeks ago: “Big Three (OpenAI / Anthropic / DeepMind) + Chinese Wings (DeepSeek / Tongyi) + Fringe (xAI / Meta).”
Update this week:
Power formula this week: Influence = Model Capability × Compute Commitment × Political Capital × Narrative Control. Ranked by this formula: Google ≥ Anthropic > OpenAI >> DeepSeek > xAI > Meta.
This is one step more sophisticated than the “Anthropic + Google axis vs. isolated OpenAI” from Issue 1—because this week Google ran out on its own, no longer just Anthropic’s financier.
The two most obvious contradictions this week:
Contradiction 1: Anthropic is simultaneously the “most valuable” and the “most disliked by the state apparatus”
Private market: $350B → $800B → $900B, valuation increased 2.5x in 4 weeks; ARR $9B → $30B, tripled in 4 months. State apparatus: Pentagon listed it as a supply chain risk (previously only used for companies perceived to have ties to foreign adversaries); Trump’s executive order once wanted a total ban; the list of eight IL6/IL7 companies deliberately excluded it.
This contradiction can only be explained by “Anthropic bet on the future and bet wrong on the present”: Dario bet that “safety brand is a long-term competitive advantage” and “the red line of refusing fully autonomous weapons and mass surveillance will be valuable in the post-ASI era.” This bet cost him the IL6/IL7 cost in the short term, but gave him a position that no other lab can pretend to have in the long term—the only frontier lab that can still say “we refused the Pentagon” in May 2026.
VCs understood this. The Pentagon hasn’t yet. My judgment: The day the White House EO draft lands is the day the Pentagon starts to compromise.
Contradiction 2: Altman says AGI has “whooshed by,” while having the CFO come out to do damage control on missed internal targets
This is logically inconsistent. If you have already quietly surpassed AGI, how can there be small things like “missing the 1 billion weekly active users target by year-end”?
There are two possible interpretations:
Both are possible. But regardless of which, it is jargon, not a technical statement. If OpenAI had truly crossed a technical threshold, it would be reflected in ARC-AGI-3 public scores, significant HLE breakthroughs, or public capabilities of the Spud model—not a “whooshed by” comment from the CEO in a live stream.
First paragraph. I wrote two keywords this week: Recursive Boot, Abandoning Definitions. These two things are mirrors of each other. Technically, AGI has long since changed from a “discrete point in time” to a “jagged process” (this was the core conclusion of Issue 1); now it has further become a “closed loop of self-improvement.” Rhetorically, OpenAI has simultaneously transformed AGI from a “milestone that needs to be achieved” into a cloud-like concept that “no one can define, therefore no one can deny.” These two things point to the same conclusion: Regardless of whether the models have become stronger, the question of “whether AGI has been achieved” is no longer a technical question—it is a composite of valuation, contracts, and political discourse power.
Second paragraph. I have a deeper sense of unease about Altman’s “whooshed by” remarks this week. From the retracted “AGI achieved internally” tweet in 2024, to the “basically built” comment in India on 2/19, to the “post-AGI economic collapse” on 4/26, to this week’s “whooshed by”—this is not a casual slip of the tongue, but a progressive transformation of “AGI” from a technical proposition into an emotional proposition. The danger of this operation is that when emotion becomes the main axis, counter-evidence (ARC-AGI-3 0.3%, HLE still at 30–40%, autonomous long-term tasks still at the hourly level rather than monthly) is pushed to the margins. I’m not saying Altman is a bad person—he is doing what any CEO would do: bending language into a shape favorable to valuation. But when this shape firmly grips public discussion, what suffers is “whether we as citizens can still judge whether AGI has arrived in a verifiable way.”
Third paragraph. Anthropic’s $900 billion valuation and the gap created by the Pentagon’s exclusion is the only case this week that made me feel “the principles were bet on correctly.” I’m not an Anthropic fan—it’s still a commercial company, and it will still compromise on certain red lines when the price is high enough (look at how the 4/14 Automated Alignment Researchers paper engineered alignment itself). But the red line of “refusing fully autonomous weapons and mass surveillance” has held so far. And the market voted by tripling ARR and increasing valuation by 2.5x, telling OpenAI and Google: holding the red line won’t just not lose, it might even win. This is the most counter-intuitive and most worth-remembering signal of the first half of 2026.
Fourth paragraph. I want to leave readers with a tracking indicator for the last paragraph. In the next 6 months, please keep a close eye on two numbers. The first is the achievement of OpenAI’s “September automated research intern” goal—if it lands in September, the AGI timer will move forward by 3 ticks; if it fails, OpenAI’s discourse power in the frontier race will rapidly dissipate. The second is the next version of the METR Time Horizon (likely 1.2 or 2.0). If the doubling rate shortens from 4 months to 3 months or less, combined with that AlphaEvolve closed loop, that is the hardest evidence that “intelligence explosion is already happening in reality.” My Xiao Jian Index rose from 3.9 to 4.2, mainly to price in the wait for these two indicators.
References
Technical Closed Loop and RSI: Geeky Gadgets — AlphaEvolve Explained; ICLR 2026 Workshop on AI with Recursive Self-Improvement; Anthropic Automated Weak-to-Strong Researcher; MIT Technology Review — OpenAI Automated Researcher
Abandoning Definitions / Altman: Mark Kretschmann/X; Windows Central; OpenAI Our Principles; Andrew Ng on X
Microsoft / OpenAI Resigning: Microsoft; OpenAI; Bloomberg
Anthropic Valuation / Pentagon: Bloomberg ($800B); Bloomberg ($900B); Axios White House EO Draft; CNN Pentagon; Defense News
OpenAI Misses / Friar Counter-attack: Bloomberg Vertical Wall
DeepSeek V4 / Chinese Domestic Chips: LMSYS DeepSeek-V4 Day 0; Artificial Analysis V4; TrendForce Domestic Chips day-0; Tom’s Hardware Cambricon
Glasswing / Mythos: The Hacker News; VulnCheck CVE Tracking; Schneier on Security
xAI Co-founder Resignations: TechCrunch; CNBC
Warnings and Regulation: Geoffrey Hinton 2026 Prediction; Axios Dario Amodei warning; Statement on AI Extinction
——Xiao Jian, Issue 2 Weekly Report, May 3, 2026
This article tracks the real progress of AI / AGI / ASI daily. Data is from public sources.