X is the best way I’ve found to keep up with AI. I like tweets throughout the week, filtering for things I think are actually worth knowing. I use Claude Code to pull those likes automatically and help me turn them into this post (here’s how the pipeline works). This week: 150 tweets liked, filtered down to what’s below.
Check out the previous roundup (Apr 5) if you missed it.
AI for Everyone
Claude Opus 4.7 Ships (12+ mentions)
Anthropic launched Claude Opus 4.7 on April 16. Pricing held steady at $5 per million input tokens and $25 per million output. What changed is how the model spends those tokens: it thinks longer before answering and double-checks itself more aggressively. Vision is the part I didn’t see coming. Opus 4.7 now accepts images up to 2,576 pixels on the long edge, triple what prior Claude models took, which actually matters for diagrams, slides, and UI screenshots. If you’ve been running 4.6 for long tasks, swap the model ID and give it something harder. (source: @claudeai, @claudeai, @bcherny)
Perplexity Releases Personal Computer
Perplexity shipped Personal Computer, a Mac app that drives your local files, native apps, and browser. Rolling out to Max subscribers today and the waitlist behind them. Same “AI that can actually do stuff on your machine” pitch that Anthropic and OpenAI are running, aimed at the slice of users who already open Perplexity instead of Google. (source: @perplexity_ai)
Gemini Comes to Mac
Google released a native Swift Gemini Mac app. Sundar says a small team built it with Antigravity, prototyped it in a few days, and shipped 100+ features in under 100 days. It’s the first time Gemini has been a desktop app, not a browser tab. Worth trying if you’re already paying for Gemini. (source: @joshwoodward, @sundarpichai)
GPT-5.4 Pro Claims a 60-Year-Old Math Problem
GPT-5.4 Pro reportedly solved Erdős Problem #1196, a conjecture on primitive sets that’s been open for 60 years, in about 80 minutes of one-shot reasoning. Jared Duker Lichtman, who spent four years of his doctorate on a related Erdős primitive-set result, tweeted about his own prior work the same day, which got widely misread as independent confirmation of the AI proof. It isn’t. The result hasn’t been verified by a mathematician yet, so treat this as “interesting if it holds up.” (source: @kimmonismus, @jdlichtman)
Gemini 3.1 Flash TTS with Audio Tags
Google launched Gemini 3.1 Flash TTS, their most controllable text-to-speech model. The interesting bit is audio tags. You embed [excited], [whispered], or [sarcastic] inline in the prompt and the model actually acts on it. 30 prebuilt voices, director’s notes for pacing, 100+ languages. Prompting it feels more like directing a voice actor than tuning a synth. (source: @GoogleAI, @GoogleDeepMind)
Perplexity Adds Personal Finance
Perplexity launched Personal Finance with Plaid, free for all users. Link your bank accounts, credit cards, and loans, then ask Perplexity questions about your spending. Computer task execution on your finances is Pro/Max only. The free tier alone is a decent Mint replacement. (source: @perplexity_ai, @testingcatalog)
Claude for Word Enters Beta
Anthropic shipped Claude for Word for Team and Enterprise plans. Draft and revise from a sidebar, with edits showing up as tracked changes and formatting preserved. Most non-technical knowledge workers aren’t going to install a desktop app or learn a new chat interface, but they’ll accept tracked changes in Word. That’s where Claude finally meets them. (source: @claudeai)
AI for Developers
Cloudflare Agents Week Dumps the Full Stack (20+ mentions)
Cloudflare ran Agents Week and shipped agent infrastructure at every layer. The headline launches: Cloudflare Email Service in public beta for agents to send and receive email natively with an onEmail hook. Cloudflare Mesh with post-quantum encrypted private networking and 50 mesh nodes per account. Sandboxes GA as real persistent compute environments for agents, now with secure credential injection and dynamic egress policies. The Agents SDK preview adds durable long-running agents, voice pipelines, version control via git, and a partnership with OpenAI as a sandbox host. Browser Rendering became Browser Run with Live View, Human in the Loop, CDP access, and 4x higher concurrency. The Cloudflare Registrar API lets agents register domains at cost from your terminal. The Cloudflare dashboard itself is now agentic with generative UI that completes tasks like “create a Worker and bind an R2 bucket.” If you’re building agents and not on Cloudflare, it’s getting harder to justify. (source: @Cloudflare, @Cloudflare, @CFchangelog, @whoiskatrin, @CloudflareDev, @BraydenWilmoth)
Opus 4.7 in Claude Code with a New xhigh Effort Level (12+ mentions)
Opus 4.7 is live in Claude Code today. Boris Cherny says the default effort is now xhigh, a new level between high and max that trades latency for more reasoning. Input token use runs 1.0–1.35× Opus 4.6 on the same workload, which is why Anthropic also raised rate limits for all subscribers. Max users get auto mode: no permission prompts, hand it a task, come back to verified work. Cat Wu’s advice: treat it like an engineer you delegate to, not a pair programmer you guide line by line. Austin Ray called it “a monster.” I’m not going to argue. (source: @bcherny, @bcherny, @bcherny, @_catwu, @austospumanto)
OpenAI’s Codex Gets Computer Use on macOS (5 mentions)
OpenAI dropped a big Codex update. Computer use on macOS means Codex can click and type in any app, not just APIs. gpt-image-1.5 is baked in so it can generate mockups and frontend designs without bouncing to another tool. Thread automations let Codex pick up where it left off and wake itself up to continue long-running work. OpenAI also announced dozens of new plugins, multi-terminal support, SSH into devboxes, and document editing. Feels like the quarter where Codex stopped being a coding assistant and started being a platform. (source: @OpenAI, @OpenAI, @OpenAI, @OpenAI)
Qwen3.6-35B-A3B Goes Open (4 mentions)
Alibaba released Qwen3.6-35B-A3B under Apache 2.0. Sparse MoE, 35B total params with only 3B active. Alibaba’s own marketing says it matches models “10× its active size” on agentic coding, which is the kind of claim I’d want to benchmark myself before trusting. What’s not in dispute: the Unsloth 4-bit GGUF runs on 23GB of RAM. A decent gaming GPU or an M-series Mac with 32GB can host it, which puts a real coding model inside a lot of people’s laptops. (source: @Alibaba_Qwen, @UnslothAI, @Alibaba_Qwen)
Claude Code Routines (8 mentions)
Anthropic shipped Claude Code Routines in research preview. A routine is a saved prompt plus a repo plus connectors. Triggers: cron schedule, GitHub event (PR opened, release published), or a per-routine API endpoint with a bearer token. Routines run on Anthropic’s cloud, so your laptop doesn’t need to be open. Available on Pro, Max, Team, and Enterprise with Claude Code on the web enabled. Nightly backlog triage, PR review, docs drift detection. The stuff where you’d normally hack something together with cron and a shell script. I’ve been waiting for this one. (source: @claudeai, @noahzweben, @_catwu)
Claude Mythos Preview Leaks Beat Opus 4.7 (5 mentions)
Anthropic’s launch chart for Opus 4.7 has an unusual rightmost column: Mythos Preview, Anthropic’s next-gen model, beats Opus 4.7 on SWE-bench Pro by 13 points, SWE-bench Verified by 6, Terminal-Bench by 13, and Humanity’s Last Exam by 10. The UK AI Security Institute also confirmed Mythos is the first model to complete their cyber range end-to-end. If Opus 4.7 feels like a step up, Mythos is the next one. (source: @aakashgupta, @AISecurityInst, @kimmonismus)
Claude Managed Agents (4 mentions)
Anthropic released Claude Managed Agents, a hosted service that splits the harness (orchestration), the session (append-only event log), and the sandbox (execution environment) into separate components. Time-to-first-token dropped 60% at p50 and 90%+ at p95 vs running your own harness. Kevin Rose cut a NotebookLM video walkthrough. Even if you never touch the hosted version, the engineering blog is worth reading. The architecture argument generalizes to any long-horizon agent. (source: @mc_anthropic, @RLanceMartin, @kevinrose)
Claude Code Desktop App Redesigned
Felix Rieseberg says the Claude Code desktop app was redesigned from the ground up for parallel work and is a lot faster. It’s been his main way to use Claude Code for the last few weeks. If you bounced off the first version, it’s worth another look. (source: @felixrieseberg)
Claude Code Monitor Tool for Background Processes
New Monitor tool in Claude Code. Claude spawns a background process and each stdout line streams into the conversation without blocking the main thread. The example in the announcement: kubectl logs -f | grep ERROR, have Claude watch for errors, then open a PR to fix them. Way cleaner than the old “run for 30 seconds and dump output” pattern. (source: @alistaiir)
Honorable Mentions
- Physical Intelligence π0.7 shows emergent compositional generalization: it figures out how to use an air fryer to cook a sweet potato and folds shirts on robots it’s never seen, using skills learned separately. (source: @physical_int)
- Gemma 4 passed 10 million downloads in its first week, and the Gemma family is over 500 million total. (source: @GoogleDeepMind)
- Fine-tune Gemma 4 free on Kaggle or Colab, no GPU or credit card required. Google and Unsloth are running a $200,000 hackathon. (source: @UnslothAI, @dr_cintas)
- MiniMax M2.7 open-sourced with self-evolution capabilities, and NVIDIA is hosting it free via their API endpoints. (source: @kimmonismus, @AdolfoUsier)
- Claude advisor strategy: pair Opus 4.7 as advisor with Sonnet or Haiku as executor. 2.7 percentage points higher on SWE-bench Multilingual than Sonnet alone, 11.9% cheaper per task. (source: @claudeai)
- GLM 5.1 hit #3 on Code Arena, matching Claude Sonnet 4.6 and beating Gemini 3.1 and GPT-5.4. First frontier-level open model in the top three. (source: @arena)
- Gemini Robotics-ER 1.6 released, SOTA on spatial reasoning, available via the Gemini API. (source: @OfficialLoganK)
- NVIDIA Ising is the first open AI model family built for quantum computing, aimed at the qubit error rate problem. (source: @nvidianewsroom)
- TinyFish ships four web primitives (Search, Fetch, Browser, Agent) behind one API key, 500 free steps to try. (source: @Tiny_Fish)
- /ultraplan in Claude Code builds an implementation plan on the web that you can edit, then runs it back in your terminal. (source: @trq212)
- Firecrawl Fire-PDF is a Rust PDF parser that’s 5x faster and preserves tables and formulas. (source: @firecrawl)
- Voxtral from Mistral is a 4B TTS model with 70ms latency, voice cloning from 3 seconds, and a claimed 68.4% win rate over ElevenLabs Flash v2.5. (source: @_avichawla)
- Google TimesFM is a foundation model that forecasts time series without training on your data. (source: @_vmlops)
- MAI-Image-2 from Microsoft improves in-image text consistency for infographics and slides. (source: @MicrosoftAI)
- Skills in Google Chrome are one-click AI workflows with a library of ready-made skills and custom prompts. (source: @Google)
- Cashtags on X launches for US and Canada iPhone users, bringing real-time financial data to the timeline. (source: @nikitabier)
- GBrain by Garry Tan is an MIT-licensed markdown knowledge graph for giving any agent total recall across 10,000+ files. (source: @garrytan)
- Gemini 3D visualizations can now transform questions into interactive visualizations you can rotate and adjust inside the chat. (source: @GeminiApp)
- Bun test gets experimental per-file test isolation in the next release, years after people first asked. (source: @bunjavascript)
- Shopify Autoresearch cut Polaris build time by 65% and made unit tests 300x faster by running an autonomous optimization loop against the build. Open-sourced as
pi-autoresearch. (source: @tobi) - Last30Days v3 hit 20k+ GitHub stars. AI search engine scored by upvotes, likes, and real money (Polymarket) across Reddit, X, YouTube, HN. (source: @mvanhorn)
- OpenRouter Reranker now exposes Cohere’s reranker behind a single API endpoint, the missing piece for a lot of RAG pipelines. (source: @OpenRouter)
- Forbes: AI labs are buying dead-startup data to feed reinforcement learning gyms, turning old Slack and Jira threads into simulated work environments. (source: @_IainMartin)
- Sequoia’s thesis that the next trillion-dollar company sells work, not software, is getting quoted everywhere this week. The argument: ship a copilot and you compete with every new model release; ship the outcome and you own the value. (source: @guilleflorvs)
- Tesla FSD gets a 100 safety score on every mile driven, bringing down insurance premiums. (source: @tesla_na)
- Waymo + Waze pothole detection pilot launches in SF Bay, LA, Phoenix, Austin, and Atlanta, using autonomous fleet data to help cities fix roads. (source: @SawyerMerritt)
Try This Weekend
For everyone:
- Install Gemini on Mac. Native Swift app, first time Gemini runs outside a browser tab.
- Try Perplexity Personal Computer if you’re on Max. Let it run a task across your local files, a native app, and the browser.
- Link Perplexity Personal Finance with Plaid, free for all users. Ask it “what did I spend on food last month.”
- Ask Gemini to build an interactive 3D visualization for something you’re learning. Rotate it, change variables, see if it beats reading.
For developers:
- Switch to Opus 4.7 in Claude Code with
xhigheffort. Hand it a task you’d normally break into pieces. - Set up a Claude Code Routine. Pick a GitHub event (PR opened) and wire a review routine that posts back as a comment.
- Run Qwen3.6-35B-A3B locally via Unsloth’s 4-bit GGUF. 23GB RAM is enough.
- Build an email agent with Cloudflare Email Service. The
onEmailhook is the 10-minute version. - Install GBrain if you have a pile of markdown notes. Let it index, then ask your agent questions that require memory.
