Isometric illustration of a cozy desk with a monitor showing AI autonomously moving the cursor, golden sparkle trail from mouse to screen, warm desk lamp lighting on dark navy background

AI weekly: GPT-5.4 launches with computer use, Claude cheats a benchmark, Karpathy ships autoresearch

X is the best way I’ve found to keep up with AI. I like tweets throughout the week, filtering for things I think are actually worth knowing. I use Claude Code to pull those likes automatically and help me turn them into this post. This week: 298 tweets liked, filtered down to what’s below. Check out the previous roundup (Feb 26) if you missed it. AI for Everyone GPT-5.4 Is Here, and It Can Control Your Computer OpenAI launched GPT-5.4 this week. Codex-level coding merged with GPT-5.2-level reasoning, native computer use, and a 1M-token context window, all in one model. You can steer it mid-response. It’s live in ChatGPT (as GPT-5.4 Thinking and GPT-5.4 Pro), in the API, and rolling into Microsoft Copilot Studio. If you want to try it for coding, download the Codex app. You used to need a dedicated Codex model, but 5.4 handles it all. You still need to select a reasoning level. ...

March 8, 2026 · 8-minute read · matt silverman  · 
Isometric illustration of a cozy desk with monitor showing a Vite build completing, desk lamp, coffee mug, and React-to-Vite transformation icons

AI weekly: Nano Banana 2, Cloudflare rebuilds Next.js, Gemini takes over Android

Here’s what caught my attention from Feb 16-26, 2026: Check out the previous roundup (Feb 16) if you missed it. AI for Everyone Google Launches Nano Banana 2: Pro-Quality Image Generation at Flash Speed Google just dropped Nano Banana 2, their new image generation model built on Gemini Flash. The headline numbers: pro-level quality running at Flash speed, which in practice means you get high-fidelity images without waiting around. It handles text rendering (historically terrible in AI image gen), 4K upscaling, aspect ratio control, and subject consistency across multiple generations. Demis Hassabis says it taps into Gemini’s world understanding and real-time search to produce more accurate results, which tracks with what we’re seeing in the outputs. The rollout is wide: Gemini App, AI Studio, the Gemini API (listed as “Gemini 3.1 Flash Image”), Google Search, Flow, Google Ads, and Vertex AI. OpenRouter already has it. Logan Kilpatrick also announced new lower-cost resolutions and an Image Search tool. If you’ve been using DALL-E or Midjourney by default, this is worth trying today. (source: @GoogleDeepMind, @demishassabis, @OfficialLoganK) ...

February 26, 2026 · 9-minute read · matt silverman  · 
Laptop on a desk showing a GitHub repository with rising star count, phone with AI chat, and a miniature claw machine toy, warm desk lamp lighting

AI weekly: OpenAI hires OpenClaw founder, Qwen 3.5 and MiniMax M2.5 launch, WebMCP, and I built an app at the Claude Code hackathon

Personal News I got accepted into the Claude Code hackathon last week. 13,000 people applied, 500 got in. The hackathon just wrapped up today and I built Robo.app. Check it out, and watch the 3 min pitch video! Here’s what caught my attention from Feb 10-16, 2026: 🚀 OpenAI Hires OpenClaw Founder for “Personal Agents” Peter Steinberger joining OpenAI to build next-gen personal agents - Sam Altman personally welcomed him. OpenClaw hit 100K GitHub stars in 2 days (fastest any repo has ever done that), now at 197K+. Both Meta and OpenAI made offers. The fact they’re building a dedicated personal agents team, not folding him into ChatGPT, signals a new product line. OpenClaw will transition to an independent open-source foundation that OpenAI will continue to support. ( @sama) 🏋️ Three Major Model Launches in One Week Qwen 3.5-397B released - 397B total params, 17B active via hybrid linear attention + MoE. 87.8 MMLU-Pro, 83.6 LiveCodeBench v6, 76.4 SWE-bench Verified. Decoding 8.6x-19x faster than Qwen3-Max. Apache 2.0, 201 languages. Already on OpenRouter. ( @Ali_TongyiLab) MiniMax M2.5: Sonnet-level coding at 1/20th Opus cost - 230B MoE, 10B active. 80.2% SWE-Bench Verified, $0.30/M input tokens. OpenHands confirmed it’s the first open model to beat Sonnet on their benchmark. Stock jumped 15.7%. If you’re running coding agents, test this as your workhorse model. ( VentureBeat) ByteDance Seed 2.0 / Doubao 2.0 - Claims GPT-5.2 and Gemini 3 Pro level at ~1/10th cost. Doubao app has 155M weekly active users. China’s frontier labs aren’t 6 months behind anymore. ( @QuanquanGu) 🌐 WebMCP: Every Website Becomes an API for Agents Google and Microsoft co-authored a W3C spec called WebMCP - Websites can now declare capabilities as structured tools via navigator.modelContext. Two modes: declarative (HTML forms self-describe) and imperative (JS exposes custom tools). 89% token efficiency improvement over screenshot-based agents. Available in Chrome 146 Canary behind a flag. ( @aakashgupta) Pietro Schirano built a DoorDash agent demo - Adds items to cart and checks out without ever parsing UI. This is what “agents that talk to apps” looks like. ( @skirano) Links: Chrome blog | W3C spec draft ...

February 16, 2026 · 4-minute read · matt silverman  · 
Monolithic software block cracking apart with luminous AI agents emerging from the fractures

AI & dev updates: agent swarms, Opus 4.6, GPT-5.3-Codex, and the end of SaaS as we know it

Here’s what caught my attention from the past 48 hours (Feb 6-7, 2026): 🤖 Opus 4.6: The Agent Model Arrives Claude Opus 4.6 launches with agent teams, fast mode, and 1M context - SOTA on agentic coding benchmarks. Can plan before acting, catch its own mistakes, and coordinate parallel agents. This is the model designed for tasks that span hours, not minutes. ( @claudeai) Agent teams are experimental but available now - Set CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS and spawn multiple agents that coordinate via shared task lists and a mailbox system. I’ve been running these all weekend. ( @bcherny) Anthropic built a C compiler with agent teams - They mostly walked away. Two weeks later it compiled the Linux kernel. ( @AnthropicAI) Fast mode: 2.5x output speed - Same model, faster inference. $30/150 MTok (50% discount through Feb 16). Toggle with /fast in Claude Code. Heads up: switching mid-conversation costs more because you pay fast mode price for the full context. ( @claudeai) 🏎️ GPT-5.3-Codex Drops 57% SWE-Bench Pro, 76% TerminalBench, 64% OSWorld - 40% faster and <50% token usage vs 5.2-Codex. Mid-task steerability means you can course-correct while it’s working. The model was used to build itself. ( @sama) Codex crosses 1M users - Dedicated macOS app, integrated into Xcode 26.3, rate limits doubled for paid plans through April. ( @sama) 🐝 Agent Swarms Go Production Reverse-engineering Claude Code to save $200/day - Stan Girard figured out Claude Code’s filesystem-based agent protocol and built a REST API + TypeScript SDK around it. Uses his $200/month Max subscription instead of burning $200/day on API calls. Anthropic hasn’t commented. ( @_StanGirard) Local agent swarms on a single GPU, no internet - Qwen3-Coder-Next on one GB10 GPU. 100 tokens/sec generation, 17,871 tokens/sec read, 256k context. No cloud, no quota anxiety. ( @iotcoi) Anthropic PMs don’t write PRDs anymore - They use Claude Code to build the first version, dogfood it for weeks, then ship. This workflow took them from $1B to $9B run-rate in 12 months. ( @aakashgupta) 💰 The SaaS Reckoning Goldman Sachs: SaaS has a 5-year shelf life - Application software will hit $780B by 2030 but agents will capture 60%+ of the economics. Current SaaS multiples are wrong. ( @aakashgupta) OpenAI launches Frontier: AI coworkers for enterprise - Agents that connect CRM, data, ticketing, and internal tools. Sam’s framing: “People will manage teams of agents to do very complex things.” ( @sama) “Ask what your token budget will be” - Ethan Mollick says if you’re considering a job offer, ask about your token budget. AI access is now a comp consideration. ( @emollick) 🌟 The Tweet That Says It All Tobi Lütke endorses Pi over every funded AI coding tool - Pi has four tools: Read, Write, Edit, Bash. No MCP, no sub-agents, no plan mode. Can write plugins for itself as you use it. Shopify’s CEO calling a solo dev’s project “the most interesting agent harness” is a statement about simplicity as strategy. ( @tobi) 📋 Honorable Mentions Qwen3-Coder-Next - First open-weight coding agent model, 80B params, trained on 800K verifiable tasks Cloudflare R2 Local Uploads - Up to 75% faster uploads, <100ms reported Cloudflare Queues on free plan - 10k queue ops/day included in Workers free tier X API pay-per-use - New Console, XDK, and MCP for indie builders. I set this up and it’s working great. Xcode 26.3 gets Codex + Claude SDK - Apple sees agentic dev workflows as table stakes ESLint v10 - Flat config only, eslintrc is gone Postgres 18 OAuth 2.0 - OAuth for database auth ElevenLabs raises $500M at $11B - Voice AI is a category now Excalidraw MCP App - Natural language diagrams in Claude chat via MCP Apps protocol dbt Agent Skills - 9 auto-loading skills, key insight: MCP = tools, Skills = knowledge Nvidia Parakeet - 83ms transcription on M3, 26x faster than Lightning Whisper Chatterbox Turbo - Open source voice cloning in 5 seconds, <150ms latency Anthropic Startup Program - Up to $25K in credits, no VC required Ethereum ERC-8004 - Agent payment standard by Ethereum Foundation, MetaMask, Google, Coinbase Waymo World Model - Photorealistic simulation for rare edge cases, built on Genie 3 Claude Code /insights - Analyzes your past month of usage and gives workflow suggestions OpenRouter free router - Routes to free LLMs matched to your request awesome-llm-apps - 92.6k stars, copy-paste code for 9 categories of LLM apps This curated summary covers 182 liked and bookmarked tweets from Feb 6-7, 2026. The meta-story: we’re moving from “AI as assistant” to “AI as autonomous workforce” faster than most realize. ...

February 7, 2026 · 4-minute read · matt silverman  · 
ChatGPT Image Jul 7, 2025, 04_13_31 PM

AI Crypto Updates: Claude Code, OpenAI, Bitcoin & Dev Tools

As I’ve been closely following AI and crypto developments, I am sharing some of the most interesting updates from the past few weeks (June 16 - July 7, 2025). Here’s what caught my attention: 🤖 AI Development & Tools Claude Code Evolution Claude Code Pro Tip: Custom Slash Commands - Track todos, file changes, learnings, and next steps ( @iannuttall) Using Gemini CLI with Claude Code - Leverage 1M context window for codebase research ( @iannuttall) Background Tasks in Claude Code - Enable background tasks with ENABLE_BACKGROUND_TASKS=1 ( @iannuttall) Claude Code Creators Join Cursor - @bcherny and @_catwu leave Anthropic for Cursor AI ( @ryancarson) OpenAI Updates O3-Pro on Genspark (~$20/mo, paid annually) - OpenAI o3-pro now with unlimited use on Genspark Plus. They also give you 1-2 o3-pro queries per day on free plan, which is what i’m using. Only place I found with free access to o3-pro. ( @genspark_ai) ChatGPT on WhatsApp - Image generation available via 1-800-ChatGPT ( @OpenAI) API Primitives Update - Centrally manage, version, and optimize prompts across Playground, API, Evals ( @OpenAIDevs) OpenAI DoD Contract - $200M contract to provide AI tools to Department of Defense ( @unusual_whales) Google/Gemini Breakthrough Gemini 2.5 Pro Free Tier - Back in the free tier of the API ( @OfficialLoganK) Scheduled Actions in Gemini - Pro/Ultra users can now schedule recurring tasks ( @GeminiApp) MCP Support for AI Studio - Google working on Model Context Protocol support ( @testingcatalog) 💰 Crypto & Finance Evolution Trading Infrastructure Coinbase Perpetual Futures - Launching July 21 for BTC and ETH, CFTC compliant, 24/7 trading ( @brian_armstrong) Real World Asset Tokenization - Robinhood Chain: First Ethereum L2 optimized for trading real-world assets via Arbitrum ( @vladtenev) Figma’s Bitcoin Strategy - $70M in Bitcoin ETFs, board approved for $30M more ( @tier10k) Mainstream Adoption Signals Bitcoin as Mortgage Asset - Federal Housing Finance Agency officially recognizes crypto for mortgages ( @BitcoinMagazine) Satoshi-Era Wallets Awakening - 8 dormant wallets move $8.6B in BTC after 14 years ( @WatcherGuru) 🛠️ DevTools Cursor AI Revolution Cursor Mobile View - Create tasks, manage projects from phone ( @kregenrek) Community Explosion - 29 meetups planned through mid-August ( @ericzakariasson) Practical Dev Tips .gitignore Pattern - Use *.local.* for temporary files and outputs ( @mattpocockuk) Use Bun for Speed - Faster compilation than npm/yarn with parallel package installation ( @claude_code) Desktop Notifications - ntfy: Send notifications via simple HTTP/REST API ( @tom_doerr) 🚀 Product Launches AI-Powered Creation Tools Nvidia PartPacker 3D - Generate editable 3D objects from single image with separate parts ( @victormustar) OmniGen2 Image Editor - Open source Photoshop-grade edits without affecting rest of image ( @deedydas) Firecrawl’s Perplexity Clone - Open source AI search with cited sources ( @firecrawl_dev) Platform Integrations ElevenLabs MCP Support - Connect AI agents to Salesforce, HubSpot, Gmail instantly ( @elevenlabsio) Anthropic Desktop Extensions - One-click MCP server installation for Claude Desktop ( @AnthropicAI) 📊 Infrastructure & Performance Major Technical Achievements macOS Container Support - Native container support without Docker in macOS 26 Beta ( @jaydrogers) and h/t James Qualls for the notice. Ethereum Gigagas Era - Justin Drake unveils roadmap targeting 10,000 TPS ( @WhaleInsider) Helium Network Milestone - 1M daily active users, $300k HNT burned ( @amirhaleem) Cost Revelations Figma’s AWS Bill - $300k/day AWS costs revealed in S-1 filing ( @DanielLockyer) 🌟 Notable Industry Moves Nikita Bier to X - Joining as Head of Product after “posting to the top” ( @nikitabier) Meta’s Superintelligence Team - Zuckerberg announces new team with hires from OpenAI, Anthropic, Google ( @kyliebytes) Jack’s Weekend Project - bitchat: Bluetooth mesh network chat with IRC vibes ( @jack). I tried it out but need someone else in range to fully try it. 💡 Randomness Mosquito Laser Defense - Startup using lidar and lasers to eliminate mosquitos in homes ( @ritwikpavan) This curated summary focuses on the most impactful developments from my X likes (June 16 - July 7, 2025), emphasizing actionable insights in AI tooling, crypto infrastructure, and developer productivity. Hope you find something useful here! ...

July 7, 2025 · 4-minute read · matt silverman  ·