marksmith.dev
My Thoughts
Exploring web development, design patterns, and the future of technology.
Tag: #AI

OpenAI Codex App: Computer Use, Agents and a Stronger Desktop Workflow
OpenAI’s Codex app is becoming more than a coding assistant: with computer use, cloud workspaces and agentic workflows, it offers a compelling alternative to Claude Desktop for practical AI-powered work.

Anthropic’s Goodwill Problem: Usage Rules, PR and the Cost of Trust
Anthropic has earned real respect for model quality and safety, but stricter usage rules and awkward PR have left some users and developers questioning whether Claude still feels as open and dependable as it once did.

DeepSeek V4 Flash and Pro: Aggressive Pricing Meets Frontier-Scale Context
DeepSeek’s V4 Flash and V4 Pro models bring 1M-token context, thinking modes and extremely aggressive pricing into direct comparison with GPT-5.5 and Claude’s Sonnet and Opus lines.

ChatGPT 5.5: Everyday AI Gets Faster, Leaner and More Efficient
OpenAI’s ChatGPT 5.5 release shows how far practical AI has moved: faster responses, better efficiency, and strong everyday performance even at low reasoning levels.

Google Vertex AI & Gemini 3.1: Powerful Models, Fragmented Rollout, Real Trade-Offs
Google's current Vertex AI docs now feature Gemini 3.1 Pro and 3.1 Flash-Lite, while 2.5 models still remain prominent in generally available and pricing guidance. That mix is exactly why the platform feels both powerful and confusing.

MiniMax M2.7 and the Token Plan: A New Kind of AI Pricing
MiniMax M2.7 pairs a 204,800-token coding and agent model with a Token Plan that bundles text, speech, image, music, and video access under one subscription. The model matters, but the pricing model may be the bigger disruption.

The Coming AI Wave: Claude Mythos Is Real, but OpenAI's Next Step Is Still Unconfirmed
Anthropic has officially announced Claude Mythos Preview and Project Glasswing. OpenAI has officially announced GPT-5.4 and continued Sora activity. What is not publicly confirmed by OpenAI is the rumor-heavy "GPT-5.5 Spud" story, so this article has been corrected to separate verified reality from speculation.

Claude Code Access Is More Complicated Than "Just a Subscription" — But It Is Not Locked to Anthropic's API
A corrected explanation of how Claude Code access actually works in 2026: which account types are supported, where API billing fits in, and why unsupported community harnesses breaking is not the same thing as Anthropic silently banning all third-party use.

Artificial Analysis: The Independent AI Benchmarking Platform You Need to Know
With AI models dropping faster than most teams can evaluate them, Artificial Analysis has carved out a rare niche: independent, rigorous, and genuinely useful benchmarking that cuts through the hype.

Anthropic Accidentally Leaked 512,000 Lines of Claude Code Source Code: The Full Story
On March 31 2026, Anthropic shipped 512,000 lines of Claude Code TypeScript source code to the public npm registry via a misconfigured source map. Here is what actually happened, what was exposed, and why the company is now issuing DMCA takedowns against its own users.

Qwen 3.6 Plus: Alibaba's Flagship AI Model Redraws the Agentic Coding Boundary
Qwen 3.6 Plus launched April 2 2026 with a 1 million token context window, a 61.6 score on Terminal-Bench 2.0, and a pricing model that undercuts Claude Code at roughly $0.50 per million tokens on Bailian. The agentic coding race just got significantly more competitive.

GLM-5.1: China's First Publicly Traded AI Company Ships a Frontier Model
GLM-5.1 hit 77.8% on SWE-bench Verified, trained entirely on Huawei Ascend chips under US trade restrictions, and briefly stress-tested itself in the wild via a stealth launch before going official. Meet Zhipu AI's most capable model yet.