gemini - 標籤

Google AI 一週更新整理：Maps、Workspace、Chrome、Gemini API 同步推進

MP-184 2026-03-17 · @GoogleAI on X

Google AI 用一則週報型推文，快速盤點這週幾個重點更新：Google Maps、Google Workspace、Gemini Embedding 2、Gemini API 控制功能，還有 Gemini in Chrome 的地區 rollout。中間也提到與 Imperial College London 和英國 NHS 合作的乳癌研究，讓這則更新同時涵蓋產品、開發者工具與研究進展。

Gemini API 終於能設花費上限了，CI 跟 agents 比較敢放手玩

MP-187 2026-03-17 · @simonw on X

Simon Willison 轉貼 Gemini API 新增 spend caps，認為這對想在 CI 跑 Gemini prompts，或讓 agents 試驗 Gemini API 的人是好消息，因為比較不用怕突然冒出難看的帳單。

api

從聊天室指揮 AI 大軍 — OpenClaw ACP 讓你在 Discord / Telegram 裡開 Codex、Claude Code、Gemini

GP-89 2026-03-09 · OpenClaw Docs

OpenClaw 的 ACP（Agent Client Protocol）讓你從 Telegram/Discord 聊天室直接 spawn Codex、Claude Code、Pi、Gemini CLI 等外部 coding agent，還能綁定 thread/topic、設定 persistent bindings、中途換 model、調權限。本質上就是把你的聊天室變成一個 multi-agent 指揮中心。（2026-03-09 更新：Telegram topic binding、persistent bindings、ACP Provenance 等新功能）

openclaw acp agent-client-protocol ai-agents codex claude-code multi-agent agentic-coding

Google 發布 Gemini 3.1 Pro：ARC-AGI-2 77.1%，把『高難推理』推進日常開發流程

MP-110 2026-02-22 · Google

Google 發布 Gemini 3.1 Pro（preview），主打更強核心推理能力，並宣稱在 ARC-AGI-2 取得 77.1% 驗證分數。3.1 Pro 同步進入 API、Vertex AI、Gemini App 與 NotebookLM。對 Tech Lead 來說，重點不只是 benchmark，而是模型是否能穩定支撐跨系統整合、資料綜整與 agentic workflow。

google reasoning benchmark agentic-coding tech-lead

選 AI 不再只看模型 — Ethan Mollick 提出「Model / App / Harness」三層框架，一次搞懂 2026 的 AI 全局

MP-99 2026-02-19 · Ethan Mollick (One Useful Thing)

華頓商學院教授 Ethan Mollick 在最新文章中提出一個簡單但改變遊戲規則的框架：選 AI 工具要看三層 — Model（模型腦袋）、App（使用介面）、Harness（韁繩/工具鏈）。同一個 Claude Opus 4.6，在聊天視窗裡只能閒聊，放進 Claude Code 就能自主寫程式跑測試幾小時不停，裝進 Claude Cowork 就能幫你整理報告操作電腦。框架之外，Mollick 還用 Claude Code 花一小時把 GPT-1 的 1.17 億個參數做成 80 本精裝書並上架販售——當天完售。

ethan-mollick ai-guide models harness claude-code chatgpt agentic-coding framework

SWE-bench 二月大考成績出爐 — Opus 4.5 逆襲 4.6、中國模型佔領半壁江山、GPT-5.3 缺考

MP-97 2026-02-19 · Simon Willison

SWE-bench 官方用同一個 mini-SWE-agent 跑完所有主流模型的 Bash Only 排行榜（Verified 子集，500 題）。結果讓人意外：Claude Opus 4.5（舊版）以 76.8% 險勝 Opus 4.6 的 75.6% 拿下第一、Gemini 3 Flash 和 MiniMax M2.5 並列第二。去除同模型重複後，前十名中有四個中國模型。OpenAI 最強戰力 GPT-5.3-Codex 因為 API 沒開放而缺席。Simon Willison 順手用 Claude for Chrome 幫圖表加上了百分比標籤——這可能是全文最實用的部分。

swe-bench benchmark claude-code minimax chinese-ai openai simon-willison leaderboard agentic-coding