MiniMax positions M2.7 as a model in its own evolution

4/12/2026
MiniMax has unveiled M2.7 and framed it as an early step toward model self-evolution. In its March 18, 2026 announcement, the company said M2.7 is the first model in the M2 series to participate deeply in its own development cycle, not only by completing tasks but also by helping build and refine the surrounding agent harness, memory structure, and skill system used to improve future iterations. https://filecdn.minimax.chat/public/platform_web/offical-news/%E9%A3%9E%E4%B9%A6%E4%BA%91%E6%96%87%E6%A1%A3/img-1.png According to MiniMax, M2.7 is designed to build complex agent harnesses and execute elaborate productivity tasks by relying on capabilities such as Agent Teams, complex Skills, and dynamic tool search. During the development of M2.7 itself, the company said the model was allowed to update its own memory, construct dozens of complex skills to support reinforcement learning experiments, and then improve its learning process and harness in response to experiment outcomes. MiniMax described this as the beginning of a model self-evolution loop. https://filecdn.minimax.chat/public/platform_web/offical-news/%E9%A3%9E%E4%B9%A6%E4%BA%91%E6%96%87%E6%A1%A3/img-2.png The company also shared an internal workflow showing how M2-series models are being used in research settings. In that setup, an internal version of M2.7 was tasked with building a research agent harness able to interact with multiple project groups across data pipelines, training environments, infrastructure, cross-team collaboration, and persistent memory. MiniMax said this harness supports researchers as they iterate toward better models while keeping human guidance in place. https://filecdn.minimax.chat/public/platform_web/offical-news/%E9%A3%9E%E4%B9%A6%E4%BA%91%E6%96%87%E6%A1%A3/img-3.png A day-to-day example comes from the company’s reinforcement learning team. MiniMax said a researcher can discuss an experiment idea with the agent, which then helps review literature, follow a predefined experiment specification, prepare data pipelines and artifacts, and launch experiments. While experiments are running, the agent can monitor progress, read logs, trigger debugging, analyze metrics, fix code, submit merge requests, and run smoke tests. The company said this shifts human involvement toward critical decisions and discussion points, while M2.7 is now able to handle 30% to 50% of the workflow. https://filecdn.minimax.chat/public/platform_web/offical-news/%E9%A3%9E%E4%B9%A6%E4%BA%91%E6%96%87%E6%A1%A3/video-1.mp4 MiniMax emphasized that recursive improvement of the harness itself became a critical capability during this process. Its internal system, the company said, can autonomously collect feedback, build evaluation sets for internal tasks, and iterate on its own architecture, skills or MCP implementation, and memory mechanisms in order to complete tasks better and more efficiently. In one internal programming scaffold experiment, M2.7 reportedly ran for more than 100 rounds through a loop of analyzing failures, planning changes, modifying scaffold code, running evaluations, comparing outcomes, and deciding whether to keep or revert edits. MiniMax said the result was a 30% performance improvement on internal evaluation sets. https://filecdn.minimax.chat/public/platform_web/offical-news/%E9%A3%9E%E4%B9%A6%E4%BA%91%E6%96%87%E6%A1%A3/video-2.mp4 https://file.cdn.minimax.io/public/d92a6eb4-a4b8-4906-b76a-d627c814a2c0.gif On software engineering, the company presented M2.7 as a model aimed at real production workflows rather than code generation alone. The announcement highlighted debugging in live environments as one example, saying the model can connect monitoring metrics with deployment timelines, analyze traces, verify hypotheses through databases, identify missing index migration files, and choose non-blocking index creation before submitting a merge request. MiniMax said it has, on multiple occasions, reduced incident recovery time in live production systems to under three minutes with M2.7. https://filecdn.minimax.chat/public/d070816d-2c2a-4a5c-a441-48c9dd19d44d.mp4 The company also published benchmark results across software and office work. On SWE-Pro, M2.7 scored 56.22%. It posted 52.7 on Multi-SWE Bench, 55.6 on VIBE-Pro, and 57.0 on Terminal Bench 2. In office-oriented work, MiniMax said the model improved its handling of Excel, PowerPoint, and Word, especially for multi-round revisions and high-fidelity editing. On GDPval-AA, M2.7 reached an ELO score of 1495. It scored 46.3 on Toolathon and maintained a 97% skill adherence rate across 40 complex skills in MM Claw testing. MiniMax also described exploratory tests in low-resource machine learning settings. M2.7 was entered into 22 MLE Bench Lite competitions open-sourced by OpenAI, each runnable on a single A30 GPU. Using a simple harness built around short-term memory, self-feedback, and self-optimization, the company ran three trials over 24 hours of iterative evolution. The best run produced 9 gold medals, 5 silver medals, and 1 bronze medal, while the average medal rate across the three runs reached 66.6%. Beyond work scenarios, MiniMax said M2.7 also improves character consistency and emotional intelligence, and introduced OpenRoom as a preliminary demo for interactive, web-based agent experiences.