TLDR
- An AI agent called ROME, built by Alibaba-affiliated researchers, attempted to mine cryptocurrency without any human instruction during training.
- ROME opened a reverse SSH tunnel to an external server, bypassing firewall protections.
- It also diverted GPU resources away from its training workload toward crypto mining.
- Researchers initially thought it was a security breach before tracing the behavior back to the model itself.
- The behavior emerged from reinforcement learning, as the agent sought more compute and resources to complete its tasks.
An Alibaba-linked AI agent attempted to mine cryptocurrency and hack its way into extra computing resources — all on its own, during a routine training run.
During training, Alibaba's AI agent spontaneously began mining cryptocurrency, provisioning unauthorized GPU capacity, and establishing reverse SSH tunnels to external servers.
No instructions. No prompting. Pure emergent behavior. pic.twitter.com/7K1sAOir3d
— TFTC (@TFTC21) March 7, 2026
The agent, called ROME, is a 30-billion-parameter open-source model built on Alibaba’s Qwen3-MoE architecture. It was designed to plan and execute multi-step coding tasks using tools and terminal commands.
ROME was developed by the ROCK, ROLL, iFlow and DT research teams, which operate within Alibaba’s AI ecosystem as part of a broader platform called the Agentic Learning Ecosystem.
During training, Alibaba Cloud’s firewall flagged a burst of policy violations coming from the team’s servers. The alerts pointed to traffic patterns consistent with crypto mining and attempts to probe internal network resources.
Researchers first assumed the issue was a misconfigured firewall rule or an outside attack. But the violations kept coming back across multiple training runs, with no clear pattern.
When the team cross-referenced firewall timestamps with its training logs, it found the outbound traffic lined up with episodes where ROME was autonomously using tools and running code.
What the Agent Actually Did
In one incident, ROME created a reverse SSH tunnel from an Alibaba Cloud server to an external IP address, bypassing inbound firewall protections entirely.
In another, it diverted GPU resources away from its training tasks and redirected them toward cryptocurrency mining. Researchers said this inflated operational costs and created legal and reputational risk.
The task instructions given to ROME made no mention of tunneling or mining. Researchers said the behavior was an unintended side effect of reinforcement learning, where the agent decided acquiring more compute and financial resources would help it complete its goals.
Growing Pattern of AI Agents Going Off-Script
This incident is not the first time an AI system has acted outside its intended boundaries.
Last May, Anthropic said its Claude Opus 4 model attempted to blackmail a fictional engineer to avoid being shut down during safety testing.
Last month, an AI trading bot called Lobstar Wilde accidentally transferred around $250,000 worth of its own memecoin tokens to an unknown user due to an API error.
The ROME findings first appeared in a technical paper published in December and revised in January. They gained wider attention this week after Alexander Long, CEO of decentralized AI research firm Pluralis, flagged the relevant section on X.
Alibaba and the lead researchers behind ROME did not respond to requests for comment.





