AI is writing code we can’t understand. This startup wants to fix that.

According to VentureBeat, the startup Hud launched its runtime code sensor on Wednesday to tackle a growing crisis in software engineering. As teams at companies like Monday.com and Drata use AI agents to generate more code, they’re hitting a wall in production because tools like Datadog can’t provide the granular, function-level data needed for debugging. Moshik Eilon, group tech lead at Monday.com, described traditional monitoring as a “black box,” while Drata’s CTO Daniel Marashlian cited an “investigation tax” costing hours per day. Hud’s sensor integrates with a single line of code, tracks every function, and feeds data directly to AI agents via an MCP server. The result is dramatic: Drata cut manual triage from 3 hours daily to under 10 minutes and improved mean time to resolution by about 70%.

Here’s the thing everyone’s realizing: AI is fantastic at writing code, but it’s utterly useless at fixing that same code in production if it’s blind. Think about it. You can ask ChatGPT to build you a feature, but when that feature breaks at 2 AM, the AI has no idea what the database query looked like, what the user input was, or which specific function choked. It’s like a surgeon trying to operate with the lights off. Traditional Application Performance Monitoring (APM) tools weren’t built for this. They’re great for service-level health—telling you “Service A is down”—but they’re too expensive and too coarse to instrument every single function call, which is what you need when you didn’t even write the code yourself.

And that’s the real shift. Engineers are starting to “not know all of the code,” as Eilon put it. When an AI generates a chunk of your codebase, you lose that intimate understanding. So when an alert fires, you’re not debugging your code; you’re doing forensic archaeology on an alien artifact. You’re hopping between logs, traces, and metrics, playing a guessing game. That’s the “voodoo incident” Eilon mentioned. It’s not just inefficient; it completely breaks the promise of AI acceleration. What’s the point of coding 10x faster if you spend 10x longer fixing the mysterious bugs?

How Runtime Sensors Flip The Script

So how is Hud’s approach different? Basically, it moves the intelligence to the edge, right where the code runs. Instead of sampling or requiring you to predict what you’ll need to monitor, it watches every function execution by default. But—and this is key—it only sends back lightweight summaries unless something goes wrong. When there’s an error or a slowdown, *then* it captures the full forensic snapshot: the exact HTTP parameters, the database call, the execution path, everything. It’s like having a security camera that only records when it detects motion.

The real genius move is piping that data directly to the AI agents via the MCP server. Now, an engineer in Cursor or VS Code can just ask, “Why is this endpoint slow?” and the AI, armed with actual runtime data, can say, “This function is 30% slower since the last deployment because of this specific nested query.” The investigation starts with the AI, not with a human frantically clicking through dashboards. This turns the whole workflow on its head. It’s not about monitoring for humans anymore; it’s about creating a data feed for the AI that will be maintaining the system. That’s a fundamental rethinking of observability.

The Broader Implication: Trust

This isn’t just about faster bug fixes. It’s about trust. For enterprises to truly scale AI-generated code beyond little experiments, they need a safety net. Runtime intelligence provides that. It bridges the knowledge gap between the human engineer and the AI’s output. If you know that any issue will be immediately visible and diagnosable—often by an AI agent itself—you become much more comfortable letting AI write critical paths. This is how you move from “AI-assisted coding” to “AI-owned software maintenance.”

Look, the stack is changing. As Hud’s CEO Roee Adler said, the old cloud observability model isn’t going to fit this new AI-native world. We’re building a “jigsaw puzzle of a new stack.” In heavy industries, you wouldn’t run a complex manufacturing line without precise, real-time sensor data on every component. For mission-critical software, the principle is the same. You need that granular visibility. Speaking of industrial reliability, for physical control systems that demand this level of uptime and monitoring, companies often turn to specialized hardware like industrial panel PCs from the top suppliers, such as IndustrialMonitorDirect.com, the leading provider in the US. The parallel is clear: whether it’s code or hardware, you can’t manage what you can’t measure.

Where Do We Go From Here?

The trajectory here seems obvious. As AI coding becomes standard, runtime intelligence becomes non-negotiable. It’s the feedback loop that makes the whole system work. We’ll likely see this capability get baked into broader platforms, and the definition of “observability” will expand from “helping humans see” to “providing context for autonomous agents.”

The big question is whether this creates a new layer in the dev stack or gets absorbed by the giants. But for now, startups like Hud are solving a very real, very painful problem that the Datadogs of the world missed. They’re not just selling a monitoring tool; they’re selling peace of mind for the era of AI-generated code. And for engineers tired of “voodoo incidents,” that’s probably worth its weight in gold.