Why Internal Tools Fail, and What to Actually Do About It

I have built an internal tool that failed. Not spectacularly. Quietly, which is almost worse.

We were switching an ops team onto a new system. The compliance case was real: the old process had a data leakage risk, and we needed everything flowing through a single governed platform. We did discovery. We ran sessions. We mapped the workflows. And then we shipped phase one.

Within weeks, the numbers told a familiar story. The tool was technically in use. The ops team was logging in. But the speed of work had dropped, the floor was frustrated, and the feedback forms were empty. It took a lot of digging to find out why.

The ops team did not have an SOP. Every edge case, every exception, every "and then sometimes this happens on a Tuesday afternoon" scenario existed purely in the memory of the person who had been doing the job for two years. When I asked what their edge cases were, they could not fully answer, not because they were withholding, but because they had never needed to articulate what they just knew. And the ones that did surface felt too small to mention. A scenario that hits maybe a handful of times a week. Easy to dismiss.

As a PM, I did exactly what the prioritisation framework told me to do: I asked for the quantum. It was a small number. It did not make phase one. And because it did not make phase one, what got built did not cover the full workflow. And because it was not covering the full workflow, the ops team was opening the old system to handle those cases. And once they were in the old system, they stayed there, because context switching mid-task has its own cost. The tool was abandoned, not because it was bad. It was abandoned because it was incomplete in exactly the place where completeness mattered most.

That failure taught me more about internal tool adoption than any research I have read since. And I have read a lot of it, because the failure is not mine alone. The research is blunt: 70% of software rollouts fail. The reasons, once you have lived one, are obvious. But nobody talks about them honestly enough.

This piece is an attempt to do that.

The discovery problem: edge cases live in people, not documents

The first and most upstream failure in internal tool development is that the people building the tool do not actually know what the tool needs to do.

This sounds harsh, but it is structural. Ops teams, particularly in high-attrition environments, rarely have comprehensive SOPs. The institutional knowledge of how work actually gets done, including every exception, every workaround, every scenario that does not fit the clean version of the process, lives in the memory of the people who have been doing the job long enough to have encountered it. When you run a discovery session and ask, "What are your edge cases?" you get the ones people can recall on demand, in a meeting room, under mild social pressure. You do not get the Tuesday afternoon scenario that has become so automatic it no longer feels like something worth mentioning.

The Chipper Cash case study from Sprig illustrates how deep this gap can run. When they deployed in-product surveys to understand why a feature was failing in Uganda, 58% of users turned out to be completely unaware that the feature existed. Not resistant. Not frustrated. Simply unaware it was there. The adoption problem looked like rejection from the outside. It was a visibility and discovery problem from the inside. The signal was present. The mechanism to reach it was not.

The fix: Stop running discovery in meeting rooms. Shadow real users doing real work for real shifts, not demos of hypothetical workflows. The scenarios that do not come up in interviews will come up when you are sitting next to someone and a case lands in their queue that they have to handle right now. Document what you see, not just what you are told. Build a living workflow map, not a one-time requirements document, and keep adding to it through the build. The goal of discovery for an internal tool is not to capture requirements. It is to surface the tacit knowledge that the ops team has never had to make explicit before. That is a different exercise, and it takes longer.

The prioritisation problem: frequency is the wrong metric for internal tools

Even when edge cases do get surfaced, the standard PM prioritisation frameworks discard them. RICE scores, MoSCoW, and impact-effort matrices: all of them, in their default form, reward frequency. How many users does this affect? How often does it come up? If the answer is a small number, the edge case lands at the bottom of the backlog and phase one ships without it.

This is the right framework for consumer products and the wrong framework for internal tools, particularly compliance-mandated ones. The relevant metric is not frequency. It is whether the absence of that workflow is a blocker for anyone trying to complete their work.

A scenario that hits five times a week across a team of fifty people looks like a minor edge case in a frequency-weighted analysis. For the agent who hits it on a Wednesday morning and cannot proceed, it is not a 10% problem. It is a 100% problem. They stop. They open the old tool. They complete the task there. And the new tool has just demonstrated, to that person, that it cannot be relied upon for real work.

Mind the Product's analysis of internal product management captures the structural version of this: internal products receive less investment and fewer iterations than customer-facing ones, and the gap between what the tool covers and what the user actually needs compounds over time rather than closing.

The fix: For internal tools, add a fourth dimension to your prioritisation framework alongside reach, impact, and effort: workflow completeness. Before any phase ships, map every scenario within the scoped workflows and mark each one as covered or not. If any uncovered scenario is a blocker, meaning a user cannot complete their task without the old system, it is not an edge case. It is a launch blocker, regardless of frequency. The phase does not ship until the workflow is complete within its defined scope. Narrow the scope if you need to. Do not ship an incomplete one.

The design problem: internal tools are built to spec, not designed for speed

There is an uncomfortable truth that PMs working on internal tools rarely say out loud: the primary competition is Excel. Not the idea of it. The actual product has been refined over four decades by some of the best designers and engineers in the history of the software industry.

The behavioural science on adoption explains why this gap is so hard to close. Kahneman's work on System 1 and System 2 thinking shows that most of what people do at work operates on autopilot, fast and automatic, anchored in routines built over months and years. Opening Excel to update a tracker is System 1 thinking. Opening a new internal tool, finding the right field, navigating an unfamiliar menu: that is System 2. You are asking users to perform cognitively expensive work every time they open your product, until the habit forms. Pendo's research found that only 49% of features get used even when people do engage with a new tool. The other half requires changing how they work, and that change does not happen through an onboarding video.

In ops environments, this is not a preference problem. It is a performance problem. AHT, quality scores, first contact resolution: every second of friction in the UI shows up in the user's metrics at the end of the shift. Attrition in ops teams is typically high, which means many users on the floor came from other organisations where Excel was the primary working surface and have already built deep muscle memory around it. They are not resisting the new tool. They are protecting their numbers.

Internal tools are almost always built to spec and rarely designed. A developer interprets a requirement and builds the functionality. Nobody asks whether a new agent, under time pressure, on their third shift of the week, can complete the task faster on this tool than on the one they came from. That question rarely appears in a UAT script.

The fix: Assign a designer to internal tools. Not a contractor tidying up a UI at the end of a sprint. A designer who is in discovery from day one, who has spent time on the ops floor, and who is optimising for speed of task completion under real conditions. Run timed task tests with real users, not polished demos. Measure time-to-completion and error rate as primary product metrics alongside adoption. When the budget argument comes up, make the case explicitly: this user has a performance target. A slow tool is not a minor inconvenience. It is a daily tax on their output, and it compounds across every person on the floor, every shift, for as long as the tool is in production.

The decision-making problem: tools get sponsored based on memory, not conditions, Build vs Buy on steroids

A specific failure mode lives at the leadership level and is rarely discussed directly.

A senior leader moves from one company to another, carrying a strong memory of a tool that worked well. The context that made it work, the months of change management, the internal champion who lived in the product daily, the forced deprecation of the prior system, the training investment, none of that travels with the memory. What travels is the outcome. The product was good. The team adopted it. Things improved.

At the new company, the leader sponsors the same product. Budget is approved. The rollout begins. And it frequently stalls, because the conditions that produced the prior success are not being reproduced. UC Today's research on project management tool adoption makes the pattern precise: leaders buy a platform, ask teams to move work into it, then act surprised when usage becomes patchy, reporting turns political, and people retreat to spreadsheets. The platform becomes shelfware. The tool did not fail. The conditions for adoption were never built.

The PM who inherits this mandate is working with one hand tied behind their back. The foundational conversation about adoption readiness has been preempted by conviction. Raising concerns about rollout conditions can feel, in that context, like questioning a leadership decision that has already been made.

The fix: Before the procurement decision, run a conditions audit. Not a feature comparison. A conditions audit. What change management infrastructure existed at the previous organisation? Was there a dedicated internal champion? Was the old tool formally deprecated with a hard cutoff date? What was the actual timeline from rollout to genuine, floor-level adoption, and what happened during that period? If those conditions do not exist at the current organisation, they need to be built alongside the tool deployment, not added as an afterthought when adoption stalls at week six. The PM should put this in writing before the contract is signed. Not as a challenge to the decision but as a shared understanding of what the rollout actually requires.

The build problem: the costs that never make the spreadsheet

The case for building an internal tool is almost always made with a cost argument. The licensing fee for a commercial product is real and visible. The cost of building is estimated in engineering sprints, which feels manageable. Build appears cheaper, and the decision is taken.

What does not make it onto that spreadsheet is the future being purchased alongside the decision. Caylent's analysis of internal tools that fail to mature is direct: most stall somewhere between kind of usable and sort of valuable, never quite ready for prime time. That is not a failure of the idea. It is a failure to treat the tool as a product with an ongoing obligation rather than a project with a completion date.

The incremental future costs are predictable and rarely priced in: the engineer who understood the data model leaves, a compliance requirement changes, and nobody owns the update, a bug surfaces eight months post-launch and takes three sprints to fix because the context has been lost, a new ops workflow emerges, and the tool cannot accommodate it without a rebuild. None of these appeared in the original analysis. All of them were always going to happen.

The fix: The build-vs-buy analysis needs a third column: cost of ownership at year two and year three. This should include engineering time for maintenance and bug fixes, the cost of knowledge transfer when the original team moves on, the feature gap between your build and a commercial product that has a full team iterating on it constantly, and the opportunity cost of the engineering cycles spent on internal tooling rather than the core product. Run that analysis honestly. My view is direct: unless the problem is genuinely specific to the core of your business in a way that no external product will ever adequately serve, buy. Take the licensing cost as infrastructure. Direct your PM and engineering capacity toward your actual competitive advantage. The exception is real, but it requires meeting a high bar before you take on the ongoing product obligation that a build decision creates.

The feedback problem: the coffee break conversation

Every failed internal tool has a version of the same dynamic. Everyone on the ops floor knows what is broken. The information never reaches the people who could act on it. The feedback that should be shaping the roadmap is being shared in corridors and Slack DMs, and never in Jira. The PM reads the empty feedback forms as acceptance and moves on to the next phase.

This is not a culture problem. It is a product process failure. When users feel that their frustration carries social risk, because the tool was announced as a leadership initiative and questioning it feels like questioning a decision that has already been made, the feedback goes underground. It surfaces eventually, but in lagging indicators: attrition data, quality score degradation, and the workaround spreadsheets that appear on shared drives six months after launch.

A Smartsheet study of 1,550 operations professionals across seven countries found that 76% rely on workarounds because their approved systems cannot keep up with changing business priorities. That is not a statistic about difficult users. It is a statistic about feedback loops that were never built.

The fix: Build the feedback architecture before the tool launches, not after adoption stalls. Three things specifically. First, direct observation on a regular cadence: the PM or someone on the team spends time on the ops floor every two weeks and treats what they see as primary data. Second, lightweight in-product feedback at the task level, a signal on a specific workflow step rather than a quarterly satisfaction survey, so the friction point is captured at the moment it is felt rather than recalled later under different conditions. Third, a standing ritual where the PM reviews floor-level feedback before every sprint planning session, not as a post-launch courtesy but as a primary input to what gets prioritised next. The goal is to make the coffee break conversation unnecessary by giving it a more useful place to go.

The one thing all of this has in common

Every failure described in this piece, the missing edge case, the deprioritised blocker, the tool that was built to spec but not designed for speed, the leadership mandate disconnected from rollout conditions, the build decision that did not price in year two, the feedback that never surfaced: all of them are downstream of a single misunderstanding.

Internal tools are treated as projects with a launch date. They are not. They are products with users who need to win every single day, who are being measured on their performance, and who will use whatever makes them fastest, regardless of what the org chart says they are supposed to use.

The launch is the beginning of accountability. Not the end of it.