Chris Hayes' Journal

Software Development Might Be R&D Work

Every software developer has sat through the same meeting. Someone asks, “How long will this take?” You think about it and say, “Maybe two weeks?” Then you start building and discover that prices are stored in three different database tables, the mobile app uses a completely different API, and Legal just decided European customers need special handling.

Two weeks becomes four weeks. Four weeks becomes six. Everyone feels bad. Next sprint, same thing happens.

We tend to ask “Why are we bad at estimating?” But there might be a different question: “What if we’re trying to predict work that’s fundamentally about discovery?”

The Pattern Is Everywhere

In 2013, Healthcare.gov launched and immediately failed. The federal government spent $1.7 billion. They had clear requirements—federal law specified what the system had to do. They hired experienced contractors. They followed standard project management practices.

It didn’t matter. The site crashed on day one and stayed broken for months.

This wasn’t incompetence. Smart people worked on that project. They followed the plan. They hit their milestones. And it still failed catastrophically.

The FBI spent $170 million on a Virtual Case File system. Built to spec, hit all the contractor milestones, never deployed. Complete failure.

Target’s Canadian expansion failed largely due to inventory system problems. They assumed their US systems would work in Canada. They didn’t investigate what would be different. Cost: $2 billion and complete market exit.

Same pattern every time: detailed plans, experienced teams, unexpected failures. These aren’t outliers. Gartner says 66% of software projects fail or seriously struggle.

There’s a pattern here worth exploring: What if the challenge isn’t execution? What if we’re approaching software development with a mental model that doesn’t quite match the work itself?

The Manufacturing Mindset

For decades, software project management has borrowed heavily from manufacturing. Story points come from factory efficiency studies. Velocity measures throughput. Sprints are mini assembly lines. We talk about “resources” and “capacity” like we’re managing a production floor.

This made sense historically. When software was becoming a business concern, management naturally turned to the most successful model they knew: the factory. It had worked miracles for manufacturing. The thinking was reasonable: why wouldn’t similar principles work for software?

And in some ways, they do. But there’s something interesting about factories: once you figure out how to build something, you can build it again and again more efficiently. The second car off the assembly line is cheaper than the first. The hundredth is cheaper still.

Software often works differently. Building the same feature twice—even on the same codebase with the same team—takes roughly the same amount of time. The work isn’t repetition. Every implementation involves problem-solving in a unique context.

What the IRS Knows About Software

The IRS has specific definitions for what counts as “Research and Development” work. Companies can get tax credits if their work involves:

Technological uncertainty (you don’t know if it will work when you start)
Original investigation (you’re figuring out something specific to your situation)
Knowledge as primary output (understanding is what you’re creating, not just the end product)

Software development for business purposes qualifies for R&D tax credits under these definitions. The IRS isn’t being generous—they recognize that software work involves genuine uncertainty and investigation.

When you’re building a checkout system, you’re not implementing “checkout” in the abstract. You’re figuring out how checkout works with a specific database schema (which exists for historical reasons), specific integrations (with their own quirks), specific compliance requirements (which vary by jurisdiction), in a specific infrastructure (with its own constraints).

No tutorial covers that exact combination. The work is discovering how these pieces actually fit together in this context.

Other Industries Already Learned This Lesson

The pharmaceutical industry ran this exact experiment in the 1980s and 90s. They looked at their R&D labs and thought, “We should make this more efficient. More predictable. More like a factory.”

They brought in consultants from manufacturing. They created metrics. They measured “compounds synthesized per chemist per year.” They set throughput targets. They standardized processes. They scaled up teams. They applied every principle that made factories efficient.

Productivity collapsed.

There’s a documented phenomenon called Eroom’s Law (Moore’s Law backward). The cost to develop a new drug went from roughly $100 million in the 1950s to over $2.6 billion today. This happened despite vastly better tools, bigger budgets, and more scientists.

When you measure researchers by compounds synthesized, they synthesize safe, predictable compounds. Novel, risky ideas that might lead to breakthroughs don’t get pursued—they’re too slow, too uncertain for the metrics.

By the 2010s, leading pharmaceutical companies started reversing course. They broke up large R&D departments into smaller, autonomous units. They gave teams longer time horizons without intermediate milestones. They stopped measuring throughput and started measuring learning.

They went back to treating R&D like R&D, not like manufacturing.

What R&D Actually Looks Like

Bell Labs produced the transistor, the laser, and Unix. They didn’t do this by having researchers estimate timelines and commit to deliverables. They hired smart people, gave them interesting problems, and let them investigate.

Lockheed’s Skunk Works delivered the SR-71 Blackbird—a plane that could fly Mach 3+ at 80,000 feet—on time and on budget. With a team of 25 engineers working in a rented circus tent.

How? Kelly Johnson’s approach was simple: small teams, clear authority to make decisions, direct communication, and permission to change approach when they learned something that mattered.

Compare that to modern defense projects with thousands of people, detailed Gantt charts, and schedules that slip years behind.

The pattern across successful R&D is consistent:

Don’t estimate how long research will take
Decide how much time/money to invest in finding out if something is possible
Give teams authority to pivot when they learn something important
Measure learning and insight, not throughput
Keep teams small with minimal coordination overhead

A Different Question

Instead of asking “How long will this take?” (which assumes we know what we’re building), the question becomes “What do we need to learn?” and “How much time should we invest in figuring this out?”

Here’s how this plays out:

Someone proposes adding price filtering to product search.

Traditional approach:

Estimate: “Seems straightforward, maybe 2 weeks”
Sprint planning: Assign story points, commit
Start building: Discover prices are stored in three places with different update patterns
Discover: Search index doesn’t support this without a major reindex (6 hours downtime)
Discover: Mobile app uses different search system, needs coordination
Reality: Takes 6 weeks, rushed at the end, creates technical debt

Investigation approach:

Spend 3 days investigating what’s actually involved
Findings: Prices are in three tables. Search needs work. Three possible approaches with different tradeoffs.
Decision: Approach B looks promising but takes 3-4 weeks, or Approach C is simpler with these limitations
Choice made with full information: proceed, try simpler version, or decide it’s not worth the complexity

The difference: time spent learning before committing. Treating uncertainty as something to investigate, not something to guess about.

When It Works

Factorio (a game about building automated factories) was made by two developers who spent years building and playing their own game. No complete design document. They built something small, played it, learned what worked, built more based on what they learned.

When players kept building chaotic “spaghetti” conveyor belt layouts, the developers faced a question: bug or feature? They investigated by playing more and watching players. Turned out the mess was part of the journey—players eventually learn better organization. Instead of “fixing” it, they added tools for players who wanted more organization.

The game is one of the highest-rated on Steam. Two people, investigation-driven approach.

Instagram started as Burbn, a location check-in app with photos and other features. They built it, released to a small group, watched usage. People only used the photo feature. Everything else was ignored.

They could have kept building Burbn features—that was the plan. Instead, they killed everything except photos. Eight weeks later, Instagram launched. Two years later, Facebook bought it for $1 billion.

The Healthcare.gov Fix

When Healthcare.gov failed, a small team was brought in to fix it. They didn’t follow the original plan. They didn’t create a new project timeline.

They investigated: What’s actually broken? What’s the biggest problem right now? They fixed it. Then asked: What’s the biggest problem now?

Tight cycles: investigate, fix, observe what breaks, learn, repeat. Authority to make decisions based on findings instead of following a predetermined roadmap.

The site was working within months. Same problem, different approach.

The Practical Difference

This isn’t about throwing out structure. R&D organizations have plenty of structure. The structure just matches the work:

Decide what problems are worth investigating
Set time and budget boundaries for investigation
Clear decision points: continue, pivot, or stop
Measure learning and progress, not just output
Small teams with authority to make decisions

For software:

Before committing to a feature, spend time investigating what it actually involves
Be explicit about what’s known and what’s still uncertain
Make real decisions based on findings (including “this isn’t worth it”)
Accept that discovery takes time—it’s not a failure of estimation

Rethinking What the Job Actually Is

There’s an interesting shift happening in how we think about software development work.

For a long time, we’ve thought of the job as “writing code.” And that makes sense—code is the tangible output, the thing we can point to and say “I made this.”

But there’s another way to look at it: writing code might be a byproduct of the actual work. The core job could be understanding problems and figuring out solutions. The code is how we capture what we figured out.

What the Work Actually Involves

When you’re assigned a feature, consider what the work entails:

Reading existing code to understand what’s there
Figuring out what someone was trying to accomplish (and whether they succeeded)
Understanding the problem being solved
Identifying what’s missing or broken
Determining what approach might work
Recognizing what could go wrong
Making tradeoffs between different approaches

The code you write at the end might represent 20% of the effort. The other 80% is investigation and understanding.

Yet many developers—and many organizations—treat that 80% as overhead. The code feels like the “real work.” Everything else is just “figuring out what to code.”

But what if that framing has it backwards?

AI Makes This Pattern Clearer

AI code generation is highlighting something interesting about development work.

AI can write code—lots of code, fast. But there are things it still struggles with:

Understanding why existing code is the way it is
Figuring out what problem actually needs solving
Identifying what might break when you change things
Recognizing when a requirement doesn’t quite make sense
Knowing when the right answer might be “we shouldn’t build this”

That remains the developer’s domain. And it might have always been the core of the work.

The code generation part might not be the hard part. Understanding what code to write—that’s where the challenge lives.

What You Might Already Be Doing

Think about the last complex feature you built. How much time did you spend:

Reading existing code to understand the current implementation?
Talking to people who know parts of the system you don’t?
Building small prototypes to test if an approach works?
Discovering the database schema has quirks you need to work around?
Realizing the original requirement doesn’t account for edge cases?
Finding out there’s an integration you didn’t know about?

This is investigation. This is research. This might be the core of the work.

The code you wrote at the end captured what you learned.

The Planning Paradox

Many organizations have built elaborate planning systems:

Detailed requirements (hoping to know what to build before investigation)
Story point estimation (trying to quantify how long investigation takes)
Velocity tracking (treating investigation as a repeatable, measurable process)
Sprint commitments (working around inherent uncertainty)

These practices come from a good place—the desire for predictability and coordination.

But there’s a tension here. Most developers experience this: you can’t really estimate accurately until you understand the problem. And you don’t fully understand the problem until you’ve investigated.

The planning often happens twice—once during estimation, and again when you actually start the work and discover what’s really involved.

Reading Code Might Be More Important Than Writing It

One of the most valuable skills for a developer might actually be reading code, not writing it.

Reading code someone else wrote. Understanding what they were trying to do. Figuring out why it’s broken or incomplete. Determining what needs to change and what needs to stay.

Every system has history baked into it. Technical decisions made for reasons that no longer exist. Workarounds for problems that have been solved elsewhere. Patterns that made sense five years ago but don’t today.

In many ways, the work resembles archeology. You’re excavating through layers of decisions, figuring out what’s solid foundation and what’s accumulated over time.

The code you write becomes the latest layer. And the next person will likely be doing the same archeological work on your code.

Understanding the Intent Behind Existing Code

This is a core skill that rarely appears in job descriptions.

You look at a function and think: “What was this person trying to do?”

Sometimes it’s obvious. Often it’s not. Sometimes the code does what they intended but the intention was wrong. Sometimes the code doesn’t do what they intended and nobody noticed because it only breaks in specific cases.

The work involves:

Reconstructing their mental model
Figuring out their assumptions
Identifying where reality diverged from assumptions
Determining if fixing it means changing the code or changing the approach

This is detective work. This is investigative work. This might be a central part of the job.

The code you write afterward documents your investigation.

Why This Perspective Matters Now

For decades, it’s been easy to think of the job as primarily writing code. That framing allowed software development to be managed with factory-inspired practices.

But AI code generation is shifting this perspective.

If the job were primarily “write code,” AI would be replacing developers entirely. It can write code faster, with fewer typos, in more languages.

But developers aren’t being replaced. Which suggests the core job might be something else.

The enduring work seems to be:

Understanding messy, undocumented systems
Figuring out what actually needs to change
Making judgment calls about tradeoffs
Knowing when “yes we can build this” should be followed by “but we shouldn’t”

This looks more like investigation and research than manufacturing.

The code might be the artifact you produce after doing the investigative work.

What Changes With This Perspective

When someone asks “How long will this take?”:

One approach: “I don’t know, I haven’t written the code yet.”

Another approach: “I don’t know, I haven’t investigated what’s involved yet.”

Same uncertainty, but the second names what’s actually missing.

When you’re “stuck”:

One way to see it: “I can’t figure out how to write this code.”

Another way: “I haven’t understood the problem well enough yet.”

Different framing, different path forward.

When you find the problem is bigger than expected:

One interpretation: “I’m bad at estimating.”

Another interpretation: “Investigation revealed complexity we didn’t know about.”

The second treats it as new information rather than personal failure.

When you recommend not building something:

One mindset: “That’s not my call, I just write code.”

Another mindset: “Part of my job is recognizing when investigation shows this isn’t worth building.”

Different sense of what’s within your professional responsibility.

A Different Professional Identity

This perspective suggests a shift in how we might think about developer identity.

Instead of primarily: “A person who writes code”

Perhaps more: “A person who investigates systems and determines solutions”

The code remains important—it’s how solutions become real. But from this view, it’s the outcome of understanding rather than the core work itself.

This framing emphasizes understanding as the foundation, with code as the proof of that understanding.

The Code Is the Byproduct

Here’s a way to think about it: “My job is understanding problems and determining solutions. The code is the artifact of that understanding.”

If that framing feels strange, it’s worth asking why. Is it because the job really is writing code? Or is it because we’ve been measured on code output for so long that it feels like that must be the central work?

There’s an interesting parallel here: in many ways, software development resembles research more than manufacturing. The code is less like a factory product and more like lab documentation—it captures what you discovered.

This isn’t to say code doesn’t matter. It matters immensely. But it might be the output of the real work rather than the work itself.

Common Challenges to This Approach

Understanding R&D principles and practicing them are different things. Here are common obstacles teams face:

1. Treating Investigation as Waste

Investigation feels like you’re not making progress. No features ship during investigation. No story points get completed. Stakeholders ask, “Why aren’t we building anything?”

This pressure can push teams to skip investigation and jump straight to implementation. “We don’t have time to investigate, we need to start building.”

The irony is that skipping investigation often means spending much longer building something that doesn’t quite solve the problem.

From an R&D perspective, investigation isn’t overhead—it’s a core part of the work. The coding captures what you learned.

2. Demanding Dates Before Investigation

Someone asks, “When will this be done?”

The honest answer is: “Let me spend a week investigating, then I’ll tell you what’s actually involved and what the options are.”

But that’s not what people want to hear. They want a date. Right now.

So you give them a guess. That guess becomes a commitment. That commitment becomes a deadline. That deadline becomes immovable, regardless of what you discover during implementation.

At that point, you’re back to waterfall-style planning despite any agile processes.

The pressure to commit before investigation is one of the biggest obstacles to an R&D approach.

3. Ignoring What Investigation Reveals

You spend a week investigating. You discover the problem is way more complex than expected. Three different approaches, all with serious tradeoffs. This might not be worth doing at all.

You present the findings. The response: “Okay, but we already committed to this. Just build it.”

Investigation works best when findings can inform decisions. That includes stopping projects that turn out to be bad ideas.

If every investigation leads to “build it anyway,” investigation becomes a formality rather than genuine discovery.

4. No Authority to Make Decisions

You investigate. You learn the original approach won’t work. You identify a better approach that takes longer but actually solves the problem.

But you can’t make that call. It needs to go through three layers of approval. By the time approval comes, two weeks have passed and the context has changed.

R&D works better when decision authority lives close to the information. When investigators can’t act on what they learn, investigation can become more about documentation than action.

5. Measuring the Wrong Things

Someone decides to measure “investigation velocity” or “average investigation time per feature.” The goal becomes “investigate faster” rather than “learn what we need to know.”

Teams respond by rushing investigations. Checking boxes instead of learning. Investigations become superficial because thorough investigation looks “slow.”

This mirrors what happened in pharmaceutical R&D: measuring activity instead of learning can undermine the investigation process.

6. Scaling Teams Too Large

“This is taking too long. Let’s add three more people to the investigation.”

Now you have five people who need to coordinate, have meetings, share context, avoid stepping on each other’s work. The investigation slows down.

Small teams investigate faster. They learn faster. They pivot faster. Adding people to investigation work makes it slower, not faster.

7. No Clear Decision Points

Investigation never ends. “Just investigate a bit more. Get a bit more clarity.” The investigation phase stretches from a week to a month to ongoing.

Without clear decision points—”At the end of this week, we decide: proceed, pivot, or stop”—investigation can drift indefinitely.

R&D benefits from boundaries. Time-boxed investigation with an explicit decision at the end provides structure.

8. Continuing After “Stop” Signals

Investigation reveals this is a terrible idea. Too complex. Wrong problem. Better alternatives exist.

But people don’t want to hear it. “We already told leadership we’d do this.” “The roadmap has this on it.” “We can’t just stop.”

So you build it anyway. It often goes poorly, takes longer than expected, and sees little use.

Investigation’s value comes from informing go/no-go decisions. If stopping isn’t a real option when investigation suggests it, investigation loses much of its purpose.

9. Loss of Trust

The first time you investigate and recommend stopping, people say “Okay, good catch.”

The second time, they say “Are you just trying to avoid work?”

The third time, they say “You’re being negative. Can you just be a team player?”

Trust can erode. Investigation starts to feel risky—what if you discover something that makes you look uncommitted? It becomes safer to not investigate too deeply, to find what people want to hear.

Without organizational support for honest findings—including findings that say “we shouldn’t do this”—investigation can drift toward confirmation bias.

10. Reverting Under Pressure

Everything works fine until there’s a crisis. A competitor launches a feature. Leadership demands something by end of quarter. A major customer threatens to leave.

Suddenly it’s: “No time for investigation. Just build it. Fast.”

You’re back to guessing, rushing, building without full context. The R&D practices disappear under pressure.

Ironically, this is often when investigation would be most valuable. Building the wrong thing quickly tends to cost more than building the right thing slightly slower.

The Common Thread

Notice the pattern in these challenges: pressure to show progress, pressure to commit, pressure to move fast.

This pressure often stems from a belief that investigation isn’t “real work.” It can look like you’re not making progress, like you’re overthinking, like you’re avoiding commitment.

An R&D approach requires trusting that time spent learning is valuable. That discovering you shouldn’t build something is a success, not a failure. That pivoting based on findings is good decision-making, not indecisiveness.

The manufacturing mindset—where progress means output, and speed means efficiency—can work against these principles.

You can adopt R&D practices, but they need organizational support to survive. Without that support, they tend to erode when pressure increases.

Subtle Habits That Work Against Investigation

If software development is really R&D work, then certain everyday habits actively undermine it—habits that make perfect sense in manufacturing but work against investigation and discovery.

Beyond the obvious obstacles, there are everyday behaviors that can quietly prevent teams from working like R&D organizations. These are harder to spot because they feel normal—they’re how we’ve always worked.

Things like:

Apologizing for not knowing yet (framing uncertainty as failure)
Treating reading time as unproductive
Defaulting to “Yes, I can build that” before investigating
Feeling guilty about uncertainty mid-investigation
Optimizing for “looking busy” rather than learning
Measuring progress by code written rather than understanding gained

These habits optimize for appearing productive, certain, and fast. They treat investigation as something to minimize rather than embrace.

For a detailed breakdown of 10 specific habits and how to shift them, see 10 Hidden Habits That Block Investigation-Driven Development.

Starting From Where You Are

Most teams can’t just declare “We’re doing R&D now” and reorganize everything. So what’s actually practical?

The short version:

Start small: Try investigation on one project. Document what you learn to build evidence.
Use stealth tactics: “Getting familiar with code” = investigation. “Discovered a complication” rather than “estimate was wrong.”
Handle prescriptive stakeholders: Acknowledge their concern, ask questions that reveal context, present investigation as risk mitigation.
Help developers transition: Low-stakes practice, pairing, treating code as drafts, making uncertainty safe.
Build alliances: Find someone who gets it, try it together, tell the success stories.

The reality is that change happens through demonstrated results, not mandates. When investigation prevents problems or finds better solutions, people notice. “Why do their projects go smoother?”

For comprehensive tactics on adopting investigation-driven development in constrained environments—including handling stakeholders who prescribe solutions, working without authority, and helping developers shift from code-cramming habits—see Practical Guide to Adopting Investigation-Driven Development.

What R&D Actually Produces

If software development is R&D work, what are the actual outputs?

Real R&D organizations produce artifacts that show learning:

Pharmaceutical companies: investigation reports
Skunk Works: weekly learning updates
Bell Labs: publications and patents

For software, the artifacts should be investigation reports, architecture decisions, risk registers, stop decisions, and post-launch reviews.

The code doesn’t tell the full story. Code shows what you built, but not why. Six months later, someone looks at complex code and doesn’t know: you investigated three approaches, the “simple” one failed in testing, this handles a European tax edge case.

That knowledge gets lost. Teams reinvestigate the same problems. They try approaches already ruled out. This is preventable through investigation documentation.

AI changes the game. For years, documentation was hard to find later. AI search understands meaning, synthesizes across documents, and makes investigation journals practical like never before. Simple markdown files in docs/investigations/ become a searchable knowledge base.

For the complete guide to documentation as the primary artifact, including AI implementation details, see Documentation as the Primary Artifact of Software R&D.

Rethinking What “Value” Means in Software

If software development is R&D work, manufacturing metrics don’t capture what creates value.

Where real value lives:

Preventing waste (investigation that stops bad ideas)
Understanding (invisible but massive impact)
Good decisions (compound over time)
Avoided problems (found during investigation, not production)
Code quality (reflects investigation quality)

The speed trap: Team A ships ten features per quarter, half unused, creating fires. Team B ships five features, each solving real problems, codebase improving. Which creates more value?

The measurement problem: Investigation value is often invisible. You don’t see the disaster you prevented. Documentation makes this value visible.

The uncomfortable truth: Much software development creates activity that looks like value, not actual value. When you measure velocity, you optimize for velocity—regardless of whether features create value.

For a deep dive into alternative metrics and making investigation value visible, see Measuring Value in R&D-Style Software Development.

The Cost of Not Investigating

When you treat software development like manufacturing instead of R&D, you pay a price.

Consider the typical costs in many organizations:

Sprint planning, estimation meetings, backlog grooming: 8+ hours per 2-week sprint
Projects that could have been stopped after investigation but weren’t: potentially 20-30% of effort
Rework from building the wrong thing: another 20-30%
Developer burnout from unsustainable “commitments”: turnover costs

For a five-person team, this could add up to $400-700k annually. The opportunity cost—what could have been built instead—is even higher.

Investigation time can feel like overhead when you’re using manufacturing metrics. But in R&D work, investigation that prevents building the wrong thing is often the highest-ROI activity possible.

The cost of treating software like manufacturing is measured in waste, rework, and missed opportunities. The benefit of treating it like R&D is measured in better decisions, avoided problems, and solutions that actually work.

Open Questions

Does treating software development as R&D work actually hold up in practice?

Are there contexts where traditional estimation actually works well?
What would actually need to change in how teams operate day-to-day to work like R&D organizations?
How do you handle stakeholders who need dates and budgets in an R&D model?
Is there something fundamental about software work that makes the R&D analogy incomplete?

This is an exploration, not a manifesto. The central question is: What if we’ve been applying the wrong mental model to software development for 50 years? What if the manufacturing mindset that seemed reasonable has been holding us back? What if treating software like R&D—with investigation, discovery, and knowledge as primary outputs—is a better fit for the actual work?

The goal is to figure out if there’s something useful here, or if it falls apart under scrutiny.