Product Thinking w/ Surya - Strategic Insights on Product and Technology

When Your Reports Become Your Customers

You call a meeting to "align on priorities." Your team spends two hours in a conference room. Decisions get deferred pending "more data." Everyone leaves to update their status decks for next week's follow-up.

You just cost your team 10 hours of productive work. What did they get in return?

If the answer isn't something concrete and valuable, you're net negative. And most managers are.

I've been testing a framework inspired by Roger Martin's A New Way to Think. His concept applies to organizational layers, corporate strategy, and entire business units. But the core idea hit me hard: every layer above the front line must add more value than it costs. If it doesn't, it weakens the competitive position.

I apply it to my functioning. One question before every initiative: Does this help my reports win more than it costs them?

What does that look like in practice? Here are the shifts:

The value-add question

Before I schedule a meeting, create a process, or ask for a deliverable, I force myself to answer: Does this help more than it costs?

Time is the cost. Coordination overhead is the cost. Delayed decisions are the cost.

If I can't articulate a specific value that exceeds those costs, I kill it. Even if my peers do it. Even if it's "standard practice."

What your reports actually need

I flipped my 1-on-1s. Instead of collecting status, I ask: "What do you need that I can provide or unblock?"

The shift is fundamental. Your reports aren't your employees—they're your customers. I justify my existence by providing services they can't get more efficiently elsewhere.

Legitimate services: strategic rigor, negotiating vendor contracts, building executive relationships, creating shared capabilities, and removing obstacles. Roll up my sleeves for prototyping and building as necessary.

Not legitimate: coordination and alignment that doesn't produce outcomes, strategic oversight that doesn't make them more competitive, process standardization that doesn't reduce their work.

Most of what managers do falls in the second category. I'm trying to kill it.

The elimination test

If I disappeared tomorrow, what would break? That's my real value. Everything else is coordination theater.

Stay connected to real competition

I spend time where my team's work actually competes. Customer calls, demos, key artifacts for solution discovery and building. Anywhere customers make choices.

Competition doesn't happen in strategy decks. It happens at the front line, where a customer picks your product over the alternative.

Why this works

Most managers optimize for boss satisfaction, peer coordination, and risk minimization.

If you optimize for your direct reports' competitive advantage instead, you create alpha. Your team ships faster. They win more customers. Morale improves.

You're not changing the org chart or fighting power structures. You're shifting from coordination to capability-building. That shift requires no executive approval. It's a choice.

The organization may not reward it explicitly. But your team's results will compound over time. The performance difference becomes undeniable.

Not prescriptive, just experimental

This is what I'm testing in my own work. It's not the only way to manage. I'm learning as I go, adjusting when things don't work.

Roger Martin's framework goes much deeper than what I've described here. His book is worth reading if you're interested in the broader implications challenging broad spectrum of existing operating models.

For me, the practical takeaway was simpler: I can choose to be net positive or net negative to my team. That choice is mine to make.

So I keep asking the question: Does this help my reports win more than it costs them?

And I keep killing the things that fail the test.

What Emerged: 99 Days of Product Thinking Journal

One hundred posts. For me, writing is clarifying thinking. I built this entire site with Claude Code—designed it, deployed it, automated the Obsidian-to-Cloudflare publishing flow. Now, sixty percent of what I've written is about AI. The tool became the subject. That's either profound or obvious, depending on your tolerance for meta-commentary.

Here's what I didn't expect: not the daily writing (I'm reading widely anyway, so ideas only compound), but the sheer pace at which AI developments demanded rethinking. Every week brought capability shifts, strategic implications, and deployment patterns worth exploring. You can't ignore the acceleration even if you tried.

This is what emerged when you show up every day without an agenda.

The AI Emergence I Didn't Plan

When I started in July, I knew AI would feature. Product thinking intersects with every major platform shift, and this one's moving faster than most. But I didn't anticipate writing fifty-nine posts with AI in the title, tags, or core argument. That's not editorial strategy—it's the environment forcing constant synthesis.

The vibe coding surprise compounds this. Claude Code didn't just help build features; it architected the entire site. Pagination systems, archive layouts, newsletter integration, test coverage—all generated through conversational iteration. The flow from Obsidian draft to live Cloudflare deployment is trivial now. No friction, no deployment anxiety, no "let me check if this breaks production."

What does that say about where we are? When the tool that builds your platform becomes the platform story itself, you're living through the shift everyone's theorizing about. The gap between "AI will change how we work" and "AI is how I work" closed faster than expected.

Systems thinking applies to content, too. Each post wasn't planned in isolation—ideas connected, frameworks built on frameworks, and patterns emerged that I didn't consciously design. The writing process became a feedback loop: publish insight, watch what resonates, follow threads that matter. Product thinking applied to product thinking itself. [...]

Two GTM Insights Product Managers Can Actually Use

I've been digging through the 2025 State of B2B GTM Report from Growth Unhinged, and while most of it focuses on channel strategy and GTM execution, two findings stood out for their direct relevance to product work.

These aren't prescriptions—they're observations from one dataset that might be useful as you think about your own product decisions.

Your pricing tier predicts your GTM motion (not the other way around)

The survey shows clear patterns between product pricing and which GTM motions actually work:

PLG dominates for products under $5k/year and companies under $1M ARR
Account-based motions work best for expensive products (above $25k ACV)
Mid-range products ($5-25k) see more success with paid acquisition

What caught my attention: this suggests that pricing isn't just a revenue decision—it's a GTM architecture decision.

When you're setting pricing tiers or deciding on packaging, you're also making a bet on how the product will go to market. A $2k/year product architected for sales-assisted conversion is fighting uphill. A $30k/year product expecting viral PLG growth faces the same problem.

This doesn't mean you can't defy these patterns. But it does mean your pricing strategy should be informed by the GTM motion you're willing and able to execute or vice versa.

For PMs: the next time you're in a pricing discussion, it's worth asking explicitly: "Which GTM motions does this pricing strategy enable or constrain?"

AI features work better as an augmentation than a replacement

The report shows high AI adoption across GTM teams, but 53% see limited or no impact from those investments.

The specifics are telling. AI SDRs (full replacement plays) are particularly disappointing. One team reported "six months, zero opportunities." Meanwhile, AI that augments human workflows—intent-driven outbound, market intelligence, content support—shows better results.

This maps to a broader product principle: automation that eliminates steps in an existing workflow tends to work better than automation that tries to replace the entire workflow. I've explored this pattern before—AI agents grow work rather than replace it.

For product teams building AI features, this suggests focusing on making humans more effective rather than eliminating them. AI that surfaces insights, automates tedious parts of a process, or handles high-volume, low-stakes tasks seems to land better than AI that tries to own an entire job function.

The nuance: this is one survey of B2B GTM teams, not a universal law. But it's consistent with what I'm seeing across other domains—the "copilot" framing works, the "autopilot" framing struggles. For now.

What this means in practice

These aren't definitive answers—they're data points worth considering as you make product decisions.

On pricing: think through the GTM implications before you lock in that tier structure. Your pricing model is also a distribution model.

On AI: consider whether your AI feature is designed to augment a human workflow or replace it entirely. The former seems to be landing better in the market right now.

What patterns are you seeing in your own product work? Do these observations match what you're experiencing, or are you seeing something different?

Early Experience: A Different Approach to Agent Training

AI agents are currently in use, handling customer service interactions, automating research workflows, and navigating complex software environments. But training them remains resource-intensive: you either need comprehensive expert demonstrations or the ability to define clear rewards at every decision point.

Meta's recent research explores a third path. Agent Learning via Early Experience proposes agents that learn from their own rollouts—without exhaustive expert coverage or explicit reward functions. It's early, but the direction is worth understanding.

(Source: Meta's research paper)

Current Training Constraints

Today's agent training follows two primary approaches, each with different resource demands:

Imitation Learning works well when you can provide thorough expert demonstrations. The challenge isn't the method—it's achieving comprehensive coverage across the scenarios your agent will encounter in production.

Reinforcement Learning delivers strong results when you can define verifiable rewards. But most real-world agent tasks like content creation, customer support, and research assistance don't have clear numerical rewards at each step. You're left with engineering proxy metrics that may not capture what actually matters.

Neither approach is inherently limited. Both are constrained by what they require: extensive demonstrations or definable rewards.

What Early Experience Proposes

Meta's research introduces a training paradigm where agents use their own exploration as the learning signal. Two mechanisms drive this:

Implicit World Modeling: The agent learns to predict what happens after it takes actions. These predictions become training targets—future states serve as supervision without external reward signals. The agent builds intuition about environmental dynamics through its own experience.

Self-Reflection: The agent compares its actions to expert alternatives and generates natural language explanations for why different choices would be superior. It's learning from its suboptimal decisions through structured comparison.

The core idea: an agent's own rollouts contain a training signal. You don't need a human expert for every scenario or a reward function for every decision.

Whether this scales to production environments across different domains remains an open question.

The Research Numbers

In controlled benchmark environments, Early Experience showed meaningful gains over imitation learning: +18.4% on e-commerce navigation tasks, +15.0% on multi-step travel planning, and +13.3% on scientific reasoning environments.

When used as initialization for reinforcement learning, the approach provided an additional +6.4% improvement over starting from standard imitation learning.

These are research benchmarks, not production deployments. The question is whether these gains transfer to real-world complexity and whether the approach works across different agent domains.

What Changes If This Materializes

If this training paradigm proves viable at scale, several implications follow:

Training economics shift: Less dependence on comprehensive expert demonstration coverage could reduce the human-in-the-loop burden during agent development. You're trading labor-intensive curation for computation-intensive self-supervised learning.

Deployment pathway evolves: Start with Early Experience training, deploy and collect production data, then layer reinforcement learning for further optimization where rewards are verifiable. Each stage builds on actual agent experience rather than static expert datasets.

Infrastructure requirements matter: The approach needs agents with enough initial capability to generate meaningful rollouts. It's most applicable in domains with rich state spaces like web navigation, API interactions, and complex planning tasks.

This isn't a universal solution. It's likely domain-dependent, and we don't yet know where the boundaries are.

The Question Worth Asking

It's too early to call this a paradigm shift. But it represents a direction worth watching: agents learning through structured exploration of their own experience rather than pure imitation or reward maximization.

The research suggests that training agents might become less labor-intensive. Whether that transfers from research benchmarks to production systems is still uncertain.

For teams building agents: what experiments could validate whether self-supervised learning works for your specific use cases? The window between "interesting research" and "table stakes capability" has a way of closing faster than expected.

Agentic AI: It's the Readiness and Access Story

The gap between hype and reality isn't the story everyone's missing about agentic AI. The gap between who's positioned to deploy it and who's stuck waiting for infrastructure—that's the story.

And that gap is widening every quarter.

The technology is proven—access to it is not

Nearly every senior enterprise developer is experimenting with AI agents right now. One in four enterprises is deploying them across teams this year. The question isn't whether autonomous AI systems work. It's whether your organization is set up to use them.

Agentic AI means systems that plan workflows, make decisions, use tools, and execute toward goals autonomously. Several companies are automating complex research workflows. Not demos—production deployments.

The constraint isn't capability. It's infrastructure readiness.

The divide that determines everything

Two types of organizations are emerging.

One group is navigating APIs that don't exist, data scattered across incompatible systems, procurement processes that take months, and compliance frameworks designed for a different era. They're blocked by legacy infrastructure.

The other group solved these problems early. They built integration layers, consolidated data architectures, and established governance processes before they were urgent.

This second group is deploying autonomous AI systems right now, while the first waits for infrastructure to catch up. In twelve months, the capability gap between these groups will be dramatic.

The comfort of "everyone's struggling together" is false. Some organizations aren't struggling; they're shipping.

What's changing about work itself

Humans are shifting toward workflow design and outcome verification rather than task execution. Less time gathering data, more time interpreting it.

This transition creates winners and losers. Product managers who learn to architect agent workflows will be indispensable. Those focused on task-level execution will find their roles increasingly automated. Technologists who understand how to build for autonomous systems will command premium value. Those who wait for clarity will find the market has moved past them.

Some roles will be eliminated. Others will be created. Most will transform beyond recognition.

The realistic path forward requires action now

Most deployments today are basic: simple tasks with predefined objectives. Not revolutionary, but achievable even with infrastructure constraints.

You don't need perfect systems to start learning. Pick one workflow: document triage, report generation, or data synthesis. Run it with full human review. Measure time saved. Identify what breaks. Iterate.

You're building organizational fluency with the technology, so when infrastructure catches up, you're ready to deploy at scale.

The teams treating this as optional will spend next year explaining to leadership why competitors moved faster.

What's actually at stake

The transformation is real, but access to it is unequal. That inequality is compounding.

Companies positioned to deploy autonomous AI systems are establishing leads measured in quarters, not weeks. The window for experimentation without falling behind is closing.

This isn't about whether agents will replace human work. It's about whether you're positioned to architect the systems that leverage them. Or whether you'll be explaining why your organization wasn't ready.

What experiment can your team run this quarter?

The Impact Scorecard

It's surprisingly easy to stay busy without making much of an impact.

A team ships features, hits sprint goals, and sees metrics move—but six months later, it's unclear what actually mattered. Not because the team wasn't working hard, but because "impact" is slippery to define.

I've found it helpful to think about impact along two dimensions: customer value and business value. When you map your work on both axes, patterns start to emerge about what's actually moving the needle.

The Impact Scorecard

Think of it as a simple 2x2 matrix. One axis measures how much customers value what you built. The other measures how much it helps the business. Every product initiative lands somewhere on this grid, and where it lands tells you something important.

Quadrant 1: High Customer Value + High Business Value

This is what you're aiming for. You've built something that genuinely helps customers while also driving metrics that matter to the business—maybe retention, revenue, or strategic positioning. A feature that reduces a key pain point and improves conversion. An onboarding flow that both helps new users succeed and increases activation rates. When customer needs and business needs align, you've found the sweet spot.

Quadrant 2: High Customer Value + Low Business Value

Customers love what you built, but it's not moving business metrics. Maybe it's a delightful feature that doesn't connect to conversion or retention. These aren't always wrong: some features are strategic investments in trust and brand. But if most of your roadmap lives here, it's worth asking whether your work is sustainable long-term.

Quadrant 3: Low Customer Value + High Business Value

This drives short-term business results, but customers don't find much value in it. Maybe it's an aggressive upsell prompt or a feature that benefits the business more than users. These can create tension over time. Numbers might look good this quarter, but you're spending down trust, and that has costs down the road that don't always show up in dashboards.

Quadrant 4: Low Customer Value + Low Business Value

Neither customers nor the business benefit much. This often happens when we build based on assumptions rather than evidence, or when we optimize for stakeholder requests without validating demand. It's not a failure, it's just learning. The goal is to recognize these early and redirect effort toward higher-impact work.

What This Means in Practice

Try mapping your recent launches on this grid. You'll probably find a mix across quadrants—that's normal. The exercise isn't about judging past decisions, but about spotting patterns in where you're investing time.

If you notice most of your work clustering outside Quadrant 1, it might be worth asking: How could we shift more effort toward work that delivers both customer and business value?

Impact happens at the intersection of solving real customer problems and moving metrics that matter to your business. Everything else is still valuable work. You learn, you build skills, you discover what doesn't work. But knowing the difference helps you be more intentional about where you spend your time.

Anthropic’s Quiet Advantage in the AI Race

“Anthropic’s growth path is a lot easier to understand than OpenAI’s. Corporate customers are devising a plethora of money-saving uses for AI in areas like coding, drafting legal documents, and expediting billing.” — The Wall Street Journal

That line captures an important dynamic in today’s AI market: two companies building similar technology, but betting on very different ways to make it sustainable.

OpenAI is chasing scale — hundreds of millions of users, a consumer-facing brand, and a growing subscription base. Anthropic, by contrast, is growing through depth. Around 80% of its revenue now comes from corporate customers using Claude in coding, legal, and operational contexts. That focus has given it a quieter but steadier business profile, reportedly reaching a $7 billion annual run rate.

It’s interesting how this divide is shaping the landscape. OpenAI’s mass-market reach gives it visibility and data, but it’s still searching for a clear long-term revenue model beyond subscriptions. Anthropic’s enterprise-first approach, while less visible, ties directly to measurable outcomes: productivity gains, cost reductions, workflow acceleration. For now, that seems to be resonating with businesses that know exactly how to calculate return on investment.

“What I’m chasing is to bring to biologists the experience that software engineers have with code generation. You can sit down with Claude and brainstorm ideas, generate hypotheses together.” — Financial Times

This next quote signals how Anthropic is trying to deepen its foothold — not just in the enterprise, but within specific domains. The company is adapting Claude for life sciences, integrating it into lab management and genomic analysis systems.

The examples are telling. Novo Nordisk reportedly reduced clinical documentation from ten weeks to ten minutes using Claude. Sanofi says most of its employees already use it daily. That’s a different kind of AI adoption — one rooted in precision, compliance, and workflow design rather than consumer habit.

What stands out is Anthropic’s framing: Claude isn’t a scientist, it’s a scientific assistant. The focus is on amplifying human work rather than automating discovery. That seems to align with how heavily regulated industries adopt new technology; slowly, methodically, but with lasting impact once trust is established. Features like audit trails, reduced hallucinations, and citation verification make Claude fit for environments where accountability matters as much as performance.

Stepping back

These two stories together hint at an evolving market structure. OpenAI, Anthropic, Google, and others aren’t just competing on model performance anymore; they’re diverging on business logic. Some are building ecosystems around mass reach and developer tools. Others, like Anthropic, are going narrower — optimizing for reliability and use-case fit within high-value domains.

It’s too early to tell which model scales more effectively. Consumer AI could unlock entirely new markets if monetization catches up. Enterprise AI could plateau if integration costs remain high. But what’s clear is that the industry is experimenting with different paths to commercial maturity.

P.S.: Claude Code remains my favorite AI tool hands-down. Anthropic has a winner here.

When the Cost of Delay Becomes Your Biggest Risk

Blockbuster had every advantage—brand, reach, loyal customers. They saw Netflix coming and had the resources to compete. They waited too long. The window closed.

Kodak invented the digital camera in 1975. They knew film was vulnerable. They protected margins instead of building the future. When they finally moved, others had already won.

BlackBerry watched the iPhone launch and dismissed touchscreens as toys. They waited for validation. The window closed again.

The pattern is clear: in disruption, hesitation is the most expensive decision you can make.

When ROI stops working

In stable markets, ROI analysis works: estimate effort, project impact, prioritize by return. But in fast-moving markets, the numbers turn into guesses. That’s when cost of delay becomes a better lens by measuring what you lose by not acting sooner.

Jeff Bezos used this logic in 1994. The ROI of selling books online was unknowable, so he asked himself which decision he’d regret more at age 80. The answer was obvious: he’d regret not trying.

The AI-era shift

This moment is different. Building is faster and cheaper than ever. What once took months now takes days with AI tools, context engineering, and rapid prototyping.

That makes the choice of what to build more important—and more dangerous to delay. Every month spent “waiting to see” is a month others spend shipping and learning. They’re not better funded; they’re just moving.

Consider Chegg. For years, it dominated online homework help. Then generative AI gave students free answers. Chegg waited to adapt. Subscribers left. The model cracked almost overnight.

When barriers to building fall, the cost of delay compounds. What once felt like prudence becomes paralysis.

From maximizing value to minimizing regret

In stable markets, you ask, “What will maximize value?”

In turbulent markets, ask, “What will we regret not having tried?”

This isn’t about chasing every shiny thing. It’s about recognizing inflection points. Investing in AI, rethinking your platform, rebuilding for the next developer generation, and knowing when traditional ROI thinking slows you down.

Project yourself forward. In 2027, what will you wish you had started in 2025? That’s your signal. The cost of delay isn’t just lost revenue; it’s lost learning, lost compounding advantage, lost momentum.

The real risk

This isn’t a call to panic. It’s a call to move.

You can now build and test faster than ever, run more experiments with fewer resources, and learn from the market in weeks instead of quarters.

Some windows stay open for years. Others close in months. The work of product leadership is knowing which is which and having the courage to act when waiting costs more than being wrong.

The market won’t wait for you to be certain. Neither should you.