Scarce Resources in the AGI Era: A Framework for Finding Defensible Opportunities

The Starting Question

When AGI achieves mastery in the virtual world, what remains scarce? What has long-term defensibility and profit potential? What problems are worth solving?

Three primary directions explored:

Direction 1: Accumulate personal context — own the most complete picture of each user to act on their behalf
Direction 2: Optimize the environment for agents — rebuild tools for AI-first interaction; digitize offline information that agents can't access on their own
Direction 3: Replace brokers with AI agents — find industries where broker/agency models grow linearly due to human bottlenecks, and unlock exponential growth through AI

Two additional directions explored but deprioritized:

Direction 5: Trust & compliance infrastructure — AI audit tools, digital identity verification, AI liability/insurance. Logical but faces timing risk: compliance markets need regulation to lead, and regulatory pace is unpredictable. Safer to enter from an already-regulated vertical (e.g., financial compliance) and expand as regulation spreads.
Direction 6: Attention & curation — when AI can generate infinite content and options, human cognitive bandwidth becomes the bottleneck. "Chief of Staff" for individuals, B2B intelligence curation. Real need, but hard to differentiate from general AI assistants and vulnerable to OS-level absorption.

Why LLMs Unlock These Opportunities Now

Many of these ideas were attempted before LLMs. Four structural reasons they failed:

Non-structured information processing cost dropped to near-zero. Previously, extracting structured data from conversations, PDFs, voice messages required per-domain human annotation or specialized models. Enormous marginal cost per new domain or format. LLM makes "understanding arbitrary unstructured information" nearly free.
Personalized matching cost approached zero. Traditional broker platforms (Upwork, legal directories) could only match on tags and keywords — "a Python developer," "a real estate lawyer." But users need fuzzy, contextual matching ("patient communicator who's done similar-scale projects"). This previously required an experienced human broker's time. LLM automates deep need-understanding and matching.
Long-tail service economics became feasible. Many information services only served high-end clients because each required extensive human customization. McKinsey charges hundreds of thousands for competitive analysis; a small business needs the same but can't afford it. LLM makes customized service delivery near-zero marginal cost, enabling market downscaling.
Cold-start friction dropped. Platform businesses die at cold start — insufficient supply means no demand and vice versa. LLM changes this: you can provide decent service quality with minimal proprietary data (the model's general capability fills the gap), then gradually accumulate proprietary data to differentiate. Previously you needed large datasets before providing any value.

The double-edged implication: LLM lowers everyone's costs equally. So the real question isn't "does LLM unlock this?" but "once everyone can do this, what can you accumulate that others can't?" — which leads back to proprietary data, network effects, and trust brands.

What We Learned (and Unlearned)

Direction 1 was downgraded

Personal context accumulation faces an OS-level moat problem. Apple and Google already own the entry points (email, calendar, device data). Worse, as general agents improve, they'll naturally accumulate user context through normal interactions. Unless you own exclusive signals that agents can't get through regular use, the moat is thin. Context portability (via protocols like MCP) will make switching costs shrink over time.

Direction 2 split into two sub-directions

"Rebuild tools for AI" has real short-term opportunity but weak long-term moat. Incumbents (Figma, Salesforce) will add agent-friendly APIs. Strong enough AGI won't need optimized environments — it'll adapt to whatever interface exists.

"Digitize offline information" proved far more interesting. Physical-world information that's fragmented, unstructured, and continuously changing requires real effort to collect. This data doesn't appear on the internet and can't be generated by AI. It must be gathered through relationships and embedded workflows.

Key design principle: when users can't clearly describe what they need (common in non-standard matching), they don't want more options — they want reduced uncertainty. Showing 20 plumbers is Yelp. Having a trusted friend say "use this one, here's why" is what people actually want. Implication: the core value proposition should be "better judgment" not "wider selection." What gives you the right to make that judgment? The data flywheel.

Direction 3 needed major corrections

The conversation developed a formal taxonomy of broker types with different AI implications:

Information brokers (travel agents, insurance brokers) — their value is "knowing things" that clients don't. AI directly threatens them because the information they hold can be replicated. Easy to displace, but also easy for competitors to replicate any AI-based displacement. Navan's real moat isn't information intermediation — it's workflow embedding (expense management) + procurement aggregation (volume-based bargaining power).

Relationship brokers (recruiters, investment bankers) have stronger moats because trust is personal. But human relationship capacity has structural limits (~150 active relationships per Dunbar's number). AI can expand a broker's effective network from 150 to maybe 500-1000 people, but not to 5,000 — at some point the other party senses they're being "managed by a tool," destroying trust. This is a 3-5x improvement, not exponential growth.

The third mode (the real insight) — neither amplify individual brokers nor replace them, but aggregate the fragmented information held across many distributed brokers into a network. A single freight forwarder knows 20 shipping routes. Aggregate 500 forwarders' information and you have near-complete coverage — that network has non-linear value. Brokers participate because it brings them incremental business without threatening their existence (positive-sum information sharing).

The Core Thesis

In the AGI era, lasting commercial value comes from controlling the interface between AI and the physical world.

AI capabilities in the virtual world will rapidly converge. But for AI to operate in the real world, it needs three things it can't generate on its own:

High-quality proprietary data about real-world state
Connections to real-world people and resources
Human trust and verification of AI outputs

Don't build the agent's "skills" (they'll be learned). Build the agent's "eyes and hands" (perception and execution that can't be replaced).

The 5-Layer Framework

Evolved through multiple case studies and challenges:

Layer 0: The AGI-Resilience Test

Will this opportunity's core value be eroded as general agents get stronger?

If your value is "executing a task better than a general agent" — dangerous. General agents will catch up. If your value is "owning data or connections that agents depend on to execute tasks" — safe. Stronger agents make your data more valuable, not less.

Layer 1: Market Identification

Four conditions, all required:

Abundant distributed "human interfaces." The market has many fragmented human intermediaries (brokers, agents, consultants) whose work is information processing, judgment, and matching between supply and demand.

Structural information asymmetry. The asymmetry won't be eliminated by simple transparency. Structural reasons include: information is dynamic (updates daily/weekly), distributed across many fragmented actors, generated through physical-world activity that doesn't pass through digital channels, or quality is only knowable through experience.

Viable frequency x ticket size. Ideal: mid-frequency, mid-to-high ticket size. Low-frequency high-ticket can work if you extend across categories. High-frequency low-ticket works through B2B aggregation.

Highly non-standard demand where LLM is indispensable. Test: try designing a form with dropdown menus to capture 10 real demand descriptions. If you need 20+ fields, lots of free-text boxes, or can't enumerate the options — LLM is irreplaceable. If a few dropdowns cover 80% of matching, traditional search suffices.

Layer 2: Data Acquisition Mode

Three modes (not mutually exclusive):

Mode A: Workflow-embedded collection. Build a standalone tool that solves a real pain point for the supply side. Data is a natural byproduct of tool usage. Strongest moat (switching costs grow over time), but slowest cold start. Example: work order management for property managers.

Mode B: Lightweight assessment collection. Design a low-friction "assessment entry point." The supply side contributes data by participating. Fast cold start, but low switching costs. Example: Mercor's 20-minute AI interview.

Mode C: Intermediary network aggregation. Give existing brokers tools to plug their information into your network. Leverages existing information collection infrastructure. Works when information sharing is positive-sum for brokers (brings them incremental business, doesn't threaten their existence). Example: freight forwarder networks, sourcing agent networks.

Layer 3: Data Moat Validation

Four tests — all must pass:

Data exclusivity. Can this data only be obtained through your channel? Are there public datasets, alternative collection paths, or ways for incumbents to replicate it? If yes, the moat is illusory. Radiology AI failed this test — medical images are standardized and publicly available.

Data temporality. Does the data go stale? Update frequency of days-to-weeks means your continuous connection to the supply side is itself a moat. Static data that's collected once provides no ongoing advantage.

Output non-benchmarkability. Can the quality of your data-driven output (recommendations, matches, judgments) be easily compared via standardized benchmarks? If yes (like radiology diagnosis accuracy), competition becomes a performance race and first-mover data advantages erode. If no ("is this the right vendor for your situation?" is subjective and multi-dimensional), first-mover advantages persist.

Data self-reinforcement. Do results from using data feed back into the system? Recommendation -> use -> outcome feedback -> better recommendation = flywheel. Without the feedback loop, the flywheel can't spin.

Layer 4: Scale Effects

On top of data moats, you need at least one:

Two-sided network effects (more supply -> better demand matching -> more demand -> more supply joins)
Non-linear value from coverage density (going from 30% to 80% supply-side coverage creates a value jump far beyond the linear improvement)
Data-to-AI-capability conversion — the most aggressive endgame: use accumulated data to train AI that partially or fully replaces the supply side (the therapy AI model). Deepest moat but faces the trust paradox.

Layer 5: External Conditions

Structural tailwinds (energy transition, supply chain diversification, etc.)
Regulatory changes (can create or destroy opportunities)
Competitive disruptions (Mercor benefited from Scale AI's Meta deal)
Technology inflection points (LLM capability jumps can suddenly make previously infeasible models viable)

The Industry Scan

Applied the framework across 40+ industries. Top tier after full filtering (including LLM indispensability):

Tier 1 (all conditions met, LLM highly irreplaceable):

Cold chain logistics broker networks — multi-constraint, multi-leg coordination, cross-regulatory requirements
Specialty construction subcontractor networks — implicit constraints, sequencing complexity, heavy context-dependence
Contract manufacturing/factory sourcing — material/precision/certification/capacity constraints that defy form-based matching; strong supply-chain-diversification tailwind

Tier 1 (with caveats):

Non-standard advertising resource broker networks — brand-fit understanding, cross-media combination optimization. Caveat: weak attribution feedback loop (hard to know if ad placement "worked") and relatively low cost of matching failure. May belong in Tier 2.

Tier 2 (conditions mostly met): Distributed energy installation, commercial insurance brokers, trade finance brokers, contract testing labs, commercial energy procurement, industrial MRO services, specialty agricultural sourcing, food co-packing, cybersecurity services, patent/IP services

What Makes a Vertical Not Worth It

Seven disqualifying conditions:

Information asymmetry will be naturally eliminated by technology (IoT, sensors)
Supply side is concentrated (top 5 players >80% market share — no routing value)
Low cost of matching failure (users don't care enough about accuracy to pay)
Feedback loop too long (years to know if match was good — flywheel can't spin)
Transaction frequency too low (data accumulates too slowly)
Supply side has no incentive to participate (top providers don't need your platform)
Regulation blocks data flow (can be overcome but raises cost/slows speed)

Compressed test: Large fragmented supply side with incentive to participate + High cost of matching failure + Short feedback loop + Structural information asymmetry that won't auto-resolve.

Finance Sector Opportunities

Private market capability snapshots — strongest finance opportunity. Private markets (PE, VC, private debt, real estate funds) remain extremely opaque. A GP's real performance data, team composition, fundraising status scattered across pitch decks and verbal communication. PitchBook/Preqin data is often estimated or lagged. Matching need is extremely non-standard ("endowment, 15% PE allocation, mid-market buyout, no China, non-overlapping with existing portfolio"). Ticket size massive — fund placement commissions 1-2% of fund size.

SMB credit risk real-time API — not a matching platform but a continuous multi-dimensional risk profile from accounting/bank/invoice/payroll systems, packaged as an API lender agents can call. Flywheel: more SMBs -> better risk models -> lenders trust more -> SMBs get better terms -> more join.

Insurance risk real-time pricing interface — continuous risk signal collection -> real-time risk profile -> dynamic pricing API. Position as risk data infrastructure (all insurers' agents call your API) rather than as a broker.

Cross-border payment routing intelligence — real-time routing optimization across SWIFT, local rails, fintech corridors. Clear value but challenging data acquisition from participants who benefit from opacity.

IT Sector Opportunities

Open source component health snapshots — real-time signals per library: commit frequency, maintainer activity, vulnerability history, breaking change patterns. Cloud cost optimization APIs — actual utilization, anomaly detection, cross-provider comparison. API/SaaS vendor health snapshots — real uptime, actual response times, deprecation signals, pricing trends. Cybersecurity threat matching — organization's specific tech stack matched with relevant threats. Highly non-standard. IT talent verification — Mercor model for IT ops roles.

Case Studies

Corporate Travel / Navan (Framework Failure)

Verdict: Does not fit. Two of four Layer 1 conditions fail.

Information asymmetry is weak — flight/hotel pricing already transparent via GDS and OTAs. LLM is not indispensable — travel needs are fully structurable with dropdown menus. Navan's moat comes from workflow embedding (expense management integration with high switching costs) and procurement aggregation (volume-based bargaining power with airlines/hotels) — neither is an information moat. AI is incremental improvement here, not market restructuring. Window closed for new entrants.

Mercor (Mode B Validation)

Verdict: Strong fit. 20-minute AI interview simultaneously provides value and collects exclusive data. Interview performance and behavioral signals exist only in Mercor's system. Each hiring outcome feeds back to improve assessment. Benefited from Scale AI's Meta contract loss (sudden supply of available AI talent — Layer 5 in action). The real moat is accumulated data, not the interview process.

Property Management Work Orders (Mode A, Partial Fit)

Verdict: Strong Mode A candidate, but dropped during LLM-indispensability refiltering. Data on contractor reliability accumulates naturally through work order tools. Information asymmetry is structural, supply is fragmented, switching costs grow over time. But tenant requirements and maintenance needs are relatively standardizable — a form with dropdowns captures most service requests. LLM adds value but isn't the core differentiator.

Radiology AI (Framework Failure — Improved the Framework)

Verdict: Appears to fit but actually fails. Surface logic is perfect: give radiologists tools -> generate annotated data -> model improves. But no real moats emerged.

Three framework gaps exposed: Data is not exclusive — medical images are standardized (DICOM) with large public datasets and multiple acquisition paths. A chest X-ray exists independent of your tool, unlike a contractor's rework rate which only exists in your system. Output is too benchmarkable — diagnosis accuracy has objective ground truth, enabling competitors to prove parity on public benchmarks. Tool penetration is too thin — radiologists have PACS as their core tool; your AI is an optional auxiliary layer with near-zero switching costs.

This case study motivated adding data exclusivity and output non-benchmarkability as explicit Layer 3 tests.

Therapy AI (The Replacement Paradox)

Verdict: Advanced, high-risk variant of Direction 2. Model: serve therapists with tools -> accumulate session data -> train AI that partially replaces therapists. Moat is deeper (the trained model, not just a database) but faces a trust paradox: you serve therapists to get data, but your endgame is replacing them. Once they realize this, why use your tool?

Generalizes to any professional service where decision processes can be recorded (legal case management -> AI legal advisor, financial planning -> AI advisor). Constraints: service delivery must be text/conversation-based (therapy works, surgery doesn't), and regulation must allow AI delivery.

Factory Sourcing (Tier 1 Deep Dive)

Verdict: Strong across all conditions. Massive gap between factory claims and reality (only knowable through cooperation). Millions of manufacturers globally (extreme fragmentation). Wrong factory = months wasted, defective products, hundreds of thousands in losses. Sample-stage feedback in 2-6 weeks. Strong supply-side incentive (SME factories desperate for better clients than Alibaba price wars). Minimal regulatory barriers.

Unique tailwind: supply chain diversification (China+1) to Vietnam, India, Mexico — where factory information is even more opaque. Existing competitors (Thomasnet, Sourcify, Fictiv) still use traditional directory/marketplace models without cooperation-data-based routing.

Non-Standard Advertising Brokers (Tier 1 Challenged)

Verdict: Meets most conditions but two weaknesses may warrant Tier 2. Covers billboards, event sponsorships, KOL partnerships, podcast ads — everything outside programmatic platforms.

Information asymmetry and supply fragmentation pass. But: weak attribution feedback loop (did the billboard "work"? ambiguous, slow to materialize — the data flywheel breaks at the feedback step) and low cost of matching failure (suboptimal placement wastes budget but isn't catastrophic, unlike wrong factory = defective products).

The Protocol Upgrade: From Database to Interaction Primitive

Why Calendly Can't Be Killed

Calendly persists not because it's good, but because it invented a new interaction primitive — the booking link. It compressed a multi-step bilateral negotiation into a single-step unilateral declaration. It redefined how information flows between two parties. And it's embeddable in a single link that spreads virally.

Products that become protocols share four traits:

They define a new interaction verb that didn't exist before
Single-sided network effects (users expose non-users to the format)
Switching costs are distributed across scattered touchpoints (email signatures, websites, automations)
"Good enough" functionality ceiling — incremental improvement can't drive migration

Agent-Era Protocols

Old protocols that solve "humans process information slowly" (Calendly, Loom, Typeform) will be eroded by agents. New protocols will solve agent-era interaction friction:

Capability Snapshot — a standardized, machine-readable declaration of "what I can do, when, and under what conditions." Calendly's booking link generalized to multiple dimensions. Most likely to emerge first because it has immediate value even before agents are fully autonomous.

Verification Proof — proving that a claim, content, or identity is real in a world where AI can fabricate anything. Value increases as AI capability increases.

Authorization Primitive — machine-readable declaration of what an agent is authorized to do, enabling agent-to-agent transactions.

Conditional Commitment — dynamic, machine-readable commercial terms that agents can evaluate and compare automatically.

Preference Declaration — standardized expression of human decision preferences. Highest risk of being absorbed by OS-level players.

The Critical Correction: Natural Language Makes Formats Less Important

If LLMs can understand any natural language, why define structured formats at all?

Answer: natural language is the interface layer, but structured data remains essential for precision (eliminating ambiguity in binding commitments), efficiency (comparing 100 suppliers via structured data is orders of magnitude cheaper than parsing 100 natural-language descriptions), verifiability (machine-verifiable contract terms), and aggregability (routing layers need to index millions of entries).

The real moat isn't "I defined a schema everyone uses." It's: coverage density (how many supply-side entities you've indexed), matching accuracy (how much feedback data you've accumulated), and semantic depth (how well you understand industry-specific meaning). Formats can be copied. Data and understanding cannot.

The Vertical Router: Where Everything Converges

The entire discussion — from information digitization to broker aggregation to new protocols — converges on one concept: becoming the default routing layer for agents in a specific vertical industry.

All agents executing tasks in your industry (regardless of which platform they're from — OpenAI, Google, enterprise-built) need to go through you to discover, match, and interact with supply-side resources. Your routing layer indexes capability snapshots, performs LLM-driven non-standard matching, accumulates feedback data, and becomes the default infrastructure.

Phased approach:

Now -> 1-2 years: Human-facing vertical information platform. Accumulate supply-side data and relationships.
2-3 years: Package data as APIs for early agents. Start defining capability snapshot standards.
3-5 years: Become the vertical's agent routing layer.

Each phase must generate standalone commercial value — you cannot pitch "we'll become the agent routing layer in 5 years." Each phase needs to be a viable business on its own, with the routing layer emerging as an endgame rather than being the starting pitch. The routing layer is the long-term outcome of short-term value creation, not the entry strategy.

On AGI Itself: What to Build Now That Leads There

The most valuable thing to build before AGI: a perception-action-feedback loop in a specific vertical.

This simultaneously achieves three things: generates current commercial value (tools and matching services), accumulates scarce assets for AGI (real-world data and expert decision data), and occupies an irreplaceable interface position when AGI arrives.

Four concrete approaches:

Build AI's "sensory system" for a vertical — continuous real-world state data that AI needs but can't generate
Create action-feedback loops — let AI make decisions in the real world and learn from outcomes
Capture expert decision processes — the most urgent category, as experienced professionals are retiring and their tacit knowledge disappears permanently
Pioneer agent-to-agent coordination — build early infrastructure for discovery, communication, trust, and dispute resolution between agents

Applying to ReadyCall

The Honest Assessment

Meeting notes/action items is a real but mild pain point. People forget follow-ups, miss CRM updates, lose action items — but they've survived with workarounds (memory, scribbled notes, colleague reminders). "Good enough" is a product's worst enemy.

CSM (Customer Success Manager) emerged as the strongest entry point during analysis: predictable action item structure, quantifiable failure cost (client churn), tool gap in the market. But upon challenge: clients rarely churn because of CSM meeting execution. They churn because the product doesn't solve their problem, competitors are better, or budgets got cut. CSM quality is maybe 10-20% of churn causation.

The value proposition of "save CSM time" and "look more professional" is real but not sharp enough to drive urgent adoption.

The Broader Question: Philosophy 1 vs Philosophy 2

Philosophy 1: Find structural advantages, build in defensible positions. Warren Buffett approach. Ceiling: hundreds of millions to low billions. Lower risk.

Philosophy 2: Surf the biggest wave, win through speed and product intuition. Cursor/Perplexity approach. Ceiling: tens to hundreds of billions. Higher risk.

Philosophy 3 (middle layer): Build enabling infrastructure for the agent ecosystem (evaluation tools, security layers, etc.). Requires deep technical credibility and developer community influence.

ReadyCall/Granola occupies a unique position: it's trying to define a new work behavior ("AI auto-records your meetings"), similar to how Calendly defined "sending a booking link." Its competition is about who makes this behavior the default habit first. But the category faces existential threat from general agents — when Apple Intelligence or Google Gemini can attend your meetings natively, the standalone meeting product's window closes.

Devil's Advocate

The framework is a sophisticated form of risk avoidance. By optimizing for "what has moats," we systematically excluded the highest-value opportunities (coding agents, personal assistants, general AI). These are exactly the areas where the most value will be created. The most defensible positions are also the ones with the lowest ceilings.

"Build the eyes and hands, not the brain" assumes the brain and eyes will remain separate. If foundation model companies vertically integrate into perception and execution (Google already has Maps, Waymo, Nest; Apple has the device ecosystem), the "interface" position may not be as independent as the framework assumes.

The framework over-indexes on supply-side data moats. Many successful companies (Stripe, Twilio, Plaid) built moats from demand-side developer adoption and API integration, not from proprietary data. The framework's supply-side focus may miss API-infrastructure plays.

Timing risk cuts both ways. The framework says "start now, build moat over time." But if AGI arrives faster than expected and instantly solves information asymmetry through autonomous investigation (sending agents to inspect factories, interview experts, etc.), the "data collection" moat evaporates.

The Meta-Reflection

After 39 turns of rigorous strategic analysis, the most important insight was the last one: sophisticated analysis can become sophisticated procrastination.

The frameworks are useful for elimination (ruling out bad ideas) but not for selection (finding the right one). The right entry point probably won't be found through more analysis — it'll be found by watching someone do their job and noticing the moment they curse under their breath.

The capabilities needed for Philosophy 2 (product intuition, iteration speed, taste for what users need) aren't prerequisites you develop before starting — they're muscles you build by shipping. The question isn't "am I ready?" but "am I training in the right arena?"

What This Is Not

Not a complete business plan — it's a thinking framework with known limitations
Not a claim that vertical moats are the "right" strategy — it's one of several valid approaches with specific trade-offs
Not an argument against working on ambitious, competitive problems — the framework's conservatism is a feature and a bug

[[me.md]] — thinking patterns captured (diverge-then-converge, reasons from scarcity, aggressive concreteness demands, self-aware about over-analysis)
[[frameworks.md]] — opportunity identification framework to add
ReadyCall — direct application of framework to current product

Companion Pieces

[[2026-02-06-ai-native-saas-reinvention]] — Agent-era interfaces, SaaS reinvention framework, "Skills = Apps" stress test, Howie/Blockit analysis, product-as-protocol taxonomy
[[2026-02-06-defensible-opportunities-in-post-agi]] — Five types of context worth collecting now, six medium-sized opportunities, Philosophy 1/2/3 with founder capability requirements, ReadyCall CSM deep analysis
[[2026-02-06-what-remains-scarce-raw-conversation]] — Full original conversation (Chinese, 39 turns)

References

Original conversation: 39 turns, exploring post-AGI business opportunity frameworks
Case studies referenced: Navan, Mercor, Granola, Cursor, Perplexity, Calendly, Flexport, Aidoc, Howie, Blockit, LangChain, Fictiv, Scale AI
Frameworks referenced: Dunbar's number, network effects theory, data flywheel models, platform economics