Short-Half-Life AI Tools: Edge Harvesting as the New Software Business Model

Abstract

Software defensibility has shifted from a multi-year moat game to a repeated edge extraction game. In 2024 alone, U.S. private AI investment reached $109.1B, organizational AI adoption hit 78%, and inference costs for GPT-3.5-class workloads fell by 280x from late 2022 to late 2024. Those numbers imply two simultaneous truths: building is cheaper than ever, and feature-level advantage decays faster than ever. This paper argues that short-half-life AI tools are not a pathological business category; they are a rational category with a different operating model. The correct frame is not traditional SaaS compounding but volatility trading: find temporary inefficiencies, monetize quickly, and recycle gains into an agent harness layer that compounds as models improve. We define practical monetization archetypes, pricing envelopes, deprecation-aware product operations, and a barbell portfolio strategy that combines 30-180 day utility waves with durable context/memory/policy infrastructure.

1. From SaaS Moats to Edge Half-Lives

Building on giants is the oldest pattern in software. Linux, the GNU stack, every package manager, every web framework on a web framework: durable systems have always compounded on top of open-source substrate maintained by people who cared about the substrate more than about the business above it. What was new fourteen months ago is that for AI agent harnesses specifically, the giants did not exist yet. There were no clear winners to wrap. So I rolled my own -- a context-export Bash script (protocol-model-context), turned it into a coding agent (pcm), iterated through a string of prompt and MCP utilities (metaprompter, init, mcp-mcp, prompts) -- not because I wanted to maintain dev tools but because nothing viable existed to depend on. Today the substrate is emerging. There still are not clear winners, but real maintainers are shipping tools worth wrapping -- Dicklesworthstone's cass (session archaeology) and beads_rust (graph-aware issue tracking) are the clearest example I work with. My own utilities got displaced into thin wrapper skills around their binaries, and the shape of what I had been building (skill-as-workflow, domain-slice planning, decision-surface artifacts) compounded into a skills plus [redacted] substrate. The paper that follows is the framework I wish I had when I started -- not because the principle is new, but because the harness layer is finally old enough to have principles at all.

The dominant software question of the 2010s was: can this product defend an ARR stream for a decade? The dominant software question of the late 2020s is different: can this product recover build cost before platform integration catches up? The shift is structural: cheaper intelligence, higher baseline capability in consumer products, and faster release cycles by model providers who increasingly ship direct-to-user surfaces.

In practical terms, many categories that previously supported venture-scale SaaS now behave like short-dated instruments. A utility can produce immediate cashflow for 3-12 months, then flatten when an upstream model, operating system, or distribution platform absorbs the core feature. The classic flashlight-app dynamic has moved from hardware toggles to cognition-layer workflows: summarize, route, prioritize, nudge, draft, reconcile, and plan.

The critical strategic error is treating this as a temporary anomaly. The more accurate reading is that software production and software consumption have both moved into a higher-frequency regime. Builders can launch in days. Users can switch in minutes. Platforms can clone in weeks. Under those conditions, persistence is no longer the default objective for every product line. Rapid payback and portfolio recycling become first-class objectives.

This does not eliminate durable businesses. It changes where durability sits. Durable value migrates from isolated feature execution to integrated memory, policy, trust, and distribution systems. If the user can reproduce your value with one new model release and a weekend project, you are in the feature layer. If better models improve your throughput while your proprietary context graph, decision policy, and trust loop stay unique, you are in the harness layer.

2. Edge Compression Is Measurable, Not Anecdotal

The edge compression thesis can be grounded in hard macro data rather than founder sentiment. Stanford's 2025 AI Index documents both rapid adoption and rapid cost compression: U.S. private AI investment reached $109.1B in 2024, global private investment rose to $252.3B (+26% YoY), organizational AI usage climbed to 78%, and inference costs for GPT-3.5-equivalent performance fell by roughly 280x between November 2022 and October 2024.

Table 1. Edge Compression Indicators (2024-2025)

Signal	Latest Reading	Implication
U.S. private AI investment (2024)	$109.1B	Capital concentrates where platform power already lives
Global private AI investment growth (2024)	+26% YoY to $252.3B	More capital chases faster cycles, not longer durability
Organizations using AI (2024)	78% (up from 55%)	Adoption is now mainstream, so novelty windows close faster
Industry share of notable models (2024)	90.16%	Roadmaps are controlled by a few labs and hyperscalers
Top-vs-10th model performance gap	5.4% (down from 11.9%)	Feature advantages are shorter-lived and easier to copy
Inference cost for GPT-3.5-class tasks	280x decline (Nov 2022 to Oct 2024)	MVP cost collapses, but pricing power collapses too

Sources: Stanford HAI AI Index 2025 summary and chapter PDFs; figures rounded where needed for readability.

Each metric pushes in the same direction. More capital and broader adoption attract more builders, which increases competition intensity in the exact layers where build cost is falling fastest. Simultaneously, model quality convergence reduces the duration of quality-based differentiation: when the performance spread between the best and the tenth-best model narrows to single digits, feature-level advantages are increasingly packaging and timing effects, not enduring technical moats.

A second-order effect matters even more. Industry produced 90.16% of notable models in 2024. That concentration means a small number of platform actors can reset feature markets on short notice. You are no longer competing in a market with many independent innovation trajectories; you are operating downstream of a few release calendars.

"Build once, rent forever" has been replaced by "ship fast, collect value quickly, and assume the baseline will move under you." Builders who price and operate for that reality win even when individual products have short shelf lives.

3. Monetization Still Works Because Demand Is Explosive During Window Periods

Fast decay does not imply no revenue. It implies a different revenue curve. Consumer demand around AI utilities has been strong enough to support meaningful short-cycle monetization. Reporting that cites Sensor Tower shows AI app spending exceeded $1.0B in 2024 with >200% YoY growth, and State of Mobile 2025 placed GenAI app spending at $1.49B (+169% YoY). Additional reporting on the first half of 2025 cited $1.87B in GenAI app revenue and 1.7B downloads.

Table 2. Demand Momentum for AI Utility Surfaces

Demand Metric	Observed Value	Commercial Meaning
AI app consumer spending in 2024	>$1.0B	Users pay for immediate utility despite free alternatives
YoY growth in AI app spending (2024)	>200%	Willingness-to-pay can spike before platform absorption
GenAI app spending in 2024 (State of Mobile 2025)	$1.49B	Category has moved from experiment to recurring spend
GenAI app spending growth in 2024 (State of Mobile 2025)	+169% YoY	Revenue windows are short but large when timing is right
GenAI app revenue in H1 2025	$1.87B	Short cycles can still produce meaningful cashflow
GenAI app downloads in H1 2025	1.7B	Distribution velocity remains available to fast movers

Sources: TechCrunch reporting on Sensor Tower / State of Mobile data (2025), January-August 2025 coverage.

Two observations matter for builders. First, willingness-to-pay exists even when free model chat surfaces are available. Users pay for speed, fit, and convenience in context, not raw model access. Second, category growth can be nonlinear for brief windows when model capability crosses a threshold and UX has not yet normalized across major platforms. Those windows are monetizable if onboarding, distribution, and pricing are designed for immediate conversion.

The market therefore rewards a barbell stance: treat feature products as intentionally time-bounded cashflow vehicles, while channeling earnings and telemetry into a longer-lived harness substrate. The failure mode is trying to force every short-wave product into a perpetual SaaS story with long sales cycles, heavy roadmap promises, and cost structures that assume multi-year retention.

4. Price for Payback, Not for Theoretical Lifetime Value

If half-life is short, monetization design must prioritize fast payback over elegant annual plans. In this regime, a 30-day cash recovery target often dominates a 24-month LTV narrative. The objective is to convert novelty and immediate task-value before integration pressure erodes differentiation.

Table 3. Monetization Archetypes for Short-Half-Life Tools

Archetype	Price Envelope	Half-Life	Best Use Case
Single-job utility (lifetime)	$19-$99 one-time	1-6 months	Fast novelty capture, low support load
Workflow accelerator subscription	$9-$29 / month	3-12 months	Daily operators who value speed over perfection
Team micro-agent seat	$49-$199 / seat / month	6-18 months	Domain teams with repeated high-value tasks
Template + automation bundle	$79-$299 bundle	2-9 months	Buyers who need immediate implementation
Outcome-priced execution service	$200-$2,000 / outcome	6-24 months	When confidence in value capture is high
Tool-led advisory retainer	$1k-$10k / month	12-36 months	Convert tool demand into longer-cycle services

Price ranges are operator heuristics based on current AI utility market behavior and observed conversion norms in low-friction digital tools.

This table highlights a non-obvious point: "short-lived" and "low quality" are not equivalent. A tool can be excellent, save users hours weekly, and still be structurally transient because a platform eventually internalizes the workflow. Monetization strategy should therefore encode temporal realism directly in pricing and packaging decisions.

Outcome framing usually outperforms feature framing under cannibalization risk. Features are copyable and often become default controls in upstream interfaces. Outcome claims tied to specific user jobs retain persuasive force longer, especially when backed by transparent before/after data. When possible, package around a completed job unit (e.g., reconciled lead list, routed task queue, finalized memo packet) rather than around the prompt or model selection UI.

Table 4. Offer Design Defaults Under Fast Edge Decay

Design Variable	Fast-Cycle Default	Reason
Time-to-first-value	<10 minutes	Novelty monetization fails if activation requires onboarding
Trial design	No trial or 3-day max	Short windows punish long conversion funnels
Payment trigger	Front-load at first successful output	Capture value before churn spike
Packaging	Outcome unit, not feature list	Features are easiest for platforms to clone
Refund policy	Tight but clear guarantee	Preserves trust while reducing abuse in low-ticket offers
Upsell path	From tool to workflow to advisory	Extends LTV beyond feature half-life

Heuristics for maximizing conversion speed and reducing monetization lag in high-change model markets.

The tactical implication is straightforward: long free trials, delayed value realization, and complex tiering are often anti-patterns in this category. You are not optimizing a mature procurement process. You are optimizing rapid, trust-preserving value capture in a moving baseline.

5. Cost Curves Now Favor High Gross Margin Even at Low Ticket Prices

Falling inference costs radically improve short-cycle economics. OpenAI and Anthropic pricing surfaces in 2026 imply that substantial end-user utility can often be delivered for cents per active session at low-to-mid model tiers. This means a $19-$49 product can maintain strong gross margin if orchestration is disciplined and context handling is efficient.

Table 5. Current API Pricing Surface (Selected Models)

Provider / Model	Input ($ / 1M tokens)	Output ($ / 1M tokens)	Strategic Read
Anthropic Claude Opus 4.6	$5.00	$25.00	Frontier-tier reasoning at materially lower cost than prior Opus pricing bands
OpenAI GPT-5.3-Codex	$1.75	$14.00	High-capability coding agent economics that still support sub-$100 offers
Anthropic Claude Sonnet 4.6	$3.00	$15.00	Production baseline for agent workflows where speed-cost-quality balance matters

Sources: OpenAI model page for GPT-5.3-Codex; Anthropic Claude Opus 4.6 and Sonnet 4.6 model pages, accessed March 5, 2026.

Margin discipline still matters. The fastest way to destroy a viable short-cycle product is uncontrolled context bloat, redundant model calls, and no routing policy. The right architecture routes most calls to the cheapest adequate model, escalates only when confidence thresholds fail, and aggressively caches reusable context. Providers themselves now expose cost controls (e.g., Anthropic Batch API discounts), reinforcing the feasibility of low-ticket monetization with healthy gross margin.

Table 6. Illustrative Unit Economics for Fast-Cycle AI Products

Scenario	Users / Month	ARPU	Gross Revenue	Illustrative Inference Cost	Gross Margin
Niche solo utility	400	$19	$7,600	$300	~96%
Power-user workflow app	1,200	$24	$28,800	$1,450	~95%
Team micro-agent	250 seats	$149	$37,250	$3,800	~90%
Outcome-priced agent service	140 outcomes	$450	$63,000	$6,900	~89%

Illustrative scenarios, not audited financials. Inference assumptions reflect blended low/mid-tier routing and moderate context sizes.

These economics reframe the problem. You do not need a monopoly category winner to produce meaningful cashflow. You need disciplined scope, rapid iteration, and predictable conversion behavior in a well-defined niche. Monetization viability no longer requires defending the entire category over many years; it requires operational precision during the useful window.

6. Deprecation and Integration Are Product Variables, Not Surprises

Builders who treat platform changes as random shocks are repeatedly punished. Deprecations and migrations are now routine. OpenAI, Anthropic, and Google all publish model lifecycle changes with explicit dates, and consumer-facing products can shift defaults quickly (as seen when GPT-4 was retired from ChatGPT). A robust short-cycle business therefore includes an explicit deprecation clock.

Table 7. Model Lifecycle Events That Define the Product Clock

Platform Event	Date	What It Signals
OpenAI retired GPT-4 from ChatGPT interface	2025-04-30	Consumer-facing model shelf life is now short
OpenAI scheduled chatgpt-4o-latest shutdown	2026-02-17	Alias stability cannot be assumed in product architecture
OpenAI scheduled gpt-4-32k deprecation	2025-06-06	Legacy premium tiers can disappear rapidly
Anthropic scheduled Claude 3.5 Sonnet (20240620) retirement	2025-10-22	Provider-led migrations are recurring, not exceptional
Google sunset gemini-2.5-flash-image-preview	2026-01-15	Preview capabilities are explicitly temporary monetization windows

Sources: OpenAI deprecations docs and ChatGPT release notes; Anthropic model deprecations page; Google Gemini API changelog.

The operational pattern is to convert lifecycle announcements into backlog events. Every deprecation notice should trigger three parallel tracks: migration implementation, pricing review, and customer communication. If your margin model depends on a soon-to-retire endpoint, that is not a technical issue alone; it is a pricing and positioning issue. If your core UX mirrors a new platform release, that is not a roadmap coincidence; it is a revenue risk signal.

Many operators misread the market at exactly this junction. They spend months refining features while ignoring upstream lifecycle telemetry, then experience sudden churn as users can get 80-90% of the value natively. The correct stance is proactive: design migration-ready abstractions, measure feature substitutability continuously, and pre-announce upgrades that reposition the offer around outcomes rather than around specific model hooks.

The same lifecycle discipline applies internally, and to your tool stack as much as to your features. Every utility you ship is a candidate for cannibalization by your own next iteration, or by an external maintainer who will eventually do the job better than you. I retired a Bash context-export script (protocol-model-context, March 2025) by replacing it with a coding agent (pcm, May 2025), then retired most of that pattern a year later by collapsing the underlying shape into a skills plus [redacted] substrate and adopting Dicklesworthstone's cass and beads for the parts I no longer needed to own. Internal deprecation is not failure -- it is what a healthy harness looks like in motion. If your portfolio contains no repos you would call retired, you are probably defending feature instances when better substrate, your own or someone else's, is already available.

7. The Agent Harness Layer Is Where Durability Reappears

The feature layer is where edge decays; the harness layer is where edge can compound. A harness is not simply an LLM wrapper. It is an integrated system that owns context ingestion, memory, policy, routing, tool execution, and human correction loops. Better base models increase the harness's effectiveness, because the harness owns the problem framing and decision boundary, not just one model call.

The clearest evidence I have for this comes from my own work. Inside one of my product repos -- one with a real, opinionated domain underneath it -- I developed two artifacts that turned out to be the durable shapes: a domain_slice_template.md for cross-stack feature planning, and per-domain "db-schema" planning pages. A year later the same shapes recur in skills as domain-planner, domain-reviewer, domain-scaffolder, and mmdx -- generalized away from any one product. The repo-specific instances were transient. The patterns compounded.

Table 8. Harness Layer Components and Compounding Mechanics

Layer	What You Own	Why It Improves with Better Models
Context ingestion	User events, docs, calendar, location, messages metadata	Richer models classify and prioritize context better
Memory model	Entity graph, preference graph, historical outcomes	Reasoning upgrades increase memory retrieval quality
Policy/routing engine	Task policy, risk thresholds, escalation logic	Better models reduce false positives and improve routing
Tool orchestration	API actions, retries, fallback providers	Model quality boosts tool-call success and recovery handling
Human override loop	Approval states, correction logs, confidence gates	Correction data compounds into higher precision over time
Feedback telemetry	Error classes, save-time metrics, intervention rates	Continuous tuning benefits from stronger base cognition
Distribution identity	Audience trust, niche positioning, voice	Brand and trust remain outside raw model commoditization

Conceptual architecture for agent-layer products that benefit from model improvement instead of being replaced by it.

This distinction aligns with your claim that builders should feel excited on model release day. That emotional test is strategically useful: if a stronger model makes your product better, your architecture likely sits above the feature blast radius. If a stronger model makes your product unnecessary, your architecture is probably too close to raw inference and too far from durable workflow ownership.

Durable AI businesses increasingly resemble operating systems for decisions, not chat interfaces for prompts. The model is the engine; the harness is the vehicle. New engines should increase your speed, not destroy your business.

In enterprise and prosumer settings, this is also where trust economics emerge. Auditability, override controls, escalation policies, and correction histories are difficult for generalized platform features to replicate at domain depth. These assets are dull from a demo perspective, but they are exactly what converts intermittent utility spend into repeat organizational spend.

8. Wrapper Skills: Renting Substrate from the Few Real Maintainers

Standing on someone else's shoulders is not a new idea. Every meaningful software business in the last forty years has compounded on top of an operating system, a runtime, a database, a queue, a web framework -- code written and maintained by people whose business was not the same as the business consuming it. What is genuinely new for AI agent work is that the layer this paper calls the harness is old enough to start having those maintainers but young enough that there are no clear winners yet. You can already see who is doing the careful work; you cannot yet tell whose substrate will be load-bearing in three years.

The contract that makes this work in practice is what I think of as a wrapper skill -- a SKILL.md file that sits on top of someone else's binary and teaches my agents when to reach for it, how to read its output, and which local conventions to apply. My skills directory currently contains wrappers for three external tools, all of them Dicklesworthstone's: cass (session search and prompt archaeology), cass-memory (procedural memory across agents), and the beads_rust plus beads_viewer pair (graph-aware issue tracking and triage). The wrappers are substantial -- a few hundred lines of SKILL.md each, encoding mode-selection tables, invocation rules, and verification gates -- but they are still smaller than the underlying tools they wrap, and an order of magnitude smaller than what it would cost me to rebuild equivalent functionality. The binary is rented; the discipline is mine.

The honest version of the trajectory: I started by writing my own context exporters, my own prompt rotation utilities, my own session searchers. That work is now quietly displaced. The reason is not that I learned a better philosophy; it is that for the first time there are people doing this substrate well enough to lean on. Dicklesworthstone is the clearest case I touch -- careful, opinionated open-source dev tooling produced not because dev tools are his business but because he cares about the substrate. Mine certainly is not. Yet most of my productive hours now flow through his binaries with my conventions on top. The asymmetry runs in the right direction for both sides.

The pattern is recursive. Even the SKILL.md files themselves -- the wrappers I write -- get reshaped over time. Sometimes by displacement (other people's skill catalogs do the job better), and sometimes by evolution. smart is the clearest case I have. It started in my public opensource/skills collection as a single-answer skill -- ask the one highest-leverage question for whatever the user is working on -- and migrated out in April. Today it lives as my always-global dispatcher with a skill-aware routing table that suggests which sibling skill to invoke next. The original shape (single insight) died. The shape underneath it (routing among siblings) survived and became the actual job. Wrapper skills wrap binaries. Wrapper-skill catalogs wrap other catalogs. And individual skills morph into routers when the real abstraction surfaces. The discipline is the same at every layer: own the policy, rent the substrate.

This is the part most builders skip. They evaluate each new tool through a build-vs-buy lens when the real question is "what shape does this have to fit underneath my skill layer." I maintain a build-vs-clone skill specifically to force that question before any new repo is scaffolded. The decision is rarely about lines of code saved. It is about whether the tool's surface can be wrapped in a day or two of SKILL.md, and whether that wrapper will survive the next model release without rewriting. The most expensive mistake is not "I should have built it." It is "I should have wrapped it and gotten back to my actual business."

Motion also runs the other direction. Sometimes a skill earns graduation into an app. The invocation pattern stabilizes, the surface justifies a UI instead of agent-only access, and the policy that lived inside a SKILL.md becomes the product's business logic. mmdx started as a diagramming skill and now opens its output in a dedicated viewer surface. bookme started as a repo-local SKILL.md for booking workflow and is the booking product on my own site. Most skills should never graduate -- the value lives in keeping them agent-invocable, cheap, and replaceable. But a small minority earn it, when external customers (not just my agents) will benefit from a stable UI on top of the policy. That graduation is the third arrow in the harness layer's motion: substrate flows in as wrapper skills, internal patterns compound into generalized skills, and the most settled skills flow out as products. Knowing which way a given workflow is moving is most of the skill.

The most aggressive version of this pattern is replacing your own cognition with substrate the harness can consult, and the journey there is its own ladder of escalating abstractions. Each rung gives up more of "I do this myself" to the substrate underneath.

The first rung was ask-cascade: a skill that structures clarifying questions to me in dependency order, so agents who do not know what I mean stop and ask, ideally surfacing strategic choices before detail ones. That works, but every clarification is friction on me.

The second rung is the research wiki itself, a forty-three-concept-page substrate that agents query through the wiki skill instead of stopping to ask. When an agent does not know what I mean by "barbell effect" or "operator portfolio," it reads the relevant concept pages. Friction on me drops; friction on substrate quality goes up.

The third rung is wiki-duel: it runs adversarial multi-model duels on a concept before it enters the wiki -- generating opposing positions, scoring them, and pushing back on my framing until the surviving claim is the one that earned its place. The cost of substrate quality moves off me and onto other models.

The fourth rung is wiki-forge: it picks the single highest-leverage concept in the vault, runs the same adversarial duel on it, optionally validates the synthesis against current external reality, and files the result back. The wiki uses itself to improve itself. I review rather than write.

skill-issue closes the equivalent loop on the skill files themselves, compounding quality from real session transcripts. The cannibalization runs at every rung of the ladder. I am replacing myself, deliberately, at progressively higher layers of abstraction.

The compounding mechanic is asymmetric in the right direction. A stronger model does not retire my domain-planner skill; it executes it with fewer corrections. A stronger model does not retire my wiki; it interprets the concept graph with more nuance. A stronger model does not retire cass or beads; it uses them more effectively. The harness layer is precisely the layer where the next model release is good news rather than risk -- whoever maintains the underlying tools.

9. Portfolio Construction: Trade Short Waves, Compound Long Assets

Treating all products as identical long-duration bets is no longer efficient. A portfolio approach is more robust: dedicate part of build capacity to short-wave edge extraction, while reserving explicit allocation for harness, distribution, and data assets that compound across waves.

Table 9. Barbell Portfolio for the Cannibalization Era

Portfolio Sleeve	Allocation	Primary Objective	Expected Horizon
Flash utilities	30%	Exploit model or UX discontinuities quickly	30-180 days
Workflow products	30%	Capture recurring payments from repeat tasks	6-18 months
Agent harness core	25%	Compound proprietary memory and policy assets	2-5 years
Distribution/media moat	10%	Lower launch CAC for each new wave	Persistent
R&D optionality	5%	Prototype frontier ideas before market consensus	Always-on

Illustrative allocation for solo operators or small teams balancing immediate cashflow with durable capability building.

The portfolio view changes decision quality in three ways. First, it normalizes product sunsetting as a healthy outcome when expected payback has already been captured. Second, it prevents over-investment in defensive roadmaps for products with structurally limited duration. Third, it creates an explicit reinvestment path from short-cycle profits into long-cycle assets.

This is analogous to a trading desk funding a long-term strategy book: short-term volatility capture provides operating cash and market intelligence, while a smaller set of durable positions absorbs most compounding gains. In software terms, the durable book includes memory infrastructure, domain datasets, correction logs, and trusted distribution channels.

Table 10. Build Decision Matrix: Speed, Risk, and Durability

Build Choice	Cannibalization Risk	Monetization Speed	Durability	Recommendation
Single-model UI wrapper	Critical	Fast	Low	Ship only if payback target is <30 days
Prompt-packaged assistant	High	Fast	Low-Medium	Sell as bundle and collect upfront
Domain workflow automation	Medium	Moderate	Medium	Good bridge product if tied to outcome metrics
Agent harness with memory and policy	Low-Medium	Moderate	High	Best long-cycle compounding path
Data network + expert feedback loop	Low	Slow	Very High	Durable moat; fund via short-cycle tools

Use this matrix to classify each new idea before development begins.

quadrantChart
    title Model releases should tell you whether to harvest or compound
    x-axis Model releases improve it --> Model releases replace it
    y-axis Low long-term ownership --> High long-term ownership
    quadrant-1 Durable but under-monetized
    quadrant-2 Compound Zone
    quadrant-3 Thin utility
    quadrant-4 Harvest Fast
    Data network + expert feedback loop: [0.10, 0.94]
    Agent harness w/ memory + policy: [0.24, 0.84]
    Wrapper skill on strong substrate: [0.36, 0.68]
    Domain workflow automation: [0.52, 0.56]
    Tool-led advisory offer: [0.58, 0.72]
    Prompt-packaged assistant: [0.76, 0.32]
    Single-model UI wrapper: [0.93, 0.14]

Products in the Harvest Fast quadrant can still produce strong cashflow, but only if priced and operated for a 30-180 day window; products in the Compound Zone justify longer investment because better models improve them rather than replace them.

The matrix also clarifies investor communication and self-governance. If you know a concept sits in "critical cannibalization risk / fast monetization", you can set explicit guardrails: limited engineering investment, hard payback thresholds, and clear sunset criteria. Conversely, if a concept sits in "low risk / slow monetization / high durability," you should expect longer payback and evaluate by compounding telemetry quality rather than immediate MRR.

10. Release-Day Operations: Converting Model Shocks into Revenue

In a fast-cycle market, model launch days are not merely technical events. They are commercial events. Teams that process these shocks quickly can capture outsized demand before market messaging and platform UX normalize. A lightweight but disciplined operating cadence outperforms both ad hoc shipping and heavyweight planning in this window.

Table 11. T+Window Operating Playbook After Major Model Releases

T+Window	Operating Action	Monetization Intent
T+0 to T+24h	Benchmark old vs new model on core user jobs	Detect instantly marketable quality/cost delta
T+24h to T+72h	Ship upgraded prompts, routing, and pricing copy	Capture novelty while search/social attention is high
T+3 to T+14d	Launch narrow use-case landing pages and micro-offers	Convert curiosity traffic into paid cohorts
T+2 to T+6w	Collect correction telemetry and publish comparative outcomes	Build trust and reduce churn
T+6 to T+12w	Decide: double down, bundle, or sunset	Reallocate to higher-edge opportunities

Operational sequence for turning model improvements into shipped value and near-term revenue.

The first 72 hours are disproportionately important. Benchmarking determines whether your current offer should reprice, reposition, or split into a cheaper and a premium tier. Public artifacts during this period matter: comparative output examples, latency and cost deltas, and concrete statements of what the update does for user outcomes. Vague excitement posts waste the window.

The following 2-6 weeks are the telemetry phase. This is when you accumulate the signals that convert one-off launch demand into repeatable decision intelligence: where users still intervene, where output quality failed, what tasks retained willingness-to-pay despite platform improvements, and which cohorts churned after copying baseline improvements from free tools.

Critically, this cadence can coexist with a high quality bar. Rapid does not mean sloppy. It means tight scopes, pre-defined instrumentation, and clear go/no-go thresholds. In practice, this often leads to better product hygiene than a speculative long roadmap because each cycle forces concrete evidence of value.

11. Conclusion: A Practical Doctrine for the AI Edge Economy

Short-half-life AI tools constitute a different asset class, not a dead end. The appropriate doctrine is neither naive permanence nor nihilistic churn, but structured edge harvesting: build fast, monetize quickly, instrument deeply, and recycle gains into durable harness assets.

Five operating principles follow from the evidence. First, assume edge decay by default and design offers for immediate payback. Second, package outcomes rather than features because outcomes survive cloning longer. Third, manage deprecation calendars as revenue variables. Fourth, allocate portfolio capacity explicitly across short-wave cashflow and long-wave compounding. Fifth, treat every major model release as a deploy-and-sell event, not a spectator event.

For builders with strong execution velocity, this regime can be positive-sum. Falling model costs and rising baseline capability reduce the capital required to launch, test, and monetize. The bottleneck shifts to judgment: selecting the right wedge, setting the right pricing clock, and building the right substrate beneath each wave. In this sense, software abundance does not remove edge; it changes edge from possession to process.

The hardest inversion is psychological. Under the old playbook, cannibalization signaled failure. Under the new one, cannibalization can be evidence that you correctly identified value early. If you captured cashflow, learning, and proprietary telemetry before integration, you did not lose the game; you completed the cycle. The compounding question is what your harness learned and what you can deploy next.

For me, after fourteen months of compulsively retiring my own utilities, the substrate I actually run today -- ordered by recent-session invocation count, verified by a cass first-progress-marker pass -- is smart as the always-global dispatcher (the most-invoked skill in my collection), commit for routine commit hygiene, skill-issue for compounding skill quality from real session transcripts, wiki for substrate maintenance and queries, build-vs-clone for placement decisions, the domain-planner / domain-reviewer / domain-scaffolder family for cross-stack planning, Dicklesworthstone's cass for session archaeology and beads for graph-aware issue tracking, divide-and-conquer for parallel orchestration, mmdx for decision-surface artifacts, and wiki-forge for substrate self-improvement passes. The lower-traffic rungs of the cognition ladder from section 8 -- ask-cascade, wiki-duel, cass-memory -- get narrated more than they get invoked, which is honest data worth keeping visible. prompt-reviewer was in earlier drafts of this list with zero recent invocations to back the claim; the same cass pass caught the overclaim and it has been removed. None of this survives forever. All of it improves when the next model lands. That is the only test that matters now.

References

Stanford Human-Centered AI. (2025). 2025 AI Index Report. https://hai.stanford.edu/ai-index/2025-ai-index-report

Stanford Human-Centered AI. (2025). Chapter 1: The AI Research and Development Landscape. https://hai.stanford.edu/sites/default/files/2025-04/chapter_1_the_ai_research_and_development_landscape.pdf

Stanford Human-Centered AI. (2025). Chapter 4: AI in the Economy. https://hai.stanford.edu/sites/default/files/2025-04/chapter_4_ai_in_the_economy.pdf

OpenAI. (2026). API Pricing. https://openai.com/api/pricing/

OpenAI. (2026). GPT-5.3-Codex Model. https://platform.openai.com/docs/models/gpt-5.3-codex

OpenAI. (2026). API Deprecations. https://platform.openai.com/docs/deprecations

OpenAI Help Center. (2026). ChatGPT Release Notes. https://help.openai.com/en/articles/6825453-chatgpt-release-notes

Anthropic. (2026). Pricing. https://docs.anthropic.com/en/docs/about-claude/pricing

Anthropic. (2026). Claude Opus 4.6. https://www.anthropic.com/claude/opus

Anthropic. (2026). Claude Sonnet 4.6. https://www.anthropic.com/claude/sonnet

Anthropic. (2026). Model Deprecations. https://docs.anthropic.com/en/docs/about-claude/model-deprecations

Google AI for Developers. (2026). Gemini API Pricing. https://ai.google.dev/gemini-api/docs/pricing

Google AI for Developers. (2026). Gemini API Changelog. https://ai.google.dev/gemini-api/docs/changelog

Brynjolfsson, E., Li, D., and Raymond, L. R. (2023). Generative AI at Work (NBER Working Paper No. 31161). https://www.nber.org/papers/w31161

Bick, A., Blandin, A., and Deming, D. (2024). The Rapid Adoption of Generative AI (NBER Working Paper No. 32966). https://www.nber.org/papers/w32966

GitHub. (2022). Research: Quantifying GitHub Copilot's impact on developer productivity and happiness. https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/

Carta. (2025). The startup shutdown surge continues in 2024. https://carta.com/data/the-startup-shutdown-surge-continues-in-2024/

TechCrunch. (2025, January 22). AI apps saw over $1 billion in consumer spending in 2024. https://techcrunch.com/2025/01/22/ai-apps-saw-over-1-billion-in-consumer-spending-in-2024/

TechCrunch. (2025, July 30). GenAI apps doubled their revenue, grew to 1.7B downloads in first half of 2025. https://techcrunch.com/2025/07/30/gen-ai-apps-doubled-their-revenue-grew-to-1-7b-downloads-in-first-half-of-2025/

Suggested citation: Baratta, R. (2026). “Short-Half-Life AI Tools: Edge Harvesting as the New Software Business Model.” Buildooor Research Brief, March 2026.

Correspondence: buildooor@gmail.com