After a hailstorm, an insurance adjuster drives to the property, climbs on the roof, counts damaged shingles by hand, takes 50-100 photos, hand-sketches the roof dimensions, and enters it all into Xactimate (the industry-standard estimating software). Back at the office, they spend 2-3 hours matching photos to roof sections and calculating replacement costs. A single residential roof claim takes 4-8 hours of adjuster time. After a major storm, an adjuster handles 5-10 claims per day, working 12+ hour days for weeks. So what? US insurers process 4-5 million property claims per year. The average roof claim takes $800-1,500 in adjuster labor cost (time + travel). An agent that could process drone imagery of a damaged roof — detect damaged shingles, measure affected areas, auto-generate an Xactimate estimate — would cut per-claim adjuster time from 4-8 hours to 30-60 minutes. The technology components exist separately (drone imagery, computer vision for damage detection, Xactimate API) but nobody has connected them into an end-to-end agent workflow. Why doesn't this agent exist? Xactimate (by Verisk) is the monopoly estimating tool and their API is restricted to approved partners. Drone imagery resolution varies by altitude and camera. Insurance carriers require specific documentation formats that vary by carrier. And adjusters are independent contractors who resist automation that threatens their per-claim income ($300-600 per claim).
Real problems worth solving
Browse frustrations, pains, and gaps that founders could tackle.
A small CPA firm (2-5 accountants) prepares 300-500 individual tax returns per year. For each client, they receive bank statements (PDF), brokerage statements (PDF), W-2s (PDF or paper), 1099s (various formats), and mortgage interest statements. They manually open each PDF, read the numbers, and type them into tax software (Drake, Lacerte, UltraTax). A single return with a brokerage account might have 40+ stock transactions that are hand-entered one by one. This takes 45-90 minutes per return for data entry alone, before any actual tax planning begins. So what? During January-April, a 3-person CPA firm does 1,500+ hours of pure data entry. At $150/hour billing rate, that is $225K in revenue generated by mindless typing. The CPAs are exhausted, make errors (transposed numbers are the #1 cause of amended returns), and have no time for the advisory work that clients actually value. An agent that could ingest a folder of PDFs and auto-populate a tax return in Drake or Lacerte would save 60% of per-return preparation time. Why doesn't this agent exist? Tax software (Drake, Lacerte, UltraTax) has no import API — data must be entered through the GUI. Brokerage statement PDFs have wildly different layouts (Fidelity, Schwab, Vanguard all different). OCR gets 95% accuracy on numbers but a single wrong digit in a cost basis changes the tax liability by thousands. The accuracy bar is 100% — a 95% accurate agent creates more work (finding and fixing errors) than manual entry.
An immigration attorney preparing an H-1B petition must fill out Form I-129 (Petition for Nonimmigrant Worker), Form I-129 Data Collection Supplement, Labor Condition Application (LCA via FLAG system), Form G-28 (Notice of Entry of Appearance), and supporting evidence letters. The client's name, address, passport number, job title, and employer details are entered by hand into each form separately — the same 30 fields, 15 times. USCIS forms are fillable PDFs with no import/export capability. The FLAG system for LCA filing has its own separate data entry. So what? A single H-1B petition takes 6-10 attorney hours at $300-500/hour. At least 3-4 of those hours are pure data re-entry across forms — not legal analysis, not strategy, not writing arguments. An immigration firm handling 200 H-1B petitions per year wastes 600-800 hours on data entry alone. That is $180K-400K in attorney time spent on copy-paste. An agent that could read a client intake form and auto-populate all 15 USCIS forms would save 50% of the per-petition cost. Why doesn't this agent exist? USCIS forms change layout yearly without notice. The FLAG system for LCA has a web interface that resists automation (CAPTCHAs, session timeouts). Each form has different field names for the same data (beneficiary name vs petitioner name vs applicant name). Immigration law firms are small (2-10 attorneys) and cannot afford custom software development. Existing immigration software (INSZoom, Docketwise) helps with case management but still requires manual form filling.
Your landlord requires renters insurance. You buy a $15/month policy from Lemonade covering $30K in personal property. You feel protected. Then a fire breaks out in the unit below yours in your 1908 building (SF has 45,000+ pre-1920 buildings with aging electrical). Your unit has smoke and water damage. You cannot live there for 3 months while repairs happen. Your renters insurance covers $30K in damaged belongings and $1,500/month in temporary housing (called Additional Living Expenses or ALE). But a temporary furnished apartment in SF costs $4,500-6,000/month. Your ALE coverage runs out in 1 month. For the remaining 2 months, you pay $4,500/month out of pocket while also paying rent on your damaged apartment (California law: you must pay rent if the unit is 'partially habitable'). Your total out-of-pocket cost for someone else's fire: $9,000+. So what? Standard renters insurance policies have ALE limits of $10-20K, designed for average US rents ($1,500/month). In SF, where temporary housing costs $4,500+/month, your ALE coverage lasts 2-4 months instead of the 6-12 months you might need. The policy that your landlord required you to buy is nearly useless for the most common SF renter disaster (fire in an old building). You paid premiums for years for protection that evaporates when you actually need it. Why does this persist in the first place? Renters insurance is a commodity product — insurers compete on price, not coverage adequacy. Lemonade, State Farm, and others sell the same $15-20/month policy nationally without adjusting ALE limits for SF cost of living. Landlords require it for liability protection (for themselves), not for adequate tenant coverage. No landlord checks if the policy's ALE limit is sufficient for SF temporary housing costs.
You move to a neighborhood with zero street parking (Russian Hill, North Beach, Nob Hill). Your building has no garage. You need a parking spot. There is no centralized marketplace for parking spots. You post on Craigslist 'ISO parking spot Russian Hill.' Nothing. You walk around your neighborhood looking for private garages with empty spaces and leave notes on doors. You ask your neighbors. After 3 weeks, someone knows someone who has a spot in their building for $350/month, cash only, no written agreement. You pay $350/month in cash with no receipt, no lease, and no protection if they give your spot to someone else next month. So what? SF has approximately 280,000 garage parking spaces in residential buildings, many of which sit empty because the resident does not own a car. Meanwhile, 40% of SF residential streets have no available parking after 6pm. Empty spots and desperate drivers exist on the same block but have no way to find each other. The entire parking spot rental market operates informally — cash, handshakes, Craigslist posts that expire. No platform aggregates available spots, handles payments, or provides any tenant protection. Why does this persist in the first place? Parking spots in SF are legally complex: some are deeded to specific units, some are common area, some are tandem. HOA rules often prohibit renting spots to non-residents. Building managers do not want liability. So the market stays underground — cash transactions, no paper trail, no platform willing to navigate the legal complexity.
You tour a beautiful 1BR in a 1920s building in the Haight. Fresh paint, new appliances. You sign the lease. Two months later, you discover: the building has 14 open DBI (Department of Building Inspection) complaints, the unit below you had a sewage backup last year, the building failed its most recent fire inspection, and your unit has an unpermitted bathroom addition that technically makes it an illegal unit. All of this information exists in SF public records — DBI complaints, fire inspection reports, housing inspection records — but it is scattered across 3-4 different city databases, none of which are linked to rental listings, and none of which are easy to search. So what? You made a $40K+ annual commitment (rent) without knowing the building is a maintenance disaster. Mold, pest infestations, structural issues, and fire safety violations are all knowable before signing if you could access the data. But the data is effectively hidden: DBI complaints are searchable by address but the interface is from 2003 and returns raw permit data that requires expertise to interpret. Fire inspection results are not online. Housing inspection reports require a FOIA request. Landlords have no obligation to disclose code violations. Why does this persist in the first place? SF's building data is managed by DBI, SFFD (fire), and DPH (health) in separate systems that do not talk to each other. The city has no 'building health score' or unified property report. Landlords actively benefit from information asymmetry — a building with 14 open complaints rents at the same price as a well-maintained building because tenants cannot tell the difference until after move-in.
You have lived in a rent-controlled 2BR in Noe Valley for 12 years paying $1,800/month. Market rate for your unit is $4,500. Your landlord files an Ellis Act eviction — a California state law that allows landlords to evict all tenants by withdrawing the property from the rental market. You get 120 days notice (1 year if you are elderly or disabled). You receive a relocation payment of $7,000-14,000. You now need to find a new apartment at market rate: your housing cost triples overnight. The landlord waits 5 years (the minimum re-rental restriction), then re-rents the unit at $4,500 — a $2,700/month increase. So what? The Ellis Act was designed to let landlords who genuinely want to stop being landlords exit the business. In practice, it is used as an investment strategy: buy a building with below-market rent-controlled tenants, Ellis Act evict everyone, wait the mandatory period (or convert to condos), then re-enter the market at 2-3x the previous rents. Between 2010-2023, SF lost approximately 5,500 rent-controlled units to Ellis Act evictions. Each displaced tenant is a person who built a life in a neighborhood and is forced to leave the city entirely because no comparable rent-controlled unit exists. Why does this persist in the first place? The Ellis Act is a STATE law that SF cannot override — the city has tried repeatedly and been blocked. Real estate investors lobby Sacramento to preserve the Ellis Act because it is the primary mechanism to convert rent-controlled buildings into market-rate assets. The relocation payments ($7-14K) are a tiny fraction of the windfall the landlord captures ($2,700/month × 12 months × decades = $500K+ in additional rent revenue).
You need a roommate for your 2BR in the Sunset ($1,600/person). You post in 'SF Housing, Subtlets & Roommates' on Facebook. You get 40 messages. Each person sends 2-3 sentences about themselves. You have no way to verify employment, rental history, lifestyle habits, cleanliness, or whether they have ever been evicted. You invite 5 strangers to your apartment for 15-minute interviews. You pick the person who seemed nicest in a 15-minute conversation. You sign a lease together. Three months later they stop paying rent, bring their partner to live in the living room, and you discover they were evicted from their last place. So what? In SF, where the median 1BR is $3,200, most people under 35 need roommates. You are making a $20-40K annual financial commitment (your share of rent) with a person you vetted for 15 minutes based on vibes. There is no rental history check for roommates (only landlords can run these), no way to contact previous roommates for references, and no platform that verifies the basic facts someone claims about themselves. The entire roommate matching process has less verification than a Tinder date. Why does this persist in the first place? Roommate platforms (SpareRoom, Roomi) exist but are just listing boards — they verify nothing. Running a credit/background check on a potential roommate requires their SSN and consent, which is awkward to ask for before you even know if you like each other. Facebook groups have zero verification. The person who scammed 3 previous roommates looks identical to a genuine person in a Facebook message.
You move out of your SF apartment after 3 years. You cleaned thoroughly, patched nail holes, and left it in better condition than move-in. California law requires the landlord to return your deposit (or an itemized deduction statement) within 21 days. Day 21 passes. Nothing. You email. No response. Day 30. You send a demand letter citing California Civil Code 1950.5. The landlord responds with a $1,200 deduction for 'cleaning' and 'paint touch-up' on your $4,800 deposit — despite normal wear and tear not being deductible under California law. You now choose: accept the $3,600 return and lose $1,200 you are owed, or file in small claims court, take a day off work, and hope the judge agrees. Most people take the $3,600. So what? The average SF security deposit is $3,500-7,000 (one month's rent). Landlords illegally withhold an estimated $50-100M annually in SF alone by betting that tenants will not go to small claims court over $500-2,000. The 21-day deadline is routinely violated with no consequence unless the tenant files a lawsuit. Wrongful deductions for normal wear and tear are the norm, not the exception. Why does this persist in the first place? The penalty for violating the 21-day deadline is up to 2x the deposit in bad faith cases, but only if the tenant sues. The Rent Board handles rent disputes but NOT deposit disputes. There is no administrative enforcement — every deposit dispute must go through small claims court. Landlords know that 90%+ of tenants will not file because the process costs a day of work plus emotional energy.
A good 1BR under $3,000 in SF gets 50+ inquiries within 4 hours of posting. By the time Zillow syndicates the listing from the property manager's website (24-48 hour delay), the unit already has 20 applications. Craigslist listings are manually posted and often left up for days after the unit is taken because landlords forget to delete them. You spend 30 minutes writing a personalized inquiry for a listing that was rented 2 days ago. So what? The entire SF rental search process is built on stale data. Renters waste 10-15 hours per week responding to listings that are already gone. They schedule viewings for apartments that are already taken. They tailor cover letters explaining why they are a great tenant for a unit that has a signed lease. The time-to-lease for a desirable SF apartment is 24-72 hours, but every major listing platform operates on a 48-72 hour refresh cycle. The search tools are slower than the market. Why does this persist in the first place? Listing platforms (Zillow, Apartments.com) aggregate from property management software (AppFolio, Buildium) via daily batch syncs, not real-time feeds. Craigslist is manually posted with no integration to any leasing system. Small landlords (who own 60%+ of SF units) have no leasing software at all — they post on Craigslist, collect applications via email, and never update the listing status. There is no MLS equivalent for rentals that would provide real-time availability across all listings.
You find a 1BR for $2,400 in a Victorian on Divisadero. Great price. You ask the landlord if it is rent-controlled. They say 'I'm not sure' or 'no.' You sign a 1-year lease. After move-in, you discover the building was built in 1923 — it IS rent-controlled under SF Rent Ordinance (buildings before 1979). Your landlord has been raising rent 8% annually on previous tenants who did not know their rights. So what? Rent control is the single most valuable renter protection in SF — it caps annual increases at 60% of CPI (roughly 2-4%/year). But there is no easily searchable public database where you can type in an address and see: is this unit rent-controlled? what is the legal maximum rent? what was the last tenant paying? The SF Rent Board has records but they are not digitized or searchable by address. You must physically visit or call the Rent Board to verify. Landlords exploit this information asymmetry to charge new tenants above the legal rent or to claim units are not rent-controlled when they are. Why does this persist in the first place? The SF Rent Board operates on a $7M annual budget with outdated systems. Their database is partially digitized but not publicly searchable. Landlords are required to register rent-controlled units but there is no penalty for non-registration and no cross-referencing with building permit data. The city has the data (building age from assessor records + Rent Board registrations) but has never built the public lookup tool.
A $3,500/month 1BR in SOMA requires you to prove $140,000/year in income (40x rent). You are a senior engineer making $180K base + $150K/year in RSUs. Your total comp is $330K but the landlord only counts your $180K base salary because stock vesting is not guaranteed income. Your offer letter shows $330K but the landlord wants pay stubs showing $15K/month gross, not $27.5K. Your pay stubs show $15K. The landlord says your income is insufficient for a $3,500 apartment despite you earning $330K. So what? Tech workers earning $250-400K in total comp are getting rejected from apartments because 40-60% of their compensation is in equity that landlords refuse to count. They end up in worse apartments than they can afford, or they pay 6-12 months upfront (tying up $21-42K in cash) to compensate for the income gap. Meanwhile the landlord rents to someone with a higher base salary but lower total comp. Why does this persist in the first place? Landlords use income verification services (Plaid, Snappt) that pull bank deposits and pay stubs. RSU vesting shows up as irregular large deposits that look like one-time events, not income. No verification service reliably classifies RSU income as recurring. Landlords are risk-averse — they would rather have a tenant with steady $160K salary than one with $120K salary + $200K in stock that could crash.
You find a 1BR on Craigslist in the Mission for $2,800. You schedule a viewing. 15 other people show up. You apply that day — $45 application fee per person, non-refundable. The landlord collects $45 from all 15 applicants ($675 total), picks one, and the other 14 lose their money with zero recourse. You apply to 5 apartments before getting accepted. That is $225 gone before you even sign a lease. For a couple applying together, double it: $450. So what? Application fees are a transfer of wealth from desperate renters to landlords with zero accountability. Landlords have no obligation to disclose how many applications they have already received or whether the unit is effectively already taken. Some landlords collect applications for units they have no intention of renting immediately, pocketing fees as passive income. California AB 2559 (2024) allows reusable tenant screening reports, but landlords are not required to accept them. Why does this persist in the first place? Credit check services (TransUnion SmartMove, RentPrep) charge landlords $25-40 per screening, so landlords pass the cost to applicants and pocket the difference. The $45 fee cap exists in some cities but enforcement is complaint-driven and renters in a housing crisis do not risk antagonizing potential landlords by filing complaints.
During open enrollment, you must choose between 4-8 health insurance plans. Each has different premiums, deductibles, copays, coinsurance rates, out-of-pocket maximums, formulary tiers, and provider networks. You take 2 medications and see 3 specialists. To compare plans accurately, you need to: check if each doctor is in-network for each plan (call the insurance company or search their broken provider directory), check which formulary tier each medication is on for each plan (download 200-page PDFs), and calculate your expected annual cost under each plan based on your usage patterns. Nobody does this math. People pick the plan with the lowest premium and discover in February that their medication costs $400/month because it is Tier 3 on the new plan. So what? Americans collectively make $30B+ in suboptimal health insurance decisions annually. Choosing the wrong plan costs a family $1,000-5,000/year in unnecessary spending. This is the single largest financial decision most people make annually, and they make it with almost no usable information. Why does this persist in the first place? Insurance companies benefit from confusion — plans that look cheap (low premium) but have bad coverage for your specific needs generate more revenue. CMS requires plan information to be published but not in a standardized, machine-comparable format. Provider directories are inaccurate 30-50% of the time (doctors listed as in-network who are not). No tool can reliably answer 'what will Plan X cost ME given MY doctors and medications' because the underlying data is unreliable.
You hire a contractor to renovate your bathroom. They seem professional, give a reasonable quote, and take a 30% deposit ($4,500 on a $15,000 job). They start demolition. Then they disappear for 2 weeks. You text them — no response. You call — voicemail. They show up briefly, do 2 hours of work, then vanish again. Your bathroom is a gutted shell. This drags on for 3 months instead of 3 weeks. You have no leverage because they already have your money, and hiring a new contractor to finish someone else's work costs 50% more. So what? This is not a rare horror story — it is the default experience. HomeAdvisor data shows 55% of home renovation projects exceed their timeline by 2x or more. The core problem is asymmetric commitment: the contractor has your money and many simultaneous projects, you have a demolished bathroom and no alternatives. Reviews on Yelp/Google are unreliable because contractors ask happy clients to review and unhappy clients have already moved on. Licensing boards exist but only handle egregious fraud, not chronic schedule overruns. Why does this persist in the first place? Contractors face zero financial penalty for delays. The deposit structure (30-50% upfront) eliminates their urgency. There is no escrow system where payments release based on milestone completion. Bonding exists for commercial projects but not for residential renovations under $50K. And the market is structurally short on contractors — demand exceeds supply by 30%+, so even bad contractors stay fully booked.
A freelance designer finishes a project. They open a Google Doc invoice template, manually fill in hours, rates, line items, and client details. They export to PDF. They email it. Two weeks later, no payment. They send a follow-up email. Another week. They send another. The client pays via bank transfer with no reference number. The freelancer manually matches the payment to the invoice in a spreadsheet. At tax time, they export their spreadsheet to their accountant, who finds 12 errors. This cycle repeats for every client, every month. So what? The average freelancer has 3-7 active clients and sends 5-15 invoices per month. At 20-30 minutes per invoice cycle (create + send + chase + reconcile), that is 5-8 hours per month of unpaid administrative work. For a freelancer billing $100/hour, that is $500-800/month in lost revenue — not from missing work, but from the overhead of getting paid for work already done. Why does this persist in the first place? Tools exist (FreshBooks, QuickBooks Self-Employed, Wave) but they solve individual pieces, not the full cycle. You still manually create each invoice, you still manually chase late payments (automated reminders help but clients ignore them), and you still manually reconcile bank transactions. No tool connects 'hours tracked' → 'invoice generated' → 'payment received' → 'books updated' → 'tax-ready P&L' as a single automated flow for a freelancer with multiple clients across different payment methods.
A parent whose child has a peanut allergy picks up a granola bar at the grocery store. They flip it over and read the ingredient list — 30+ items in 6-point font. They look for bolded allergen warnings. The label says 'may contain traces of tree nuts' but the child is allergic to peanuts specifically, not tree nuts. Is this safe? They Google the brand name + 'peanut allergy.' They find conflicting Reddit posts. They put the granola bar back and buy the same brand they always buy. This happens 5-10 times per grocery trip, adding 20-30 minutes per shopping session. So what? 32 million Americans have food allergies, including 5.6 million children. Their families restrict their diets to a tiny set of 'known safe' products because verifying new products is so time-consuming and anxiety-inducing. Kids eat the same 10 foods for years because parents cannot efficiently vet new options. Allergic reactions from mislabeled or misunderstood food send 200,000 people to the ER annually. Why does this persist in the first place? FDA labeling law (FALCPA) requires declaration of the top 9 allergens but 'may contain' / 'processed in a facility' warnings are voluntary and inconsistent — companies use them as legal protection regardless of actual cross-contamination risk. There is no standardized, machine-readable allergen database for packaged foods. Barcode scanning apps (Yummly, Fooducate) exist but have incomplete allergen data and do not account for 'may contain' warnings or facility-level cross-contamination.
A US-based freelancer invoices a client in Germany for $2,000. The client initiates a wire transfer. The payment passes through 2-4 correspondent banks, each taking 1 day and charging $10-25 in fees. The FX conversion adds a 1-3% hidden markup on the exchange rate. The freelancer receives $1,880-1,940 five days later with no explanation of which bank took how much. They cannot track the payment in transit — it just disappears for days. So what? There are 400+ million freelancers globally and millions of small businesses that trade internationally. Each pays $25-50 per wire and loses 1-3% on FX, on payments that are already small enough that fees are a significant percentage. A freelancer receiving ten $2,000 international payments per month loses $500-800/year to fees alone. And the 3-5 day delay creates cash flow uncertainty that forces small businesses to maintain larger cash reserves. Why does this persist in the first place? The SWIFT network was built in 1973 and operates through correspondent banking — each bank in the chain takes a day and a fee. Wise (TransferWise) and similar fintechs solve the fee problem by holding local currency pools, but they are not universally accepted by businesses (many corporate AP departments only send via bank wire). Stablecoin rails (USDC) are instant and near-free but no legitimate invoicing/accounting system integrates them. The structural blocker is that the sender's bank chooses the rails, not the receiver.
A restaurant owner knows their total food cost percentage (typically 28-35% of revenue) but cannot tell you which specific dishes are profitable and which are losing money. To calculate per-dish profitability, they would need to: track the exact cost of every ingredient at current prices (which change weekly), account for waste and prep loss, calculate labor cost per dish, and cross-reference with POS sales data. Nobody does this because it requires manually updating ingredient costs weekly across 40-80 menu items. So what? Restaurants operate on 3-5% net margins. A single unprofitable dish served 20 times per day can erase the entire margin. But owners keep underpriced dishes on the menu for months because they have no per-item cost visibility. Menu engineering — the practice of optimizing menu pricing and placement based on profitability data — is taught in every hospitality program but practiced by almost nobody because the data collection is impossibly tedious. Why does this persist in the first place? POS systems (Toast, Square, Clover) track sales data perfectly but do not track food costs. Inventory systems (MarketMan, BlueCart) track purchase costs but do not link them to specific dishes. Recipe costing software (Galley, Apicbase) exists but requires manual recipe entry for every dish. No system automatically connects purchase invoices → ingredient costs → recipes → POS sales to show per-dish profit in real time.
A landlord with 3-15 rental units collects rent via Venmo/Zelle with no automated tracking, receives maintenance requests via text message with no ticket system, stores lease expiration dates in a spreadsheet they forget to check, and calculates security deposit deductions by hand. When a tenant disputes a charge, the landlord scrolls through 6 months of text messages to find the relevant conversation. So what? Small landlords (1-20 units) own 48% of all rental units in the US — roughly 23 million units. They are running a real business on consumer tools designed for friends splitting dinner. Missed rent reminders, lost maintenance requests, forgotten lease renewals, and security deposit disputes cost small landlords an estimated $3,000-8,000 per year in lost rent and legal exposure. Why does this persist in the first place? Property management software (Buildium, AppFolio, Rent Manager) is designed for 50+ unit portfolios and costs $200-500/month — absurdly expensive for someone with 5 units generating $8K/month. The free tools (Avail, TurboTenant) cover rent collection but not maintenance tracking, lease management, or communication in one place. No product serves the 3-15 unit landlord with an all-in-one tool at a price point that makes sense for their scale.
A patient fills out a 3-page intake form on paper or PDF. A staff member sits at a computer and types each field — name, DOB, allergies, medications, insurance ID, 40+ fields — into the EHR system one by one. This takes 10-15 minutes per patient. A busy clinic does this 50+ times per day. That is 8-12 hours of pure data entry per day, done by humans, in 2026. So what? This is not a technology problem that lacks solutions — OCR and form parsing exist. The problem is that every PDF form has a different layout, every EHR system has a different input format, and the accuracy requirement is 100% (a wrong medication or allergy can kill someone). Generic OCR gets 90-95% field accuracy, which means 2-4 errors per form, which is unacceptable for medical data. So humans do it manually to guarantee correctness. Why does this persist in the first place? EHR vendors (Epic, Cerner) charge millions for integration. Small clinics cannot afford custom integrations. PDF forms are not standardized — every insurance company, every state Medicaid program, every specialty practice uses a different form layout. The combination of high-accuracy requirements + non-standardized inputs + expensive integration = manual data entry remains the cheapest option for small practices.
You want to cancel a SaaS subscription. You click Settings. There is no cancel button. You click Billing. No cancel button. You click Account. There is a tiny 'Manage subscription' link at the bottom. It opens a new page. 'Are you sure? You will lose access to X.' Click continue. 'Before you go, would you like to downgrade instead?' No. 'Can you tell us why you are leaving?' Select a reason. 'We can offer you 50% off.' No. 'Are you really sure?' Yes. 'Your subscription will be canceled at the end of the billing period.' The entire process took 3 minutes and 7 clicks. So what? This is not just annoying — it is a dark pattern that costs consumers real money. People who intend to cancel give up halfway through because they cannot find the button or get fatigued by the guilt screens. They continue paying for months. The FTC estimates dark-pattern subscriptions cost US consumers billions per year. Why does this persist in the first place? It works. Every extra click in the cancellation flow reduces cancellation rates by 5-10%. Companies A/B test their cancellation flow to maximize friction. The business incentive (retain revenue) directly opposes the user interest (cancel easily). Regulation like the FTC Click-to-Cancel rule (2024) exists but enforcement is slow and most SaaS companies outside the US ignore it entirely.
Your team discusses a feature in a Google Doc. There are 15 comment threads with decisions, edge cases, and open questions. Now you need to create a Jira ticket to actually build it. You must manually read each comment thread, summarize the decisions, copy relevant quotes, and paste them into Jira. Links back to specific comment threads in Google Docs break when the doc is restructured. So what? The context that matters most — why decisions were made, what alternatives were rejected, what edge cases were identified — is permanently trapped in Google Docs comments and never makes it to the ticket. The developer who picks up the Jira ticket gets a sanitized summary with none of the nuance. They re-ask questions that were already answered in the doc. They miss edge cases that were discussed. The 2 hours of doc discussion was wasted because none of it transferred. Why does this persist in the first place? Google and Atlassian are competitors. Neither has incentive to build deep integration with the other. Google Docs comments are stored in a proprietary format with no public API for threaded comment export. Jira's API accepts markdown but not Google Docs comment metadata (resolved/unresolved, replies, anchored text). The integration gap exists because both companies benefit from lock-in.
You know someone sent you a specific message in Slack about a deployment issue roughly 3 weeks ago. You remember some keywords. You search for them. Slack returns 200+ results sorted by some opaque relevance algorithm, mostly from public channels, none of which are the message you want. You add the person's name as a filter — now you get 0 results because Slack's search doesn't handle the combination of sender + keyword + time range correctly. You scroll through your DM history manually for 15 minutes. So what? Slack is the primary knowledge base of every modern company, but it is unsearchable. Decisions, context, links, specs — all buried in threads that nobody can find again. Teams re-discuss the same topics because nobody can locate the previous conversation. New hires have zero access to institutional context because Slack search is broken. Why does this persist in the first place? Slack indexes messages as flat text, not as structured conversations with participants, topics, decisions, and references. Their search algorithm optimizes for recent messages in popular channels, not for the specific needle-in-a-haystack retrieval that people actually need. And Slack has no business incentive to fix this — they charge per seat, not per search quality, and switching costs are enormous.
When a coding agent (local or cloud) generates a file edit as a diff or search-and-replace, the patch frequently fails to apply. The line numbers are wrong because the agent's view of the file is stale. The context lines don't match because the agent hallucinated surrounding code. Partial edits leave the file in a broken state — half old code, half new code, syntax errors everywhere. This happens 10-20% of the time on real codebases. So what? Every failed edit requires the developer to manually inspect the diff, understand what the agent intended, and hand-apply the change. This is more work than writing the code themselves. Worse, when the agent retries a failed edit, it often makes a different mistake or compounds the original error. A 3-step refactoring task where step 2's diff fails to apply means steps 1 and 3 are also wasted. Why does this persist in the first place? LLMs generate diffs as text tokens — they do not have a structured representation of the file's AST or a real line-number index. The model is guessing line numbers from whatever file content was in its context window, which may be truncated, outdated, or incomplete. There is no feedback loop: the model generates a diff, but it does not verify the diff applies before presenting it. Structured edit formats (AST-based transforms, tree-sitter patches) exist but no major agent framework uses them.
To fit a model in limited VRAM, you quantize it (Q4_K_M, Q5_K_M, Q8, GPTQ, AWQ). The quantized model generates syntactically valid code that looks correct but has subtle logic errors — off-by-one bugs, wrong comparison operators, swapped function arguments — at 2-3x the rate of the full-precision model. You cannot tell this is happening because the code compiles and often passes basic tests. So what? Developers pick a quantization level based on what fits in their VRAM, not based on quality metrics for their use case. A developer using Q4 quantization for a coding agent is shipping code with a hidden 2-3x higher defect rate and has no idea. They blame the model architecture when the real problem is quantization-induced degradation. Why does this persist in the first place? Every LLM benchmark (MMLU, HumanEval, MBPP) is run on full-precision models. When quantized benchmarks exist, they measure perplexity (a statistical metric) not task-specific quality like 'does the generated code actually work correctly in context.' There is no benchmark that measures 'Q4 of Model X produces working code Y% of the time vs full precision at Z%.' Users are flying blind — choosing between quantization levels by vibes and file size, not by measured quality for their specific task.
A 70B parameter model — the minimum size that can follow complex multi-step instructions reliably — requires ~40GB of VRAM at Q4 quantization. The best consumer GPU (RTX 4090) has 24GB. So you are forced to either: (a) run a 13B model that is too dumb for real agent tasks, (b) buy two GPUs and deal with tensor parallelism setup that barely works, or (c) use Apple Silicon unified memory which loads the model but runs inference at 5 tokens/second — making a 50-call agent loop take 30+ minutes. So what? There is a hardware dead zone: the models worth running locally do not fit on hardware normal people own, and the models that fit are not worth running. This kills the entire local-first AI agent market. Everyone who cares about privacy (healthcare, legal, finance) or wants to avoid per-token costs is told to run local — but running local is either useless (small model) or painfully slow (offloading to RAM/Apple Silicon). Why does this persist? NVIDIA has no incentive to ship more VRAM on consumer cards — they want you to buy $10K+ A100/H100 datacenter GPUs. Apple Silicon has the memory but the memory bandwidth bottleneck (200 GB/s vs 3 TB/s on H100) makes inference 15x slower. AMD GPUs have the VRAM (RX 7900 XTX has 24GB) but ROCm software support is so broken that most inference engines do not work on AMD at all.
When you use a local model (Llama 3, Mistral, Qwen) via Ollama or llama.cpp with function calling, the model hallucinates tool names that don't exist, generates malformed JSON arguments (missing quotes, trailing commas, wrong types), and ignores the tool schema you provided. This happens 20-40% of the time even with the best open-source models. So what? Function calling is the foundation of every agentic workflow — without reliable tool use, a local LLM cannot be an agent at all. It can chat, but it cannot act. This means anyone who wants to run agents locally for privacy, cost, or latency reasons is stuck: the models that can do reliable function calling (Claude, GPT-4) are cloud-only, and the models you can run locally cannot reliably call a single tool. Why does this persist in the first place? Function calling was bolted onto open-source models after the fact via fine-tuning on synthetic tool-call datasets. The training data is small, the JSON grammar is not enforced at the decoding level (most inference engines just sample tokens and hope they form valid JSON), and there is no standardized tool-call format across model families — Llama uses one format, Mistral another, Qwen another.
Most business software (Salesforce admin panels, SAP workflows, internal tools, legacy ERP systems, government portals) either has no API, or has an API so limited it covers 20% of what a human can do through the UI. So what? Agents are confined to the tiny slice of software that has good APIs — developer tools, cloud platforms, modern SaaS. The highest-value automation targets are exactly the ones agents cannot reach: data entry into legacy systems, navigating bureaucratic portals, operating enterprise software that was built in 2005. Why does this matter in the first place? The companies that would pay the most for automation — enterprises drowning in manual processes across dozens of clunky internal tools — are precisely the ones agents cannot help. Browser automation (Playwright, Puppeteer) breaks when a CSS class changes, fails on dynamic SPAs, and gets blocked by bot detection. Computer-use agents that operate via screenshots are 5-10 seconds per action and misclick constantly. The structural reason: building good APIs is expensive and most software vendors have no incentive to do it. Their moat is the UI — making it agent-accessible would let competitors build better interfaces on top of their data. And the agent-accessibility problem is circular: vendors won't build APIs until agents are useful, but agents won't be useful until vendors build APIs.
Running an AI agent on a real task costs anywhere from $0.10 to $50+ and there is no way to predict the cost before execution. The same task can cost 10x more on a second run if the agent takes a different reasoning path. So what? If you are a founder building a product powered by agents, you cannot set a price for your product because you do not know your own costs. If you are an enterprise buyer, your finance team will block agent adoption because they cannot forecast spend. If you are a developer, you live in fear of runaway loops that drain your API budget overnight. Why does this matter in the first place? Every other computing resource — cloud VMs, storage, bandwidth, even GPU time — has predictable per-unit pricing. You can estimate costs before committing. Agent costs are fundamentally unpredictable because the number of LLM calls depends on the model's runtime reasoning, which varies with task complexity, tool call results, and stochastic sampling. This breaks every standard financial planning model. The structural reason this persists: no agent framework provides pre-execution cost estimation, per-run budget caps, or cost-aware planning where the agent considers cheaper alternative approaches. The economic feedback loop is missing — the agent has no incentive to be efficient because it does not see the bill.