# AI Phone Agents Are a Real Opportunity. Here's What Building One Actually Taught Us.

> In early 2025, my team built an AI phone agent at a hackathon in one day and decided to develop it as a side project. Here's what we learned, why we paused, and what the 2026 market is now confirming.

**Published:** 2026-04-07
**Canonical URL:** https://yihuisong.com/article/anycall

## The story behind AnyCall

### How AnyCall started

In December 2024, I broke my elbow on a snowboarding trip abroad. What followed was months of painful recovery navigating insurance, searching for doctors, getting physical therapies via phone calls. That frustration stuck with me, and became the seed of an idea at a hackathon at AGI House in San Francisco in spring 2025.

I teamed up with engineers [Xinhao (Jerome) Li](https://www.linkedin.com/in/jeromexlee/), [Yifan Chen](https://www.linkedin.com/in/yifan-chen-nu/), and [Zian Li](https://www.linkedin.com/in/zianli-duke/). In roughly one day we built a Voice AI agent to handle doctor appointment phone calls as an MCP (Model Context Protocol) component — wiring together speech recognition, a language model, and text-to-speech into something that could navigate a live phone conversation. We won 2nd place. More importantly, it worked.

We turned it into a side project from there. The goal was simple: a general phone agent that handles any day-to-day phone call with service providers on behalf of consumers — doctor appointments, insurance queries, home services, anything that eats up time and energy for no good reason. From there we focused on continuous improvement: stabilizing the agent, building a consumer-facing interface, running user interviews, and exploring where the real market opportunity is. This article is to share what we discovered along the way and my reflection about voice agents in 2026.

### What we discovered about building a good phone agent (for consumers)

Building a reliable phone agent is harder than any demo suggests. The challenges split into two layers, and each makes the other harder.

**The technical layer: prompt tuning, evaluation, and the data problem**

The surface challenge is voice quality — latency, naturalness, overlapping speech. The deeper challenge is behavioral. How does the agent respond when a receptionist goes off-script, or when on-hold music plays for 90 seconds and a different person picks up? Each edge case requires deliberate design, and the failure modes are more embarrassing than a chatbot's because the call is happening in real time.

We spent significant effort on prompt tuning — how the agent introduced itself, how it handled ambiguity, when it asked clarifying questions versus made reasonable assumptions. This is never a one-time exercise; every new phone tree or vertical requires a fresh look. We had also begun designing an evaluation pipeline to grade transcripts, flag regressions, and close gaps systematically. But both depend on real call data at volume, and that data wasn't going to appear without consistent adoption first. Without users making calls we couldn't tune the agent; without a well-tuned agent we couldn't retain users. The pipeline was the right answer to the wrong starting point.

**The UX layer: simplicity wins, and the loop stays open**

Our instinct was to design the intake as a conversation. Back-and-forth felt more natural, more like a real human concierge, and a dialogue could proactively surface what the user would need before they realized it mattered. In practice it created friction that caused users to drop out before a task even started.

What worked better was the reverse: a simple text input, let the agent attempt the call, then inform the user afterward what additional context would have helped. This was the more honest design too, because front-loading context is a losing battle. Every service provider has its own workflow, and what gets needed mid-call is impossible to predict upfront in one shot.

But post-call feedback only goes so far. Some decisions can't be deferred. This led to a second insight: certain calls require the user to stay involved in real time. What we had in mind was an in-call connection feature — while the agent is live, the user gets a simple text or in-app prompt: "The receptionist is offering Tuesday 3:15 or Thursday morning, which works?" A quick reply and the agent picks it back up. No interruption, just a lightweight side channel that returns control precisely when human judgment is needed.

### Why we paused: two honest reasons

We paused AnyCall in late 2025. The reasons are worth being direct about.

**1. Consumer demand was inconsistent, and willingness to pay was low.**

The core insight that drove us was real: people hate making certain phone calls, especially when these calls are less efficient and require special attention. But hating something and being willing to pay to avoid it are different things. During our beta testing with the prototype with family and friends, we found that ordinary consumers, ourselves included, didn't reach for AnyCall with consistent enough frequency to build a reliable usage pattern.

- The use cases that felt most painful in retrospect (navigating insurance, scheduling medical care, looking for home services) weren't everyday occurrences.
- The more frequent use cases (making a dinner reservation, rescheduling a haircut) had already been partially addressed by online booking, and the friction threshold for pulling out an AI phone agent wasn't reliably crossed.

This showed up in the data: limited conversations captured by our web chatbot, low repeat usage of the phone agent, weak signals of willingness to pay. It echoed something the broader market has since confirmed. [SurveyMonkey's December 2025 survey](https://www.surveymonkey.com/curiosity/customer-service-statistics/) found that 79% of Americans strongly prefer interacting with a human over an AI agent, and only 8% actively prefer AI. Consumer AI voice products face a real trust and habit formation problem that is slow to solve.

**2. A side project cannot win in a fast-moving market.**

Through interviews with local service providers, I started to see a clearer target take shape: small businesses don't just need their phones answered; they need new customers but can't always afford hiring a full-time receptionist. Helping SMBs with customer acquisition, not just call handling, was a more concrete and commercially compelling problem to solve. Our research also pointed to a more specific selection criteria for which verticals to prioritize: SMBs that handle high volumes of inbound calls, charge meaningfully per transaction or service, and operate in spaces with lighter regulatory exposure. A dental office, a home services contractor, a real estate agency — these businesses lose real revenue on every missed call, have the willingness to pay for a solution, and don't carry the compliance complexity of, say, financial services or insurance. That combination of high call frequency, high contract value, and lower regulatory risk is the filter that separates attractive verticals from ones that look good on paper but are hard to sell into or build for.

But pursuing that direction seriously meant deeper sales relationships, vertical-specific integrations, and a level of iteration speed that part-time work simply cannot sustain. The AI voice agent space is moving extremely quickly both technologically and competitively. The market went from roughly [$315 million in VC funding in 2022 to over $2.1 billion in 2025](https://www.assemblyai.com/blog/voice-ai-in-2026-series-1), and the number of well-funded competitors entering specific verticals accelerated sharply through the year. In that environment, part-time work is not a viable mode. You cannot respond to market signals, iterate on product, or build the sales relationships that B2B requires without guaranteed, full-time focus. Side projects can prove hypotheses, but they cannot outrun funded competitors who are doing this full time. For any founder serious about this space: the part-time mode can drive a discovery process, but not a growth vehicle.

---

## What the 2026 market confirms: go vertical, go deep

The single clearest lesson from watching this market develop is the power of vertical focus. Generic AI phone platforms exist — [Retell AI](https://www.linkedin.com/company/retellai/) has [reached $50 million ARR](https://finance.yahoo.com/sectors/technology/articles/voice-ai-startup-retell-ai-131700326.html) to drive call center operations. [ElevenLabs](https://www.linkedin.com/company/elevenlabsio/) and [Vapi](https://www.linkedin.com/company/vapi-ai/) have built an infrastructure layer for voice AI agents. But more startups that have raised money and shown the fastest product-market fit are those that picked one industry and went deep.

The dental office example is striking. Within Y Combinator alone, multiple companies focused specifically on AI phone agents for dental practices. [Arini](https://www.linkedin.com/company/ariniai/) (YC24) describes the exact problem we observed: 80% of dental appointments are still booked over the phone, yet practices miss 20–30% of inbound calls and lose millions in revenue. [Toothy AI](https://www.linkedin.com/company/toothy-ai/) (YC W25) focused on the back-office insurance calls that consume 160+ hours per month per clinic. [Patientdesk.ai](https://www.linkedin.com/company/patientdesk-ai/) (YC W26) grew from $17K to $50K MRR in 8 weeks by owning the entire front-office phone stack for dental clinics: scheduling, live insurance verification, claims, and billing.

The restaurant vertical tells the same story. [Loman AI](https://www.linkedin.com/company/useloman/) [raised $3.5 million](https://restauranttechnologynews.com/2025/08/loman-ai-secures-3-5-million-to-help-restaurants-automate-the-phones/) in August 2025 specifically to automate restaurant phone calls, and [Hostie](https://www.linkedin.com/company/hostie-ai/) [raised $4 million](https://sfstandard.com/2025/05/01/ai-bot-answering-phones-in-sf-hostie-2/) on the back of dozens of San Francisco restaurant deployments. [Slang AI](https://www.linkedin.com/company/slang-ai/) has [raised $68 million](https://www.nrn.com/restaurant-technology/tech-tracker-ai-chatbots-are-the-next-frontier-in-restaurant-technology) in total for restaurant-focused voice AI. As one investor in Loman's round put it: "Restaurants have tried voice for years, but the AI wasn't ready. It is now. We are seeing strong pull from independents through enterprise. That is rare at seed."

Beyond dental and restaurants, several other verticals have shown clear PMF in 2026:

- **Property management & outpatient healthcare:** [EliseAI](https://eliseai.com/blog/eliseai-raises-250m-series-e) has surpassed $100 million in ARR automating communications and workflows for 1 in 8 US apartments and is now expanding into outpatient healthcare with Voice AI as part of their core offerings.
- **Veterinary practices:** [Scritch](https://www.linkedin.com/company/scritch/) (YC W24) is building AI voice agents specifically for vet front desks, a category that mirrors dental in its call volume and scheduling-heavy workflows.
- **Healthcare billing:** [LunaBill](https://www.linkedin.com/company/lunabill/) (YC F25) reached [$764K in contracted ARR](https://www.ycombinator.com/companies/lunabill) automating insurance claim follow-up calls for healthcare billing teams, where each call averages 30 minutes and accounts for 80% of a biller's daily workload.
- **Debt collection:** [Skit.ai](https://www.linkedin.com/company/skit-ai/) has raised $47.6 million and partnered with 53,000+ creditors across 19+ debt types, automating over one billion conversations and resolving more than $1 billion in accounts.
- **Home services:** HVAC, plumbing, and electrical contractors represent another significant opportunity. Home services businesses suffer from revenue loss due to missing inbound calls. Startups are moving fast to fill the gap. [Avoca](https://www.linkedin.com/company/avoca-ai/) (YC W23) has built an AI call center platform exclusively for the trades, striking [partnerships](https://homepros.news/home-services-startup-avoca-lands-investment-officially-launches/) with [ServiceTitan](https://www.servicetitan.com/blog/ai-voice-agents-in-hvac), Nexstar, and Home Service Freedom. [acrely](https://www.linkedin.com/company/acrely-ai/) (YC S25) is building the AI Customer Service Rep (CSR) for the same vertical.
- **Manufacturing and industrial suppliers:** This could be a potential market. Vendor coordination, purchase orders, and logistics confirmations still happen by phone in many plants and supply houses, with few voice vendors marketing there yet ([AgentVoice](https://www.agentvoice.com/ai-voice-in-2025-mapping-a-45-billion-market-shift/)).

This is not a coincidence. The winning formula appears to be: pick a vertical with (1) high call volume, low existing technology, and clear ROI on every missed call, and (2) enough operational specificity that deep integrations can create a genuine moat against horizontal competitors (such as practice management software, POS systems, EHRs). Our own research pointed to the same filter: prioritize SMBs handling high volumes of inbound calls, charging meaningfully per transaction, and operating outside the most heavily regulated spaces. The combination of call frequency, contract value, and manageable compliance is what separates attractive verticals from ones that look good on paper but are hard to build for and harder to sell into.

The broader market numbers back this up. The AI voice agent market is projected at roughly [$22.5 billion in 2026](https://www.ringly.io/blog/voice-ai-statistics-2026), growing toward $47.5 billion by 2034. Healthcare calling alone carries a [37.8% CAGR](https://www.grandviewresearch.com/industry-analysis/ai-voice-agents-healthcare-market-report), driven by structural workforce shortages and an estimated market size of $3.18 billion by 2030. The cost economics are stark: AI handles a call for roughly $0.40 versus [$7–12 for a human agent](https://www.ringly.io/blog/voice-ai-statistics-2026). The companies best positioned are those that chose a vertical, built deep integrations, and shipped with full-time commitment.

That's the thesis we were building toward with AnyCall. The insight was right. Unfortunately, the execution would have required full-time commitment we weren't in a position to give at that time.

---

## The strategic vision we still believe in: agent-to-agent calls

Pausing the project doesn't mean abandoning the underlying thesis about where this market goes:

- The paradigm today is phone calls between humans.
- What's being built toward is a consumer AI agent calling a business AI agent.
- Eventually, neither end of that call requires a human to initiate anything.

Imagine this scenario: a user says "book me a 6pm appointment with my dentist before Friday." The user's AI agent knows their schedule, insurance, and preferences. It calls the clinic's AI agent, which knows the practice's availability and booking rules. They negotiate and confirm. The appointment appears on both calendars. No human touched a phone.

[Gartner](https://www.gartner.com/en/newsroom/press-releases/2025-02-10-traditional-customer-service-channels-are-losing-ground-to-mobile-and-ai-innovations) predicts that by 2028, 70% of customer service journeys will begin — and be resolved — in conversational, third-party assistants built into their mobile devices. The companies building AI agents for both consumers and SMBs today are positioning themselves on both sides of this future network. Most are only building one side.

Our logic in starting with consumers was specifically this: real-world call data from consumer deployments would make any future SMB-side product far better. Consumer deployments as a data flywheel for a B2B product is a legitimate strategy.

---

## What I take away from AnyCall

AnyCall was a real project that produced real results — a working phone agent, a consumer interface, and a set of insights the market is now validating. It was unfortunate to pause it when the conditions for success weren't present. But here's what I'd carry forward in future work in this space:

- **Start B2B, not consumer.** Willingness to pay is clearer, feedback loops are faster, and call volume comes with the territory. The consumer-first approach created a circular dependency we couldn't break: no adoption meant no data, and no data meant we couldn't finish the prompt tuning and evaluation infrastructure the agent actually needed.
- **Pick your vertical with discipline.** High inbound call volume, meaningful revenue per transaction, lighter regulatory exposure. Verticals that check all three are where AI phone agents are winning right now, and where deep integrations compound into real moats over time.
- **Simplicity beats cleverness in UX.** A plain text input outperformed a conversational intake every time. Design for the lowest activation cost first, add complexity only where usage data justifies it.
- **Evaluation infrastructure is not optional.** Grading call transcripts and flagging regressions is just as important as the agent itself. Without it you're flying blind. Without call volume, you can't build it, which circles back to vertical focus.
- **Calibration matters more than autonomy.** The best phone agents know when to hand back control. Designing for those moments is a discipline that most demos hide entirely.

The phone call is one of the last major human interactions not yet transformed by software. The infrastructure exists, the models are ready, and the verticals are being claimed. The only question is who goes deep enough, fast enough, to own one.

---

*AnyCall was built by [Yihui Song](https://www.linkedin.com/in/yihuisong/), [Zian Li](https://www.linkedin.com/in/zianli-duke/), [Yifan Chen](https://www.linkedin.com/in/yifan-chen-nu/), and [Xinhao (Jerome) Li](https://www.linkedin.com/in/jeromexlee/). Demo: [youtube.com/watch?v=dGPzNPBcSOc](https://www.youtube.com/watch?v=dGPzNPBcSOc).*
