Conversational AI | CRM | Business Automation | SaaS

Project X thumbnail showing a multi-channel AI assistant automating phone, SMS, and web conversations

Built a multi-channel AI assistant platform that automates phone, SMS, and chat interactions at scale.

Project X is a production-grade multi-channel AI SaaS platform designed for continuous, human-like conversations across voice and messaging. It integrates with business systems (including CRMs) to automate front-desk and operational workflows—reducing overhead while delivering near real-time responses.

The key product gap was unification: businesses needed one system that could deliver real-time voice + messaging + workflow execution with high accuracy and low latency—without fragmented communication stacks or robotic assistant experiences.
Voice response latency
~1.2s
Near real-time, interruptible voice conversations.
Availability
24/7
Always-on interactions with zero wait time.
Channels unified
Phone + SMS + Web
Shared AI agent experience across all touchpoints.
Business outcome
10-50x ROI
Automation depth + faster responses + reduced support load.
Scan-first summary A modular AI agent platform that powers unified multi-channel conversations (voice, SMS, web) and executes business workflows in real time—achieving near real-time latency, high concurrency, and industry-aware automation with measurable ROI.
Scan in 90 seconds

What it solved

  • Missed calls turning directly into lost revenue.
  • Expensive call centers and bloated support teams.
  • Slow response times across disconnected channels.
  • Generic chatbots that delivered poor accuracy and robotic UX.

What was built

  • Ultra-low-latency voice pipeline with streaming interaction design.
  • Channel-agnostic multi-channel agent core with shared context.
  • Industry-specific AI agents with grounding from business knowledge + CRM data.
  • Workflow automation engine (Trigger → Intent → Action → Response) with observability.

Why it mattered

  • Customers received human-like conversations without waiting for staffed teams.
  • Businesses replaced a large % of support workload with automated front-desk execution.
  • Exact same agent behavior worked across phone, SMS, and web chat.
  • Compliance-friendly, vertical-aware workflows (example: healthcare data) reduced risk.

The Challenge

Before this platform, businesses used fragmented communication systems and expensive operational overhead. There was no unified system capable of handling real-time voice + messaging + workflow execution with high accuracy and low latency—especially under strict constraints like concurrency and compliance.

Environment constraints & goals

  • Designed to handle 24/7 customer interactions.
  • Provide natural voice conversations with a seamless phone UX.
  • Integrate with CRMs and business workflow systems.
  • Deliver 10–50x ROI by reducing operational costs and support load.
  • Support compliance-heavy, industry-specific workflows (e.g., healthcare data).

Core constraints to win

  • Critical latency: real-time voice latency requirements for natural, interruptible conversations.
  • High concurrency: multiple simultaneous calls without performance degradation.
  • Unified UX: channel consistency across Phone, SMS, and Web.
  • Accuracy under real-world conditions: reduce hallucinations in strict business workflows.
  • The primary goal was to deliver human-like AI conversations (voice and text), enable full automation of front-desk workflows, and provide industry-specific intelligence.

The Solution

The core idea was a modular AI agent platform that understands user intent via LLMs, acts through workflow integrations, responds naturally, and continuously learns from business data. A real-time communication stack and a channel-agnostic agent core unified voice and messaging experiences.

Solution overview & architecture

  • LLM-based intent understanding to interpret user requests across channels.
  • Workflow integration layer to execute business actions beyond Q&A.
  • Natural response generation designed for voice and text UX.
  • Continuous learning grounded in scraped and business-uploaded knowledge.
  • Unified memory/context shared across phone, SMS, and web chat sessions.

Design principles

  • Ultra-low latency first for voice (streaming-first pipeline).
  • Channel-agnostic agent core for consistent UX across protocols.
  • Retrieval-grounded answers using semantic runtime retrieval.
  • Event-driven automation with explicit intent → action → response flow.
  • Observability baked in: latency tracking, failure logging, and escalation signals.
High-level architecture
High-level architecture diagram
Backend API: Access Management
Backend API: Access Management diagram
Backend API: Project Management - Access Details
Backend API: Project Management - Access Details diagram
Backend API: Project Management - Register Details
Backend API: Project Management - Register Details diagram
Engine: Chat Channel
Engine: Chat Channel diagram
Engine: Phone Channel
Engine: Phone Channel diagram
Engine: Whatsapp Channel
Engine: Whatsapp Channel diagram
User Interface
User Interface diagram

Capability Snapshot

Project X delivered more than AI conversations. It created a unified business automation layer that could answer, retrieve context, trigger workflows, and update external systems with production reliability.

Real-time conversation engine

Streaming STT for partial transcripts, incremental LLM token streaming, and TTS orchestration.

Knowledge grounding

Scraping + parsing ingestion pipelines with chunking/embedding and FAISS-backed vector search.

Workflow automation

Event-driven execution: Trigger → Intent → Action → Response across business APIs and CRM signals.

Multi-channel consistency

Channel-agnostic agent core with shared memory and context across phone, SMS, and web chat.

Industry-specific intelligence

Vertical-specific prompting plus grounding from FAQs, local documents, and CRM data.

Observability & safe operations

Conversation logs, latency and failure monitoring, sentiment analysis, and escalation dashboards.

Implementation Highlights

Ultra-low latency voice pipeline Natural phone UX requires near real-time response.

Problem: Traditional AI voice systems feel laggy and unnatural, breaking the flow of human conversations.

Solution: Implemented streaming STT for partial transcripts, used incremental LLM token streaming combined with parallel TTS generation, and optimized buffering for real-time audio playback.

Impact: Achieved ~1–1.5s response latency, delivering interruptible, human-like conversations that feel immediate.

Multi-channel unified engine One intelligence layer across phone, SMS, and web.

Challenge: Phone, SMS, and Web have different protocols and UX expectations, but businesses need consistent assistant behavior across all channels.

Solution: Built a channel-agnostic agent core and abstracted the communication layer so the same experience could be delivered everywhere. Shared memory and context were implemented across channels.

Impact: The exact same AI agent works seamlessly across phone calls, SMS interactions, and embedded web chat.

Industry-specific AI agents Vertical workflows needed more than generic models.

Problem: Generic LLMs struggle in strict, domain-specific business workflows and can hallucinate when accuracy requirements are high.

Solution: Fine-tuned prompts per vertical (Healthcare, Retail, Finance) and deeply integrated business FAQs, local documents, and Zoho Desk CRM data to ground interactions.

Impact: Higher accuracy, reduced hallucination rates, and superior task completion.

Business knowledge ingestion & retrieval Fresh, grounded context for accurate conversations.

Problem: Assistants underperform without fresh, business-specific context.

Solution: Built ingestion pipelines for scraping websites and parsing business-uploaded PDFs / SOPs. Chunked and embedded the data into a self-hosted FAISS vector store, then used semantic runtime retrieval and synced structured CRM signals for context-aware replies.

Impact: Faster vertical onboarding and highly accurate, workflow-aware responses.

Workflow automation engine & observability Automation that can be monitored and improved.

Workflow: Built an event-driven system (Trigger → Intent → Action → Response) integrating external APIs (Calendars, Zoho Desk, databases) to automate appointments, orders, and routing.

Observability: Logged conversations, latency, and failures. Added sentiment analysis and escalation tracking dashboards to ensure continuous improvement and fast detection of degradation.

Impact: Replaced human intervention for end-to-end tasks while ensuring system degradation was caught immediately.

Business & System Results

The platform achieved scale and reliability by combining near real-time conversational AI with workflow execution, knowledge grounding, and continuous operational observability.

Scale & performance

  • Handles multiple simultaneous calls without waiting.
  • 24/7 automated communication across phone, SMS, and web.
  • Voice latency engineered for near real-time user experience.

Conversions & customer experience

  • Increased response speed, directly improving conversions.
  • Human-like, interruptible conversations rather than robotic scripts.
  • Accurate, industry-aware assistance from grounded knowledge and CRM signals.

Cost & operations

  • Significant reduction in call center workload and hiring dependencies.
  • Reduced overhead through end-to-end automation of front-desk workflows.
  • Operational degradation caught immediately via observability dashboards.

My Role & Scope

Timeline: 2.5 years. Users: SMBs, Enterprises, Healthcare. Role: Lead AI Engineer. I owned the system end-to-end, from the real-time communication stack to AI/LLM orchestration.

Owned directly

  • System architecture for real-time AI + communication stack.
  • AI/LLM orchestration: prompting, orchestration, optimization.
  • Knowledge: web scraping, parsing, FAISS vector search.
  • Infrastructure for multi-channel (Phone, SMS, Web chat) and workflows.
  • Integrations with Zoho Desk CRM, scheduling systems, and custom APIs.

Hands-on engineering

  • Wrote production-grade code across backend systems and AI pipelines.
  • Built and maintained real-time communication services (voice + messaging).
  • Optimized system performance for low-latency voice and high concurrency.
  • Designed and implemented APIs and internal services used by the platform.
  • Debugged live call flows, LLM hallucinations, and async race conditions.

Lessons Learned

This project demonstrates the ability to build production-grade AI systems, solve real-world latency and scaling challenges, design end-to-end SaaS platforms, and deliver measurable business impact.

What worked

  • Real-time performance is non-negotiable for voice AI.
  • Vertical-specific agents consistently outperform generic models.
  • Multi-channel unification is a massive market differentiator.

Challenges conquered

  • Balancing the strict trade-off between latency and accuracy in LLMs.
  • Handling unpredictable edge cases in real human voice conversations.

Final takeaway

Final Takeaway: This project demonstrates the ability to build production-grade AI systems, solve real-world latency and scaling challenges, design end-to-end SaaS platforms, and deliver measurable business impact.