Conversational AI | CRM | Business Automation | SaaS

Project X thumbnail showing a multi-channel AI assistant automating phone, SMS, and web conversations

Built a multi-channel AI assistant platform that automates phone, SMS, and chat interactions at scale.

Project X is a production-grade multi-channel AI SaaS platform designed for continuous, human-like conversations across voice and messaging. It integrates with business systems (including CRMs) to automate front-desk and operational workflows—reducing overhead while delivering near real-time responses.

See Outcomes View Architecture

The key product gap was unification: businesses needed one system that could deliver real-time voice + messaging + workflow execution with high accuracy and low latency—without fragmented communication stacks or robotic assistant experiences.

Voice response latency

~1.2s

Near real-time, interruptible voice conversations.

Availability

24/7

Always-on interactions with zero wait time.

Channels unified

Phone + SMS + Web

Shared AI agent experience across all touchpoints.

Business outcome

10-50x ROI

Automation depth + faster responses + reduced support load.

What it solved

Missed calls turning directly into lost revenue.
Expensive call centers and bloated support teams.
Slow response times across disconnected channels.
Generic chatbots that delivered poor accuracy and robotic UX.

What was built

Ultra-low-latency voice pipeline with streaming interaction design.
Channel-agnostic multi-channel agent core with shared context.
Industry-specific AI agents with grounding from business knowledge + CRM data.
Workflow automation engine (Trigger → Intent → Action → Response) with observability.

Why it mattered

Customers received human-like conversations without waiting for staffed teams.
Businesses replaced a large % of support workload with automated front-desk execution.
Exact same agent behavior worked across phone, SMS, and web chat.
Compliance-friendly, vertical-aware workflows (example: healthcare data) reduced risk.

The Challenge

Before this platform, businesses used fragmented communication systems and expensive operational overhead. There was no unified system capable of handling real-time voice + messaging + workflow execution with high accuracy and low latency—especially under strict constraints like concurrency and compliance.

Environment constraints & goals

Designed to handle 24/7 customer interactions.
Provide natural voice conversations with a seamless phone UX.
Integrate with CRMs and business workflow systems.
Deliver 10–50x ROI by reducing operational costs and support load.
Support compliance-heavy, industry-specific workflows (e.g., healthcare data).

Core constraints to win

Critical latency: real-time voice latency requirements for natural, interruptible conversations.
High concurrency: multiple simultaneous calls without performance degradation.
Unified UX: channel consistency across Phone, SMS, and Web.
Accuracy under real-world conditions: reduce hallucinations in strict business workflows.
The primary goal was to deliver human-like AI conversations (voice and text), enable full automation of front-desk workflows, and provide industry-specific intelligence.

The Solution

The core idea was a modular AI agent platform that understands user intent via LLMs, acts through workflow integrations, responds naturally, and continuously learns from business data. A real-time communication stack and a channel-agnostic agent core unified voice and messaging experiences.

Solution overview & architecture

LLM-based intent understanding to interpret user requests across channels.
Workflow integration layer to execute business actions beyond Q&A.
Natural response generation designed for voice and text UX.
Continuous learning grounded in scraped and business-uploaded knowledge.
Unified memory/context shared across phone, SMS, and web chat sessions.

Design principles

Ultra-low latency first for voice (streaming-first pipeline).
Channel-agnostic agent core for consistent UX across protocols.
Retrieval-grounded answers using semantic runtime retrieval.
Event-driven automation with explicit intent → action → response flow.
Observability baked in: latency tracking, failure logging, and escalation signals.

High-level architecture diagram — High-level architecture

Backend API: Access Management diagram — Backend API: Access Management

Backend API: Project Management - Access Details diagram — Backend API: Project Management - Access Details

Backend API: Project Management - Register Details diagram — Backend API: Project Management - Register Details

Engine: Chat Channel diagram — Engine: Chat Channel

Engine: Phone Channel diagram — Engine: Phone Channel

Engine: Whatsapp Channel diagram — Engine: Whatsapp Channel

Capability Snapshot

Project X delivered more than AI conversations. It created a unified business automation layer that could answer, retrieve context, trigger workflows, and update external systems with production reliability.

Real-time conversation engine

Streaming STT for partial transcripts, incremental LLM token streaming, and TTS orchestration.

Knowledge grounding

Scraping + parsing ingestion pipelines with chunking/embedding and FAISS-backed vector search.

Workflow automation

Event-driven execution: Trigger → Intent → Action → Response across business APIs and CRM signals.

Multi-channel consistency

Channel-agnostic agent core with shared memory and context across phone, SMS, and web chat.

Industry-specific intelligence

Vertical-specific prompting plus grounding from FAQs, local documents, and CRM data.

Observability & safe operations

Conversation logs, latency and failure monitoring, sentiment analysis, and escalation dashboards.

Implementation Highlights

Ultra-low latency voice pipeline Natural phone UX requires near real-time response.

Problem: Traditional AI voice systems feel laggy and unnatural, breaking the flow of human conversations.

Solution: Implemented streaming STT for partial transcripts, used incremental LLM token streaming combined with parallel TTS generation, and optimized buffering for real-time audio playback.

Impact: Achieved ~1–1.5s response latency, delivering interruptible, human-like conversations that feel immediate.

Multi-channel unified engine One intelligence layer across phone, SMS, and web.

Challenge: Phone, SMS, and Web have different protocols and UX expectations, but businesses need consistent assistant behavior across all channels.

Solution: Built a channel-agnostic agent core and abstracted the communication layer so the same experience could be delivered everywhere. Shared memory and context were implemented across channels.

Impact: The exact same AI agent works seamlessly across phone calls, SMS interactions, and embedded web chat.

Industry-specific AI agents Vertical workflows needed more than generic models.

Problem: Generic LLMs struggle in strict, domain-specific business workflows and can hallucinate when accuracy requirements are high.

Solution: Fine-tuned prompts per vertical (Healthcare, Retail, Finance) and deeply integrated business FAQs, local documents, and Zoho Desk CRM data to ground interactions.

Impact: Higher accuracy, reduced hallucination rates, and superior task completion.

Business knowledge ingestion & retrieval Fresh, grounded context for accurate conversations.

Problem: Assistants underperform without fresh, business-specific context.

Solution: Built ingestion pipelines for scraping websites and parsing business-uploaded PDFs / SOPs. Chunked and embedded the data into a self-hosted FAISS vector store, then used semantic runtime retrieval and synced structured CRM signals for context-aware replies.

Impact: Faster vertical onboarding and highly accurate, workflow-aware responses.

Workflow automation engine & observability Automation that can be monitored and improved.

Workflow: Built an event-driven system (Trigger → Intent → Action → Response) integrating external APIs (Calendars, Zoho Desk, databases) to automate appointments, orders, and routing.

Observability: Logged conversations, latency, and failures. Added sentiment analysis and escalation tracking dashboards to ensure continuous improvement and fast detection of degradation.

Impact: Replaced human intervention for end-to-end tasks while ensuring system degradation was caught immediately.

Business & System Results

The platform achieved scale and reliability by combining near real-time conversational AI with workflow execution, knowledge grounding, and continuous operational observability.

Scale & performance

Handles multiple simultaneous calls without waiting.
24/7 automated communication across phone, SMS, and web.
Voice latency engineered for near real-time user experience.

Conversions & customer experience

Increased response speed, directly improving conversions.
Human-like, interruptible conversations rather than robotic scripts.
Accurate, industry-aware assistance from grounded knowledge and CRM signals.

Cost & operations

Significant reduction in call center workload and hiring dependencies.
Reduced overhead through end-to-end automation of front-desk workflows.
Operational degradation caught immediately via observability dashboards.

My Role & Scope

Timeline: 2.5 years. Users: SMBs, Enterprises, Healthcare. Role: Lead AI Engineer. I owned the system end-to-end, from the real-time communication stack to AI/LLM orchestration.

Owned directly

System architecture for real-time AI + communication stack.
AI/LLM orchestration: prompting, orchestration, optimization.
Knowledge: web scraping, parsing, FAISS vector search.
Infrastructure for multi-channel (Phone, SMS, Web chat) and workflows.
Integrations with Zoho Desk CRM, scheduling systems, and custom APIs.

Hands-on engineering

Wrote production-grade code across backend systems and AI pipelines.
Built and maintained real-time communication services (voice + messaging).
Optimized system performance for low-latency voice and high concurrency.
Designed and implemented APIs and internal services used by the platform.
Debugged live call flows, LLM hallucinations, and async race conditions.

Lessons Learned

This project demonstrates the ability to build production-grade AI systems, solve real-world latency and scaling challenges, design end-to-end SaaS platforms, and deliver measurable business impact.

What worked

Real-time performance is non-negotiable for voice AI.
Vertical-specific agents consistently outperform generic models.
Multi-channel unification is a massive market differentiator.

Challenges conquered

Balancing the strict trade-off between latency and accuracy in LLMs.
Handling unpredictable edge cases in real human voice conversations.

Final takeaway

Final Takeaway: This project demonstrates the ability to build production-grade AI systems, solve real-world latency and scaling challenges, design end-to-end SaaS platforms, and deliver measurable business impact.