Sears KAIros - AI-Powered Home Manager Communication Platform
โจ Complete Multi-Step Orchestration Workflows
A unified AI agentic platform that orchestrates customer communications across multiple channels (voice, SMS, email) using OpenAI Realtime APILow-latency speech-to-speech AI, WebSocketReal-time audio streaming protocol, and MongoDBCloud database for state management. The system embodies the Home Manager role - making all customer calls for appointment confirmations, "on the way" notifications, and post-service follow-ups. Covers 12 communication scenarios with 60+ templates, ensuring technicians focus solely on service delivery while AI handles all customer interactions.
4-layer system design with agents, state management, and integrations
Complete workflow for appointment confirmations
Parallel multi-channel outreach workflows
AI-powered voicemail handling capabilities
State management and cross-channel context
Reschedule flow with full context
Same-day service timeline example
Complete scenario coverage
AWS infrastructure and scaling
Definitions and explanations of key terms
graph TB
subgraph orchestration["๐ฏ AGENTS"]
Journey["๐ฏ Communication Orchestrator
โโโโโโโโโโโโโโโ
โ Multi-step workflows
โ Parallel multi-channel
โ Context preservation
โ State management"]
PreCall["๐ Pre-Call Agent
โโโโโโโโโโโ
โ Confirmation Call
โ Narrow Window
โ On the Way Call"]
FollowUp["๐ Follow-Up Agent
โโโโโโโโโโโ
โ Post-Service SMS
โ Satisfaction Check
โ Service Complete"]
Comms["๐ฌ Communications Hub
โโโโโโโโโโโ
โ 12 Scenarios
โ Multi-Channel
โ Templates"]
end
subgraph context["๐พ CONTEXT & STATE MANAGEMENT"]
StateManager["๐ง Conversation State Manager
โโโโโโโโโโโโโโโโโโโ
1. Cross-channel tracking
2. Intent & sentiment
3. History & timeline
4. Next-best-action"]
subgraph dataSources["๐ Data Sources"]
direction LR
MongoDB["๐๏ธ MongoDB Data Store
โโโโโโโโโโโโโโโ
1. Conversations
2. Customers
3. Interactions
4. Context
5. Analytics"]
ServicePower["๐ง ServicePower/Snowflake
โโโโโโโโโโโโโโโ
1. Service orders
2. Appointments
3. Customer details
4. Schedules
5. Parts
6. Tech info"]
end
end
subgraph channels["๐ค CHANNEL AGENTS LAYER"]
subgraph voice["๐ค Voice Agents"]
OutVoice["๐ค Outbound Voice
โโโโโโโโโโ
1. Initiate calls
2. Real-time conv
3. Voicemail
4. Trigger actions"]
InVoice["๐ฅ Inbound Voice
โโโโโโโโโโ
1. Answer calls
2. Context-aware
3. KB-powered"]
end
subgraph sms["๐ฌ SMS Agents"]
OutSMS["๐ค Outbound SMS
โโโโโโโโโโ
1. Send messages
2. Link sharing
3. Confirmations
4. Trigger actions"]
InSMS["๐ฅ Inbound SMS
โโโโโโโโโโ
1. Process replies
2. Context-aware
3. Determine intent
4. Auto-respond"]
end
subgraph email["๐ง Email Agents"]
OutEmail["๐ค Outbound Email
โโโโโโโโโโ
1. Compose
2. Attachments
3. Follow-ups"]
InEmail["๐ฅ Inbound Email
โโโโโโโโโโ
1. Process replies
2. Extract info
3. Classify intent
4. Auto-respond"]
end
end
subgraph integration["๐ INTEGRATION LAYER"]
OpenAI["๐ค OpenAI Realtime
โโโโโโโโโโโโโโ
1. Speech-to-Speech
2. Function Calling
3. Voice Detection
4. Audio Streaming"]
Twilio["๐ฑ Twilio Voice/SMS
โโโโโโโโโโโโโโ
1. Media Streams
2. TwiML
3. Messaging API
4. Webhooks"]
Graph["๐ฎ Microsoft Graph
โโโโโโโโโโโโโโ
1. Mail API
2. Attachments"]
end
PreCall --> StateManager
FollowUp --> StateManager
Comms --> StateManager
StateManager --> dataSources
MongoDB ~~~ ServicePower
dataSources --> OutVoice
dataSources --> InVoice
dataSources --> OutSMS
dataSources --> InSMS
dataSources --> OutEmail
dataSources --> InEmail
OutVoice --> Twilio
InVoice --> Twilio
OutVoice --> OpenAI
InVoice --> OpenAI
OutSMS --> Twilio
InSMS --> Twilio
OutEmail --> Graph
InEmail --> Graph
style orchestration fill:#e8f0f5,stroke:#1a1f3a,stroke-width:3px
style context fill:#f5f0e8,stroke:#8b6f47,stroke-width:3px
style channels fill:#f0e8f0,stroke:#4a2c5c,stroke-width:3px
style integration fill:#e8f5f0,stroke:#2d5f4f,stroke-width:3px
style dataSources fill:#f5ede5,stroke:#c9a961,stroke-width:2px
style voice fill:#e8f0f5,stroke:#1e4d4d,stroke-width:2px
style sms fill:#f5e8eb,stroke:#b76e79,stroke-width:2px
style email fill:#f5f0e0,stroke:#d4a574,stroke-width:2px
Home Manager Jessica makes ALL customer calls - technicians NEVER call customers. Two calls: 1) Confirmation + narrow window (1-2hrs before), 2) "On the way" (30min before arrival).
sequenceDiagram
participant Scheduler as โฐ Scheduler
participant PreCall as ๐ Home Manager
(Pre-Call Agent)
participant DB as ๐๏ธ MongoDB
participant GPT as ๐ค GPT-4o
participant OutVoice as ๐ค Outbound Voice
participant Twilio as ๐ฑ Twilio
participant OpenAI as ๐ค OpenAI Realtime
participant Customer as ๐ค Customer
participant SMS as ๐ฌ SMS Agent
Note over Scheduler,PreCall: 1-2 hours before appointment
Scheduler->>PreCall: Trigger confirmation workflow
PreCall->>DB: Load appointment data
DB-->>PreCall: Service details, window, customer info
PreCall->>GPT: Generate confirmation script
Note right of GPT: โข Technician name
โข Narrowed window (4-5pm)
โข Service type
โข Payment ($89)
โข Pet/access notes
GPT-->>PreCall: Personalized script ready
PreCall->>OutVoice: Initiate confirmation call
OutVoice->>Twilio: Place call to +1234567890
Twilio->>Customer: ๐ Calling...
Customer-->>Twilio: Answer
Twilio->>OpenAI: Connect audio stream
OpenAI-->>Customer: ๐ฃ๏ธ "Hi! This is about your 4-5pm appointment..."
Customer-->>OpenAI: ๐ฃ๏ธ "Yes, I'll be home"
OpenAI->>Customer: ๐ฃ๏ธ "Adult 18+ present?"
Customer->>OpenAI: ๐ฃ๏ธ "Yes"
OpenAI->>Customer: ๐ฃ๏ธ "$89 trip charge, pets secured?"
Customer->>OpenAI: ๐ฃ๏ธ "Got it, dog will be in backyard"
OpenAI->>Customer: ๐ฃ๏ธ "I'll call when tech is 30 min away"
Customer->>OpenAI: ๐ฃ๏ธ "Perfect, thanks!"
OpenAI-->>OutVoice: Call completed (2 min 15 sec)
OutVoice->>SMS: Send confirmation SMS
SMS->>Customer: ๐ฑ "Confirmed: Tech Tom, 4-5pm today. $89 trip charge."
OutVoice->>DB: Log call details & confirmation
DB-->>OutVoice: โ Saved
Note over Scheduler,Customer: 30 minutes before arrival
Scheduler->>PreCall: Trigger "on the way" call
PreCall->>OutVoice: Make brief notification call
OutVoice->>Twilio: Call customer
Twilio->>OpenAI: Connect
OpenAI->>Customer: ๐ฃ๏ธ "Tech finishing up, heading your way!"
Customer->>OpenAI: ๐ฃ๏ธ "Great, ready!"
OpenAI-->>OutVoice: Call complete (35 sec)
OutVoice->>DB: Update: Customer ready โ
Full customer journey workflow showing multi-channel outreach strategy. When customer doesn't answer the call, the system immediately sends voicemail + SMS + Email to maximize response channels. Demonstrates how the Communication Orchestrator manages parallel multi-channel communication with complete context preservation across all touchpoints. Customer can respond via ANY channel at any time.
Communication Orchestrator initiates the workflow by triggering an outbound call. If customer answers, workflow completes immediately. If no answer, the system detects voicemail, leaves a contextual message, and immediately sends SMS and Email to provide multiple response channels.
sequenceDiagram
participant Scheduler as โฐ Scheduler
participant Journey as ๐ฏ Communication Orchestrator
participant OutVoice as ๐ค Outbound Voice
participant OutSMS as ๐ค Outbound SMS
participant OutEmail as ๐ค Outbound Email
participant DB as ๐๏ธ MongoDB State
participant OpenAI as ๐ค OpenAI Realtime
participant Twilio as ๐ฑ Twilio
participant Graph as ๐ฎ MS Graph
participant Customer as ๐ค Customer
Scheduler->>Journey: Trigger confirmation workflow
Journey->>DB: Create communication state
Note right of DB: State: workflow_initiated
Status: in_progress
Step: outbound_call
Journey->>OutVoice: Initiate confirmation call
OutVoice->>Twilio: Place call to customer
Twilio->>Customer: ๐ Calling...
alt Customer Answers Call
Customer->>Twilio: โ Answers
Twilio->>OpenAI: Connect audio stream
OpenAI->>Customer: ๐ฃ๏ธ "Hi! This is about your appointment at 4-5pm today..."
Customer->>OpenAI: ๐ฃ๏ธ "Yes, I'll be there!"
OpenAI->>Customer: ๐ฃ๏ธ "Great! Tech Tom will arrive between 4-5pm. $89 trip charge."
Customer->>OpenAI: ๐ฃ๏ธ "Perfect, see you then!"
OpenAI-->>OutVoice: Call completed successfully
OutVoice->>DB: Update: confirmed, workflow_complete
Note right of DB: Status: confirmed
Channel: voice_outbound
Duration: 1m 45s
Note over Journey,DB: โ
SUCCESS - Workflow Complete
else No Answer / Voicemail Detected
Customer-->>Twilio: No answer (rings)
Twilio->>OpenAI: Voicemail greeting detected
Note right of OpenAI: AI detects:
โข Beep sound
โข Greeting patterns
โข No human response
OpenAI->>Customer: ๐ฃ๏ธ Leave voicemail message
Note right of OpenAI: "Hi, this is Jessica from
Sears about your dishwasher
appointment today 4-5pm.
Please call back to confirm."
OpenAI-->>OutVoice: Voicemail left (22 seconds)
OutVoice->>DB: Update: voicemail_left
Note right of DB: Status: voicemail_left
Timestamp: 13:30
Note over Journey,OutEmail: Immediately send SMS and Email (parallel)
par Send SMS
OutVoice->>Journey: Trigger SMS
Journey->>OutSMS: Send confirmation SMS
OutSMS->>Twilio: Send message
Twilio->>Customer: ๐ฌ "Hi, we called about your appointment today at 4-5pm. Reply YES to confirm or call us back."
OutSMS->>DB: Log: sms_sent
and Send Email
OutVoice->>Journey: Trigger Email
Journey->>OutEmail: Send confirmation email
OutEmail->>Graph: Send message
Graph->>Customer: ๐ง "Appointment Confirmation - Please Respond"
OutEmail->>DB: Log: email_sent
end
Note right of DB: All channels contacted:
โข Voicemail: 13:30
โข SMS: 13:30
โข Email: 13:30
Journey->>Journey: Wait for response on ANY channel
Note over Journey,DB: โณ Waiting for customer response
end
Outcome: If customer answers, workflow completes immediately (85% success rate). If voicemail is detected, system immediately sends voicemail + SMS + Email simultaneously, providing customer with multiple convenient ways to respond.
After multi-channel outreach, customer can call back at any time. When they do, the inbound voice agent has full context about all previous contact attempts (voicemail, SMS, email) and can provide a seamless, personalized conversation.
sequenceDiagram
participant Customer as ๐ค Customer
participant Twilio as ๐ฑ Twilio
participant InVoice as ๐ฅ Inbound Voice
participant DB as ๐๏ธ MongoDB State
participant OpenAI as ๐ค OpenAI Realtime
Note over Customer,InVoice: Customer received: Voicemail + SMS + Email
Customer->>Twilio: ๐ Calls company number
Twilio->>InVoice: Incoming call detected
InVoice->>DB: Lookup by phone: +1234567890
DB-->>InVoice: Load communication state
Note right of DB: Context loaded:
โข Voicemail left at 13:30
โข SMS sent at 13:30
โข Email sent at 13:30
โข Appointment: 4-5pm today
InVoice->>OpenAI: Connect with full context
OpenAI->>Twilio: Ready for conversation
Twilio->>Customer: โ Connected
OpenAI->>Customer: ๐ฃ๏ธ "Hi! Thanks for calling back about your appointment today..."
Customer->>OpenAI: ๐ฃ๏ธ "Yes, I got your messages. I'll be home."
OpenAI->>Customer: ๐ฃ๏ธ "Perfect! Tech Tom will arrive between 4-5pm. $89 trip charge."
Customer->>OpenAI: ๐ฃ๏ธ "Great, thanks!"
OpenAI-->>InVoice: Call completed
InVoice->>DB: Update: confirmed_via_callback, workflow_complete
Note right of DB: Status: confirmed
Channel: voice_inbound
Response time: 25 min
Note over InVoice,DB: โ
SUCCESS - Workflow Complete
Key Feature: Inbound agent recognizes caller from phone number, loads complete context about all contact attempts (voicemail, SMS, email), and provides seamless conversation.
Customer can respond to the SMS message sent in Step 1. The inbound SMS agent analyzes their reply, has full context about all previous contact attempts, and can confirm the appointment or handle other responses. Customer may also choose to call instead of texting.
sequenceDiagram
participant Customer as ๐ค Customer
participant Twilio as ๐ฑ Twilio
participant InSMS as ๐ฅ Inbound SMS
participant InVoice as ๐ฅ Inbound Voice
participant DB as ๐๏ธ MongoDB State
participant OpenAI as ๐ค OpenAI Realtime
Note over Customer,InSMS: Customer received: Voicemail + SMS + Email
alt Customer Replies to SMS
Customer->>Twilio: ๐ฌ "YES"
Twilio->>InSMS: Incoming SMS webhook
InSMS->>DB: Load communication state
DB-->>InSMS: Full context loaded
Note right of DB: Complete history:
โข Voicemail at 13:30
โข SMS at 13:30
โข Email at 13:30
โข Customer replied
InSMS->>InSMS: Analyze response: "YES" = confirmation
InSMS->>Twilio: Send confirmation reply
Twilio->>Customer: ๐ฌ "Great! Tech Tom arriving 4-5pm. $89 trip charge. See you then!"
InSMS->>DB: Update: confirmed_via_sms, workflow_complete
Note right of DB: Status: confirmed
Channel: sms_inbound
Response time: 12 min
Note over InSMS,DB: โ
SUCCESS - Workflow Complete
else Customer Calls Instead of SMS Reply
Customer->>Twilio: ๐ Calls company number
Twilio->>InVoice: Incoming call
InVoice->>DB: Load communication state
DB-->>InVoice: Full context loaded
Note right of DB: Context:
โข Voicemail at 13:30
โข SMS at 13:30
โข Email at 13:30
โข Customer calling back
InVoice->>OpenAI: Connect with full context
OpenAI->>Customer: ๐ฃ๏ธ "Hi! Thanks for calling about your appointment today..."
Customer->>OpenAI: ๐ฃ๏ธ "Yes, I can confirm. I'll be home at 4pm."
OpenAI->>Customer: ๐ฃ๏ธ "Perfect! We'll see you between 4-5pm."
OpenAI-->>InVoice: Call completed
InVoice->>DB: Update: confirmed_via_callback, workflow_complete
Note right of DB: Status: confirmed
Channel: voice_inbound
Note over InVoice,DB: โ
SUCCESS - Workflow Complete
end
Flexibility: Customer can respond via ANY channel. If they call instead of texting, the inbound voice agent has complete context about all contact attempts.
Customer can respond to the email sent in Step 1. The inbound email agent processes their reply, analyzes the content, and has full context about all previous contact attempts. Customer may also choose to call back instead of replying via email.
sequenceDiagram
participant Customer as ๐ค Customer
participant Graph as ๐ฎ MS Graph
participant InEmail as ๐ฅ Inbound Email
participant Twilio as ๐ฑ Twilio
participant InVoice as ๐ฅ Inbound Voice
participant DB as ๐๏ธ MongoDB State
participant OpenAI as ๐ค OpenAI Realtime
Note over Customer,InEmail: Customer received: Voicemail + SMS + Email
alt Customer Replies to Email
Customer->>Graph: ๐ง Reply: "Yes, I'll be there"
Graph->>InEmail: Incoming email webhook
InEmail->>DB: Load communication state
DB-->>InEmail: Full context loaded
Note right of DB: Complete history:
โข Voicemail at 13:30
โข SMS at 13:30
โข Email at 13:30
โข Customer replied
InEmail->>InEmail: Analyze email: confirmation detected
InEmail->>Graph: Send auto-reply
Graph->>Customer: ๐ง "Thank you! Confirmed for 4-5pm. Tech Tom will arrive with $89 trip charge."
InEmail->>DB: Update: confirmed_via_email, workflow_complete
Note right of DB: Status: confirmed
Channel: email_inbound
Response time: 35 min
Note over InEmail,DB: โ
SUCCESS - Workflow Complete
else Customer Calls Instead of Email Reply
Customer->>Twilio: ๐ Calls company number
Twilio->>InVoice: Incoming call
InVoice->>DB: Load communication state
DB-->>InVoice: Complete communication history
Note right of DB: All channels contacted:
โข Voicemail at 13:30
โข SMS at 13:30
โข Email at 13:30
โข Customer calling back
InVoice->>OpenAI: Connect with full context
OpenAI->>Customer: ๐ฃ๏ธ "Thanks for calling back about your appointment today!"
Customer->>OpenAI: ๐ฃ๏ธ "Yes, I confirm, I'll be there at 4pm."
OpenAI->>Customer: ๐ฃ๏ธ "Perfect! We'll see you between 4-5pm."
OpenAI-->>InVoice: Call completed
InVoice->>DB: Update: confirmed_via_callback, workflow_complete
Note right of DB: Status: confirmed
Channel: voice_inbound
Note over InVoice,DB: โ
SUCCESS - Workflow Complete
end
Multi-Channel Advantage: Customer has multiple convenient ways to respond (voice, SMS, or email). Whichever channel they choose, the system has complete context and provides seamless confirmation.
The Communication Orchestrator triggers parallel multi-channel communication (Voicemail + SMS + Email) simultaneously when customer doesn't answer, maximizing response opportunities and convenience.
Customer chooses their preferred response channel - voice call back, SMS reply, or email response. The system maintains complete context regardless of which channel they select.
MongoDB stores complete communication state including all contact attempts (voicemail, SMS, email) at same timestamp. Every interaction updates state in real-time for perfect context awareness across channels.
When customers don't answer calls, the Outbound Voice Agent uses Voice Activity DetectionDetects when a person starts/stops speaking to identify voicemail systems and leaves contextual messages. This ensures customers receive appointment information even when unavailable to answer.
Example Voicemail Message:
"Hi, this is Jessica from Sears Home Services calling about your dishwasher appointment today between 4 and 5 PM. Please call us back at 1-800-4-MY-HOME to confirm. Thanks!"
The Conversation State Manager ensures seamless context flow across all channels using Communication StateComplete context of all customer interactions. When a customer is contacted via voicemail, SMS, and email simultaneously, all subsequent interactionsโregardless of channelโhave complete visibility into previous contact attempts, enabling natural and personalized conversations.
MongoDB stores complete communication state for every customer interaction:
{
"conversation_id": "conv_abc123",
"customer_id": "cust_xyz789",
"communication_type": "pre_call_confirmation",
"communication_state": "in_progress",
"timeline": [
{
"timestamp": "2024-01-15T13:30:00Z",
"channel": "voice_outbound",
"action": "call_initiated",
"outcome": "voicemail_left",
"duration_seconds": 22,
"message": "Left voicemail about 4-5pm appointment"
},
{
"timestamp": "2024-01-15T13:30:00Z",
"channel": "sms_outbound",
"action": "sms_sent",
"message": "Confirmation SMS sent",
"content": "Hi Sarah, please confirm your 4-5pm..."
},
{
"timestamp": "2024-01-15T13:30:00Z",
"channel": "email_outbound",
"action": "email_sent",
"message": "Confirmation email sent",
"subject": "Appointment Confirmation - Please Respond"
},
{
"timestamp": "2024-01-15T15:45:00Z",
"channel": "voice_inbound",
"action": "customer_callback",
"outcome": "confirmed",
"duration_seconds": 95
}
],
"context": {
"last_contact_channel": "voice_inbound",
"last_contact_time": "2024-01-15T15:45:00Z",
"attempts": {
"voice": 2,
"sms": 1,
"email": 1
},
"customer_preferences": {
"preferred_channel": "voice",
"callback_number": "+1234567890"
},
"sentiment": "positive",
"intent": "appointment_confirmed"
},
"appointment_data": {
"order_number": "0008169-12407837",
"scheduled_window": "16:00-17:00",
"technician": "Tom",
"service_type": "dishwasher_repair"
}
}
sequenceDiagram
participant Customer as ๐ค Customer
participant Twilio as ๐ฑ Twilio
participant Webhook as ๐ Webhook
participant InVoice as ๐ฅ Inbound Voice
participant DB as ๐๏ธ MongoDB
participant KB as ๐ Knowledge Base
participant OpenAI as ๐ค OpenAI Realtime
participant SMS as ๐ฌ SMS Agent
participant Email as ๐ง Email Agent
Customer->>Twilio: ๐ Calls company number
Twilio->>Webhook: Incoming call webhook
Webhook->>InVoice: Trigger inbound agent
InVoice->>DB: Lookup by phone: +1234567890
DB-->>InVoice: Customer ID: cust_xyz789
InVoice->>DB: Get conversation history
DB-->>InVoice: โข AI called 2 hours ago for confirmation
โข Left voicemail: "Appt tomorrow 2-4pm"
โข Sent SMS + Email at same time
โข Customer calling back
InVoice->>KB: Search relevant context
KB-->>InVoice: Rescheduling policies loaded
InVoice->>OpenAI: Connect with full context
Note right of OpenAI: Context includes:
โข Confirmation call attempt
โข Voicemail + SMS + Email sent
โข Appt: Tomorrow 2-4pm
โข Customer history + policies
OpenAI->>Twilio: Ready for conversation
Twilio->>Customer: โ Connected
OpenAI->>Customer: ๐ฃ๏ธ "Hi! Thanks for calling back. I see we reached out about confirming your appointment tomorrow 2-4pm..."
Customer->>OpenAI: ๐ฃ๏ธ "Yes, I got your message. I need to reschedule - I can't make it tomorrow"
OpenAI->>Customer: ๐ฃ๏ธ "I can help! What date works better?"
Customer->>OpenAI: ๐ฃ๏ธ "Next Thursday if possible"
OpenAI->>DB: Check availability
DB-->>OpenAI: Thursday 2-4pm available โ
OpenAI->>Customer: ๐ฃ๏ธ "Thursday 2-4pm is open, should I confirm?"
Customer->>OpenAI: ๐ฃ๏ธ "Perfect!"
Note over OpenAI,SMS: Mid-call actions
OpenAI->>SMS: Send confirmation SMS
SMS->>Customer: ๐ฑ "Rescheduled to Thu 2-4pm. Confirmation details sent."
OpenAI->>Email: Send confirmation email
Email->>Customer: ๐ง "Appointment Confirmation Details"
OpenAI->>Customer: ๐ฃ๏ธ "Sent confirmation via SMS and email. Anything else?"
Customer->>OpenAI: ๐ฃ๏ธ "No, that's all. Thanks!"
OpenAI-->>InVoice: Call complete
InVoice->>DB: Update conversation
Note right of DB: โข Call duration: 1m 45s
โข Outcome: Rescheduled
โข Sentiment: Positive
โข Next action: None
Real example: Sarah's dishwasher repair with Home Manager Jessica and Technician Tom. Shows the complete flow from confirmation call to service completion.
graph LR
A[๐ง Jessica
Home Manager calls] --> B[๐ Hi Sarah, about
your dishwasher today]
B --> C[โ Narrow window:
1-5pm โ 4-5pm]
C --> D[โ Tech Tom
will arrive 4-5pm]
D --> E[โ Confirm: Adult 18+,
pets secured, payment]
E --> F[โ Promise: I'll call
when he's 30 min away]
F --> G[๐ฌ Send confirmation
SMS immediately]
G --> H[๐พ Log: confirmed,
customer prepared]
style A fill:#e8f0f5,stroke:#1e4d4d,stroke-width:2px
style C fill:#f5f0e8,stroke:#8b6f47,stroke-width:2px
style G fill:#f5e8eb,stroke:#b76e79,stroke-width:2px
style H fill:#e8f5f0,stroke:#2d5f4f,stroke-width:2px
Key: Home Manager Jessica (not the technician) makes this 4-minute call to confirm appointment, narrow the window, and prepare customer. Call duration: 4 min 10 sec.
graph LR
A[โฐ 30 minutes
before arrival] --> B[๐ง Jessica calls
Sarah again]
B --> C[๐ Tom finishing up,
heading your way]
C --> D[๐ Should arrive
around 4pm]
D --> E[๐ค Customer:
All set! Cat's ready]
E --> F[๐พ Log: on_way_call
32 seconds]
style A fill:#f5f0e8,stroke:#8b6f47,stroke-width:2px
style B fill:#e8f0f5,stroke:#1a1f3a,stroke-width:2px
style E fill:#e8f5f0,stroke:#2d5f4f,stroke-width:2px
style F fill:#e8f5f0,stroke:#1e4d4d,stroke-width:2px
Key: Jessica (Home Manager) makes second brief call. Tech Tom does NOT call - he just drives to location. Call duration: 32 seconds.
graph LR
A[๐ Tom arrives
NO CALL NEEDED] --> B[๐ช Knocks on
door]
B --> C[๐ค Sarah answers:
Hi, you must be Tom]
C --> D[๐ง Tom: Thanks for
securing cat!]
D --> E[๐ง Diagnoses:
Failed drain pump]
E --> F[๐ต Quote: $285 total
$89 + $196]
F --> G[โ
Sarah approves
immediately]
G --> H[๐ง Repair completed
tested working]
H --> I[๐ณ Payment via
credit card]
I --> J[๐ฑ Auto SMS follow-up:
How did we do?]
style A fill:#e8f0f5,stroke:#1e4d4d,stroke-width:2px
style E fill:#f5f0e8,stroke:#8b6f47,stroke-width:2px
style G fill:#e8f5f0,stroke:#2d5f4f,stroke-width:2px
style J fill:#f5e8eb,stroke:#b76e79,stroke-width:2px
Result: No confusion, no disputes, smooth process. Customer was prepared because Jessica handled all communications. Total service time: 1hr 12min.
AI embodies Jessica the Home Manager - makes ALL customer calls. Technicians NEVER call customers, they focus 100% on service delivery.
Covers appointment confirmations, reminders, en route notifications, follow-ups, estimates, rescheduling, complaints, payments, parts delays, maintenance offers, referrals, and general inquiries.
Customer prepared before tech arrives: knows timing, payment ($89 applied), has pets secured. Result: smooth service, no disputes, faster completion.
The system handles all customer communications across the service lifecycle with 60+ pre-written templates optimized for voice, SMS, email, and web chat channels.
graph TB
subgraph internet["๐ Internet"]
Users["๐ฅ Users
(Customers)"]
Twilio["๐ฑ Twilio
Voice/SMS"]
OpenAI["๐ค OpenAI
Realtime API"]
MSGraph["๐ฎ Microsoft
Graph API"]
end
subgraph aws["โ๏ธ AWS Cloud"]
subgraph network["VPC - Private Network"]
ALB["โ๏ธ Application Load Balancer
โโโโโโโโโโโโโโ
1. SSL/TLS Termination
2. Health Checks
3. Auto-scaling"]
subgraph compute["Compute Layer (ECS/Fargate)"]
API1["๐ FastAPI Instance 1
โโโโโโโโโโโโโโ
1. WebSocket Support
2. Async I/O
3. Agent Orchestration"]
API2["๐ FastAPI Instance 2
โโโโโโโโโโโโโโ
1. WebSocket Support
2. Async I/O
3. Agent Orchestration"]
API3["๐ FastAPI Instance N
โโโโโโโโโโโโโโ
1. Auto-scaled
2. On-demand"]
end
subgraph cache["Caching Layer"]
Redis["๐ด Redis
โโโโโโโโโโ
1. Session Cache
2. Rate Limiting
3. Queue System"]
end
end
S3["๐ฆ S3 Bucket
โโโโโโโโโโ
1. Call Recordings
2. Attachments
3. Static Assets"]
end
subgraph database["๐๏ธ Database Layer"]
MongoDB["๐ MongoDB Atlas
(Managed Cluster)
โโโโโโโโโโโโโโ
1. Conversations
2. Customers
3. Interactions
4. Context
5. Analytics
6. Knowledge_base"]
end
subgraph monitoring["๐ Monitoring & Logging"]
CloudWatch["๐ AWS CloudWatch
โโโโโโโโโโ
1. Logs
2. Metrics
3. Alarms"]
Grafana["๐ KAIros Pulse
โโโโโโโโโโ
1. Dashboard
2. Visualization"]
end
Users -->|HTTPS| ALB
Twilio -->|Webhooks| ALB
ALB --> API1
ALB --> API2
ALB --> API3
API1 <--> Redis
API2 <--> Redis
API3 <--> Redis
API1 --> MongoDB
API2 --> MongoDB
API3 --> MongoDB
API1 --> S3
API2 --> S3
API3 --> S3
API1 <-->|WebSocket| OpenAI
API2 <-->|WebSocket| OpenAI
API1 <-->|REST API| Twilio
API2 <-->|REST API| Twilio
API1 <-->|REST API| MSGraph
API2 <-->|REST API| MSGraph
API1 --> CloudWatch
API2 --> CloudWatch
API3 --> CloudWatch
CloudWatch --> Grafana
style internet fill:#e8f0f5,stroke:#1e4d4d,stroke-width:2px
style aws fill:#f5f0e8,stroke:#8b6f47,stroke-width:3px
style network fill:#f0e8f0,stroke:#4a2c5c,stroke-width:2px
style compute fill:#e8f5f0,stroke:#2d5f4f,stroke-width:2px
style cache fill:#f5e8eb,stroke:#6b2c3e,stroke-width:2px
style database fill:#f5f0e0,stroke:#c9a961,stroke-width:3px
style monitoring fill:#e8f0f0,stroke:#1e4d4d,stroke-width:2px
Click on any term to expand and view its definition, purpose, and technical details.
Definition: XML-based language for controlling phone calls and SMS messages in Twilio.
Purpose: Instructs Twilio on how to handle incoming/outgoing calls and messages.
In This System:
Definition: Full-duplex communication protocol for real-time bidirectional data transfer.
Purpose: Enables persistent, low-latency connection between server and client.
In This System:
Definition: Low-latency speech-to-speech conversational AI API powered by GPT-4o.
Purpose: Enables natural voice conversations with AI without separate STT/TTS.
Key Features:
Definition: Cloud-hosted NoSQL database service for flexible data storage.
Purpose: Stores conversation state, customer data, and interaction history.
In This System:
Definition: OpenAI capability that allows AI to invoke external functions/tools.
Purpose: Enables AI to take actions like sending SMS, emails, or updating databases.
In This System:
Definition: Technology that detects when a person starts and stops speaking.
Purpose: Enables natural conversation flow without explicit turn-taking signals.
In This System:
Definition: Central agent that manages multi-step workflows across channels.
Purpose: Coordinates timing, sequencing, and routing of communications.
Responsibilities:
Definition: Complete context of all customer interactions across all channels.
Purpose: Ensures seamless experience regardless of which channel customer uses.
Includes:
Definition: Unified API for accessing Microsoft 365 services including Outlook email.
Purpose: Send and receive emails through Microsoft/Outlook infrastructure.
In This System:
Definition: Amazon Web Services monitoring and observability platform.
Purpose: Track system performance, logs, and metrics in real-time.
In This System:
Definition: Custom dashboard for visualizing system metrics and performance.
Purpose: Provide real-time visibility into AI agent operations and customer interactions.
Displays:
Definition: Twilio feature for streaming live audio from phone calls.
Purpose: Enable real-time audio processing and AI voice interactions.
In This System: