In Part 6, we gave our agents expertise by wiring up a knowledge base with RAG. The agents now know what they’re talking about — they can answer domain-specific questions, recall relevant context, and stay grounded. But all of that intelligence is still running headless. It’s time to build the interfaces that candidates actually see and use.
This post covers the client side in full: the React web client, the React Native mobile app, a Flutter alternative, and all the cross-platform nuances that will trip you up if you’re not prepared. By the end, you’ll have a production-ready client that handles audio permissions, reconnection, mobile edge cases, and a UI that makes candidates feel at ease rather than anxious.
The Client Architecture Overview
Before writing any code, it helps to understand what the client is responsible for versus what the server handles.
The server side — which we covered in Parts 3 through 6 — runs the LiveKit SFU, the agent pipeline, the RAG system, and the evaluation logic. The client’s job is simpler but not easy:
- Acquire a token from your API
- Connect to the LiveKit room
- Capture microphone audio and send it
- Receive and play the agent’s audio back
- Show meaningful UI state (connecting, listening, thinking, speaking)
- Handle errors and network drops gracefully
What makes this non-trivial is audio — specifically, getting it right across browsers, iOS, Android, and varying network conditions.
Web Client with LiveKit React Components
LiveKit’s React SDK is the fastest path to a working web client. Install the dependencies:
npm install @livekit/components-react @livekit/components-styles livekit-client
The core of your interview room component:
// src/components/InterviewRoom.tsx
import {
LiveKitRoom,
useVoiceAssistant,
BarVisualizer,
RoomAudioRenderer,
VoiceAssistantControlBar,
} from "@livekit/components-react";
import "@livekit/components-styles";
import { useState, useEffect, useCallback } from "react";
interface InterviewRoomProps {
candidateName: string;
jobRole: string;
sessionId: string;
}
export function InterviewRoom({ candidateName, jobRole, sessionId }: InterviewRoomProps) {
const [token, setToken] = useState<string | null>(null);
const [serverUrl, setServerUrl] = useState<string | null>(null);
const [error, setError] = useState<string | null>(null);
useEffect(() => {
async function fetchToken() {
try {
const res = await fetch("/api/interview/token", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ candidateName, jobRole, sessionId }),
});
if (!res.ok) throw new Error("Failed to get token");
const data = await res.json();
setToken(data.token);
setServerUrl(data.serverUrl);
} catch (err) {
setError("Could not start interview. Please try again.");
}
}
fetchToken();
}, [candidateName, jobRole, sessionId]);
if (error) return <div className="error-state">{error}</div>;
if (!token || !serverUrl) return <LoadingState />;
return (
<LiveKitRoom
token={token}
serverUrl={serverUrl}
connect={true}
audio={true}
video={false}
onDisconnected={() => console.log("Disconnected from interview")}
onError={(err) => console.error("LiveKit error:", err)}
>
<InterviewInterface candidateName={candidateName} jobRole={jobRole} />
<RoomAudioRenderer />
</LiveKitRoom>
);
}
The RoomAudioRenderer component is easy to miss but critical — it’s what actually plays the agent’s audio. Without it, you’ll be in the room but in total silence.
The Interview Interface Component
The inner interface uses the useVoiceAssistant hook, which gives you real-time state about what the AI agent is doing:
// src/components/InterviewInterface.tsx
import {
useVoiceAssistant,
BarVisualizer,
useLocalParticipant,
} from "@livekit/components-react";
import { useState, useEffect, useRef } from "react";
import { AgentState } from "@livekit/components-react";
interface InterviewInterfaceProps {
candidateName: string;
jobRole: string;
}
export function InterviewInterface({ candidateName, jobRole }: InterviewInterfaceProps) {
const { state, audioTrack } = useVoiceAssistant();
const { localParticipant } = useLocalParticipant();
const [elapsedTime, setElapsedTime] = useState(0);
const [currentSection, setCurrentSection] = useState(0);
const [isMuted, setIsMuted] = useState(false);
const timerRef = useRef<ReturnType<typeof setInterval> | null>(null);
const sections = ["Introduction", "Technical Skills", "Problem Solving", "Culture Fit", "Q&A"];
const totalDuration = 45 * 60; // 45 minutes
useEffect(() => {
timerRef.current = setInterval(() => {
setElapsedTime((prev) => {
const next = prev + 1;
// Estimate section progress based on time
const sectionDuration = totalDuration / sections.length;
setCurrentSection(Math.min(Math.floor(next / sectionDuration), sections.length - 1));
return next;
});
}, 1000);
return () => {
if (timerRef.current) clearInterval(timerRef.current);
};
}, []);
const toggleMute = useCallback(async () => {
if (!localParticipant) return;
const newMuteState = !isMuted;
await localParticipant.setMicrophoneEnabled(!newMuteState);
setIsMuted(newMuteState);
}, [localParticipant, isMuted]);
const formatTime = (seconds: number) => {
const m = Math.floor(seconds / 60).toString().padStart(2, "0");
const s = (seconds % 60).toString().padStart(2, "0");
return `${m}:${s}`;
};
return (
<div className="interview-container">
{/* Header */}
<div className="interview-header">
<div className="candidate-info">
<h2>{candidateName}</h2>
<span className="job-role">{jobRole} Interview</span>
</div>
<div className="interview-timer">
<span className={elapsedTime > totalDuration * 0.9 ? "timer-warning" : ""}>
{formatTime(elapsedTime)}
</span>
</div>
</div>
{/* Section Progress Bar */}
<div className="section-progress">
{sections.map((section, index) => (
<div
key={section}
className={`section-pill ${
index < currentSection
? "completed"
: index === currentSection
? "active"
: "upcoming"
}`}
>
{section}
</div>
))}
</div>
{/* Agent State Indicator */}
<div className="agent-state-container">
<AgentStateDisplay state={state} />
{audioTrack && (
<BarVisualizer
state={state}
trackRef={audioTrack}
barCount={20}
options={{ minHeight: 4, maxHeight: 60 }}
className="agent-visualizer"
/>
)}
</div>
{/* Controls */}
<div className="controls">
<button
onClick={toggleMute}
className={`mute-button ${isMuted ? "muted" : ""}`}
aria-label={isMuted ? "Unmute microphone" : "Mute microphone"}
>
{isMuted ? "Unmute" : "Mute"}
</button>
</div>
</div>
);
}
function AgentStateDisplay({ state }: { state: AgentState }) {
const stateConfig = {
[AgentState.Disconnected]: { label: "Disconnected", color: "#ef4444", pulse: false },
[AgentState.Connecting]: { label: "Connecting...", color: "#f59e0b", pulse: true },
[AgentState.Initializing]: { label: "Preparing...", color: "#f59e0b", pulse: true },
[AgentState.Listening]: { label: "Listening", color: "#22c55e", pulse: true },
[AgentState.Thinking]: { label: "Thinking...", color: "#3b82f6", pulse: true },
[AgentState.Speaking]: { label: "Speaking", color: "#8b5cf6", pulse: false },
};
const config = stateConfig[state] ?? { label: "Unknown", color: "#6b7280", pulse: false };
return (
<div className="agent-state" style={{ color: config.color }}>
<span className={`state-dot ${config.pulse ? "pulse" : ""}`} />
<span className="state-label">{config.label}</span>
</div>
);
}
The AgentState enum from LiveKit tracks exactly what the AI is doing at any given moment. Showing this state clearly is the difference between a candidate who feels confused (“is it listening to me?”) and one who feels confident in the system.
Token Generation and Room Management API
Your backend needs a token endpoint. Here’s a minimal FastAPI implementation:
# api/interview/token.py
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
from livekit import api
import os
import uuid
router = APIRouter()
class TokenRequest(BaseModel):
candidate_name: str
job_role: str
session_id: str
@router.post("/api/interview/token")
async def get_interview_token(request: TokenRequest):
livekit_api_key = os.environ["LIVEKIT_API_KEY"]
livekit_api_secret = os.environ["LIVEKIT_API_SECRET"]
livekit_url = os.environ["LIVEKIT_URL"]
room_name = f"interview-{request.session_id}"
token = (
api.AccessToken(livekit_api_key, livekit_api_secret)
.with_identity(f"candidate-{request.session_id}")
.with_name(request.candidate_name)
.with_grants(
api.VideoGrants(
room_join=True,
room=room_name,
can_publish=True,
can_subscribe=True,
)
)
.to_jwt()
)
# Also dispatch the agent to join this room
await dispatch_agent(room_name, request.job_role, request.session_id)
return {
"token": token,
"serverUrl": livekit_url,
"roomName": room_name,
}
async def dispatch_agent(room_name: str, job_role: str, session_id: str):
"""Tell the agent worker to join this room"""
lk_api = api.LiveKitAPI(
url=os.environ["LIVEKIT_URL"],
api_key=os.environ["LIVEKIT_API_KEY"],
api_secret=os.environ["LIVEKIT_API_SECRET"],
)
await lk_api.agent_dispatch.create_dispatch(
api.CreateAgentDispatchRequest(
agent_name="interview-agent",
room=room_name,
metadata=f'{{"job_role": "{job_role}", "session_id": "{session_id}"}}',
)
)
await lk_api.aclose()
Audio Visualization Beyond BarVisualizer
LiveKit’s BarVisualizer works well, but you might want custom visualization for your brand. Here’s a canvas-based circular visualizer using the raw audio track:
// src/components/CircularVisualizer.tsx
import { useEffect, useRef } from "react";
import type { TrackReferenceOrPlaceholder } from "@livekit/components-react";
import { Track } from "livekit-client";
interface CircularVisualizerProps {
trackRef: TrackReferenceOrPlaceholder;
size?: number;
}
export function CircularVisualizer({ trackRef, size = 200 }: CircularVisualizerProps) {
const canvasRef = useRef<HTMLCanvasElement>(null);
const animRef = useRef<number | null>(null);
const analyzerRef = useRef<AnalyserNode | null>(null);
useEffect(() => {
const track = trackRef.publication?.track;
if (!track || track.kind !== Track.Kind.Audio) return;
const audioContext = new AudioContext();
const source = audioContext.createMediaStreamSource(
new MediaStream([track.mediaStreamTrack])
);
const analyzer = audioContext.createAnalyser();
analyzer.fftSize = 256;
source.connect(analyzer);
analyzerRef.current = analyzer;
const canvas = canvasRef.current;
if (!canvas) return;
const ctx = canvas.getContext("2d")!;
const bufferLength = analyzer.frequencyBinCount;
const dataArray = new Uint8Array(bufferLength);
function draw() {
animRef.current = requestAnimationFrame(draw);
analyzer.getByteFrequencyData(dataArray);
ctx.clearRect(0, 0, size, size);
const centerX = size / 2;
const centerY = size / 2;
const radius = size * 0.3;
const bars = 60;
for (let i = 0; i < bars; i++) {
const dataIndex = Math.floor((i / bars) * bufferLength);
const value = dataArray[dataIndex] / 255;
const barHeight = value * (size * 0.2);
const angle = (i / bars) * Math.PI * 2 - Math.PI / 2;
const x1 = centerX + Math.cos(angle) * radius;
const y1 = centerY + Math.sin(angle) * radius;
const x2 = centerX + Math.cos(angle) * (radius + barHeight);
const y2 = centerY + Math.sin(angle) * (radius + barHeight);
ctx.beginPath();
ctx.moveTo(x1, y1);
ctx.lineTo(x2, y2);
ctx.strokeStyle = `hsl(${260 + value * 60}, 80%, ${50 + value * 20}%)`;
ctx.lineWidth = 3;
ctx.lineCap = "round";
ctx.stroke();
}
}
draw();
return () => {
if (animRef.current) cancelAnimationFrame(animRef.current);
audioContext.close();
};
}, [trackRef, size]);
return <canvas ref={canvasRef} width={size} height={size} />;
}
React Native with @livekit/react-native
The React Native SDK mirrors the web SDK closely, but with important differences around native modules and audio sessions.
npm install @livekit/react-native @livekit/react-native-webrtc
npx pod-install # iOS only
You need to register the WebRTC native module before using any LiveKit code:
// index.js or App.tsx — must be first
import { registerGlobals } from "@livekit/react-native-webrtc";
registerGlobals();
Audio Permissions on iOS and Android
Audio permissions are where React Native interviews fall apart if you’re not careful. Here’s a complete permission handler:
// src/hooks/useAudioPermissions.ts
import { useState, useEffect } from "react";
import { Platform, PermissionsAndroid, Alert, Linking } from "react-native";
export type PermissionStatus = "checking" | "granted" | "denied" | "blocked";
export function useAudioPermissions(): {
status: PermissionStatus;
request: () => Promise<boolean>;
} {
const [status, setStatus] = useState<PermissionStatus>("checking");
useEffect(() => {
checkPermission();
}, []);
async function checkPermission() {
if (Platform.OS === "android") {
const result = await PermissionsAndroid.check(
PermissionsAndroid.PERMISSIONS.RECORD_AUDIO
);
setStatus(result ? "granted" : "denied");
} else {
// iOS — use AVAudioSession or react-native-permissions
const { check, PERMISSIONS } = await import("react-native-permissions");
const result = await check(PERMISSIONS.IOS.MICROPHONE);
setStatus(
result === "granted"
? "granted"
: result === "blocked"
? "blocked"
: "denied"
);
}
}
async function request(): Promise<boolean> {
if (Platform.OS === "android") {
const result = await PermissionsAndroid.request(
PermissionsAndroid.PERMISSIONS.RECORD_AUDIO,
{
title: "Microphone Permission",
message: "This interview requires access to your microphone.",
buttonPositive: "Allow",
buttonNegative: "Deny",
}
);
const granted = result === PermissionsAndroid.RESULTS.GRANTED;
setStatus(granted ? "granted" : "denied");
return granted;
} else {
const { request, PERMISSIONS, RESULTS } = await import(
"react-native-permissions"
);
const result = await request(PERMISSIONS.IOS.MICROPHONE);
if (result === RESULTS.BLOCKED) {
setStatus("blocked");
Alert.alert(
"Microphone Blocked",
"Please enable microphone access in Settings to continue the interview.",
[
{ text: "Cancel", style: "cancel" },
{ text: "Open Settings", onPress: () => Linking.openSettings() },
]
);
return false;
}
const granted = result === RESULTS.GRANTED;
setStatus(granted ? "granted" : "denied");
return granted;
}
}
return { status, request };
}
The React Native Interview Screen
// src/screens/InterviewScreen.tsx
import React, { useState, useEffect } from "react";
import { View, Text, TouchableOpacity, StyleSheet, AppState, AppStateStatus } from "react-native";
import {
LiveKitRoom,
useVoiceAssistant,
AudioSession,
} from "@livekit/react-native";
import { useAudioPermissions } from "../hooks/useAudioPermissions";
export function InterviewScreen({ route, navigation }) {
const { candidateName, jobRole, sessionId } = route.params;
const [token, setToken] = useState<string | null>(null);
const [serverUrl, setServerUrl] = useState<string | null>(null);
const { status: permissionStatus, request: requestPermission } = useAudioPermissions();
// Handle iOS audio session for background audio and phone call interruptions
useEffect(() => {
AudioSession.configureAudio({
ios: {
defaultOutput: "speaker",
mixWithOthers: false,
// Allow audio to continue when screen locks
audioMode: "videoRecording",
},
});
AudioSession.startAudioSession();
return () => AudioSession.stopAudioSession();
}, []);
// Handle app state changes (backgrounding, phone calls)
useEffect(() => {
const subscription = AppState.addEventListener("change", handleAppStateChange);
return () => subscription.remove();
}, []);
function handleAppStateChange(nextState: AppStateStatus) {
if (nextState === "background") {
// On Android, audio will cut unless we have foreground service
// On iOS with proper audio session config, it continues
console.log("App backgrounded during interview");
} else if (nextState === "active") {
// Resume — check if we need to reconnect
console.log("App foregrounded");
}
}
useEffect(() => {
if (permissionStatus !== "granted") return;
fetchToken();
}, [permissionStatus]);
async function fetchToken() {
const res = await fetch("https://your-api.com/api/interview/token", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ candidateName, jobRole, sessionId }),
});
const data = await res.json();
setToken(data.token);
setServerUrl(data.serverUrl);
}
if (permissionStatus === "denied" || permissionStatus === "blocked") {
return <PermissionDeniedScreen onRequest={requestPermission} />;
}
if (!token || !serverUrl) return <LoadingScreen />;
return (
<LiveKitRoom
serverUrl={serverUrl}
token={token}
connect={true}
audio={true}
video={false}
onDisconnected={() => navigation.navigate("InterviewComplete", { sessionId })}
>
<MobileInterviewInterface
candidateName={candidateName}
jobRole={jobRole}
/>
</LiveKitRoom>
);
}
function MobileInterviewInterface({ candidateName, jobRole }) {
const { state } = useVoiceAssistant();
return (
<View style={styles.container}>
<View style={styles.header}>
<Text style={styles.title}>{jobRole} Interview</Text>
<Text style={styles.subtitle}>{candidateName}</Text>
</View>
<View style={styles.agentContainer}>
<AgentStateCircle state={state} />
</View>
<Text style={styles.hint}>
{state === "listening"
? "Speak clearly — the interviewer is listening"
: state === "thinking"
? "Processing your response..."
: state === "speaking"
? "Interviewer is speaking"
: "Connecting..."}
</Text>
</View>
);
}
Flutter with livekit_client
Flutter deserves a mention because many enterprise mobile teams prefer Dart over JavaScript. The Flutter integration is more manual than React Native, but equally capable.
# pubspec.yaml
dependencies:
livekit_client: ^2.3.0
permission_handler: ^11.3.0
// lib/screens/interview_screen.dart
import 'package:flutter/material.dart';
import 'package:livekit_client/livekit_client.dart';
import 'package:permission_handler/permission_handler.dart';
class InterviewScreen extends StatefulWidget {
final String candidateName;
final String jobRole;
final String sessionId;
const InterviewScreen({
super.key,
required this.candidateName,
required this.jobRole,
required this.sessionId,
});
@override
State<InterviewScreen> createState() => _InterviewScreenState();
}
class _InterviewScreenState extends State<InterviewScreen> {
Room? _room;
EventsListener<RoomEvent>? _listener;
RemoteParticipant? _agentParticipant;
String _agentState = 'connecting';
@override
void initState() {
super.initState();
_initInterview();
}
Future<void> _initInterview() async {
// Request permissions
final micStatus = await Permission.microphone.request();
if (!micStatus.isGranted) {
setState(() => _agentState = 'permission_denied');
return;
}
// Fetch token
final tokenData = await _fetchToken();
// Create and connect room
final room = Room();
_listener = room.createListener();
_listener!
..on<ParticipantConnectedEvent>((event) {
if (event.participant.identity.startsWith('agent-')) {
setState(() => _agentParticipant = event.participant as RemoteParticipant);
}
})
..on<DataReceivedEvent>((event) {
// Agent sends state updates via data channel
final message = String.fromCharCodes(event.data);
final data = jsonDecode(message);
if (data['type'] == 'agent_state') {
setState(() => _agentState = data['state']);
}
});
await room.connect(
tokenData['serverUrl'],
tokenData['token'],
roomOptions: const RoomOptions(
defaultAudioCaptureOptions: AudioCaptureOptions(
noiseSuppression: true,
echoCancellation: true,
autoGainControl: true,
),
),
);
// Enable microphone
await room.localParticipant?.setMicrophoneEnabled(true);
setState(() => _room = room);
}
Future<Map<String, dynamic>> _fetchToken() async {
// ... HTTP call to your token endpoint
return {};
}
@override
Widget build(BuildContext context) {
return Scaffold(
backgroundColor: const Color(0xFF0F0F1A),
body: SafeArea(
child: Column(
children: [
_buildHeader(),
Expanded(child: _buildAgentState()),
_buildHintText(),
_buildControls(),
],
),
),
);
}
Widget _buildAgentState() {
final colors = {
'listening': const Color(0xFF22c55e),
'thinking': const Color(0xFF3b82f6),
'speaking': const Color(0xFF8b5cf6),
'connecting': const Color(0xFFf59e0b),
};
final color = colors[_agentState] ?? const Color(0xFF6b7280);
return Center(
child: AnimatedContainer(
duration: const Duration(milliseconds: 300),
width: 120,
height: 120,
decoration: BoxDecoration(
shape: BoxShape.circle,
color: color.withOpacity(0.2),
border: Border.all(color: color, width: 3),
),
child: Icon(
_agentState == 'listening'
? Icons.mic
: _agentState == 'speaking'
? Icons.volume_up
: Icons.psychology,
color: color,
size: 48,
),
),
);
}
@override
void dispose() {
_listener?.dispose();
_room?.disconnect();
_room?.dispose();
super.dispose();
}
}
Reconnection Strategies for Network Drops
Network reliability is your biggest enemy in mobile interviews. Cellular drops, WiFi handoffs, and corporate firewalls all conspire against you. Here’s how to handle it:
LiveKit’s Built-in Reconnection
LiveKit handles basic reconnection automatically, but you need to configure it and handle the UI states:
// web client reconnection handling
import { RoomEvent, ConnectionState } from "livekit-client";
import { useRoomContext } from "@livekit/components-react";
import { useEffect, useState } from "react";
export function useReconnectionHandler() {
const room = useRoomContext();
const [connectionState, setConnectionState] = useState<ConnectionState>(
ConnectionState.Connected
);
const [reconnectAttempt, setReconnectAttempt] = useState(0);
useEffect(() => {
function handleStateChange(state: ConnectionState) {
setConnectionState(state);
if (state === ConnectionState.Reconnecting) {
setReconnectAttempt((prev) => prev + 1);
} else if (state === ConnectionState.Connected) {
setReconnectAttempt(0);
}
}
room.on(RoomEvent.ConnectionStateChanged, handleStateChange);
return () => room.off(RoomEvent.ConnectionStateChanged, handleStateChange);
}, [room]);
return { connectionState, reconnectAttempt };
}
Configure the room with generous reconnection timeouts — candidates on mobile often have brief gaps in coverage:
<LiveKitRoom
token={token}
serverUrl={serverUrl}
connect={true}
options={{
reconnectPolicy: {
maxRetries: 5,
minDelay: 1000,
maxDelay: 10000,
backoffMultiplier: 2,
},
webSocketTimeout: 15000,
peerConnectionTimeout: 15000,
}}
>
Reconnection UI Banner
function ReconnectionBanner() {
const { connectionState, reconnectAttempt } = useReconnectionHandler();
if (connectionState === ConnectionState.Connected) return null;
return (
<div className={`reconnection-banner ${connectionState.toLowerCase()}`}>
{connectionState === ConnectionState.Reconnecting && (
<>
<span className="spinner" />
<span>Reconnecting... (attempt {reconnectAttempt}/5)</span>
</>
)}
{connectionState === ConnectionState.Disconnected && (
<>
<span>Connection lost. Your progress has been saved.</span>
<button onClick={() => window.location.reload()}>Rejoin Interview</button>
</>
)}
</div>
);
}
Mobile-Specific Challenges
Background Audio on iOS
iOS aggressively pauses audio when your app goes to background unless you configure the audio session correctly. In React Native:
// For iOS background audio, add these to Info.plist
// UIBackgroundModes: audio, voip
// And configure the audio session:
import { Platform } from "react-native";
import AudioSession from "@livekit/react-native";
if (Platform.OS === "ios") {
AudioSession.configureAudio({
ios: {
audioMode: "voiceChat", // Allows VoIP to continue in background
defaultOutput: "earpiece", // Natural phone call feel
mixWithOthers: false,
},
});
}
Handling Phone Call Interruptions
Nothing is more disruptive than a phone call interrupting your interview. Handle it explicitly:
import { InterruptionType, AudioSession } from "@livekit/react-native";
AudioSession.addEventListener("interruptionEvent", (event) => {
if (event.interruptionType === InterruptionType.Began) {
// Phone call started — mute the candidate to avoid confusion
room.localParticipant?.setMicrophoneEnabled(false);
showNotification("Interview paused — phone call in progress");
} else if (event.interruptionType === InterruptionType.Ended) {
// Call ended — prompt to resume
showResumePrompt();
}
});
Android Foreground Service for Background Audio
On Android, audio processing in the background requires a foreground service. Without it, the OS will kill your audio after a few seconds:
// android/app/src/main/java/com/yourapp/InterviewForegroundService.kt
class InterviewForegroundService : Service() {
override fun onStartCommand(intent: Intent?, flags: Int, startId: Int): Int {
val notification = buildNotification("Interview in progress — tap to return")
startForeground(NOTIFICATION_ID, notification)
return START_STICKY
}
}
PWA vs Native: The Trade-offs
Should you build a PWA or native apps? Here’s what I’ve learned shipping both:
PWA (Web)
Pros: Single codebase, instant updates, no app store friction, works on desktop and mobile. Cons: iOS Safari has historically had poor WebRTC support (improved significantly in iOS 16+), no access to background audio on iOS Safari, limited control over audio session routing.
React Native / Flutter
Pros: Full audio session control, background audio, foreground service support, better reliability on spotty networks, camera access for video interviews. Cons: Separate codebase, app store approval, installation friction.
My recommendation: start with the web client for your MVP. Most interviews happen on desktop, and the web experience is excellent there. Build the native client only when candidates start requesting mobile support, or when you add video features that require more camera control.
If you go native, React Native is the faster choice if your team knows JavaScript. Flutter is worth the Dart learning curve if you’re building a product where UI quality and animation smoothness are core differentiators.
Complete CSS for Interview UI
/* interview.css */
.interview-container {
display: flex;
flex-direction: column;
height: 100vh;
background: var(--color-bg);
color: var(--color-text);
padding: 1.5rem;
max-width: 720px;
margin: 0 auto;
}
.interview-header {
display: flex;
justify-content: space-between;
align-items: flex-start;
margin-bottom: 1.5rem;
}
.interview-timer {
font-family: var(--font-mono);
font-size: 1.5rem;
font-weight: 600;
color: var(--color-text);
}
.timer-warning {
color: #ef4444;
animation: pulse 1s infinite;
}
.section-progress {
display: flex;
gap: 0.5rem;
margin-bottom: 2rem;
flex-wrap: wrap;
}
.section-pill {
padding: 0.25rem 0.75rem;
border-radius: 9999px;
font-size: 0.75rem;
font-weight: 500;
transition: all 0.3s;
}
.section-pill.completed {
background: #22c55e20;
color: #22c55e;
border: 1px solid #22c55e40;
}
.section-pill.active {
background: #3b82f620;
color: #3b82f6;
border: 1px solid #3b82f6;
font-weight: 700;
}
.section-pill.upcoming {
background: var(--color-border);
color: var(--color-text);
opacity: 0.5;
border: 1px solid transparent;
}
.agent-state-container {
flex: 1;
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
gap: 2rem;
}
.agent-state {
display: flex;
align-items: center;
gap: 0.75rem;
font-size: 1.125rem;
font-weight: 600;
}
.state-dot {
width: 12px;
height: 12px;
border-radius: 50%;
background: currentColor;
}
.state-dot.pulse {
animation: pulse 1.5s ease-in-out infinite;
}
.agent-visualizer {
width: 300px;
height: 80px;
}
@keyframes pulse {
0%, 100% { opacity: 1; transform: scale(1); }
50% { opacity: 0.6; transform: scale(0.9); }
}
.reconnection-banner {
position: fixed;
top: 0;
left: 0;
right: 0;
padding: 0.75rem 1rem;
display: flex;
align-items: center;
justify-content: center;
gap: 0.75rem;
font-size: 0.875rem;
font-weight: 500;
z-index: 100;
}
.reconnection-banner.reconnecting {
background: #f59e0b;
color: #000;
}
.reconnection-banner.disconnected {
background: #ef4444;
color: #fff;
}
What We Built
We now have a complete client layer:
- A React web client using
LiveKitRoom,useVoiceAssistant, andBarVisualizer - Agent state visualization with distinct states for listening, thinking, and speaking
- A timer with section progress tracking
- A token generation API that dispatches the agent into the room
- React Native client with audio session management for iOS and Android
- Flutter as an alternative for Dart-native teams
- Reconnection handling for network drops
- Mobile-specific audio: background audio, phone call interruptions, foreground services
The client is the candidate’s only window into the interview. A confusing or unreliable UI undermines trust in the entire process, regardless of how smart the AI underneath is. The investment in getting this layer right pays off in completion rates and candidate satisfaction scores.
In Part 8, we add video to the mix. We’ll explore when video analysis actually helps (live coding interviews, engagement tracking) versus when it’s noise (emotional analysis, lie detection), and how to run Gemini Live’s multimodal sessions without blowing your per-interview cost budget.
This is Part 7 of a 12-part series: The Voice AI Interview Playbook.
Series outline:
- Why Real-Time Voice Changes Everything — The landscape, the vision, and the reference architecture (Part 1)
- Cascaded vs. Speech-to-Speech — Choosing your pipeline architecture (Part 2)
- LiveKit vs. Pipecat vs. Direct — Picking your framework (Part 3)
- STT, LLM, and TTS That Actually Work — Building the voice pipeline (Part 4)
- Multi-Role Agents — Interviewer, coach, and evaluator personas (Part 5)
- Knowledge Base and RAG — Making your voice agent an expert (Part 6)
- Web and Mobile Clients — Cross-platform voice experiences (this post)
- Video Interview Integration — Multimodal analysis with Gemini Live (Part 8)
- Recording, Transcription, and Compliance — GDPR, HIPAA, and getting it right (Part 9)
- Scaling to Thousands — Architecture for concurrent voice sessions (Part 10)
- Cost Optimization — From $0.14/min to $0.03/min (Part 11)
- Multi-Provider Support — OpenAI Realtime, Bedrock Nova, Grok, and the adapter pattern (Part 12)