The Voice AI Interview Playbook: Web and Mobile Clients — Building Cross-Platform Voice Experiences (Part 7 of 12)

In Part 6, we gave our agents expertise by wiring up a knowledge base with RAG. The agents now know what they’re talking about — they can answer domain-specific questions, recall relevant context, and stay grounded. But all of that intelligence is still running headless. It’s time to build the interfaces that candidates actually see and use.

This post covers the client side in full: the React web client, the React Native mobile app, a Flutter alternative, and all the cross-platform nuances that will trip you up if you’re not prepared. By the end, you’ll have a production-ready client that handles audio permissions, reconnection, mobile edge cases, and a UI that makes candidates feel at ease rather than anxious.

The Client Architecture Overview

Before writing any code, it helps to understand what the client is responsible for versus what the server handles.

The server side — which we covered in Parts 3 through 6 — runs the LiveKit SFU, the agent pipeline, the RAG system, and the evaluation logic. The client’s job is simpler but not easy:

Acquire a token from your API
Connect to the LiveKit room
Capture microphone audio and send it
Receive and play the agent’s audio back
Show meaningful UI state (connecting, listening, thinking, speaking)
Handle errors and network drops gracefully

What makes this non-trivial is audio — specifically, getting it right across browsers, iOS, Android, and varying network conditions.

Web Client with LiveKit React Components

LiveKit’s React SDK is the fastest path to a working web client. Install the dependencies:

npm install @livekit/components-react @livekit/components-styles livekit-client

The core of your interview room component:

// src/components/InterviewRoom.tsx
import {
  LiveKitRoom,
  useVoiceAssistant,
  BarVisualizer,
  RoomAudioRenderer,
  VoiceAssistantControlBar,
} from "@livekit/components-react";
import "@livekit/components-styles";
import { useState, useEffect, useCallback } from "react";

interface InterviewRoomProps {
  candidateName: string;
  jobRole: string;
  sessionId: string;
}

export function InterviewRoom({ candidateName, jobRole, sessionId }: InterviewRoomProps) {
  const [token, setToken] = useState<string | null>(null);
  const [serverUrl, setServerUrl] = useState<string | null>(null);
  const [error, setError] = useState<string | null>(null);

  useEffect(() => {
    async function fetchToken() {
      try {
        const res = await fetch("/api/interview/token", {
          method: "POST",
          headers: { "Content-Type": "application/json" },
          body: JSON.stringify({ candidateName, jobRole, sessionId }),
        });
        if (!res.ok) throw new Error("Failed to get token");
        const data = await res.json();
        setToken(data.token);
        setServerUrl(data.serverUrl);
      } catch (err) {
        setError("Could not start interview. Please try again.");
      }
    }
    fetchToken();
  }, [candidateName, jobRole, sessionId]);

  if (error) return <div className="error-state">{error}</div>;
  if (!token || !serverUrl) return <LoadingState />;

  return (
    <LiveKitRoom
      token={token}
      serverUrl={serverUrl}
      connect={true}
      audio={true}
      video={false}
      onDisconnected={() => console.log("Disconnected from interview")}
      onError={(err) => console.error("LiveKit error:", err)}
    >
      <InterviewInterface candidateName={candidateName} jobRole={jobRole} />
      <RoomAudioRenderer />
    </LiveKitRoom>
  );
}

The RoomAudioRenderer component is easy to miss but critical — it’s what actually plays the agent’s audio. Without it, you’ll be in the room but in total silence.

The Interview Interface Component

The inner interface uses the useVoiceAssistant hook, which gives you real-time state about what the AI agent is doing:

// src/components/InterviewInterface.tsx
import {
  useVoiceAssistant,
  BarVisualizer,
  useLocalParticipant,
} from "@livekit/components-react";
import { useState, useEffect, useRef } from "react";
import { AgentState } from "@livekit/components-react";

interface InterviewInterfaceProps {
  candidateName: string;
  jobRole: string;
}

export function InterviewInterface({ candidateName, jobRole }: InterviewInterfaceProps) {
  const { state, audioTrack } = useVoiceAssistant();
  const { localParticipant } = useLocalParticipant();
  const [elapsedTime, setElapsedTime] = useState(0);
  const [currentSection, setCurrentSection] = useState(0);
  const [isMuted, setIsMuted] = useState(false);
  const timerRef = useRef<ReturnType<typeof setInterval> | null>(null);

  const sections = ["Introduction", "Technical Skills", "Problem Solving", "Culture Fit", "Q&A"];
  const totalDuration = 45 * 60; // 45 minutes

  useEffect(() => {
    timerRef.current = setInterval(() => {
      setElapsedTime((prev) => {
        const next = prev + 1;
        // Estimate section progress based on time
        const sectionDuration = totalDuration / sections.length;
        setCurrentSection(Math.min(Math.floor(next / sectionDuration), sections.length - 1));
        return next;
      });
    }, 1000);
    return () => {
      if (timerRef.current) clearInterval(timerRef.current);
    };
  }, []);

  const toggleMute = useCallback(async () => {
    if (!localParticipant) return;
    const newMuteState = !isMuted;
    await localParticipant.setMicrophoneEnabled(!newMuteState);
    setIsMuted(newMuteState);
  }, [localParticipant, isMuted]);

  const formatTime = (seconds: number) => {
    const m = Math.floor(seconds / 60).toString().padStart(2, "0");
    const s = (seconds % 60).toString().padStart(2, "0");
    return `${m}:${s}`;
  };

  return (
    <div className="interview-container">
      {/* Header */}
      <div className="interview-header">
        <div className="candidate-info">
          <h2>{candidateName}</h2>
          <span className="job-role">{jobRole} Interview</span>
        </div>
        <div className="interview-timer">
          <span className={elapsedTime > totalDuration * 0.9 ? "timer-warning" : ""}>
            {formatTime(elapsedTime)}
          </span>
        </div>
      </div>

      {/* Section Progress Bar */}
      <div className="section-progress">
        {sections.map((section, index) => (
          <div
            key={section}
            className={`section-pill ${
              index < currentSection
                ? "completed"
                : index === currentSection
                ? "active"
                : "upcoming"
            }`}
          >
            {section}
          </div>
        ))}
      </div>

      {/* Agent State Indicator */}
      <div className="agent-state-container">
        <AgentStateDisplay state={state} />
        {audioTrack && (
          <BarVisualizer
            state={state}
            trackRef={audioTrack}
            barCount={20}
            options={{ minHeight: 4, maxHeight: 60 }}
            className="agent-visualizer"
          />
        )}
      </div>

      {/* Controls */}
      <div className="controls">
        <button
          onClick={toggleMute}
          className={`mute-button ${isMuted ? "muted" : ""}`}
          aria-label={isMuted ? "Unmute microphone" : "Mute microphone"}
        >
          {isMuted ? "Unmute" : "Mute"}
        </button>
      </div>
    </div>
  );
}

function AgentStateDisplay({ state }: { state: AgentState }) {
  const stateConfig = {
    [AgentState.Disconnected]: { label: "Disconnected", color: "#ef4444", pulse: false },
    [AgentState.Connecting]: { label: "Connecting...", color: "#f59e0b", pulse: true },
    [AgentState.Initializing]: { label: "Preparing...", color: "#f59e0b", pulse: true },
    [AgentState.Listening]: { label: "Listening", color: "#22c55e", pulse: true },
    [AgentState.Thinking]: { label: "Thinking...", color: "#3b82f6", pulse: true },
    [AgentState.Speaking]: { label: "Speaking", color: "#8b5cf6", pulse: false },
  };

  const config = stateConfig[state] ?? { label: "Unknown", color: "#6b7280", pulse: false };

  return (
    <div className="agent-state" style={{ color: config.color }}>
      <span className={`state-dot ${config.pulse ? "pulse" : ""}`} />
      <span className="state-label">{config.label}</span>
    </div>
  );
}

The AgentState enum from LiveKit tracks exactly what the AI is doing at any given moment. Showing this state clearly is the difference between a candidate who feels confused (“is it listening to me?”) and one who feels confident in the system.

Token Generation and Room Management API

Your backend needs a token endpoint. Here’s a minimal FastAPI implementation:

# api/interview/token.py
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
from livekit import api
import os
import uuid

router = APIRouter()

class TokenRequest(BaseModel):
    candidate_name: str
    job_role: str
    session_id: str

@router.post("/api/interview/token")
async def get_interview_token(request: TokenRequest):
    livekit_api_key = os.environ["LIVEKIT_API_KEY"]
    livekit_api_secret = os.environ["LIVEKIT_API_SECRET"]
    livekit_url = os.environ["LIVEKIT_URL"]

    room_name = f"interview-{request.session_id}"

    token = (
        api.AccessToken(livekit_api_key, livekit_api_secret)
        .with_identity(f"candidate-{request.session_id}")
        .with_name(request.candidate_name)
        .with_grants(
            api.VideoGrants(
                room_join=True,
                room=room_name,
                can_publish=True,
                can_subscribe=True,
            )
        )
        .to_jwt()
    )

    # Also dispatch the agent to join this room
    await dispatch_agent(room_name, request.job_role, request.session_id)

    return {
        "token": token,
        "serverUrl": livekit_url,
        "roomName": room_name,
    }

async def dispatch_agent(room_name: str, job_role: str, session_id: str):
    """Tell the agent worker to join this room"""
    lk_api = api.LiveKitAPI(
        url=os.environ["LIVEKIT_URL"],
        api_key=os.environ["LIVEKIT_API_KEY"],
        api_secret=os.environ["LIVEKIT_API_SECRET"],
    )
    await lk_api.agent_dispatch.create_dispatch(
        api.CreateAgentDispatchRequest(
            agent_name="interview-agent",
            room=room_name,
            metadata=f'{{"job_role": "{job_role}", "session_id": "{session_id}"}}',
        )
    )
    await lk_api.aclose()

Audio Visualization Beyond BarVisualizer

LiveKit’s BarVisualizer works well, but you might want custom visualization for your brand. Here’s a canvas-based circular visualizer using the raw audio track:

// src/components/CircularVisualizer.tsx
import { useEffect, useRef } from "react";
import type { TrackReferenceOrPlaceholder } from "@livekit/components-react";
import { Track } from "livekit-client";

interface CircularVisualizerProps {
  trackRef: TrackReferenceOrPlaceholder;
  size?: number;
}

export function CircularVisualizer({ trackRef, size = 200 }: CircularVisualizerProps) {
  const canvasRef = useRef<HTMLCanvasElement>(null);
  const animRef = useRef<number | null>(null);
  const analyzerRef = useRef<AnalyserNode | null>(null);

  useEffect(() => {
    const track = trackRef.publication?.track;
    if (!track || track.kind !== Track.Kind.Audio) return;

    const audioContext = new AudioContext();
    const source = audioContext.createMediaStreamSource(
      new MediaStream([track.mediaStreamTrack])
    );
    const analyzer = audioContext.createAnalyser();
    analyzer.fftSize = 256;
    source.connect(analyzer);
    analyzerRef.current = analyzer;

    const canvas = canvasRef.current;
    if (!canvas) return;
    const ctx = canvas.getContext("2d")!;
    const bufferLength = analyzer.frequencyBinCount;
    const dataArray = new Uint8Array(bufferLength);

    function draw() {
      animRef.current = requestAnimationFrame(draw);
      analyzer.getByteFrequencyData(dataArray);

      ctx.clearRect(0, 0, size, size);
      const centerX = size / 2;
      const centerY = size / 2;
      const radius = size * 0.3;
      const bars = 60;

      for (let i = 0; i < bars; i++) {
        const dataIndex = Math.floor((i / bars) * bufferLength);
        const value = dataArray[dataIndex] / 255;
        const barHeight = value * (size * 0.2);
        const angle = (i / bars) * Math.PI * 2 - Math.PI / 2;

        const x1 = centerX + Math.cos(angle) * radius;
        const y1 = centerY + Math.sin(angle) * radius;
        const x2 = centerX + Math.cos(angle) * (radius + barHeight);
        const y2 = centerY + Math.sin(angle) * (radius + barHeight);

        ctx.beginPath();
        ctx.moveTo(x1, y1);
        ctx.lineTo(x2, y2);
        ctx.strokeStyle = `hsl(${260 + value * 60}, 80%, ${50 + value * 20}%)`;
        ctx.lineWidth = 3;
        ctx.lineCap = "round";
        ctx.stroke();
      }
    }

    draw();

    return () => {
      if (animRef.current) cancelAnimationFrame(animRef.current);
      audioContext.close();
    };
  }, [trackRef, size]);

  return <canvas ref={canvasRef} width={size} height={size} />;
}

React Native with @livekit/react-native

The React Native SDK mirrors the web SDK closely, but with important differences around native modules and audio sessions.

npm install @livekit/react-native @livekit/react-native-webrtc
npx pod-install  # iOS only

You need to register the WebRTC native module before using any LiveKit code:

// index.js or App.tsx — must be first
import { registerGlobals } from "@livekit/react-native-webrtc";
registerGlobals();

Audio Permissions on iOS and Android

Audio permissions are where React Native interviews fall apart if you’re not careful. Here’s a complete permission handler:

// src/hooks/useAudioPermissions.ts
import { useState, useEffect } from "react";
import { Platform, PermissionsAndroid, Alert, Linking } from "react-native";

export type PermissionStatus = "checking" | "granted" | "denied" | "blocked";

export function useAudioPermissions(): {
  status: PermissionStatus;
  request: () => Promise<boolean>;
} {
  const [status, setStatus] = useState<PermissionStatus>("checking");

  useEffect(() => {
    checkPermission();
  }, []);

  async function checkPermission() {
    if (Platform.OS === "android") {
      const result = await PermissionsAndroid.check(
        PermissionsAndroid.PERMISSIONS.RECORD_AUDIO
      );
      setStatus(result ? "granted" : "denied");
    } else {
      // iOS — use AVAudioSession or react-native-permissions
      const { check, PERMISSIONS } = await import("react-native-permissions");
      const result = await check(PERMISSIONS.IOS.MICROPHONE);
      setStatus(
        result === "granted"
          ? "granted"
          : result === "blocked"
          ? "blocked"
          : "denied"
      );
    }
  }

  async function request(): Promise<boolean> {
    if (Platform.OS === "android") {
      const result = await PermissionsAndroid.request(
        PermissionsAndroid.PERMISSIONS.RECORD_AUDIO,
        {
          title: "Microphone Permission",
          message: "This interview requires access to your microphone.",
          buttonPositive: "Allow",
          buttonNegative: "Deny",
        }
      );
      const granted = result === PermissionsAndroid.RESULTS.GRANTED;
      setStatus(granted ? "granted" : "denied");
      return granted;
    } else {
      const { request, PERMISSIONS, RESULTS } = await import(
        "react-native-permissions"
      );
      const result = await request(PERMISSIONS.IOS.MICROPHONE);
      if (result === RESULTS.BLOCKED) {
        setStatus("blocked");
        Alert.alert(
          "Microphone Blocked",
          "Please enable microphone access in Settings to continue the interview.",
          [
            { text: "Cancel", style: "cancel" },
            { text: "Open Settings", onPress: () => Linking.openSettings() },
          ]
        );
        return false;
      }
      const granted = result === RESULTS.GRANTED;
      setStatus(granted ? "granted" : "denied");
      return granted;
    }
  }

  return { status, request };
}

The React Native Interview Screen

// src/screens/InterviewScreen.tsx
import React, { useState, useEffect } from "react";
import { View, Text, TouchableOpacity, StyleSheet, AppState, AppStateStatus } from "react-native";
import {
  LiveKitRoom,
  useVoiceAssistant,
  AudioSession,
} from "@livekit/react-native";
import { useAudioPermissions } from "../hooks/useAudioPermissions";

export function InterviewScreen({ route, navigation }) {
  const { candidateName, jobRole, sessionId } = route.params;
  const [token, setToken] = useState<string | null>(null);
  const [serverUrl, setServerUrl] = useState<string | null>(null);
  const { status: permissionStatus, request: requestPermission } = useAudioPermissions();

  // Handle iOS audio session for background audio and phone call interruptions
  useEffect(() => {
    AudioSession.configureAudio({
      ios: {
        defaultOutput: "speaker",
        mixWithOthers: false,
        // Allow audio to continue when screen locks
        audioMode: "videoRecording",
      },
    });
    AudioSession.startAudioSession();
    return () => AudioSession.stopAudioSession();
  }, []);

  // Handle app state changes (backgrounding, phone calls)
  useEffect(() => {
    const subscription = AppState.addEventListener("change", handleAppStateChange);
    return () => subscription.remove();
  }, []);

  function handleAppStateChange(nextState: AppStateStatus) {
    if (nextState === "background") {
      // On Android, audio will cut unless we have foreground service
      // On iOS with proper audio session config, it continues
      console.log("App backgrounded during interview");
    } else if (nextState === "active") {
      // Resume — check if we need to reconnect
      console.log("App foregrounded");
    }
  }

  useEffect(() => {
    if (permissionStatus !== "granted") return;
    fetchToken();
  }, [permissionStatus]);

  async function fetchToken() {
    const res = await fetch("https://your-api.com/api/interview/token", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ candidateName, jobRole, sessionId }),
    });
    const data = await res.json();
    setToken(data.token);
    setServerUrl(data.serverUrl);
  }

  if (permissionStatus === "denied" || permissionStatus === "blocked") {
    return <PermissionDeniedScreen onRequest={requestPermission} />;
  }

  if (!token || !serverUrl) return <LoadingScreen />;

  return (
    <LiveKitRoom
      serverUrl={serverUrl}
      token={token}
      connect={true}
      audio={true}
      video={false}
      onDisconnected={() => navigation.navigate("InterviewComplete", { sessionId })}
    >
      <MobileInterviewInterface
        candidateName={candidateName}
        jobRole={jobRole}
      />
    </LiveKitRoom>
  );
}

function MobileInterviewInterface({ candidateName, jobRole }) {
  const { state } = useVoiceAssistant();

  return (
    <View style={styles.container}>
      <View style={styles.header}>
        <Text style={styles.title}>{jobRole} Interview</Text>
        <Text style={styles.subtitle}>{candidateName}</Text>
      </View>
      <View style={styles.agentContainer}>
        <AgentStateCircle state={state} />
      </View>
      <Text style={styles.hint}>
        {state === "listening"
          ? "Speak clearly — the interviewer is listening"
          : state === "thinking"
          ? "Processing your response..."
          : state === "speaking"
          ? "Interviewer is speaking"
          : "Connecting..."}
      </Text>
    </View>
  );
}

Flutter with livekit_client

Flutter deserves a mention because many enterprise mobile teams prefer Dart over JavaScript. The Flutter integration is more manual than React Native, but equally capable.

# pubspec.yaml
dependencies:
  livekit_client: ^2.3.0
  permission_handler: ^11.3.0

// lib/screens/interview_screen.dart
import 'package:flutter/material.dart';
import 'package:livekit_client/livekit_client.dart';
import 'package:permission_handler/permission_handler.dart';

class InterviewScreen extends StatefulWidget {
  final String candidateName;
  final String jobRole;
  final String sessionId;

  const InterviewScreen({
    super.key,
    required this.candidateName,
    required this.jobRole,
    required this.sessionId,
  });

  @override
  State<InterviewScreen> createState() => _InterviewScreenState();
}

class _InterviewScreenState extends State<InterviewScreen> {
  Room? _room;
  EventsListener<RoomEvent>? _listener;
  RemoteParticipant? _agentParticipant;
  String _agentState = 'connecting';

  @override
  void initState() {
    super.initState();
    _initInterview();
  }

  Future<void> _initInterview() async {
    // Request permissions
    final micStatus = await Permission.microphone.request();
    if (!micStatus.isGranted) {
      setState(() => _agentState = 'permission_denied');
      return;
    }

    // Fetch token
    final tokenData = await _fetchToken();

    // Create and connect room
    final room = Room();
    _listener = room.createListener();

    _listener!
      ..on<ParticipantConnectedEvent>((event) {
        if (event.participant.identity.startsWith('agent-')) {
          setState(() => _agentParticipant = event.participant as RemoteParticipant);
        }
      })
      ..on<DataReceivedEvent>((event) {
        // Agent sends state updates via data channel
        final message = String.fromCharCodes(event.data);
        final data = jsonDecode(message);
        if (data['type'] == 'agent_state') {
          setState(() => _agentState = data['state']);
        }
      });

    await room.connect(
      tokenData['serverUrl'],
      tokenData['token'],
      roomOptions: const RoomOptions(
        defaultAudioCaptureOptions: AudioCaptureOptions(
          noiseSuppression: true,
          echoCancellation: true,
          autoGainControl: true,
        ),
      ),
    );

    // Enable microphone
    await room.localParticipant?.setMicrophoneEnabled(true);

    setState(() => _room = room);
  }

  Future<Map<String, dynamic>> _fetchToken() async {
    // ... HTTP call to your token endpoint
    return {};
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      backgroundColor: const Color(0xFF0F0F1A),
      body: SafeArea(
        child: Column(
          children: [
            _buildHeader(),
            Expanded(child: _buildAgentState()),
            _buildHintText(),
            _buildControls(),
          ],
        ),
      ),
    );
  }

  Widget _buildAgentState() {
    final colors = {
      'listening': const Color(0xFF22c55e),
      'thinking': const Color(0xFF3b82f6),
      'speaking': const Color(0xFF8b5cf6),
      'connecting': const Color(0xFFf59e0b),
    };
    final color = colors[_agentState] ?? const Color(0xFF6b7280);

    return Center(
      child: AnimatedContainer(
        duration: const Duration(milliseconds: 300),
        width: 120,
        height: 120,
        decoration: BoxDecoration(
          shape: BoxShape.circle,
          color: color.withOpacity(0.2),
          border: Border.all(color: color, width: 3),
        ),
        child: Icon(
          _agentState == 'listening'
              ? Icons.mic
              : _agentState == 'speaking'
              ? Icons.volume_up
              : Icons.psychology,
          color: color,
          size: 48,
        ),
      ),
    );
  }

  @override
  void dispose() {
    _listener?.dispose();
    _room?.disconnect();
    _room?.dispose();
    super.dispose();
  }
}

Reconnection Strategies for Network Drops

Network reliability is your biggest enemy in mobile interviews. Cellular drops, WiFi handoffs, and corporate firewalls all conspire against you. Here’s how to handle it:

LiveKit’s Built-in Reconnection

LiveKit handles basic reconnection automatically, but you need to configure it and handle the UI states:

// web client reconnection handling
import { RoomEvent, ConnectionState } from "livekit-client";
import { useRoomContext } from "@livekit/components-react";
import { useEffect, useState } from "react";

export function useReconnectionHandler() {
  const room = useRoomContext();
  const [connectionState, setConnectionState] = useState<ConnectionState>(
    ConnectionState.Connected
  );
  const [reconnectAttempt, setReconnectAttempt] = useState(0);

  useEffect(() => {
    function handleStateChange(state: ConnectionState) {
      setConnectionState(state);
      if (state === ConnectionState.Reconnecting) {
        setReconnectAttempt((prev) => prev + 1);
      } else if (state === ConnectionState.Connected) {
        setReconnectAttempt(0);
      }
    }

    room.on(RoomEvent.ConnectionStateChanged, handleStateChange);
    return () => room.off(RoomEvent.ConnectionStateChanged, handleStateChange);
  }, [room]);

  return { connectionState, reconnectAttempt };
}

Configure the room with generous reconnection timeouts — candidates on mobile often have brief gaps in coverage:

<LiveKitRoom
  token={token}
  serverUrl={serverUrl}
  connect={true}
  options={{
    reconnectPolicy: {
      maxRetries: 5,
      minDelay: 1000,
      maxDelay: 10000,
      backoffMultiplier: 2,
    },
    webSocketTimeout: 15000,
    peerConnectionTimeout: 15000,
  }}
>

function ReconnectionBanner() {
  const { connectionState, reconnectAttempt } = useReconnectionHandler();

  if (connectionState === ConnectionState.Connected) return null;

  return (
    <div className={`reconnection-banner ${connectionState.toLowerCase()}`}>
      {connectionState === ConnectionState.Reconnecting && (
        <>
          <span className="spinner" />
          <span>Reconnecting... (attempt {reconnectAttempt}/5)</span>
        </>
      )}
      {connectionState === ConnectionState.Disconnected && (
        <>
          <span>Connection lost. Your progress has been saved.</span>
          <button onClick={() => window.location.reload()}>Rejoin Interview</button>
        </>
      )}
    </div>
  );
}

Mobile-Specific Challenges

Background Audio on iOS

iOS aggressively pauses audio when your app goes to background unless you configure the audio session correctly. In React Native:

// For iOS background audio, add these to Info.plist
// UIBackgroundModes: audio, voip

// And configure the audio session:
import { Platform } from "react-native";
import AudioSession from "@livekit/react-native";

if (Platform.OS === "ios") {
  AudioSession.configureAudio({
    ios: {
      audioMode: "voiceChat",      // Allows VoIP to continue in background
      defaultOutput: "earpiece",   // Natural phone call feel
      mixWithOthers: false,
    },
  });
}

Handling Phone Call Interruptions

Nothing is more disruptive than a phone call interrupting your interview. Handle it explicitly:

import { InterruptionType, AudioSession } from "@livekit/react-native";

AudioSession.addEventListener("interruptionEvent", (event) => {
  if (event.interruptionType === InterruptionType.Began) {
    // Phone call started — mute the candidate to avoid confusion
    room.localParticipant?.setMicrophoneEnabled(false);
    showNotification("Interview paused — phone call in progress");
  } else if (event.interruptionType === InterruptionType.Ended) {
    // Call ended — prompt to resume
    showResumePrompt();
  }
});

Android Foreground Service for Background Audio

On Android, audio processing in the background requires a foreground service. Without it, the OS will kill your audio after a few seconds:

// android/app/src/main/java/com/yourapp/InterviewForegroundService.kt
class InterviewForegroundService : Service() {
    override fun onStartCommand(intent: Intent?, flags: Int, startId: Int): Int {
        val notification = buildNotification("Interview in progress — tap to return")
        startForeground(NOTIFICATION_ID, notification)
        return START_STICKY
    }
}

PWA vs Native: The Trade-offs

Should you build a PWA or native apps? Here’s what I’ve learned shipping both:

PWA (Web)

Pros: Single codebase, instant updates, no app store friction, works on desktop and mobile. Cons: iOS Safari has historically had poor WebRTC support (improved significantly in iOS 16+), no access to background audio on iOS Safari, limited control over audio session routing.

React Native / Flutter

Pros: Full audio session control, background audio, foreground service support, better reliability on spotty networks, camera access for video interviews. Cons: Separate codebase, app store approval, installation friction.

My recommendation: start with the web client for your MVP. Most interviews happen on desktop, and the web experience is excellent there. Build the native client only when candidates start requesting mobile support, or when you add video features that require more camera control.

If you go native, React Native is the faster choice if your team knows JavaScript. Flutter is worth the Dart learning curve if you’re building a product where UI quality and animation smoothness are core differentiators.

Complete CSS for Interview UI

/* interview.css */
.interview-container {
  display: flex;
  flex-direction: column;
  height: 100vh;
  background: var(--color-bg);
  color: var(--color-text);
  padding: 1.5rem;
  max-width: 720px;
  margin: 0 auto;
}

.interview-header {
  display: flex;
  justify-content: space-between;
  align-items: flex-start;
  margin-bottom: 1.5rem;
}

.interview-timer {
  font-family: var(--font-mono);
  font-size: 1.5rem;
  font-weight: 600;
  color: var(--color-text);
}

.timer-warning {
  color: #ef4444;
  animation: pulse 1s infinite;
}

.section-progress {
  display: flex;
  gap: 0.5rem;
  margin-bottom: 2rem;
  flex-wrap: wrap;
}

.section-pill {
  padding: 0.25rem 0.75rem;
  border-radius: 9999px;
  font-size: 0.75rem;
  font-weight: 500;
  transition: all 0.3s;
}

.section-pill.completed {
  background: #22c55e20;
  color: #22c55e;
  border: 1px solid #22c55e40;
}

.section-pill.active {
  background: #3b82f620;
  color: #3b82f6;
  border: 1px solid #3b82f6;
  font-weight: 700;
}

.section-pill.upcoming {
  background: var(--color-border);
  color: var(--color-text);
  opacity: 0.5;
  border: 1px solid transparent;
}

.agent-state-container {
  flex: 1;
  display: flex;
  flex-direction: column;
  align-items: center;
  justify-content: center;
  gap: 2rem;
}

.agent-state {
  display: flex;
  align-items: center;
  gap: 0.75rem;
  font-size: 1.125rem;
  font-weight: 600;
}

.state-dot {
  width: 12px;
  height: 12px;
  border-radius: 50%;
  background: currentColor;
}

.state-dot.pulse {
  animation: pulse 1.5s ease-in-out infinite;
}

.agent-visualizer {
  width: 300px;
  height: 80px;
}

@keyframes pulse {
  0%, 100% { opacity: 1; transform: scale(1); }
  50% { opacity: 0.6; transform: scale(0.9); }
}

.reconnection-banner {
  position: fixed;
  top: 0;
  left: 0;
  right: 0;
  padding: 0.75rem 1rem;
  display: flex;
  align-items: center;
  justify-content: center;
  gap: 0.75rem;
  font-size: 0.875rem;
  font-weight: 500;
  z-index: 100;
}

.reconnection-banner.reconnecting {
  background: #f59e0b;
  color: #000;
}

.reconnection-banner.disconnected {
  background: #ef4444;
  color: #fff;
}

What We Built

We now have a complete client layer:

A React web client using LiveKitRoom, useVoiceAssistant, and BarVisualizer
Agent state visualization with distinct states for listening, thinking, and speaking
A timer with section progress tracking
A token generation API that dispatches the agent into the room
React Native client with audio session management for iOS and Android
Flutter as an alternative for Dart-native teams
Reconnection handling for network drops
Mobile-specific audio: background audio, phone call interruptions, foreground services

The client is the candidate’s only window into the interview. A confusing or unreliable UI undermines trust in the entire process, regardless of how smart the AI underneath is. The investment in getting this layer right pays off in completion rates and candidate satisfaction scores.

In Part 8, we add video to the mix. We’ll explore when video analysis actually helps (live coding interviews, engagement tracking) versus when it’s noise (emotional analysis, lie detection), and how to run Gemini Live’s multimodal sessions without blowing your per-interview cost budget.

This is Part 7 of a 12-part series: The Voice AI Interview Playbook.

Series outline:

Why Real-Time Voice Changes Everything — The landscape, the vision, and the reference architecture (Part 1)
Cascaded vs. Speech-to-Speech — Choosing your pipeline architecture (Part 2)
LiveKit vs. Pipecat vs. Direct — Picking your framework (Part 3)
STT, LLM, and TTS That Actually Work — Building the voice pipeline (Part 4)
Multi-Role Agents — Interviewer, coach, and evaluator personas (Part 5)
Knowledge Base and RAG — Making your voice agent an expert (Part 6)
Web and Mobile Clients — Cross-platform voice experiences (this post)
Video Interview Integration — Multimodal analysis with Gemini Live (Part 8)
Recording, Transcription, and Compliance — GDPR, HIPAA, and getting it right (Part 9)
Scaling to Thousands — Architecture for concurrent voice sessions (Part 10)
Cost Optimization — From $0.14/min to $0.03/min (Part 11)
Multi-Provider Support — OpenAI Realtime, Bedrock Nova, Grok, and the adapter pattern (Part 12)

Export for reading

The Voice AI Interview Playbook: Web and Mobile Clients — Building Cross-Platform Voice Experiences (Part 7 of 12)

The Client Architecture Overview

Web Client with LiveKit React Components

The Interview Interface Component

Token Generation and Room Management API

Audio Visualization Beyond BarVisualizer

React Native with @livekit/react-native

Audio Permissions on iOS and Android

The React Native Interview Screen

Flutter with livekit_client

Reconnection Strategies for Network Drops

LiveKit’s Built-in Reconnection

Reconnection UI Banner

Mobile-Specific Challenges

Background Audio on iOS

Handling Phone Call Interruptions

Android Foreground Service for Background Audio

PWA vs Native: The Trade-offs

Complete CSS for Interview UI

What We Built

Comments

On this page

The Voice AI Interview Playbook: Web and Mobile Clients — Building Cross-Platform Voice Experiences (Part 7 of 12)