AGENT.md - Comlink iOS: P2P Voice Intercom System

1. Project Overview

Comlink is a high-fidelity, peer-to-peer (P2P) voice intercom application designed for loud environments such as concerts, clubs, and festivals. The app enables offline voice communication using Apple's Multipeer Connectivity Framework over Bluetooth and Wi-Fi Direct, without requiring an internet connection.

Key Features:

Offline-First Architecture: Functions entirely in Airplane Mode (with Wi-Fi/Bluetooth enabled)
Background Audio Support: Continues transmitting and receiving audio when the phone is locked
Noise Isolation: Leverages iOS built-in voice processing to filter background noise and isolate the user's voice
Ultra-Low Latency: Optimized audio pipeline for real-time communication
OLED-Friendly UI: Dark/black theme optimized for low-light concert environments

Technical Specifications:

Language: Swift 6.0+
UI Framework: SwiftUI
Architecture: MVVM (Model-View-ViewModel)
Networking: Apple Multipeer Connectivity Framework
Audio: AVFoundation & AVAudioEngine
Target iOS: iOS 16.0+

2. Architecture Diagram (Text-Based)

┌─────────────────────────────────────────────────────────────────┐
│                         SwiftUI Views                            │
│  ┌──────────────────┐  ┌──────────────────┐  ┌───────────────┐ │
│  │ ConnectionView   │  │   TalkView       │  │  SettingsView │ │
│  └────────┬─────────┘  └────────┬─────────┘  └───────┬───────┘ │
└───────────┼────────────────────┼────────────────────┼──────────┘
            │                    │                    │
            └────────────────────┼────────────────────┘
                                 │
                    ┌────────────▼────────────┐
                    │   ComlinkViewModel      │
                    │    (MVVM Coordinator)   │
                    └────┬──────────────┬─────┘
                         │              │
            ┌────────────▼────┐    ┌───▼──────────────┐
            │ MultipeerManager│    │ AudioManager     │
            │  (P2P Networking)│   │ (Audio Pipeline) │
            └────┬─────────────┘   └──┬───────────────┘
                 │                    │
     ┌───────────▼──────────┐  ┌──────▼────────────────┐
     │ MCNearbyServiceBrowser│ │  AVAudioEngine       │
     │ MCNearbyServiceAdv.   │ │  - Input Node        │
     │ MCSession             │ │  - Audio Tap         │
     └───────────┬───────────┘ │  - Output Node       │
                 │              │  - Voice Processing  │
                 │              └──────┬───────────────┘
                 │                     │
                 └──────────┬──────────┘
                            │
                ┌───────────▼────────────┐
                │   Data Flow Pipeline   │
                │                        │
                │  Mic → Tap → Buffer →  │
                │  → Network → Peer →    │
                │  → Speaker             │
                └────────────────────────┘

Data Flow:

Audio Capture: Microphone → AVAudioInputNode → Audio Tap
Processing: Audio Tap → Voice Isolation → PCM Buffer
Transmission: PCM Buffer → MultipeerManager → MCSession → Peer
Reception: Peer → MCSession → MultipeerManager → Audio Buffer
Playback: Audio Buffer → AVAudioPlayerNode → AVAudioOutputNode → Speaker

3. Step-by-Step Implementation Plan

Phase 1: Project Setup & Permissions

Goal: Configure Xcode project with required capabilities and permissions.

Tasks:

Create Xcode Project
- New iOS App with SwiftUI
- Minimum Deployment Target: iOS 16.0
- Enable Swift 6.0 language mode

Configure Info.plist

<key>NSMicrophoneUsageDescription</key>
<string>Comlink needs microphone access to transmit your voice to connected peers.</string>

<key>NSLocalNetworkUsageDescription</key>
<string>Comlink uses local network to discover and connect to nearby devices.</string>

<key>NSBonjourServices</key>
<array>
    <string>_comlink._tcp</string>
    <string>_comlink._udp</string>
</array>

<key>UIBackgroundModes</key>
<array>
    <string>audio</string>
    <string>voip</string>
</array>

Configure Capabilities
- Enable "Audio, AirPlay, and Picture in Picture"
- Enable "Background Modes" → audio, voip
- Consider enabling "Network Extensions" if needed

Project Structure

Comlink/
├── App/
│   ├── ComlinkApp.swift
│   └── AppDelegate.swift (for background audio)
├── Models/
│   ├── Peer.swift
│   └── AudioPacket.swift
├── ViewModels/
│   └── ComlinkViewModel.swift
├── Views/
│   ├── ConnectionView.swift
│   ├── TalkView.swift
│   └── Components/
├── Managers/
│   ├── MultipeerManager.swift
│   ├── AudioManager.swift
│   └── PermissionsManager.swift
├── Utilities/
│   ├── AudioCodec.swift
│   └── Logger.swift
└── Resources/
    └── Assets.xcassets

Deliverables:

✅ Xcode project configured
✅ Info.plist with all required permissions
✅ Directory structure established

Phase 2: Multipeer Manager (Advertising & Browsing)

Goal: Implement P2P discovery and connection logic using MultipeerConnectivity.

Tasks:

Create MultipeerManager Class
- Singleton pattern with @Observable for SwiftUI integration
- Properties:
  - peerID: MCPeerID
  - session: MCSession
  - serviceAdvertiser: MCNearbyServiceAdvertiser
  - serviceBrowser: MCNearbyServiceBrowser
  - @Published var connectedPeers: [MCPeerID]
  - @Published var availablePeers: [MCPeerID]
Service Discovery Configuration
- Service Type: "comlink" (max 15 characters, lowercase, alphanumeric)
- Discovery Info: Include device name, app version
- Security: Implement custom invitation handler (accept/decline)
Session Management
- Implement MCSessionDelegate methods:
  - session(_:peer:didChange:) → Update connection state
  - session(_:didReceive:fromPeer:) → Handle audio data
  - session(_:didReceive:withName:fromPeer:) → Handle streams (future)

Connection Flow

Device A (Host)          Device B (Client)
─────────────────        ─────────────────
startAdvertising()       startBrowsing()
     │                         │
     │◄────Discovery────────────┤
     │                         │
     │────Invitation Request────►
     │                         │
     │◄───Accept/Decline────────┤
     │                         │
Connected ◄──────────────► Connected

Data Transmission Methods
- sendAudioData(_ data: Data, to peer: MCPeerID) → Use .reliable or .unreliable mode
- Decision: Use .unreliable for lower latency, handle packet loss gracefully

Deliverables:

✅ MultipeerManager class with discovery/advertising
✅ Peer connection and disconnection handling
✅ Data transmission infrastructure

Phase 3: Audio Engine Setup (Input → Processing → Network)

Goal: Configure AVAudioEngine to capture microphone input, process it, and send to network.

Tasks:

Create AudioManager Class
- @Observable class with AVAudioEngine lifecycle management
- Properties:
  - private let audioEngine: AVAudioEngine
  - private let inputNode: AVAudioInputNode
  - private let audioSession: AVAudioSession
  - @Published var isRecording: Bool
  - @Published var audioLevel: Float (for UI meter)

Configure AVAudioSession

let session = AVAudioSession.sharedInstance()
try session.setCategory(.playAndRecord, mode: .voiceChat, options: [
    .defaultToSpeaker,
    .allowBluetooth,
    .allowBluetoothA2DP
])
try session.setActive(true, options: .notifyOthersOnDeactivation)

Why .voiceChat mode?

Enables built-in echo cancellation
Enables voice isolation (filters background noise)
Optimizes for low-latency duplex communication

Install Audio Tap on Input Node
```
let format = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 4096, format: format) { [weak self] buffer, time in
    self?.processAudioBuffer(buffer)
}
```
Buffer Size Selection:
- 4096 frames = ~85ms latency at 48kHz (acceptable for voice)
- Smaller buffer = lower latency but higher CPU usage
- Larger buffer = smoother but increased delay

Process Audio Buffer

func processAudioBuffer(_ buffer: AVAudioPCMBuffer) {
    guard let channelData = buffer.floatChannelData else { return }

    // Convert PCM to Data
    let audioData = Data(bytes: channelData[0],
                        count: Int(buffer.frameLength) * MemoryLayout<Float>.size)

    // Optional: Compress with Opus codec (future enhancement)

    // Send to connected peers
    multipeerManager.sendAudioData(audioData)
}

Start/Stop Audio Engine
- startRecording() → Prepare engine, start engine, activate session
- stopRecording() → Stop engine, remove taps, deactivate session
- Handle interruptions (phone calls, alarms)

Deliverables:

✅ AudioManager with AVAudioEngine configuration
✅ Audio tap installed on input node
✅ Audio buffer processing and transmission

Phase 4: Audio Receiver (Network → Buffer → Playback)

Goal: Receive audio data from peers and play it through the speaker.

Tasks:

Receive Audio Data in MultipeerManager

func session(_ session: MCSession, didReceive data: Data, fromPeer peerID: MCPeerID) {
    // Route to AudioManager for playback
    audioManager.playReceivedAudio(data, from: peerID)
}

Create Audio Playback Pipeline
- Use AVAudioPlayerNode for real-time playback
- Attach player node to audio engine
- Connect player node to output node (speaker)

Convert Data to AVAudioPCMBuffer

func playReceivedAudio(_ data: Data, from peer: MCPeerID) {
    guard let buffer = createPCMBuffer(from: data) else { return }

    playerNode.scheduleBuffer(buffer) {
        // Buffer finished playing
    }

    if !playerNode.isPlaying {
        playerNode.play()
    }
}

Handle Buffer Queue
- Implement jitter buffer to handle network variability
- Drop packets if queue exceeds threshold (prevent accumulating delay)
- Smooth playback with interpolation if needed
Prevent Feedback Loop
- Ensure echo cancellation is working (.voiceChat mode)
- Test with two physical devices (NOT simulator)
- Consider muting local input during playback (optional)

Deliverables:

✅ Audio reception and playback pipeline
✅ AVAudioPlayerNode integration
✅ Jitter buffer implementation
✅ Echo cancellation verification

Phase 5: UI Implementation (SwiftUI Views)

Goal: Create intuitive, dark-themed UI for connection and communication.

Tasks:

ConnectionView (Initial Screen)
- Display list of discovered peers
- Show connection status (Searching, Found, Connected)
- Allow user to select a peer to connect
- Display user's own device name
- "Start Broadcasting" / "Stop Broadcasting" toggle
TalkView (Active Communication Screen)
- Large "Push to Talk" button (or toggle for continuous transmission)
- Audio level meter (visual feedback)
- Connected peer name/status
- "Disconnect" button
- Battery-saving dark background

SwiftUI Integration with ViewModel

@Observable
class ComlinkViewModel {
    let multipeerManager: MultipeerManager
    let audioManager: AudioManager

    var isConnected: Bool { !multipeerManager.connectedPeers.isEmpty }
    var availablePeers: [MCPeerID] { multipeerManager.availablePeers }

    func connect(to peer: MCPeerID) { ... }
    func disconnect() { ... }
    func startTalking() { audioManager.startRecording() }
    func stopTalking() { audioManager.stopRecording() }
}

UI Design Principles
- OLED Black: Use Color.black for backgrounds (true black = pixels off)
- High Contrast: White/green text on black background
- Large Touch Targets: Minimum 44x44pt for buttons
- Haptic Feedback: Use UIImpactFeedbackGenerator for interactions
Handle Background State
- Display notification when app enters background
- Continue audio transmission (background mode enabled)
- Lock screen controls (MPRemoteCommandCenter - optional)

Deliverables:

✅ ConnectionView with peer discovery UI
✅ TalkView with push-to-talk functionality
✅ Dark theme optimized for OLED
✅ ViewModel coordinating managers

4. Known Risks & Mitigation Strategies

Risk 1: Audio Feedback Loop

Description: When two devices are physically close, audio from the speaker can be picked up by the microphone, creating a feedback loop.

Mitigation:

✅ Use .voiceChat mode for built-in echo cancellation
✅ Test with headphones/earbuds (recommended use case)
✅ Implement AGC (Automatic Gain Control) if needed
✅ Consider adding "mute speaker during talk" option

Risk 2: Background Suspension

Description: iOS may suspend the app in background to save battery, interrupting audio transmission.

Mitigation:

✅ Enable UIBackgroundModes: audio in Info.plist
✅ Keep AVAudioSession active with .playAndRecord category
✅ Use beginBackgroundTask for critical operations
✅ Test extensively with device locked and app in background
✅ Monitor AVAudioSession.interruptionNotification and resume session

Risk 3: Multipeer Connectivity Reliability

Description: MCSession can be unstable, especially with unreliable data mode. Packets may be lost or arrive out of order.

Mitigation:

✅ Use .unreliable mode for low latency (accept some packet loss)
✅ Implement sequence numbers in audio packets for ordering
✅ Add jitter buffer to smooth playback
✅ Gracefully handle missing packets (interpolate or skip)
✅ Fallback to .reliable mode if latency is acceptable

Risk 4: Permission Denials

Description: Users may deny microphone or local network permissions, breaking core functionality.

Mitigation:

✅ Create PermissionsManager to check and request permissions upfront
✅ Show educational alert explaining why permissions are needed
✅ Provide deep link to Settings if permission is denied
✅ Gracefully degrade (disable features) if permissions unavailable

Risk 5: High Latency in Loud Environments

Description: Bluetooth/Wi-Fi performance may degrade in crowded environments (concerts) due to interference.

Mitigation:

✅ Prefer Wi-Fi Direct over Bluetooth when available
✅ Use smallest feasible buffer size (balance latency vs stability)
✅ Implement adaptive bitrate (reduce quality if connection degrades)
✅ Display connection quality indicator in UI
✅ Consider Opus codec for better compression (future)

Risk 6: Battery Drain

Description: Continuous audio processing and transmission will drain battery quickly.

Mitigation:

✅ Optimize audio pipeline (avoid unnecessary processing)
✅ Use efficient data formats (compressed audio)
✅ Provide "Low Power Mode" option (lower sample rate)
✅ Display battery usage warning in UI
✅ Allow user to close connection when not needed

Risk 7: Privacy & Security

Description: Audio data transmitted over local network could be intercepted or eavesdropped.

Mitigation:

✅ Use MCSession's built-in encryption (enabled by default)
✅ Implement peer verification (confirm identity before accepting)
✅ Add optional passcode/PIN for pairing
✅ Display warning about secure environment usage
✅ Future: Implement end-to-end encryption with custom keys

5. Development Workflow & Best Practices

Code Quality Standards:

Swift 6 Concurrency: Use async/await and @MainActor where appropriate
Error Handling: Comprehensive do-catch blocks, never force-unwrap in production
Logging: Use OSLog for debugging audio and network events
Testing: Unit tests for AudioManager and MultipeerManager logic
Code Review: All phases reviewed for performance and security

Testing Checklist:

Test with two physical devices (iPhone required, simulator insufficient)
Test in Airplane Mode with Wi-Fi/BT enabled
Test with app in background and device locked
Test in noisy environment (play loud music)
Test with Bluetooth headphones connected
Test battery usage over 30-minute session
Test permission denial scenarios
Test connection/disconnection edge cases

Performance Targets:

Latency: < 200ms end-to-end (audio input → transmission → playback)
Packet Loss Tolerance: < 5% packet loss without noticeable degradation
Battery Life: > 2 hours of continuous use at 50% brightness
Discovery Time: < 5 seconds to find nearby peer

6. Future Enhancements (Post-MVP)

Phase 6: Advanced Features

Opus Codec Integration: Replace PCM with Opus for 10x better compression
Multi-Peer Support: Allow 3+ people in a group chat
Noise Gate: Automatically mute when below threshold (save bandwidth)
Voice Effects: Optional filters (reverb, pitch shift) for fun
Message History: Brief text messages alongside voice
Spatial Audio: Use device orientation for 3D positioning

Phase 7: Optimization

Adaptive Bitrate: Dynamically adjust quality based on connection
Custom Transport Protocol: Replace MCSession with lower-level UDP if needed
Machine Learning Noise Reduction: Core ML model for superior filtering
Battery Optimization: Dynamic sample rate adjustment

7. Quick Start Commands

Clone and Setup:

git clone <repo-url>
cd comlink-ios
open Comlink.xcodeproj

Build and Run:

Select physical iOS device (NOT simulator - audio features require hardware)
Cmd+R to build and run
Grant microphone and local network permissions
Repeat on second device for testing

Testing P2P Connection:

Device A: Tap "Start Broadcasting"
Device B: Tap "Find Peers" → Select Device A
Device A: Accept connection request
Both devices: Test voice transmission

8. File Naming Conventions

Swift Files: PascalCase (e.g., MultipeerManager.swift)
Models: Singular nouns (e.g., Peer.swift, not Peers.swift)
Views: Descriptive + "View" suffix (e.g., ConnectionView.swift)
ViewModels: Same as View + "ViewModel" (e.g., ComlinkViewModel.swift)
Managers: Descriptive + "Manager" suffix (e.g., AudioManager.swift)

9. Commit Message Guidelines

Follow conventional commits:

feat: New feature (e.g., feat: implement MultipeerManager peer discovery)
fix: Bug fix (e.g., fix: resolve audio feedback loop)
refactor: Code restructuring (e.g., refactor: extract audio processing into utility)
docs: Documentation (e.g., docs: update AGENT.md with Phase 3 details)
test: Add tests (e.g., test: add unit tests for AudioManager)
chore: Maintenance (e.g., chore: update Xcode project settings)

10. Dependencies & Third-Party Libraries

Current: Zero Dependencies

This project uses only Apple frameworks to minimize complexity and binary size.

Considered for Future:

Opus-iOS: Opus codec bindings (if AVAudioEngine compression insufficient)
CocoaAsyncSocket: Alternative to MCSession for custom networking (if needed)
Realm/SwiftData: For message history persistence

Decision: Start with zero dependencies, add only if native frameworks are insufficient.

11. Security Considerations

Data Protection:

Audio buffers are ephemeral (not stored to disk)
No telemetry or analytics (fully offline)
User data never leaves device except during active P2P session

Network Security:

MCSession uses TLS-like encryption by default
Peer identity verified via device name (user confirmation required)
Future: Add optional passcode pairing

Permission Sandboxing:

Request microphone access only when needed
Local network usage limited to Bonjour service type
No location, camera, or contacts access required

12. Success Criteria

MVP Definition:

✅ Two devices can discover each other offline ✅ Audio transmitted with < 200ms latency ✅ Voice isolation filters background noise ✅ App continues working when device is locked ✅ Clean, dark UI suitable for concerts ✅ Stable connection for > 10 minutes without drops

Beta Release Criteria:

✅ All MVP features + tested by 10 users ✅ Battery life > 2 hours ✅ No critical bugs in 1-week testing period ✅ App Store compliance (privacy policy, metadata)

13. Contact & Support

Project Lead: Senior iOS Engineer (AI-Assisted Development) Issues: Use GitHub Issues for bug reports and feature requests Documentation: This file (AGENT.md) is the source of truth

Important: Always refer to this document before making architectural decisions.

14. Changelog

Version	Date	Changes
1.0.0	2025-12-11	Initial architecture document created

Next Step for AI Agent: Proceed to Phase 1 implementation after confirming this plan with the user.

FilesExpand file tree

AGENT.md

Latest commit

History

AGENT.md

File metadata and controls

AGENT.md - Comlink iOS: P2P Voice Intercom System

1. Project Overview

Key Features:

Technical Specifications:

2. Architecture Diagram (Text-Based)

Data Flow:

3. Step-by-Step Implementation Plan

Phase 1: Project Setup & Permissions

Tasks:

Phase 2: Multipeer Manager (Advertising & Browsing)

Tasks:

Phase 3: Audio Engine Setup (Input → Processing → Network)

Tasks:

Phase 4: Audio Receiver (Network → Buffer → Playback)

Tasks:

Phase 5: UI Implementation (SwiftUI Views)

Tasks:

4. Known Risks & Mitigation Strategies

Risk 1: Audio Feedback Loop

Risk 2: Background Suspension

Risk 3: Multipeer Connectivity Reliability

Risk 4: Permission Denials

Risk 5: High Latency in Loud Environments

Risk 6: Battery Drain

Risk 7: Privacy & Security

5. Development Workflow & Best Practices

Code Quality Standards:

Testing Checklist:

Performance Targets:

6. Future Enhancements (Post-MVP)

Phase 6: Advanced Features

Phase 7: Optimization

7. Quick Start Commands

Clone and Setup:

Build and Run:

Testing P2P Connection:

8. File Naming Conventions

9. Commit Message Guidelines

10. Dependencies & Third-Party Libraries

Current: Zero Dependencies

Considered for Future:

11. Security Considerations

Data Protection:

Network Security:

Permission Sandboxing:

12. Success Criteria

MVP Definition:

Beta Release Criteria:

13. Contact & Support

14. Changelog