Skip to content

Add Bill Scanning with OCR for Automatic Amount Detection #38

@DevinByteX

Description

@DevinByteX

📋 Feature Request

Add camera and gallery-based bill scanning functionality that uses OCR (Optical Character Recognition) to automatically detect bill amounts from receipt images and populate the bill amount input field.


🎯 Goals

  • Enable users to scan bills using their device camera
  • Allow users to select bill images from their photo gallery
  • Automatically detect and extract bill amounts using ML Kit Text Recognition
  • Auto-populate the highest confidence amount into the bill amount input
  • Handle camera and photo library permissions gracefully on both iOS and Android
  • Provide seamless fallback to manual entry if OCR fails

📱 User Experience Flow

  1. User taps camera icon button in the bill amount input section
  2. Action sheet appears with options: "Take Photo" / "Choose from Gallery"
  3. User grants camera/gallery permissions (if needed)
  4. User captures photo or selects existing image
  5. OCR processes the image and extracts bill amounts
  6. Highest confidence amount (≥0.7 threshold) auto-populates in the input field
  7. Success toast notification appears
  8. If OCR fails or confidence is low, error toast appears with fallback to manual entry

🔧 Technical Implementation Plan

1. Dependencies to Install

{
  "react-native-permissions": "^4.0.0",
  "react-native-vision-camera": "^4.0.0",
  "@react-native-ml-kit/text-recognition": "^1.0.0",
  "@react-native-camera-roll/camera-roll": "^7.0.0"
}

Note: All libraries are cross-platform compatible with React Native 0.81.0 and New Architecture.


2. Platform Configuration

Android (android/app/src/main/AndroidManifest.xml):

<uses-permission android:name="android.permission.CAMERA" />
<uses-feature android:name="android.hardware.camera" android:required="false" />
<uses-permission android:name="android.permission.READ_MEDIA_IMAGES" />

Android (android/app/build.gradle):

dependencies {
    // ML Kit Text Recognition
    implementation 'com.google.mlkit:text-recognition:16.0.0'
}

iOS (ios/TipCalculator/Info.plist):

<key>NSCameraUsageDescription</key>
<string>TipMate needs camera access to scan bill amounts from receipts</string>
<key>NSPhotoLibraryUsageDescription</key>
<string>TipMate needs photo library access to select bill images</string>

iOS (ios/Podfile):

# Add after target 'TipCalculator'
pod 'GoogleMLKit/TextRecognition', '~> 5.0.0'

3. File Structure

New files to create:

app/
├── components/
│   └── StyledBillScanner/
│       ├── StyledBillScanner.tsx       # Main scanner component
│       └── index.ts                     # Barrel export
├── hooks/
│   ├── usePermissions.ts                # Unified permission handling
│   ├── useImagePicker.ts                # Camera/gallery launcher
│   ├── useOCRProcessor.ts               # Image to text processing
│   └── useBillScanner.ts                # Main orchestration hook
├── types/
│   └── billScanner.ts                   # TypeScript interfaces
└── utils/
    └── ocrUtils.ts                      # OCR logic and amount extraction

Modified files:

  • app/components/StyledTotalAmountInput/StyledTotalAmountInput.tsx - Add scan button
  • app/hooks/index.ts - Export new hooks
  • app/components/index.ts - Export StyledBillScanner
  • package.json - Add dependencies

4. Implementation Details

Type Definitions (app/types/billScanner.ts)
export interface ScanResult {
  amount: number | null;
  confidence: number; // 0-1 scale
  rawText: string;
  detectedAmounts: DetectedAmount[];
  error?: string;
}

export interface DetectedAmount {
  value: number;
  confidence: number;
  context?: string; // Nearby text like "Total", "Amount Due"
}

export interface OCRConfig {
  minConfidence: number; // 0.7 default
  currencySymbols: string[]; // ['$', '€', '£', etc.]
  keywords: string[]; // ['total', 'amount', 'due', etc.]
}

export type PermissionStatus = 'granted' | 'denied' | 'blocked' | 'unavailable';
Permission Hook (app/hooks/usePermissions.ts)
  • Use react-native-permissions for unified API
  • Handle platform differences:
    • Android: PERMISSIONS.ANDROID.CAMERA, PERMISSIONS.ANDROID.READ_MEDIA_IMAGES
    • iOS: PERMISSIONS.IOS.CAMERA, PERMISSIONS.IOS.PHOTO_LIBRARY
  • Implement checkPermission(), requestPermission(), openSettings()
  • Show Alert dialogs following useExternalLinkAlert.ts pattern
  • Use Linking.openSettings() for both platforms
OCR Utilities (app/utils/ocrUtils.ts)
  • Initialize ML Kit Text Recognition
  • Extract text blocks from image
  • Parse amounts using regex patterns:
    • Currency symbols: /[$€£¥₹]\s*\d+\.?\d*/g
    • Decimal numbers: /\d+\.\d{2}/g
  • Score amounts based on:
    • Proximity to keywords ("total", "amount due", "balance")
    • Format validity (2 decimal places)
    • Text block confidence from ML Kit
  • Use acceptNumbersAndDecimals from validationHooks.ts for validation
  • Return highest confidence amount ≥0.7 threshold
Image Picker Hook (app/hooks/useImagePicker.ts)
  • Camera: Use react-native-vision-camera API
  • Gallery: Use @react-native-camera-roll/camera-roll
  • Platform-specific action sheet:
    • iOS: ActionSheetIOS.showActionSheetWithOptions()
    • Android: Custom modal component
  • Return image URI for processing
Bill Scanner Hook (app/hooks/useBillScanner.ts)
  • Orchestrate full scanning flow
  • Check permissions → Launch picker → Process image → Extract amount
  • Manage loading states: isScanning, isProcessing
  • Handle errors with Toast notifications (use existing toastConfig.tsx)
  • Return: { scanBill, isScanning, error, scannedAmount }
UI Integration (app/components/StyledTotalAmountInput/)
  • Add camera icon button (use icons from StyledIcons)
  • Position: Right side of input field, before/after currency symbol
  • Platform-aware styling using Platform.OS
  • On press: Show action sheet → Call useBillScanner()
  • On success: Call onAmountChange(scannedAmount) + show success Toast
  • On failure: Show error Toast + keep input focused for manual entry
  • Maintain existing keyboard behavior (decimal-pad iOS, number-pad Android)

✅ Acceptance Criteria

  • User can tap camera button in bill amount section
  • Action sheet shows "Take Photo" and "Choose from Gallery" options
  • Camera permission request works on both iOS and Android
  • Photo library permission request works on both iOS and Android
  • Camera captures photo successfully on both platforms
  • Gallery picker opens and allows image selection on both platforms
  • OCR extracts text from receipt images with ≥70% confidence
  • Highest confidence amount auto-populates in bill amount input
  • Success toast appears when amount is detected
  • Error toast appears when OCR fails or confidence is low
  • Manual input still works as fallback
  • No crashes or memory leaks during image processing
  • Works in both light and dark themes
  • Permission denied scenarios show appropriate alerts with settings link
  • App handles "never ask again" permission state gracefully

🧪 Testing Requirements

Unit Tests

  • ocrUtils.ts - Amount extraction regex patterns
  • ocrUtils.ts - Confidence scoring algorithm
  • usePermissions.ts - Permission state handling

Integration Tests

  • Camera capture flow on iOS/Android
  • Gallery selection flow on iOS/Android
  • OCR processing with sample receipt images
  • Amount population in bill input

Manual Testing

  • Test on physical iOS device (camera required)
  • Test on physical Android device (camera required)
  • Test with various receipt formats (printed, digital, handwritten)
  • Test with different currencies ($, €, £, ¥)
  • Test permission denial scenarios
  • Test offline functionality (bundled ML Kit model)
  • Test with poor image quality (blurry, dark, angled)

📊 Performance Considerations

  • ML Kit Model Size:
    • iOS: ~8MB (via CocoaPods)
    • Android: ~10MB (bundled model for offline support)
  • Processing Time: Target <2 seconds for OCR processing
  • Memory: Monitor image processing memory usage (compress large images if needed)
  • Battery: Camera usage impacts battery; ensure proper cleanup

🔮 Future Enhancements (Out of Scope)

  • Detect multiple amounts (subtotal, tax, total) and let user choose
  • Support for multiple languages/currencies
  • Image cropping/editing before OCR
  • Save scanned receipt images with saved tips
  • Real-time OCR with camera preview overlay
  • AI-powered receipt parsing (line items, merchant info)
  • Cloud-based OCR for better accuracy

📚 References


🤝 Implementation Notes

  • Follow existing code patterns in app/hooks/ and app/components/
  • Use react-native-unistyles for theming (already configured)
  • Maintain TypeScript strict mode compliance
  • Export all new hooks/components via barrel files (index.ts)
  • Follow naming convention: Styled* for components, use* for hooks
  • Ensure compatibility with React Native 0.81.0 and New Architecture
  • Test on both iOS and Android before submitting PR

Priority: Medium

Estimated Effort: 3-5 days

Dependencies: None (can be implemented independently)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions