-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
📋 Feature Request
Add camera and gallery-based bill scanning functionality that uses OCR (Optical Character Recognition) to automatically detect bill amounts from receipt images and populate the bill amount input field.
🎯 Goals
- Enable users to scan bills using their device camera
- Allow users to select bill images from their photo gallery
- Automatically detect and extract bill amounts using ML Kit Text Recognition
- Auto-populate the highest confidence amount into the bill amount input
- Handle camera and photo library permissions gracefully on both iOS and Android
- Provide seamless fallback to manual entry if OCR fails
📱 User Experience Flow
- User taps camera icon button in the bill amount input section
- Action sheet appears with options: "Take Photo" / "Choose from Gallery"
- User grants camera/gallery permissions (if needed)
- User captures photo or selects existing image
- OCR processes the image and extracts bill amounts
- Highest confidence amount (≥0.7 threshold) auto-populates in the input field
- Success toast notification appears
- If OCR fails or confidence is low, error toast appears with fallback to manual entry
🔧 Technical Implementation Plan
1. Dependencies to Install
{
"react-native-permissions": "^4.0.0",
"react-native-vision-camera": "^4.0.0",
"@react-native-ml-kit/text-recognition": "^1.0.0",
"@react-native-camera-roll/camera-roll": "^7.0.0"
}Note: All libraries are cross-platform compatible with React Native 0.81.0 and New Architecture.
2. Platform Configuration
Android (android/app/src/main/AndroidManifest.xml):
<uses-permission android:name="android.permission.CAMERA" />
<uses-feature android:name="android.hardware.camera" android:required="false" />
<uses-permission android:name="android.permission.READ_MEDIA_IMAGES" />Android (android/app/build.gradle):
dependencies {
// ML Kit Text Recognition
implementation 'com.google.mlkit:text-recognition:16.0.0'
}iOS (ios/TipCalculator/Info.plist):
<key>NSCameraUsageDescription</key>
<string>TipMate needs camera access to scan bill amounts from receipts</string>
<key>NSPhotoLibraryUsageDescription</key>
<string>TipMate needs photo library access to select bill images</string>iOS (ios/Podfile):
# Add after target 'TipCalculator'
pod 'GoogleMLKit/TextRecognition', '~> 5.0.0'3. File Structure
New files to create:
app/
├── components/
│ └── StyledBillScanner/
│ ├── StyledBillScanner.tsx # Main scanner component
│ └── index.ts # Barrel export
├── hooks/
│ ├── usePermissions.ts # Unified permission handling
│ ├── useImagePicker.ts # Camera/gallery launcher
│ ├── useOCRProcessor.ts # Image to text processing
│ └── useBillScanner.ts # Main orchestration hook
├── types/
│ └── billScanner.ts # TypeScript interfaces
└── utils/
└── ocrUtils.ts # OCR logic and amount extraction
Modified files:
app/components/StyledTotalAmountInput/StyledTotalAmountInput.tsx- Add scan buttonapp/hooks/index.ts- Export new hooksapp/components/index.ts- Export StyledBillScannerpackage.json- Add dependencies
4. Implementation Details
Type Definitions (app/types/billScanner.ts)
export interface ScanResult {
amount: number | null;
confidence: number; // 0-1 scale
rawText: string;
detectedAmounts: DetectedAmount[];
error?: string;
}
export interface DetectedAmount {
value: number;
confidence: number;
context?: string; // Nearby text like "Total", "Amount Due"
}
export interface OCRConfig {
minConfidence: number; // 0.7 default
currencySymbols: string[]; // ['$', '€', '£', etc.]
keywords: string[]; // ['total', 'amount', 'due', etc.]
}
export type PermissionStatus = 'granted' | 'denied' | 'blocked' | 'unavailable';Permission Hook (app/hooks/usePermissions.ts)
- Use
react-native-permissionsfor unified API - Handle platform differences:
- Android:
PERMISSIONS.ANDROID.CAMERA,PERMISSIONS.ANDROID.READ_MEDIA_IMAGES - iOS:
PERMISSIONS.IOS.CAMERA,PERMISSIONS.IOS.PHOTO_LIBRARY
- Android:
- Implement
checkPermission(),requestPermission(),openSettings() - Show Alert dialogs following
useExternalLinkAlert.tspattern - Use
Linking.openSettings()for both platforms
OCR Utilities (app/utils/ocrUtils.ts)
- Initialize ML Kit Text Recognition
- Extract text blocks from image
- Parse amounts using regex patterns:
- Currency symbols:
/[$€£¥₹]\s*\d+\.?\d*/g - Decimal numbers:
/\d+\.\d{2}/g
- Currency symbols:
- Score amounts based on:
- Proximity to keywords ("total", "amount due", "balance")
- Format validity (2 decimal places)
- Text block confidence from ML Kit
- Use
acceptNumbersAndDecimalsfromvalidationHooks.tsfor validation - Return highest confidence amount ≥0.7 threshold
Image Picker Hook (app/hooks/useImagePicker.ts)
- Camera: Use
react-native-vision-cameraAPI - Gallery: Use
@react-native-camera-roll/camera-roll - Platform-specific action sheet:
- iOS:
ActionSheetIOS.showActionSheetWithOptions() - Android: Custom modal component
- iOS:
- Return image URI for processing
Bill Scanner Hook (app/hooks/useBillScanner.ts)
- Orchestrate full scanning flow
- Check permissions → Launch picker → Process image → Extract amount
- Manage loading states:
isScanning,isProcessing - Handle errors with Toast notifications (use existing
toastConfig.tsx) - Return:
{ scanBill, isScanning, error, scannedAmount }
UI Integration (app/components/StyledTotalAmountInput/)
- Add camera icon button (use icons from
StyledIcons) - Position: Right side of input field, before/after currency symbol
- Platform-aware styling using
Platform.OS - On press: Show action sheet → Call
useBillScanner() - On success: Call
onAmountChange(scannedAmount)+ show success Toast - On failure: Show error Toast + keep input focused for manual entry
- Maintain existing keyboard behavior (
decimal-padiOS,number-padAndroid)
✅ Acceptance Criteria
- User can tap camera button in bill amount section
- Action sheet shows "Take Photo" and "Choose from Gallery" options
- Camera permission request works on both iOS and Android
- Photo library permission request works on both iOS and Android
- Camera captures photo successfully on both platforms
- Gallery picker opens and allows image selection on both platforms
- OCR extracts text from receipt images with ≥70% confidence
- Highest confidence amount auto-populates in bill amount input
- Success toast appears when amount is detected
- Error toast appears when OCR fails or confidence is low
- Manual input still works as fallback
- No crashes or memory leaks during image processing
- Works in both light and dark themes
- Permission denied scenarios show appropriate alerts with settings link
- App handles "never ask again" permission state gracefully
🧪 Testing Requirements
Unit Tests
-
ocrUtils.ts- Amount extraction regex patterns -
ocrUtils.ts- Confidence scoring algorithm -
usePermissions.ts- Permission state handling
Integration Tests
- Camera capture flow on iOS/Android
- Gallery selection flow on iOS/Android
- OCR processing with sample receipt images
- Amount population in bill input
Manual Testing
- Test on physical iOS device (camera required)
- Test on physical Android device (camera required)
- Test with various receipt formats (printed, digital, handwritten)
- Test with different currencies ($, €, £, ¥)
- Test permission denial scenarios
- Test offline functionality (bundled ML Kit model)
- Test with poor image quality (blurry, dark, angled)
📊 Performance Considerations
- ML Kit Model Size:
- iOS: ~8MB (via CocoaPods)
- Android: ~10MB (bundled model for offline support)
- Processing Time: Target <2 seconds for OCR processing
- Memory: Monitor image processing memory usage (compress large images if needed)
- Battery: Camera usage impacts battery; ensure proper cleanup
🔮 Future Enhancements (Out of Scope)
- Detect multiple amounts (subtotal, tax, total) and let user choose
- Support for multiple languages/currencies
- Image cropping/editing before OCR
- Save scanned receipt images with saved tips
- Real-time OCR with camera preview overlay
- AI-powered receipt parsing (line items, merchant info)
- Cloud-based OCR for better accuracy
📚 References
- react-native-vision-camera docs
- react-native-permissions docs
- ML Kit Text Recognition
- React Native Platform API
🤝 Implementation Notes
- Follow existing code patterns in
app/hooks/andapp/components/ - Use
react-native-unistylesfor theming (already configured) - Maintain TypeScript strict mode compliance
- Export all new hooks/components via barrel files (
index.ts) - Follow naming convention:
Styled*for components,use*for hooks - Ensure compatibility with React Native 0.81.0 and New Architecture
- Test on both iOS and Android before submitting PR
Priority: Medium
Estimated Effort: 3-5 days
Dependencies: None (can be implemented independently)
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request