Skip to content

Add Google AI Support & Refactor AI Provider Settings#101

Open
skorphil wants to merge 13 commits into
Mikodin:mainfrom
skorphil:gemini-provider
Open

Add Google AI Support & Refactor AI Provider Settings#101
skorphil wants to merge 13 commits into
Mikodin:mainfrom
skorphil:gemini-provider

Conversation

@skorphil
Copy link
Copy Markdown
Contributor

@skorphil skorphil commented May 3, 2026

@Mikodin I can't test with OpenAI, because i have no money there. I tested only to the moment when error insufficient funds being returned. I re-introduced custom fetcher and it seems working (it fixes CORS issues with custom openAI providers). For gemini - separate langchain adapter is used. However I didn't integrate gemini transcription yet.

related to #81 and #54

Summary

This PR adds Google AI (Gemini) support as an alternative processing provider and refactors the AI provider settings architecture for better maintainability and extensibility. The changes enable users to choose between OpenAI, Google AI, and custom OpenAI-compatible endpoints for transcription and LLM processing.

Key Changes

🆕 Google AI Support

  • New utility file: src/util/geminiAiUtils.ts
    • Implements summarizeTranscriptGemini() for transcript summarization using Google's Gemini models
    • Implements llmFixMermaidChartGemini() for mermaid chart repair
    • Exports LLM_MODELS enum with available Gemini models (gemini-2.5-pro, gemini-2.5-flash, etc.)
  • New settings: Added googleAiApiKey and googleModel to ScribePluginSettings
  • New dependency: Added @langchain/google-genai package

🔧 Obsidian Fetcher (CORS Fix)

  • New utility file: src/util/obsidianFetch.ts
    • Custom fetch implementation wrapping Obsidian's requestUrl() API
    • Solves CORS issues with OpenAI-compatible providers (e.g., Fireworks, local LLMs)
    • Properly handles form data, data: URLs, and request/response conversion
    • Integrated into both OpenAI SDK and LangChain's ChatOpenAI configurations

🏗️ Settings Architecture Refactor

  • Replaced AiModelSettings.tsx with modular settings structure:
    • New ai-provider-settings-tab/ folder with:
      • ProviderSettingsTab.tsx - Main tab with platform selectors
      • ProviderSettingsSections.tsx - Modular sections per provider (OpenAI, Gemini, Custom)
      • index.ts - Re-exports
  • New platform enums:
    • PROCESS_PLATFORM: openAi, google, customOpenAi
    • TRANSCRIPT_PLATFORM: Added customOpenAi option
  • Removed: useCustomOpenAiBaseUrl boolean (replaced by processPlatform enum)
  • Automatic migration: src/settings/migration.ts detects the old useCustomOpenAiBaseUrl: true flag and automatically sets processPlatform and transcriptPlatform to customOpenAi on first load — no manual reconfiguration needed

⚡ Core Logic Refactoring (src/index.ts)

  • Platform-aware processing: Replaced if/else chains with switch statements for:
    • Transcription platform selection (OpenAI, AssemblyAI, Custom OpenAI)
    • LLM processing platform selection (OpenAI, Google, Custom OpenAI)
    • Mermaid chart fixing
  • Improved error handling: Better error messages, type-safe error handling, and validation checks
  • Early validation: Checks for valid platform selection before processing

🛠️ Other Improvements

  • TypeScript: Added skipLibCheck: true to tsconfig.json to prevent type errors from dependencies
  • Code organization: Better separation of concerns between platform-specific logic
  • Type safety: Removed definite assignment assertion by properly initializing controlModal

Testing Checklist

  • Test transcription with OpenAI
  • Test transcription with AssemblyAI
  • Test transcription with custom OpenAI-compatible endpoint (groq: whisper)
  • Test LLM summarization with OpenAI
  • Test LLM summarization with Google AI
  • Test LLM summarization with custom OpenAI-compatible endpoint (groq, fireworks) NOTE: only LLMs with schema output support working (e.g. gpt OSS)
  • Test mermaid chart fixing with OpenAI
  • Test mermaid chart fixing with OpenAI-compatible endpoint
  • Test mermaid chart fixing with Google AI
  • Verify settings UI displays correctly for all provider combinations
  • Verify error handling for missing API keys NOTE: requires improvement in Notice component (plan to do this later)
  • Verify CORS issues are resolved when using custom endpoints
  • Verify auto migration to new settings (if custom OpenAPI keys and settings preserved) (worked for me, but better to retest)

Breaking Changes

  • Settings schema change: The useCustomOpenAiBaseUrl boolean setting is removed and replaced with the processPlatform enum. Existing settings are automatically migrated on first load — base URL, API key, and custom model names are preserved as-is.

Future plans

  • Add note to UI, that Currently only models with schema output supported
  • Add different API keys settings for OpenAI and Custom OpenAI
  • Add error state to Notice (which closed by X)
  • Add Notice to obsidianFetch errors
  • MAYBE separate fixing mermaid AI settings from processing. i.e: mermaidPlatform
  • Fetch list of models from openAi and gemini. NOT hardcode that list

…emini prompts

- Add `obsidianFetch` utility to wrap Obsidian's `requestUrl`, bypassing CORS restrictions for OpenAI-compatible providers.
- Integrate `obsidianFetch`
…schemas

- Update `llmFixMermaidChartGemini` to use explicit HumanMessage and enforce real newline characters in the output.
- Add debug logging to Gemini summarization and mermaid chart fixing functions.
- Modify `obsidianFetch` to strip `$schema` and `title` from JSON schemas to ensure compatibility with providers like Groq.
- Remove unused
Introduce `migrateSettings` to handle updates to the plugin configuration
structure, ensuring backward compatibility when loading saved user data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant