Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,9 @@ KIMI_BASE_URL=
KIMI_MODELS=

MINIMAX_API_KEY=
MINIMAX_BASE_URL=
# MiniMax Anthropic-compatible endpoint for the built-in Anthropic SDK integration
MINIMAX_BASE_URL=https://api.minimaxi.com/anthropic/v1
# Example: MiniMax-M2.7,MiniMax-M2.7-highspeed,MiniMax-M2.5,MiniMax-M2.1
MINIMAX_MODELS=

GLM_API_KEY=
Expand Down Expand Up @@ -66,6 +68,9 @@ TTS_GLM_BASE_URL=
TTS_QWEN_API_KEY=
TTS_QWEN_BASE_URL=

TTS_MINIMAX_API_KEY=
# MiniMax TTS endpoint (speech-2.8 / 2.6 / 02 / 01 series)
TTS_MINIMAX_BASE_URL=https://api.minimaxi.com
TTS_ELEVENLABS_API_KEY=
TTS_ELEVENLABS_BASE_URL=

Expand Down Expand Up @@ -96,6 +101,10 @@ IMAGE_QWEN_IMAGE_BASE_URL=
IMAGE_NANO_BANANA_API_KEY=
IMAGE_NANO_BANANA_BASE_URL=

IMAGE_MINIMAX_API_KEY=
# Example models: image-01, image-01-live
IMAGE_MINIMAX_BASE_URL=https://api.minimaxi.com

IMAGE_GROK_API_KEY=
IMAGE_GROK_BASE_URL=

Expand All @@ -113,6 +122,10 @@ VIDEO_VEO_BASE_URL=
VIDEO_SORA_API_KEY=
VIDEO_SORA_BASE_URL=

VIDEO_MINIMAX_API_KEY=
# Example models: MiniMax-Hailuo-2.3, MiniMax-Hailuo-2.3-Fast, MiniMax-Hailuo-02
VIDEO_MINIMAX_BASE_URL=https://api.minimaxi.com

VIDEO_GROK_API_KEY=
VIDEO_GROK_BASE_URL=

Expand All @@ -132,6 +145,7 @@ TAVILY_API_KEY=

# Optional server-side default model for API routes like /api/generate-classroom
# Example: anthropic:claude-3-5-haiku-20241022 or google:gemini-3-flash-preview
# MiniMax example: minimax:MiniMax-M2.7-highspeed
DEFAULT_MODEL=

# LOG_LEVEL=info
Expand Down
21 changes: 20 additions & 1 deletion README-zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,11 +114,30 @@ providers:
apiKey: sk-ant-...
```

支持的服务商:**OpenAI**、**Anthropic**、**Google Gemini**、**DeepSeek**、**Grok (xAI)** 以及任何兼容 OpenAI API 的服务。
支持的服务商:**OpenAI**、**Anthropic**、**Google Gemini**、**DeepSeek**、**MiniMax**、**Grok (xAI)** 以及任何兼容 OpenAI API 的服务。

MiniMax 快速示例:

```env
MINIMAX_API_KEY=...
MINIMAX_BASE_URL=https://api.minimaxi.com/anthropic/v1
DEFAULT_MODEL=minimax:MiniMax-M2.7-highspeed

TTS_MINIMAX_API_KEY=...
TTS_MINIMAX_BASE_URL=https://api.minimaxi.com

IMAGE_MINIMAX_API_KEY=...
IMAGE_MINIMAX_BASE_URL=https://api.minimaxi.com

VIDEO_MINIMAX_API_KEY=...
VIDEO_MINIMAX_BASE_URL=https://api.minimaxi.com
```

> **推荐模型:** **Gemini 3 Flash** — 效果与速度的最佳平衡。追求最高质量可选 **Gemini 3.1 Pro**(速度较慢)。
>
> 如果希望 OpenMAIC 服务端默认走 Gemini,还需要额外设置 `DEFAULT_MODEL=google:gemini-3-flash-preview`。
>
> 如果希望默认走 MiniMax,可设置 `DEFAULT_MODEL=minimax:MiniMax-M2.7-highspeed`。

### 3. 启动

Expand Down
22 changes: 21 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,11 +114,30 @@ providers:
apiKey: sk-ant-...
```

Supported providers: **OpenAI**, **Anthropic**, **Google Gemini**, **DeepSeek**, **Grok (xAI)**, and any OpenAI-compatible API.
Supported providers: **OpenAI**, **Anthropic**, **Google Gemini**, **DeepSeek**, **MiniMax**, **Grok (xAI)**, and any OpenAI-compatible API.

MiniMax quick examples:

```env
MINIMAX_API_KEY=...
MINIMAX_BASE_URL=https://api.minimaxi.com/anthropic/v1
DEFAULT_MODEL=minimax:MiniMax-M2.7-highspeed

TTS_MINIMAX_API_KEY=...
TTS_MINIMAX_BASE_URL=https://api.minimaxi.com

IMAGE_MINIMAX_API_KEY=...
IMAGE_MINIMAX_BASE_URL=https://api.minimaxi.com

VIDEO_MINIMAX_API_KEY=...
VIDEO_MINIMAX_BASE_URL=https://api.minimaxi.com
```

> **Recommended model:** **Gemini 3 Flash** — best balance of quality and speed. For highest quality (at slower speed), try **Gemini 3.1 Pro**.
>
> If you want OpenMAIC server APIs to use Gemini by default, also set `DEFAULT_MODEL=google:gemini-3-flash-preview`.
>
> If you want to use MiniMax as the default server model, set `DEFAULT_MODEL=minimax:MiniMax-M2.7-highspeed`.

### 3. Run

Expand Down Expand Up @@ -483,3 +502,4 @@ If you find OpenMAIC useful in your research, please consider citing:
## 📄 License

This project is licensed under the [GNU Affero General Public License v3.0](LICENSE).

23 changes: 13 additions & 10 deletions app/api/generate/tts/route.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,17 @@ export const maxDuration = 30;
export async function POST(req: NextRequest) {
try {
const body = await req.json();
const { text, audioId, ttsProviderId, ttsVoice, ttsSpeed, ttsApiKey, ttsBaseUrl } = body as {
text: string;
audioId: string;
ttsProviderId: TTSProviderId;
ttsVoice: string;
ttsSpeed?: number;
ttsApiKey?: string;
ttsBaseUrl?: string;
};
const { text, audioId, ttsProviderId, ttsVoice, ttsSpeed, ttsModel, ttsApiKey, ttsBaseUrl } =
body as {
text: string;
audioId: string;
ttsProviderId: TTSProviderId;
ttsVoice: string;
ttsSpeed?: number;
ttsModel?: string;
ttsApiKey?: string;
ttsBaseUrl?: string;
};

// Validate required fields
if (!text || !audioId || !ttsProviderId || !ttsVoice) {
Expand Down Expand Up @@ -66,12 +68,13 @@ export async function POST(req: NextRequest) {
providerId: ttsProviderId,
voice: ttsVoice,
speed: ttsSpeed ?? 1.0,
model: ttsModel,
apiKey,
baseUrl,
};

log.info(
`Generating TTS: provider=${ttsProviderId}, voice=${ttsVoice}, audioId=${audioId}, textLen=${text.length}`,
`Generating TTS: provider=${ttsProviderId}, model=${ttsModel || 'default'}, voice=${ttsVoice}, audioId=${audioId}, textLen=${text.length}`,
);

// Generate audio
Expand Down
2 changes: 2 additions & 0 deletions components/generation/media-popover.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
Mic,
SlidersHorizontal,
ChevronRight,
Play,

Check warning on line 12 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'Play' is defined but never used. Allowed unused vars must match /^_/u
Loader2,

Check warning on line 13 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'Loader2' is defined but never used. Allowed unused vars must match /^_/u
} from 'lucide-react';
import { toast } from 'sonner';
import { Popover, PopoverContent, PopoverTrigger } from '@/components/ui/popover';
Expand All @@ -24,7 +24,7 @@
SelectTrigger,
SelectValue,
} from '@/components/ui/select';
import { Slider } from '@/components/ui/slider';

Check warning on line 27 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'Slider' is defined but never used. Allowed unused vars must match /^_/u
import { Switch } from '@/components/ui/switch';
import { cn } from '@/lib/utils';
import { useI18n } from '@/lib/hooks/use-i18n';
Expand Down Expand Up @@ -90,6 +90,7 @@
'qwen-tts': t('settings.providerQwenTTS'),
'doubao-tts': t('settings.providerDoubaoTTS'),
'elevenlabs-tts': t('settings.providerElevenLabsTTS'),
'minimax-tts': t('settings.providerMiniMaxTTS'),
'browser-native-tts': t('settings.providerBrowserNativeTTS'),
};
return names[providerId] || providerId;
Expand Down Expand Up @@ -136,9 +137,9 @@
const ttsVoice = useSettingsStore((s) => s.ttsVoice);
const ttsSpeed = useSettingsStore((s) => s.ttsSpeed);
const ttsProvidersConfig = useSettingsStore((s) => s.ttsProvidersConfig);
const setTTSProvider = useSettingsStore((s) => s.setTTSProvider);

Check warning on line 140 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'setTTSProvider' is assigned a value but never used. Allowed unused vars must match /^_/u
const setTTSVoice = useSettingsStore((s) => s.setTTSVoice);

Check warning on line 141 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'setTTSVoice' is assigned a value but never used. Allowed unused vars must match /^_/u
const setTTSSpeed = useSettingsStore((s) => s.setTTSSpeed);

Check warning on line 142 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'setTTSSpeed' is assigned a value but never used. Allowed unused vars must match /^_/u

const asrProviderId = useSettingsStore((s) => s.asrProviderId);
const asrLanguage = useSettingsStore((s) => s.asrLanguage);
Expand Down Expand Up @@ -166,7 +167,7 @@
needsKey: boolean,
) => !needsKey || !!configs[id]?.apiKey || !!configs[id]?.isServerConfigured;

const ttsSpeedRange = TTS_PROVIDERS[ttsProviderId]?.speedRange;

Check warning on line 170 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'ttsSpeedRange' is assigned a value but never used. Allowed unused vars must match /^_/u

// ─── Dynamic browser voices ───
const [browserVoices, setBrowserVoices] = useState<SpeechSynthesisVoice[]>([]);
Expand Down Expand Up @@ -215,7 +216,7 @@

// TTS: grouped by provider, voices as items (matching Image/Video pattern)
// Browser-native voices are split into sub-groups by language.
const ttsGroups = useMemo(() => {

Check warning on line 219 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'ttsGroups' is assigned a value but never used. Allowed unused vars must match /^_/u
const groups: SelectGroupData[] = [];

for (const p of Object.values(TTS_PROVIDERS)) {
Expand Down Expand Up @@ -260,7 +261,7 @@
}, [ttsProvidersConfig, locale, browserVoices, t]);

// TTS preview
const handlePreview = useCallback(async () => {

Check warning on line 264 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'handlePreview' is assigned a value but never used. Allowed unused vars must match /^_/u
if (previewing) {
stopPreview();
return;
Expand All @@ -274,6 +275,7 @@
speed: ttsSpeed,
apiKey: providerConfig?.apiKey,
baseUrl: providerConfig?.baseUrl,
model: providerConfig?.model,
});
} catch (error) {
const message =
Expand Down
99 changes: 64 additions & 35 deletions components/settings/audio-settings.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ import { useI18n } from '@/lib/hooks/use-i18n';
import { useSettingsStore } from '@/lib/store/settings';
import {
TTS_PROVIDERS,
MINIMAX_TTS_MODELS,
getTTSVoices,
ASR_PROVIDERS,
getASRSupportedLanguages,
Expand All @@ -39,6 +40,7 @@ function getTTSProviderName(providerId: TTSProviderId, t: (key: string) => strin
'qwen-tts': t('settings.providerQwenTTS'),
'doubao-tts': t('settings.providerDoubaoTTS'),
'elevenlabs-tts': t('settings.providerElevenLabsTTS'),
'minimax-tts': t('settings.providerMiniMaxTTS'),
'browser-native-tts': t('settings.providerBrowserNativeTTS'),
};
return names[providerId];
Expand Down Expand Up @@ -101,7 +103,7 @@ export function AudioSettings({ onSave }: AudioSettingsProps = {}) {

const handleTTSProviderConfigChange = (
providerId: TTSProviderId,
config: Partial<{ apiKey: string; baseUrl: string; enabled: boolean }>,
config: Partial<{ apiKey: string; baseUrl: string; model?: string; enabled: boolean }>,
) => {
setTTSProviderConfig(providerId, config);
onSave?.();
Expand Down Expand Up @@ -452,49 +454,76 @@ export function AudioSettings({ onSave }: AudioSettingsProps = {}) {

{(ttsProvider.requiresApiKey ||
ttsProvidersConfig[ttsProviderId]?.isServerConfigured) && (
<div className="grid grid-cols-2 gap-4">
<div className="space-y-2">
<Label className="text-sm">{t('settings.ttsApiKey')}</Label>
<div className="relative">
<>
<div className="grid grid-cols-2 gap-4">
<div className="space-y-2">
<Label className="text-sm">{t('settings.ttsApiKey')}</Label>
<div className="relative">
<Input
type={showTTSApiKey ? 'text' : 'password'}
placeholder={
ttsProvidersConfig[ttsProviderId]?.isServerConfigured
? t('settings.optionalOverride')
: t('settings.enterApiKey')
}
value={ttsProvidersConfig[ttsProviderId]?.apiKey || ''}
onChange={(e) =>
handleTTSProviderConfigChange(ttsProviderId, {
apiKey: e.target.value,
})
}
className="font-mono text-sm pr-10"
/>
<button
type="button"
onClick={() => setShowTTSApiKey(!showTTSApiKey)}
className="absolute right-2 top-1/2 -translate-y-1/2 text-muted-foreground hover:text-foreground"
>
{showTTSApiKey ? <EyeOff className="h-4 w-4" /> : <Eye className="h-4 w-4" />}
</button>
</div>
</div>

<div className="space-y-2">
<Label className="text-sm">{t('settings.ttsBaseUrl')}</Label>
<Input
type={showTTSApiKey ? 'text' : 'password'}
placeholder={
ttsProvidersConfig[ttsProviderId]?.isServerConfigured
? t('settings.optionalOverride')
: t('settings.enterApiKey')
}
value={ttsProvidersConfig[ttsProviderId]?.apiKey || ''}
placeholder={ttsProvider.defaultBaseUrl || t('settings.enterCustomBaseUrl')}
value={ttsProvidersConfig[ttsProviderId]?.baseUrl || ''}
onChange={(e) =>
handleTTSProviderConfigChange(ttsProviderId, {
apiKey: e.target.value,
baseUrl: e.target.value,
})
}
className="font-mono text-sm pr-10"
className="text-sm"
/>
<button
type="button"
onClick={() => setShowTTSApiKey(!showTTSApiKey)}
className="absolute right-2 top-1/2 -translate-y-1/2 text-muted-foreground hover:text-foreground"
>
{showTTSApiKey ? <EyeOff className="h-4 w-4" /> : <Eye className="h-4 w-4" />}
</button>
</div>
</div>

<div className="space-y-2">
<Label className="text-sm">{t('settings.ttsBaseUrl')}</Label>
<Input
placeholder={ttsProvider.defaultBaseUrl || t('settings.enterCustomBaseUrl')}
value={ttsProvidersConfig[ttsProviderId]?.baseUrl || ''}
onChange={(e) =>
handleTTSProviderConfigChange(ttsProviderId, {
baseUrl: e.target.value,
})
}
className="text-sm"
/>
</div>
</div>
{ttsProviderId === 'minimax-tts' && (
<div className="space-y-2">
<Label className="text-sm">{t('settings.ttsModel')}</Label>
<Select
value={ttsProvidersConfig[ttsProviderId]?.model || 'speech-2.8-turbo'}
onValueChange={(value) =>
handleTTSProviderConfigChange(ttsProviderId, {
model: value,
})
}
>
<SelectTrigger>
<SelectValue placeholder={t('settings.ttsModelPlaceholder')} />
</SelectTrigger>
<SelectContent>
{MINIMAX_TTS_MODELS.map((model) => (
<SelectItem key={model.id} value={model.id}>
{model.name}
</SelectItem>
))}
</SelectContent>
</Select>
</div>
)}
</>
)}
</div>
</div>
Expand Down
5 changes: 5 additions & 0 deletions components/settings/index.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ function getTTSProviderName(providerId: TTSProviderId, t: (key: string) => strin
'qwen-tts': t('settings.providerQwenTTS'),
'doubao-tts': t('settings.providerDoubaoTTS'),
'elevenlabs-tts': t('settings.providerElevenLabsTTS'),
'minimax-tts': t('settings.providerMiniMaxTTS'),
'browser-native-tts': t('settings.providerBrowserNativeTTS'),
};
return names[providerId];
Expand All @@ -142,13 +143,15 @@ const IMAGE_PROVIDER_NAMES: Record<ImageProviderId, string> = {
seedream: 'providerSeedream',
'qwen-image': 'providerQwenImage',
'nano-banana': 'providerNanoBanana',
'minimax-image': 'providerMiniMaxImage',
'grok-image': 'providerGrokImage',
};

const IMAGE_PROVIDER_ICONS: Record<ImageProviderId, string> = {
seedream: '/logos/doubao.svg',
'qwen-image': '/logos/bailian.svg',
'nano-banana': '/logos/gemini.svg',
'minimax-image': '/logos/minimax.svg',
'grok-image': '/logos/grok.svg',
};

Expand All @@ -157,6 +160,7 @@ const VIDEO_PROVIDER_NAMES: Record<VideoProviderId, string> = {
kling: 'providerKling',
veo: 'providerVeo',
sora: 'providerSora',
'minimax-video': 'providerMiniMaxVideo',
'grok-video': 'providerGrokVideo',
};

Expand All @@ -165,6 +169,7 @@ const VIDEO_PROVIDER_ICONS: Record<VideoProviderId, string> = {
kling: '/logos/kling.svg',
veo: '/logos/gemini.svg',
sora: '/logos/openai.svg',
'minimax-video': '/logos/minimax.svg',
'grok-video': '/logos/grok.svg',
};

Expand Down
Loading
Loading