Skip to content

Google Gemini

Invoke supports Google’s Gemini image generation models through the Gemini API. This provider is a good fit if you want high-quality text-to-image and reference-based image edits without running a local model.

  1. Open Google AI Studio and sign in with your Google account.
  2. Generate a new API key.
  3. Note the key — it will only be shown once.

Add your key to api_keys.yaml in your Invoke root directory:

external_gemini_api_key: "your-gemini-api-key"
# Optional — only set this if you need to route requests through a different endpoint
external_gemini_base_url: "https://generativelanguage.googleapis.com"

Restart Invoke for the change to take effect.

ModelModesReference ImagesNotes
Gemini 2.5 Flash Imagetxt2imgYes10 aspect ratios, fixed per-ratio resolutions.
Gemini 3 Pro Image Previewtxt2imgUp to 14 (6 object + 5 character)1K / 2K / 4K resolution presets.
Gemini 3.1 Flash Image Previewtxt2imgUp to 14 (10 object + 4 character)512 / 1K / 2K / 4K resolution presets.

Reference-image input is used to condition generation but counts as txt2img — neither img2img (denoising strength) nor inpaint (mask) is supported for Gemini.

All Gemini models are single-image-per-request — batch size is fixed at 1. To generate multiple variations, queue multiple invocations.

  1. Reference images are sent directly to the API as inlined PNG data. Large references increase request latency and cost — crop tightly where possible.
  2. Aspect ratios are mapped to the closest Gemini-supported ratio. For Gemini 3 models, use the resolution presets to stay at the provider’s native output sizes and avoid unnecessary rescaling.
  3. Pricing varies by model and region. Check Google’s documentation before running large batches.
This site was designed and developed by Aether Fox Studio.