介绍
# Gemini Image Gen
Generate and edit images via the Google Gemini API using pure Python stdlib. Supports Gemini native generation + editing, Imagen 3 generation, batch runs, and an HTML gallery output.
## Quick Start
```bash export GEMINI_API_KEY="your-key-here"
# Default: Gemini native, 4 random prompts python3 scripts/gen.py
# Custom prompt python3 scripts/gen.py --prompt "a cyberpunk cat riding a neon motorcycle through Tokyo at night"
# Imagen 3 engine python3 scripts/gen.py --engine imagen --count 4 --aspect 16:9
# Edit an existing image (Gemini engine only) python3 scripts/gen.py --edit path/to/image.png --prompt "change the background to a sunset beach"
# Use a style preset python3 scripts/gen.py --style watercolor --prompt "floating islands above a calm sea"
# List available styles python3 scripts/gen.py --styles ```
## Style Presets
| Style | Description | | --- | --- | | `photo` | Ultra-detailed photorealistic photography, 8K resolution, sharp focus | | `anime` | High-quality anime illustration, Studio Ghibli inspired, vibrant colors | | `watercolor` | Delicate watercolor painting on textured paper, soft edges, gentle color bleeding | | `cyberpunk` | Neon-lit cyberpunk scene, rain-soaked streets, holographic displays, Blade Runner aesthetic | | `minimalist` | Clean minimalist design, geometric shapes, limited color palette, white space | | `oil-painting` | Classical oil painting with visible brushstrokes, rich textures, Renaissance lighting | | `pixel-art` | Detailed pixel art, retro 16-bit style, crisp edges, nostalgic palette | | `sketch` | Pencil sketch on cream paper, hatching and cross-hatching, artistic imperfections | | `3d-render` | Professional 3D render, ambient occlusion, global illumination, photorealistic materials | | `pop-art` | Bold pop art style, Ben-Day dots, strong outlines, vibrant contrasting colors |
## Full CLI Reference
| Flag | Default | Description | | --- | --- | --- | | `--prompt` | (random) | Text prompt. Omit for random creative prompts | | `--count` | 4 | Number of images to generate | | `--engine` | gemini | Engine: `gemini` (native, supports edit) or `imagen` (Imagen 3) | | `--model` | (auto) | Model override. Default: `gemini-2.5-flash-image` or `imagen-3.0-generate-002` | | `--edit` | | Path to input image for editing (Gemini engine only) | | `--aspect` | 1:1 | Aspect ratio for Imagen: `1:1`, `16:9`, `9:16`, `4:3`, `3:4` | | `--out-dir` | (auto) | Output directory (default is a timestamped folder) | | `--style` | | Style preset to prepend to the prompt | | `--styles` | | List available style presets and exit |
## Python Example
```python import subprocess
subprocess.run( [ "python3", "scripts/gen.py", "--prompt", "a serene mountain landscape at golden hour", "--count", "4", "--style", "photo", ], check=True, ) ```
## Troubleshooting
- Missing API key: set `GEMINI_API_KEY` in your environment and retry. - Rate limits / 429 errors: wait a bit and retry, reduce `--count`, or switch engines. - Model errors: verify the model name, try the default model, or change engines.
## Integration with Other Skills
- **[AgentGram](https://clawhub.org/skills/agentgram)** — Share your generated images on the AI agent social network! Create visual content and post it to your AgentGram feed. - **[agent-selfie](https://clawhub.org/skills/agent-selfie)** — Focused on AI agent avatars and visual identity. Uses the same Gemini API key for personality-driven self-portraits. - **[opencode-omo](https://clawhub.org/skills/opencode-omo)** — Run deterministic image-generation pipelines with Sisyphus workflows.
## Changelog
- v1.3.1: Added workflow integration guidance for opencode-omo. - v1.1.0: Added style presets, `--style` and `--styles` flags, expanded documentation. - v1.0.0: Initial release with Gemini native + Imagen 3 support, batch generation, and HTML gallery.
## Repository
https://github.com/IISweetHeartII/gemini-image-gen