Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions bin/console
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ RubyLLM.configure do |config|
config.openai_api_key = ENV.fetch('OPENAI_API_KEY', nil)
config.openrouter_api_key = ENV.fetch('OPENROUTER_API_KEY', nil)
config.perplexity_api_key = ENV.fetch('PERPLEXITY_API_KEY', nil)
config.replicate_api_key = ENV.fetch('REPLICATE_API_KEY', nil)
config.replicate_webhook_url = ENV.fetch('REPLICATE_WEBHOOK_URL', nil)
config.vertexai_location = ENV.fetch('GOOGLE_CLOUD_LOCATION', nil)
config.vertexai_project_id = ENV.fetch('GOOGLE_CLOUD_PROJECT', nil)
end
Expand Down
66 changes: 62 additions & 4 deletions docs/_core_features/image-generation.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ After reading this guide, you will know:
* How to generate images from text prompts.
* How to select different image generation models.
* How to specify image sizes (for supported models).
* How to use model hosting platforms like Replicate.
* How to access and save generated image data (URL or Base64).
* How to integrate image generation with Rails Active Storage.
* Tips for writing effective image prompts.
Expand Down Expand Up @@ -119,12 +120,32 @@ image_portrait = RubyLLM.paint(
)
```

> Not all models support size customization. If a size is specified for a model that doesn't support it (like Google Imagen), RubyLLM may log a debug message indicating the size parameter is ignored. Check the provider's documentation or the [Available Models Guide]({% link _reference/available-models.md %}) for supported sizes.
> Not all models support size customization. Check the provider's documentation or the [Available Models Guide]({% link _reference/available-models.md %}) for supported sizes.
{: .note }

## Using model hosting platforms

Platforms like Replicate host a large collection of models with different capabilities and parameters. Due to the variety of models available, you should be aware of the parameters supported by the model you're using. This information is available in the platform's documentation.

For example, Imagen 4 Ultra supports `aspect_ratio` as a parameter when used [via Replicate](https://replicate.com/google/imagen-4-ultra). Since it's optional, you can omit it. But if you'd like to specify a value, you just need to add it to your paint call.

```ruby
image = RubyLLM.paint(
"A photorealistic image of a red panda coding Ruby on a laptop",
model: "google/imagen-4-ultra",
provider: :replicate,
aspect_ratio: "16:9"
)
```

> When switching between different models, you'll typically need to change the parameters as well, since different models support different parameter sets. Always check the model's documentation for the specific parameters it accepts.
{: .note }

## Working with Generated Images

The `RubyLLM::Image` object provides access to the generated image data and metadata.
When models return the image immediately, the `RubyLLM::Image` object provides access to the generated image data and metadata.

Some models generate images asynchronously. In this case, you will receive a `RubyLLM::DeferredImage` object instead. You can still access the image data, but you will either need to wait for the image to be generated or fetch it by other means—typically after being notified via a webhook.

### Accessing Image Data

Expand All @@ -133,6 +154,10 @@ The `RubyLLM::Image` object provides access to the generated image data and meta
* `image.mime_type`: Returns the MIME type (e.g., `"image/png"`, `"image/jpeg"`).
* `image.base64?`: Returns `true` if the image data is Base64-encoded, `false` otherwise.

### Accessing Deferred Image Data

* `deferred_image.url`: Returns the URL where you can check whether the image has been generated.

### Saving Images Locally

The `save` method works regardless of whether the image was delivered via URL or Base64. It fetches the data if necessary and writes it to the specified file path.
Expand All @@ -150,9 +175,42 @@ rescue => e
end
```

For deferred images, the `save` method will write the file and return its path only if the image has been generated. Otherwise, it will return `nil`.

The ideal way to handle deferred images is by having the provider notify you via webhook when the image is generated, and then fetching the image using `RubyLLM.image_from` as follows:

```ruby
output_url = webhook_payload['output']
image = RubyLLM.image_from(output_url, provider: :replicate) # returns a RubyLLM::Image instance
image.save("cartoon_panda.png")
```

If you're not able to configure a webhook, another way to handle these is to call the method recursively with a delay until it succeeds or a condition is met. This is equivalent to a "polling" mechanism. Check your provider's documentation for any rate limits that may apply.

```ruby
# Only do this if you're not able to configure a webhook
def save_deferred_image(image, path, remaining_attempts = 10)
return nil if remaining_attempts <= 0
image.save(path) || (sleep(2) && save_deferred_image(image, path, remaining_attempts - 1))
end

save_deferred_image(deferred_image, "deferred_image.png")
```

#### Configuring Replicate Webhooks

To configure Replicate webhooks set the following configuration options:

```ruby
RubyLLM.configure do |config|
config.replicate_webhook_url = "https://example.com/your-webhook-path"
config.replicate_webhook_events_filter = %w[output completed] # optionally specify which webhook events you want to receive, see Replicate's documentation for more details
end
```

### Getting Raw Image Blob

The `to_blob` method returns the raw binary image data (decoded from Base64 or downloaded from URL). This is useful for integration with other libraries or frameworks.
The `to_blob` method returns the raw binary image data (decoded from Base64 or downloaded from URL). This is useful for integration with other libraries or frameworks. Deferred images return `nil` if the image has not finished generating.

```ruby
image = RubyLLM.paint("Abstract geometric patterns in pastel colors")
Expand Down Expand Up @@ -277,4 +335,4 @@ Image generation can take several seconds (typically 5-20 seconds depending on t

* [Chatting with AI Models]({% link _core_features/chat.md %}): Learn about conversational AI.
* [Embeddings]({% link _core_features/embeddings.md %}): Explore text vector representations.
* [Error Handling]({% link _advanced/error-handling.md %}): Master handling API errors.
* [Error Handling]({% link _advanced/error-handling.md %}): Master handling API errors.
4 changes: 3 additions & 1 deletion docs/_getting_started/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,8 @@ RubyLLM.configure do |config|
config.mistral_api_key = ENV['MISTRAL_API_KEY']
config.perplexity_api_key = ENV['PERPLEXITY_API_KEY']
config.openrouter_api_key = ENV['OPENROUTER_API_KEY']
config.replicate_api_key = ENV['REPLICATE_API_KEY']
config.replicate_webhook_url = ENV['REPLICATE_WEBHOOK_URL']

# Local providers
config.ollama_api_base = 'http://localhost:11434/v1'
Expand Down Expand Up @@ -363,4 +365,4 @@ Now that you've configured RubyLLM, you're ready to:

- [Start chatting with AI models]({% link _core_features/chat.md %})
- [Work with different providers and models]({% link _advanced/models.md %})
- [Set up Rails integration]({% link _advanced/rails.md %})
- [Set up Rails integration]({% link _advanced/rails.md %})
Loading