The Diffuser Module
The Diffuser Module
The Udility.diffuser module is the core engine of the Udility library. It orchestrates the process of converting natural language prompts into high-quality, labeled SVG illustrations and subsequently rendering them into raster formats for display and distribution.
The module leverages Meta Llama 3.5 (via OpenRouter) to "reverse engineer" image descriptions into precise SVG scripting instructions.
Core Functions
generate_image_from_text
This is the primary entry point for the module. It handles the end-to-end pipeline: generating instructions, creating SVG code, converting to PNG, and displaying the result.
Signature:
generate_image_from_text(text_description, output_filename='output.png')
| Parameter | Type | Description |
| :--- | :--- | :--- |
| text_description | str | A natural language description of the illustration you want to generate (e.g., "The water cycle"). |
| output_filename | str | (Optional) The file path where the resulting PNG will be saved. Defaults to 'output.png'. |
Example:
from Udility import diffuser
# Generates a PNG and displays it in your environment
diffuser.generate_image_from_text("A labeled diagram of a plant cell.")
Utility Functions
While generate_image_from_text is recommended for most users, the module provides utility functions for more granular control over the image generation lifecycle.
svg_to_png
Converts raw SVG code into a PNG image file. This is useful if you have custom SVG scripts or have modified the output of the LLM.
Signature:
svg_to_png(svg_code, output_filename='output.png')
| Parameter | Type | Description |
| :--- | :--- | :--- |
| svg_code | str | The raw SVG XML string. |
| output_filename | str | The destination path for the saved PNG. |
display_image
Renders a saved image file using matplotlib. This is optimized for use in Jupyter Notebooks or Google Colab environments.
Signature:
display_image(image_path)
| Parameter | Type | Description |
| :--- | :--- | :--- |
| image_path | str | The path to the image file to be displayed. |
Internal Logic Functions
The following functions are used internally by the module to communicate with the Meta Llama 3.5 model. While accessible, they are typically managed by the high-level generate_image_from_text function.
get_detailed_instructions
Sends the user's prompt to the LLM to generate a specific technical plan for an SVG illustration.
- Input:
text_description(str) - Returns: Detailed instruction set (str)
generate_svg_from_instructions
Takes the technical plan and converts it into valid SVG code.
- Input:
detailed_instructions(str) - Returns: Raw SVG code (str)
Configuration & Requirements
The module requires a valid OpenRouter API key to communicate with the inference models. Ensure this is set in your environment before calling any module functions:
import os
os.environ['OPENROUTER_API_KEY'] = 'your_api_key_here'
Note: The module relies on cairosvg for rendering. If you are on a local machine (not Colab), ensure that the Cairo graphics library is installed on your system.