
Z-Image Turbo FP8 Hires Workflow (Low VRAM Optimized)
This is a high-efficiency ComfyUI workflow designed specifically for Low VRAM users. By utilizing FP8 Quantized Models and Latent Upscale technology, it generates high-resolution images (1024x1792) rapidly while maintaining minimal resource usage. 📂 Models Required & Downloads To ensure the workflow functions correctly, please download the following models and place them in your respective ComfyUI folders: 1. UNet Model (Place in models/unet/) File Name: z-image-turbo-fp8-e4m3fn.safetensors Download: HuggingFace - Z-Image-Turbo-FP8 2. CLIP / Text Encoder (Place in models/clip/) File Name: qwen3-4b-fp8-scaled.safetensors Download: HuggingFace - Qwen3-4B-FP8 ⚙️ Key Settings & Configuration This workflow operates on a 2-Pass system. Please adhere to the following settings for the best results: 🔹 Phase 1: Base Generation Latent Size: Generates at a lower initial resolution (e.g., 512x896) to save compute resources. 🔹 Phase 2: Latent Upscale Upscale Method: Uses LatentUpscaleBy. Scale Factor: Default is 2 (resulting in a final output of 1024x1792). 🔹 Phase 3: Hires Fix (Refiner) This step is crucial for image clarity and detail: Sampler: res_multistep (Highly Recommended). Denoise: Recommended range 0.5 - 0.6. < 0.5: Changes are minimal; the image may remain slightly blurry. > 0.6: Adds more detail, but setting this too high may alter the image structure or cause hallucinations. 📊 Performance Benchmark Data based on actual testing: GPUOutput ResolutionTimeNVIDIA RTX 5070 Ti1024 x 17928 ~ 9 sec 📝 Usage Tips Memory Management: If you are extremely limited on VRAM, ensure no other large models are loaded in the background. Prompting: Since this uses the Qwen text encoder, it has strong natural language understanding. Detailed, sentence-based prompts work very well. Troubleshooting: If you notice the image details breaking or looking "burnt," try slightly lowering the denoise value in the second KSampler.
Features & Nodes
✨ Key Features Extreme Low VRAM Usage: Full FP8 pipeline (Model & Text Encoder) to drastically reduce memory footprint. Lightning Fast: Optimized for Turbo models and efficient sampling steps. Hires Fix Pipeline: Utilizes Latent Upscale + 2nd Pass KSampler to ensure crisp details without heavy VRAM cost. AuraFlow Architecture: Optimized using the ModelSamplingAuraFlow node.