Switch Language
Toggle Theme

ComfyUI Beginner Guide: From Installation to Your First Stable Diffusion Image

The easiest place to get stuck as a ComfyUI beginner is not prompt writing. It is the first time you open the interface and see a whole node graph: Load Checkpoint, CLIP Text Encode, KSampler, VAE Decode, and Save Image. It is natural to assume you must learn diffusion theory before you can generate one image.

You do not. At the beginning, treat ComfyUI as a visual generation pipeline. The model provides generation capability, the prompt describes the image, the sampler performs the step-by-step generation process, and the save node writes the output. This guide focuses on one narrow goal: choose an installation route, place your model in the right folder, run the default text-to-image workflow, and troubleshoot the first round of errors in a sane order.

Quick Decision Table

Your situationSuggested routeDo not start with
Windows + NVIDIA GPU, you just want an image quicklyDesktop or Windows portableManually configuring Python on day one
macOS Apple SiliconDesktopFollowing Windows CUDA tutorials
Linux or you need PyTorch/CUDA controlManual installCopying someone else’s environment variables blindly
No local GPU yet, you want to understand workflowsComfy CloudBuying hardware or model packs immediately
You already have an Automatic1111 model libraryLocal install + extra_model_paths.yamlCopying tens of gigabytes of models twice

Your first-day target is not to make a beautiful image. It is to confirm three things: ComfyUI starts, Load Checkpoint can see a model, and the default text-to-image workflow can finish. Once those are true, LoRA, ControlNet, IP-Adapter, and more complex workflows become much easier to debug.

What ComfyUI Actually Is

ComfyUI is an open-source, node-based interface and inference engine for generative AI. It is different from Stable Diffusion tools that mainly feel like forms. Instead of filling in prompt, size, and seed in one panel, you see a workflow on a canvas.

A minimal text-to-image workflow can be split into five parts:

  1. Load Checkpoint: loads the base model, such as SD 1.5, SDXL, or another checkpoint.
  2. CLIP Text Encode: turns positive and negative prompts into conditioning the model can use.
  3. KSampler: generates latent data according to seed, steps, CFG, and sampler settings.
  4. VAE Decode: decodes the latent into an image.
  5. Save Image: saves the result to the output folder.

That is the first chain a beginner should understand. Complex workflows usually add more nodes between these steps: ControlNet reads a pose or structure image, IP-Adapter references an image, LoRA changes style or identity, and upscale nodes enlarge the output.

Why the Node Graph Looks Intimidating

ComfyUI exposes steps that many other tools hide. That gives you control, but the first impression can be rough. You do not need to understand every node on day one. Read the workflow like a stream: left to right, top to bottom, then find the model, prompts, sampler, decode, and save nodes.

Troubleshooting follows the same direction. If the model is not loaded, later nodes cannot work. If the prompt is vague, the result may be weak. If sampler settings are changed randomly, the output can become unstable. If the save node is not connected, you may think nothing was generated.

How to Choose an Installation Route

The official documentation describes several routes, including Desktop, portable, manual installation, and cloud. A beginner does not need the most powerful path. You need the path with the least friction.

Desktop: Best for Most First Attempts

Desktop is the low-friction option. You do not have to decide on Python versions, PyTorch builds, CUDA backends, or virtual environments immediately. For macOS Apple Silicon users, Desktop is also the more natural official starting point.

Know the tradeoff: the official docs describe Desktop as being based on stable releases, so the newest features may arrive later than in portable or manual setups. That does not matter much for your first image. When you start needing specific new nodes, model formats, or plugin compatibility, then you can revisit the install route.

Windows Portable: Good for NVIDIA GPU Users

Windows with an NVIDIA GPU is a common local Stable Diffusion setup. The portable package is useful because the folder structure is easy to inspect. You can directly see folders such as ComfyUI/models/ and ComfyUI/output/, which makes model-folder debugging easier.

If your goal is to learn ComfyUI, portable is enough. Do not install dozens of custom nodes, Manager extensions, ten checkpoints, and a stack of LoRA files on the first day. Many beginner failures come from installing too much before the default workflow has ever worked.

Manual Install: For People Who Need Environment Control

Manual installation is better for Linux users, developers, or anyone who already knows they need to control Python, PyTorch, CUDA, ROCm, or MPS. It is flexible, but it also creates more places where errors can appear.

If you choose manual installation, treat the environment as a small project:

git clone https://github.com/comfy-org/ComfyUI.git
cd ComfyUI
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python main.py

This command block shows the general shape of a manual setup. The actual PyTorch backend, GPU driver, and operating-system details should follow the official manual installation docs. Do not copy an old CUDA command from a random tutorial and assume it still fits your machine.

Cloud: Useful for Learning the Workflow Concept

If you do not have a local GPU, or you only want to see whether ComfyUI makes sense for you, a cloud route can help. It is not a substitute for learning a local environment, but it lets you understand nodes, workflows, models, and prompts without buying hardware first.

Once you know you want to use ComfyUI regularly, come back to the local setup. That is usually a better sequence than downloading a huge model library before you know what you need.

Where Model Files Should Go

Many ComfyUI beginner problems eventually become the same question: why is Load Checkpoint empty?

The official docs explain that most installations do not include base models by default. Models usually live under the models/ directory inside the ComfyUI installation. Common subfolders include:

File typeCommon folderPurpose
checkpoint / .safetensors / .ckptComfyUI/models/checkpoints/Base image-generation model
LoRAComfyUI/models/loras/Style, character, action, or concept tuning
VAEComfyUI/models/vae/Image decoding, color, and detail handling
embedding / textual inversionComfyUI/models/embeddings/Special trigger embeddings
upscale modelComfyUI/models/upscale_model/Image upscaling

For your first image, only handle the checkpoint. Put a usable base model in models/checkpoints/, start or refresh ComfyUI, then select it in the Load Checkpoint dropdown.

Desktop Model Folders May Differ

Desktop users should not blindly follow portable-path tutorials. The official docs mention opening the models folder from the app menu, such as Help / Open folder / Open models folder. Use the folder opened by the app as the source of truth.

If you already have a model library in Automatic1111, Forge, or another tool, consider configuring extra_model_paths.yaml. That lets ComfyUI read external model folders without copying tens of gigabytes of files.

A practical rule:

  • One or two models: put them directly in the matching ComfyUI folder.
  • A large existing model library: map it with extra_model_paths.yaml.
  • You are still unsure whether you will use ComfyUI long term: keep the model setup simple.

Generate Your First Image

Use the default workflow for your first image. Do not start with a complex JSON workflow shared by someone else. The default workflow is valuable because it has fewer variables: if it works, your environment, model, and core nodes are connected.

Step-by-Step

  1. Start ComfyUI and open the web interface.
  2. Load the default Image Generation workflow.
  3. If the interface says a model is missing, install it through the prompt or place a downloaded model in models/checkpoints/.
  4. Select the model in Load Checkpoint.
  5. Write the subject in the positive prompt, for example a cozy desk setup, soft light, detailed illustration.
  6. Write what you want to avoid in the negative prompt, for example blurry, low quality, distorted hands.
  7. Keep the default size, sampler, and steps at first. Do not change everything at once.
  8. Click Run, or press Ctrl + Enter.
  9. Check the Save Image node, the interface output area, or the output/ folder.

It is fine if the first image is plain or even ugly. Its job is to prove the pipeline works. What you should record is the model you used, the prompt, whether any error appeared, and where the output was saved.

A Simple First Prompt

Avoid overly abstract prompts at the beginning. Words like “beautiful girl” or “future city” can generate images, but they do not give you much feedback when the result is bad. A better first test prompt is:

a small wooden cabin beside a lake, morning fog, soft sunlight, detailed illustration, calm mood

A simple negative prompt can be:

blurry, low quality, distorted, extra fingers, bad anatomy

Do not stack a long list of style tags, camera terms, and artist names immediately. Make the model produce stable outputs first, then adjust prompt, size, steps, CFG, and seed one at a time.

How to Troubleshoot the First Round of Errors

Troubleshoot in order. Do not reinstall the environment, switch models, and rewrite the workflow at the same time. Change one variable at a time so you can see what actually fixed the issue.

Load Checkpoint Is Empty or Shows null

Check three things first:

  1. Is the model file a .safetensors or .ckpt file?
  2. Is it inside ComfyUI/models/checkpoints/, or inside the models folder opened by the Desktop app?
  3. Did you refresh or restart ComfyUI after moving the model?

If you use extra_model_paths.yaml, simplify it to one path first. Confirm that one path works, then add more. Paths with non-ASCII characters, spaces, or permission restrictions can create additional problems.

A Workflow Opens With Red Nodes

Red nodes usually mean missing custom nodes, missing models, or a workflow that does not match your current environment. Do not debug a complex workflow first. Return to the default text-to-image workflow and prove that the basic path works.

Once the default workflow works, debug the shared workflow:

  • Read the red node names and identify the missing custom node.
  • Check model-loading nodes and confirm checkpoint, LoRA, and VAE files are visible.
  • Look at parameters last. Do not start by rewiring the graph randomly.

This is usually a second-day topic. Do not let custom nodes hijack the first day.

CUDA, Torch, or Backend Errors

These errors are usually not caused by a bad prompt. They often come from a runtime mismatch. Windows users should check the GPU driver and the chosen install package. Linux users should compare Python, PyTorch, and backend details against the manual installation docs. macOS users should not follow CUDA instructions meant for Windows or Linux NVIDIA setups.

If you do not want to spend time on environment debugging yet, use Desktop or a cloud route to learn the concept first. Once you know you will use ComfyUI regularly, come back to GPU and backend details.

The Image Is Blurry or Ignores the Prompt

Do not assume ComfyUI is broken. Common causes include:

  • The selected model is not suitable for the image type.
  • The prompt is too abstract and lacks subject, scene, lighting, or style.
  • Size or sampler settings were changed too aggressively.
  • The negative prompt is over-constraining the model.

Keep the model and parameters fixed, then run three prompt variations. First write the subject, then add the scene, then add lighting and style. This makes it much easier to see how the prompt changes the result.

What Beginners Should Avoid at First

ComfyUI is powerful, but that power can slow beginners down.

First, do not install dozens of custom nodes immediately. More nodes mean more dependency and compatibility issues. Wait until the default workflow runs reliably, then install nodes for one concrete need.

Second, do not download ten checkpoints at once. Start with one base model, record what it is good at, then add more gradually. Too many models make it hard to know whether the prompt or the model caused a bad result.

Third, do not jump into API automation too early. The API is useful, but if you do not understand the workflow yet, automation only multiplies mistakes.

Fourth, do not treat someone else’s workflow as a universal answer. Shared workflows often depend on specific models, node versions, and file paths. Learn their structure, but do not expect every copied workflow to run immediately.

A steadier path looks like this:

  1. Run the default text-to-image workflow.
  2. Understand checkpoints, LoRA, and VAE.
  3. Learn how to read workflow JSON files and what red nodes mean.
  4. Pick one enhancement topic, such as ControlNet or IP-Adapter.
  5. Then move to batch generation, API usage, automation, and workflow reuse.

If you already understand local LLMs, think of ComfyUI as a local inference workbench for image generation. You can read Ollama Introduction: Your First Step to Running Large Language Models Locally to connect model files, runtime environments, and inference parameters. For prompt writing, continue with Prompt Engineering for Business. For GPU and runtime issues, see Ollama GPU Acceleration Setup.

Summary

ComfyUI beginners do not need to start with complex workflows. Keep the first goal small: choose a suitable installation route, place a base model in the correct folder, and run the default text-to-image workflow.

Once that path works, learn LoRA, ControlNet, custom nodes, and workflow reuse. Each step then has a clear diagnostic question: can the model be detected, are nodes missing, is the prompt specific enough, and was the output saved? ComfyUI has a real learning curve, but if you avoid mixing every topic on day one, the intimidating node graph becomes a workflow you can debug and reuse.

References

FAQ

Should a ComfyUI beginner choose Desktop, portable, or manual installation?
If you only want to generate your first image quickly, start with Desktop or Windows portable. Choose manual installation when you use Linux or need control over Python, PyTorch, CUDA, ROCm, or MPS.
Where should ComfyUI models be placed?
Base checkpoints usually go in ComfyUI/models/checkpoints/. LoRA files go in models/loras/, VAE files go in models/vae/, and embeddings go in models/embeddings/. Desktop users should use the models folder opened by the app.
Why is Load Checkpoint empty or null in ComfyUI?
Check the model file format, folder location, and whether ComfyUI has been refreshed or restarted. Most installations do not include base models by default, so ComfyUI cannot detect a model until one is placed in the checkpoint folder.
What should I do when a shared workflow shows red nodes?
Red nodes usually mean missing custom nodes, missing models, or a workflow that does not match your current environment. Beginners should first return to the default text-to-image workflow and confirm the basic path works.
Does a bad first image mean ComfyUI was installed incorrectly?
Not necessarily. The first image is only meant to prove that the environment, model, and workflow are connected. Blurry or weak images are usually a model, prompt, size, sampler, or negative prompt issue.

11 min read · Published on: Jun 1, 2026 · Modified on: Jun 2, 2026

Comments

Sign in with GitHub to leave a comment