VelvetToroyashi
/

WahtasticMerge

Text-to-Image

Not-For-All-Audiences

Model card Files Files and versions

xet

Community

WahtasticMerge / README.md

VelvetToroyashi

Update README.md

f3b2feb verified 2 months ago

preview code

raw

history blame contribute delete

4.5 kB

metadata

base_model:
  - Laxhar/noobai-XL-Vpred-1.0
pipeline_tag: text-to-image
tags:
  - not-for-all-audiences

Wahtastic Merge is a high-quality Stable Diffusion XL (SDXL) model designed to generate stunning images with improved aesthetics and excellent prompt adherence. This model is built upon the robust noobai-XL-Vpred-1.0 base and has been further refined through the strategic merging of various other models and extensive additional training.

The ultimate goal of this model is to provide an experience very similar to the already fairly competent base of NoobAI v-pred, while fixing up rough edges. Many other merges suffer from the bimodality of either having good prompt adherence (closer to base noob) or good default aesthetics (closer to illustrious).

Ideally, both can be encapsulated in a model without sacrificing too much model knowledge to acheive this.

Up to V7, the model was entirely merged. V8 and above has additional fine-tuning applied atop the model for various fixes.

Wahtastic Roadmap

1536x Super-resolution support
- Allow for 1536x native generation (and slightly above), akin to Illustrious 2+
Fix e6 size tag implications (hyper ≠ huge ≠ big)
- In short, e6 tags have implications; hyper_* implies huge_*, and huge_* implies big_*
- Because of this, the model leans to associate big with huge, and huge with hyper, causing big_* to cause disproportionately large body parts at times.
Natural language captioning
- Yes, CLIP sucks.
- Using lodestone-rock's natural-language captions, ideally some amount of natural language understanding can be brought back
- This is inspired by EasyFluff /XL
Superior style knowledge
- ~20k e6 artists with > 500 < 20 posts
- Potentially danbooru artists too

(Previously known as Pando Merge)

Compute is expensive, and while plenty has been granted to me by kind acquaintances, a fair bit of money has been poured into the training process If you like the model, or would like to help me offset the sunken cost of this, please consider donating:

ETH Wallet Address for Donations: 0x645BebF82373865eC520d8AC2527524BfB174FF8 If you prefer PayPal/Stripe, please contact me on Discord @ velvet.toroyashi

How to Use

This model can be used with any standard SDXL-compatible interface or library (e.g., Diffusers, Automatic1111, ComfyUI).

Recommended Settings

For optimal results, we recommend the following inference parameters:

Sampler: Euler or Euler A
Scheduler: Normal or Beta
Steps: 16-24
CFG Scale: 3-6
Resolution:
For general use: 832x1200 (or similar aspect ratios with a total area around 1024x1024)
For V9.1 (if applicable): Can natively handle 1536x resolutions.

Example Usage (Python with Diffusers)

from diffusers import AutoPipelineForText2Image
import torch

pipeline = AutoPipelineForText2Image.from_pretrained(
"YOUR_HUGGINGFACE_REPO_ID/WahtasticMerge", # Replace with your actual Hugging Face repo ID
torch_dtype=torch.float16,
variant="fp16",
use_safetensors=True
).to("cuda")

prompt = "a majestic fantasy landscape, vibrant colors, epic, detailed, masterpiece"
negative_prompt = "low quality, bad anatomy, deformed, ugly, distorted"

image = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_inference_steps=20,
guidance_scale=5,
height=1200, # Example resolution
width=832
).images[0]

image.save("wahtastic_image.png")

Model Details

Base Model: noobai-XL-Vpred-1.0
Merge Strategy: Various models were merged to combine their strengths, followed by extensive additional training.
Training Goal: Improve aesthetic quality, prompt adherence, and general versatility for SDXL generations.
Model Type: Diffusion-based text-to-image generative model.

License

This model is subject to the license of its base model, noobai-XL-Vpred-1.0, which adheres to the Fair AI Public License 1.0 - SD. Please review the original license for full terms and conditions regarding usage, including commercial use and derivative works.

Contributions and Support

If you find Wahtastic Merge useful and would like to support its continued development and future updates, donations are greatly appreciated!

Feedback and Issues

We welcome your feedback! If you encounter any issues or have suggestions for improvement, please open an issue on the Hugging Face repository.