Previous /sdg/ thread : >>100110132>Beginner UI local installFooocus: https://github.com/lllyasviel/fooocusEasyDiffusion: https://easydiffusion.github.io>Local installAutomatic1111: https://github.com/automatic1111/stable-diffusion-webuiComfyUI (Node-based): https://rentry.org/comfyuiAMD GPU: https://rentry.org/sdg-link#amd-gpuIntel GPU: https://rentry.org/sdg-link#intel-gpu>Use a VAE if your images look washed outhttps://rentry.org/sdvae>Auto1111 forksForge: https://github.com/lllyasviel/stable-diffusion-webui-forgeAnapnoe UX: https://github.com/anapnoe/stable-diffusion-webui-uxVladmandic: https://github.com/vladmandic/automatic>Run cloud hosted instancehttps://rentry.org/sdg-link#run-cloud-hosted-instance>Try online without registrationtxt2img: https://www.mage.spaceimg2img: https://huggingface.co/spaces/huggingface/diffuse-the-restInpainting: https://huggingface.co/spaces/fffiloni/stable-diffusion-inpaintingpixart: https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma>Models, LoRAs & embeddingshttps://civitai.comhttps://huggingface.cohttps://rentry.org/embeddings>Animationhttps://rentry.org/AnimAnonhttps://rentry.org/AnimAnon-AnimDiffhttps://rentry.org/AnimAnon-Deforum >SDXL info & downloadhttps://rentry.org/sdg-link#sdxl>Index of guides and other toolshttps://codeberg.org/tekakutli/neuralnomiconhttps://rentry.org/sdg-linkhttps://rentry.org/rentrysd>View and submit GPU performance datahttps://docs.getgrist.com/3mjouqRSdkBY/sdperformancehttps://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html>Share image prompt info4chan removes prompt info from images, share them with the following guide/site...https://rentry.org/hdgcbhttps://catbox.moe>Related boards>>>/h/hdg>>>/e/edg>>>/d/ddg>>>/b/degen>>>/vt/vtai>>>/aco/sdg>>>/trash/sdgOfficial: discord.gg/stablediffusion
>mfw Resource news04/21/2024>FlashFace Inference Code Releasedhttps://github.com/ali-vilab/FlashFace>ComfyUI MagickWand: Proper implementation of ImageMagickhttps://github.com/Fannovel16/ComfyUI-MagickWand>Moving Object Segmentation: All You Need Is SAM (and Flow)https://www.robots.ox.ac.uk/~vgg/research/flowsam/>Image Effect Scheduler Node Set for ComfyUIhttps://github.com/hannahunter88/anodes/>ComfyUI-Tripo: Generate 3D models using the Tripo APIhttps://github.com/VAST-AI-Research/ComfyUI-Tripo04/20/2024>Basic Stable Diffusion API GUIhttps://github.com/ThioJoe/BasicStabilityAPI-GUI/>IPAdapter Advanced Weighting support added to sd-webui-controlnethttps://github.com/Mikubill/sd-webui-controlnet/discussions/277004/19/2024>Customizing Text-to-Image Diffusion with Camera Viewpoint Controlhttps://customdiffusion360.github.io/>StyleBooth: Image Style Editing with Multimodal Instructionhttps://ali-vilab.github.io/stylebooth-page/>Sketch-guided Image Inpainting with Partial Discrete Diffusion Process https://github.com/vl2g/Sketch-Inpainting>ComfyUI ImageMagick: Image processing powered by ImageMagickhttps://github.com/jtydhr88/ComfyUI-ImageMagick04/18/2024>Meta has releases meta.ai, Multimodal AI including image generationhttps://www.meta.ai/>Stability AI lays off roughly 10 percent of its workforcehttps://www.theverge.com/2024/4/18/24133996/stability-ai-lay-off-emad-mostaque>Stability API nodes for ComfyUIhttps://github.com/Stability-AI/ComfyUI-SAI_API>Dynamic Typography: Bringing Text to Life via Video Diffusion Priorhttps://animate-your-word.github.io/demo/>InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Priorhttps://johanan528.github.io/Infusion/>Factorized Diffusion: Perceptual Illusions by Noise Decompositionhttps://dangeng.github.io/factorized_diffusion/>KGen - A System for Prompt Generation to Improve Text-to-Image Performancehttps://github.com/KohakuBlueleaf/KGen
>mfw Research news04/21/2024>Prompt-Driven Feature Diffusion for Open-World Semi-Supervised Learninghttps://arxiv.org/abs/2404.11795>MultiPhys: Multi-Person Physics-aware 3D Motion Estimationhttps://www.iri.upc.edu/people/nugrinovic/multiphys/>ProTA: Probabilistic Token Aggregation for Text-Video Retrievalhttps://arxiv.org/abs/2404.12216>BLINK: Multimodal Large Language Models Can See but Not Perceivehttps://arxiv.org/abs/2404.12390>Generating Human Interaction Motions in Scenes with Text Controlhttps://arxiv.org/abs/2404.10685>Dual Modalities of Text: Visual and Textual Generative Pre-traininghttps://arxiv.org/abs/2404.10710>DreamScape: 3D Scene Creation via Gaussian Splatting joint Correlation Modelinghttps://arxiv.org/abs/2404.09227>Conditional Prototype Rectification Prompt Learninghttps://arxiv.org/abs/2404.0987204/20/2024>Align Your Steps: Optimizing Sampling Schedules in Diffusion Models https://research.nvidia.com/labs/toronto-ai/AlignYourSteps/>Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approachhttps://arxiv.org/abs/2404.11732>Partial Large Kernel CNNs for Efficient Super-Resolutionhttps://arxiv.org/abs/2404.11848>From Image to Video, what do we need in multimodal LLMs?https://arxiv.org/abs/2404.11865>GhostNetV3: Exploring the Training Strategies for Compact Modelshttps://arxiv.org/abs/2404.11202>ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesishttps://arxiv.org/abs/2404.10141>StyleCity: Large-Scale 3D Urban Scenes Stylization with Vision-and-Text Reference via Progressive Optimizationhttps://arxiv.org/abs/2404.10681>Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformershttps://arxiv.org/abs/2404.09326>Exploring Text-to-Motion Generation with Human Preferencehttps://arxiv.org/abs/2404.09445>Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppressionhttps://arxiv.org/abs/2404.09601
Anyone tried fp8 training using Transformer Engine? Anyway gonna hope I can make this Docker container work and see what comes out.
can anon post a gen using PAG that doesnt look fried?
I am retarded, how do I actually run Comfy UI on Ubuntu?I git cloned the repo, but I don't see a start/run/webui.sh And I checked the readme before asking I swear.
>>100114450
>>100115948I'm not taking the bait but here's your (You)
>>100115982I'm not baitingThere's no instructions on initiating the software. Server.py or execution.py just return an error and don't start.
>>100116044lol (You)
Planning on training an SD3 model, what would you want to see most in a new model?
>>100115877good day
>>100116082There's a severe lack of general purpose/versatile models.
>>100116082ANIMEThat's all we care about.Of course, please train the model on furniture, different kinds of clothes, facial expressions, etc.It takes a lot to make a model usable and not just some one-trick pony. Best of luck, friend.
>>100116082seems like you should just go straight for an nsfw model since the most common complaint is gonna be "it can't do nsfw"
>>100116082ideally it'd understand how to do anything going on in manga/hentai when instructed by a reasonably powerful LLM or (you) including the difficult ones like ha ku ronofu jin nsfw where things cause things.
>>100116044main.py retard
>>100116082a token limit greater than 75
>>100116082the obvious answer is just anime porn. i wouldn't really know until i actually get to try it with a proper workflow (copium) and see what it's bad it. every finetune out there just completely butchers it into either a sameface anime porn generator or a sameface 1girl portrait generator
have they even released a set of captions or their captioner prompt? wasn't this supposed to be trained with natural language captions? it would be nice to know the length/terminology used so that finetune captions don't conflict
We haven't fully explored XL yet why are we thinking about SD3
>>100116263Go slow, get lapped.
>>100116207 >>100116130Honestly *booru like danbooru 202x probably is going to least waste your time tagging data anyhow.
>>100116281I'm never getting an exquisite details tier XL model am I
>>100116263I have not fully explored 1.5 yet.
>>100116300>implying booru tagging is good>implying booru tagging is competent>implying booru tagging is consistent enough to make a good datasetShortcuts just cause more problems for us all.fuck off
Preparing a dataset needs to be a team effort.
>>100116444Yes, all of this is just fine overall.
>>100116343Probably not if SD3 can do more exquisite and more XL.Of course, that will need to happen fast or it'll get bumped to SD4.
>>100116493>or it'll get bumped to SD4If Stability AI lives to make that at all. Aren't they deep in the red? They are probably screwed, but only time will tell.
julien is shit
>>100116082good dataset, train on copyrighted artists and characters, and keep tagging similar to the base model. if you do have to dig through booru tags, be aware that there will be a conflict between the natural language of the model and the tags you might end up using. booru style tags were a good solution to a dumb model problem. as the model gets smarter, tags like that are going to cause more harm than good.
caring about regular posters drama is very very low iq
>>100116532> as the model gets smarter, tags like that are going to cause more harm than goodI don't think that's a fact that has been objectively demonstrated anywhere.You can create a system where this is the case but if it's a competent system, why wouldn't it be able to find via tags as well as natural language? If anything natural language people use is less exact.
>>100116516Buy them for a dollar when they collapse, release the assets, the Internet finishes the job.>>100116532Natural language for the win.Are we not able to get SD to understand that>these, are, tags, just, put, them, somewhereand>when the line of text passes a basic grammar check it's natural language time...?Hell, give us two sets of prompt. Natural prompt, tags prompt, and negatives for those. Boomers and Boorus will be happy, and AI kings will master working both together.
>>100116614>the Internet finishes the job.With what money, programmers, and hardware?
>>100116587Even given how English works and how English speakers use it, you need to be able to point many alternative tokens at the same concept where it overlaps.There should be no issue whatsoever if a tag is used too, if anything the tag should often have the most precise idea of a concept.>>100116614Obviously both, but let's also note that among search systems tags have been far more successful and useful so far than natural language boomer descriptions. The issue might as well be on the human side, with people generally having more clear agreement on what tags mean than what every word in English actually visually or structurally or otherwise means exactly.
>>100116587>If anything natural language people use is less exact.You mean "natural language" of undereducated promptlets?Idiot-proofing products is the stupidest, most pointless thing to try and do, countless companies can tell you that.
>hyperfine intricate details
So what should happen if a prompt has contradictory tokens? Like, say, black and blue hair, or holding a rifle and also crossing arms?
>>100116680>with people generally having more clear agreement on what tags mean than what every word in English actually visually or structurally or otherwise means exactly.People who want words to have specific meanings use the tag prompt, people wanting to control composition and style would go into natural language. It's really hard for a tag based prompt to respect a described composition. You'll get the things you asked for, but positional relationships are a lost cause. But if the AI could be trained in a context where "X on top of Y" could be learned that "book on top of table" and "flowers on top of grave" and "top hat on top of dancing frog" means what precedes "on top of" is higher on the canvas than what follows, we ought to then be able to use tag prompting to specify exact content and nat lang to put those things into the drawn space rather than rolling seeds till one accidentally gets the arrangement right.
I like both. When I am making a posed girl booru tags are great. With pony it can also simple multicharacter stuff when it hews closely to the kind of images boorus feature. When I am working on a more complicated image with multiple subjects and a lot of fine background details, it starts to get harder and harder to represent this with just tags. Ideally you'd use both: natural language for the base gen to set up the composition, then a fine tuned model for people using tags to inpaint their poses and personal details precisely.
>>100116845dual colored hair unless there is two subjects. for the rifle, you can hold a rife with crossed arms at rest if it's slung correctly
Now that I have released my jizz to fat Frieren pussy caught in a mimic trap all is good in the world again.
>>100116754>Idiot-proofing products is the stupidest, most pointless thing to try and do, countless companies can tell you that.Only fools use fool-proof products.
it it with greater pleasure that i announce cute horse girls with cute horse tails are welcomed in this thread
>>100117057So how about 2girls and solo?
>>100117148youtube or follow the github instructions in the OP links. get things running first and then go peruse some models on civit>>100117150that's a good one. dunno
>>100116279FOUR TWENTY BLAZE IT.
>>100117140fuck off to your furry boards gooner
>>100117185edibles were too lit yesterday, so today is the stoner posting
>>100117311West coast gooners have just woken up from the drug/goon overload last night. Everything goes to shit 11am west coast time.
>absolutely outstanding image
>>100116200Sigma 300 cap
>>100117312cool helmet
world is gay, can't wait for the sweet release of death
>>100115936? Are you having problems with it? When using PAG its recommended to lower the CFG a bit
>>100117422>claims to hate living>doesn't even kill himselfposer
>>100116857I'd personally prefer to use positional / relational / logical information with the tags even then rather than natural language, but it doesn't actually work.
>UnboundLocalError: cannot access local variable 'h' where it is not associated with a valueWhat is this message telling me? It happens in img2img when I try to upscale beyond this resolution. Seems like if I want a higher resolution, I need to go with another program, then bring that image back in and run it through img2img again to get some detail.
>>100117484>but it doesn't actually workWhich is kind of a problem.We need some ChatGPT action that can let us do something like run a gen, then repeat it after adding fixes to the boomer prompt (like "four visible fingers and one thumb on right hand of leftmost woman") and have it actually get that where it drew a hand over there that six fingers and two thumbs was a bit too ambitious.
>>100117758nicely surreal and creepy
>>100117758>>100117872Another progressive rock album cover for songs we'll never get to listen to.
this is really just retreading old ground applying new gen settings
Anyone feels like proompting my schizo (legit) vision I hada dragon, of the DB type but so fucking massive in the sky that I perceived it as a god
>>100117758p cool
>>100117920my Trypophobia
What the fuck causes regional prompter to gen slightly off pictures sometimes?
>>100118145What do you mean by slightly off?
>>100118178Like it occasionally it ignores some of the prompts and creates "generic" looking picture.
Can I not use multiple style loras with regional prompter on forge without it looking completely fucked?
>>100118216Might be a seed that just doesn't look like what you want it to find. Is it deterministic?
>>100118309It's not a seed, it seems to be some random words in some particular order that seems to fuck with it.
>>100117573I wonder if something like that will show up eventually. Would be nice.
the gaunaburger, only at toha heavy agriculture
is there any model that actually creates proper pixel art without errors?
>>100117573instruction-based image editing is a thing but I dunno what resources are actually goodhttps://github.com/ali-vilab/Rannihttps://github.com/modelscope/scepter
>>100118299is there no way to concat the loras?
>>100118644slick>>100118689what even is going on here?>>100118807depends on what you consider an error. if it's a prefect pixel grid that's a matter for postprocessing
>>100117573i think that'll either be when parameters are high enough to sufficiently capture nuance and specificity of english, and/or when/if image editor build up enough controlnets and quick/dirty interactive edits to corral the models
>>100118934https://github.com/hako-mikan/sd-webui-regional-prompter#latent>Slower, but allows separating LoRAs to some extent.But also not completely. So perhaps you can't really with this.
>>100118914why would you do this
>>100118988there was an attention couple one that came out not too long ago. maybe try that instead?https://github.com/Haoming02/sd-forge-couple
>>100119082jesus christ what is that thing
>>100119098>jesus christ what is that thingpicrel is a hint
>>100119018I find his gens rather fascinating
>>100119116Actually yes
What's for dinner lads
>>100118939I just want to mix them but I see a sharp degradation when using forge. One Lora is fine but two style loras don't play well
>>100119163Lotsa Spaghetti!
>>100119276What will you put on it?
>>100119366
>>100119403kek
>>100106072Thank you for the helpful guidance! I'll try to apply what you've elaborated upon.>>100107578>>100107667>>100107688>>100107726>>100107802Ngl, this is my favorite thus far in these threads. Mysterious anon with godlike gens, please show yourself (and give me your Patreon kek)
>>100119474brilliant glitch dada collage
>>100119565he's on a 3 day vacation
>>100119690momentous work, jules
>>100119699my humble abode
>>100119699>9x8ft one room house>1.4milbut its in a nice neighborhood!
>momentous work, jules
>>100119699indistinguishable from reality, gj.
>>100119324sliced bananas and pineapple
has anyone tried out kohaku epsilon? I've been finding it rather hard to work with and dunno if its just me or if its a weak model
>>100115837>>StyleBooth: Image Style Editing with Multimodal Instruction>https://ali-vilab.github.io/stylebooth-page/i am once again asking if anyone here has successfully tried this out, and if so, can you catbox a working python or google colab notebook
>>100120089It just seems strictly worse than animagineXL3.1 in every way. What a shame.
>>100120296be the change you want to see take a dive and let us know how it goes
>>100120296we have enough followers. we need a leader>>100120366thats exactly how I feel about it :(
>>100119565>Shitter with shit opinionsFuck off
>>100120371>>100120394i'm trying but i'm retarded with python and dependencies n sheeit and I can't figure out why even when i manually pip install torch==2.2.1 why I get an error saying ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.torchaudio 2.2.1+cu121 requires torch==2.2.1, but you have torch 2.0.1 which is incompatible.torchtext 0.17.1 requires torch==2.2.1, but you have torch 2.0.1 which is incompatible.
>>100115740Am I the only one getting way better outputs with 1024x1024 on Pony compared to e.g. 896x1152?
>>100120454I find pony works best at 768x1280 however 1024x1024 isn't bad
>>100120437this hiking trail is only a few miles from where I live. there's benches down near the cliff where you can watch the waves crash
>>100120437are you just feeding in a real picture at low denoising to get the filename, or is there a "weekend away snapshit" lora out there
>>100120544check out the boring reality lora. it advises to use it with the base sdxl model but I've been using it with sds_film
>>100120571NTA but cool, I might take this for a spin I was eying the VHS ones for a bit to see if I could make some creepy gens but this also tickles my fancy
>>100119659thanks, heres a portrait just for you
>>100120394It's fine for 1girl stuff I guess, but I feel the results are always worse than what Animagine would have produced.It fucks up hands a lot more and losing good gens to that always sucks.I'll keep trying it, but I don't have too high hopes desu...
>>100119660Very specific answer desu>>100120401I am a shitter. We all gotta start somewhere.
>>100120653Unfortunately for you there is no hope based on your taste.
>>100120647the only thing I've found I really like about it is that it does really interesting manga layouts. but then it blunders all the details so its worthless. maybe I should do a dual-model workflow with KE for the first pass and animagine for the hires
>>100120697I want to know what it says
>backgrounds in ponyit's slightly better after I experimented with some merges but still horrible compared to 1.5
>>100120755that looks fine and fitting for the character's illustrated style? it looks better than most 1.5 garbage. the background shouldn't be equally or more detailed than the character, that's one of the telltale signs of 1.5 ai slop: overly detailed nonsensical backgrounds
>>100120731sadly we'll never know what the Ai was thinking
>>100120793hm well compare to this 1.5 picture, I think the background looks better
>>100120755>horrible compared to 1.5I fucking HATE when my backgrounds are consistent. I won't even give you any help cause you're trollin
>>100120806that is rather subjective, and you are comparing two completely different types of shots
So is stable diffusion 3 released or not?What does API release mean?
>>100120697What I've found so far is that it's a lot better at generating zouri and tabi (the kinds of sandals and socks miko wear).Other models always render generic socks and individual toes and that always bothered me.
>>100120859no and it mean's ignore it until they actually release
>>100120859> What does API release mean?money and the usual free pass of coping that isn't the final version, just like XL on clipdrop. also don't bother with SD3, the license is a mess.
>>100120855the picture he posted that "looks better" makes 0 sense. why is there a 40 foot sand dune behind that rock formation? why does that rock formation have a perfectly straight pillar? what does that rock formation have a gun trigger? etc etc etc
>>100120859SD3 doesn't matter until 6mo+ after people can start training models for it
>>100120904>sand dunes are huge>wtf are ruins Here's your (You) since you're starving
>>100120890>the license is a mess.What changed between the license to SDXL and SD3?If they release the model people are gonna fine tune it anyway, license or not
>>100120422you need matching versions of all the torch packages. add torchaudio==2.2.1 torchtext==2.2.1 to the install command.
...It *does* generate some pretty cute images, can't deny that...
>>100120931>non Euclidian backgrounds are LE GOOD
>>100120984I think it's sole purpose is "portrait of character" and it completely falls apart if you try to do anything else
I wonder if SD3 has the classic stable diffusion tendency that when you put 'elf' in the prompt it wants to give you a cross between green christmas elves and keebler elves.
>>10012095760 year old saggers on a 20 year old
>>100121003What? Image generations aren't perfect? oh my godddddddddddd
>>100121051lollmao
>>100121061you're clutching your black and white tv and screaming that it's better when anyone with eyes can see that you're wrong.
>>100121108What are you even going on about? More (You)'s for the starving third-worlder
>>100121096Nice
I don't understand why the amount of pictures you have changes the amount of steps necessary to train a lora.
>>100121280inverse fletchet cosine
>>100121280kek..
>>100121280the images themselves add steps dingus(Training Images * repeats)/batch size * epochs
(Training Images * repeats)/batch size * epochs
So I've been training a few LORAs on the same dataset recently and I can't help but notice how larger network ranks are directly tied to the quality of the output.I feel like people recommending anything less than the largest rank your GPU can handle is just vramlet cope.
>>100121444post them then faggot >you wont It's all larp
>>100121444Thank you for saying so, it's great to hear this kind of information. Do you have any comparison between network ranks? It'd be great to see evidence of the kind of difference it makes.
>>100121469Gimme a moment, I'm training some right now so I can't gen any comparisons. But I stand by what I said.
>>100121362trudeau blackface lora when?
>>100121444"optimal" is def higher than people recommend but it isn't as simple as higher = better.On pony I have found 128 best, I can train at 256 but it starts to look fucked.People saying train it on 8 are retards though
>>100121487alright cool, I'm curious about the science you're doing/have done
>>100121509True, 256 starts basically reprinting the training data very fast but in fucked up ways.
>>100120935you need to pay for commercial use
>>100121505When I grab a 3090.
>>100121556How would they ever police that?
>>100121509>>100121524I used to do requests for LORA training and trained at 128 network and 64 alpha, but people complained about the size of the LORA file. I ended up reducing it to 64/32. If there is evidence that 128 is better though I'd definitely want to switch back.
>>100121580>nooo not the heckin 200mb filetell them to keep themselves safe
>>100121564sue you if you're using an sd3 generated image commercially? desu, that'd probably cause a big court loss in the corner of generative ai though...
>>100121564emad unironically said "honesty"
>>100121595the vast majority of celebrity loras on civit are between 800 and 900 MB. It's a real problem.
>>100121651honestly, is it? it's like 50$ per TB of storage at most
>>100121444>>100121509>>100121524>>100121580>>100121595>>100121651you can resize them after you train them
>>100121651>>100121663>>100121595based and I don't give a shit about 5 cents of storage space pilled>>100121671or I can simply not
>>100121663storing the loras isn't the issue. loras have to be loaded in VRAM. can quickly run out of room with 1gb loras. >>100121671Interesting. googling.
>>100120965thank you. trying this now
>>100116832>>100117355>maximum details>extreme hyperrealistic details>trending on artstationkekt
>>100121676>or I can simply notit retains a lot of the quality without taking up so much space. there isn't a reason to keep them that big>>100121686>storing the loras isn't the issue. loras have to be loaded in VRAM. can quickly run out of room with 1gb loras.this too
>>100121722>score_9I bet this will start showing up forever in future models completely unrelated to pony
>>100121740everyone universally hates it so no
>>100121759more like people will forever blindly copy prompts from images and a huge % of images over this time period will have that
>>100121766this
>>100121740>>100121759I'd want to train a non-cucked SD3 model so people don't have to deal with Pony anymore, but it would need to not suck to compete. I don't mind spending some money renting A100s for the training but the dataset needs to be well done and that seems like a challenge.
How about some chrome?
>>100121793if you have a budget what you do is literally hire people (probably Indian) to tag massive amounts of data for you.It isn't tech that is the limitation for a great model, it's datasets
>>100121793It will need to be a coordinated group effort
>>100121804me in the back (I'm an orb)
>>100121766to be fair, in the early days of base 1.5, there were some decently complex negative prompts floating around that worked much better than embeddings, and the "amazing quality, masterpiece, award-winning photography" prompts did make a decent difference in quality when trying to gen photoreal people
>>100121793we need a way to collaboratively put together datasets from all out lora training without retards shitting it up. the latter is the hard part
>>100121846The only way to really vet people and pay for the training of such a model very quickly begins to resemble something like a real company except its employees dont get paid.
>>100121887>>100121846which is why you just don't bother and pay up to the pajeets
>>100121894>pay up to the pajeetsthat's how you get LAION
What I don't understand about pony model tags are:> score_8_upDoes this mean score 8 and up? If that's the case it shouldn't even need a score_9? The advice I got when I first started using it, can't remember from where, said to use something like:> score_9,score_8_up,score_7_up,score_6_upBut this seems redundant if "up" means what I think, so I assume I'm wrong. Also do you have to put a BREAK after the score stuff? At first I was doing that, but then I stopped partly because it was tedious to manage that in comfyUI and it didn't seem to make a whole lot of difference.
>>100121906>>100121887>>100121894alright let's see to have a great model I need to just compete in the same space and style as the billion dollar companies but do it for free with 2k of equipment instead of microsoft using 1$ billion.I think I will continue to wait for others to provide the models and focus on loras
>>100121918Check the PonyXL CivitAI page - the trainers literally admit there they fucked up the training so the score numbers are broken and don't function correctly/logically. Literally the only reason it's used is because there's no better alternative XL model. That why I'm really hoping to either train something better or that someone else will because as soon as there's something better that's not cucked there will be no reason to use PonyXL ever again.
can i train on sdxl_vaefix or do I have to train on the model without the built in vae?
>>100121840You're highly reflective. Good.
>>100121936they use synthetic datasets like they say they use in their papers
baker......baker?baker>b>a>k>e>r
>>100121887>>100121846>>100121838>>100121812>>100121968>>100121936The process I was thinking was to grab a lot of booru images since those are easy, use a Python script to clean/synchronize tags between different boorus, and then use an AI to convert the tags into natural language which should greatly improve prompting based on the ELLA research: https://ella-diffusion.github.io/That dataset could be supplemented with more manually gathered images to cover characters/styles/concepts people want, and those images would need to be fed to an AI for captioning too.I'd thinking I'd need some huge harddrives if I'm going to store the data set locally, maybe pay for a Chat GPT4 subscription to caption safe images, and setup something local to caption explicit images. For people with any training experience, does that seem reasonable?
>>100122115would it improve prompting with tags or just make it better at understanding natural language though. I don't think you would get the result you are hoping for.
>>100121714>>100120965ok this seems to work but turns out the thing i'm trying to run is not the thing i actually want to run lmao>>100121862(painting, traditional media)
>>100122133"1girl, apple" doesn't provide enough information to the AI to understand location, color, etc. If a language model can turn that into "1girl holding a red apple in her right hand" than the resulting model is leaps and bounds ahead in understanding prompts. That's what the ELLA research found, was going to post one of the pics at https://ella-diffusion.github.io/ but we hit the image limit.
it's actually over this time
>please wait before making a thread
>>1001221884 u
Next thread>>100122230>>100122230>>100122230
>>100116195i like this
>>100120089more intelligible paneling than oda
>>100120755Try Worldly lora, DPM++ 3M SDE, and maybe perturbed attention guidance