Welcome to the Pony Voice Preservation Project!youtu.be/730zGRwbQuEThe Pony Preservation Project is a collaborative effort by /mlp/ to build and curate pony datasets for as many applications in AI as possible.Technology has progressed such that a trained neural network can generate convincing voice clips, drawings and text for any person or character using existing audio recordings, artwork and fanfics as a reference. As you can surely imagine, AI pony voices, drawings and text have endless applications for pony content creation.AI is incredibly versatile, basically anything that can be boiled down to a simple dataset can be used for training to create more of it. AI-generated images, fanfics, wAIfu chatbots and even animation are possible, and are being worked on here.Any anon is free to join, and there are many active tasks that would suit any level of technical expertise. If you’re interested in helping out, take a look at the quick start guide linked below and ask in the thread for any further detail you need.EQG and G5 are not welcome.>Quick start guide:derpy.me/FDnSkIntroduction to the PPP, links to text-to-speech tools, and how (You) can help with active tasks.>The main Doc:docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/editAn in-depth repository of tutorials, resources and archives.>Active tasks:Research into animation AIResearch into pony image generation>Latest developments:GDrive clone of Master File now available >>37159549SortAnon releases script to run TalkNet on Windows >>37299594TalkNet training script >>37374942GPT-J downloadable model >>37646318FiMmicroSoL model >>38027533Delta GPT-J notebook + tutorial >>38018428New FiMfic GPT model >>38308297 >>38347556 >>38301248FimFic dataset release >>38391839Offline GPT-PNY >>38821349FiMfic dataset >>38934474SD weights >>38959367SD low vram >>38959447Huggingface SD: >>38979677Colab SD >>38981735Huggingface textual inversion >>39050383NSFW Pony Model >>39114433SD show backgrounds >>39418552so-vits-svc offline ui >>39557072Deep ponies app >>39576010so-vits-svt for colab >>39592429New DeltaVox >>39678806so-vits-svt 4.0 >>39683876so-vits-svt tutorial >>39692758sovitssvc 5.0? >>39879469Hay Say release (UI for many models) >>39920556New sovits models (>>40116027 >>40120096)RVC Models (>>40168041 >>40182177 >>40183933 >>40221966 >>40230330)Help with Free Hugs episode (>>40059913)/mlp/con panel 2023 (derpy.me/jQ6TE derpy.me/Etw7D)Vote for animatio AI inputs >>40187816>The PoneAI drive, an archive for AI pony voice content:derpy.me/LzRFXderpy.me/GOpGP>The /mlp/con live panel shows:derpy.me/YIFNt>Clipper’s Master Files, the central location for MLP voice data:mega.nz/folder/jkwimSTa#_xk0VnR30C8Ljsy4RCGSigmega.nz/folder/gVYUEZrI#6dQHH3P2cFYWm3UkQveHxQmirror: derpy.me/c71GJ>Cool, where is the discord/forum/whatever unifying place for this project?You're looking at it.Last Thread:>>40148287
FAQs:If your question isn’t listed here, take a look in the quick start guide and main doc to see if it’s already answered there. Use the tabs on the left for easy navigation.Quick: derpy.me/FDnSkMain: derpy.me/g3fFA>Where can I find the AI text-to-speech tools and how do I use them?A list of TTS tools: derpy.me/A8Us4How to get the best out of them: derpy.me/eA8Wo>Where can I find content made with the voice AI?In the PoneAI drive: derpy.me/LzRFXAnd the PPP Mega Compilation: derpy.me/GOpGP>I want to know more about the PPP, but I can’t be arsed to read the doc.See the live PPP panel shows presented on /mlp/con for a more condensed overview.derpy.me/pVeU0derpy.me/Jwj8a>How can I help with the PPP?Build datasets, train AIs, and use the AI to make more pony content. Take a look at the quick start guide for current active tasks, or start your own in the thread if you have an idea. There’s always more data to collect and more AIs to train.>Did you know that such and such voiced this other thing that could be used for voice data?It is best to keep to official audio only unless there is very little of it available. If you know of a good source of audio for characters with few (or just fewer) lines, please post it in the thread. 5.1 is generally required unless you have a source already clean of background noise. Preferably post a sample or link. The easier you make it, the more likely it will be done.>What about fan-imitations of official voices?No.>Will you guys be doing a [insert language here] version of the AI?Probably not, but you're welcome to. You can however get most of the way there by using phonetic transcriptions of other languages as input for the AI.>What about [insert OC here]'s voice?It is often quite difficult to find good quality audio data for OCs. If you happen to know any, post them in the thread and we’ll take a look.>I have an idea!Great. Post it in the thread and we'll discuss it.>Do you have a Code of Conduct?Of course: 15.ai/code>Is this project open source? Who is in charge of this?derpy.me/CQ3Ca>Links/mlp/con: derpy.me/tledz derpy.me/14zBPPPP Redubs:Ep1: derpy.me/xZhnJ derpy.me/ELksqEp2: derpy.me/WVRAc derpy.me/RHegyUnused clips: derpy.me/VWdHn derpy.me/OKoqsRewatch Premiere: derpy.me/EflMJEp3: derpy.me/b2cp2 derpy.me/RxTbREp4: drive.google.com/file/d/1iQzIeGZTbxcu2BJcPHtmyxtiu2mV-MhN/view?usp=sharing pony.tube/videos/watch/c2a0b52d-344a-4240-a415-9a303b734777Rewatch Premiere: mega.nz/file/4AklSZgI#9cAkzRz-81BGNX7dKKpOChDAg3x3KCoBKuSa1lJfDVs
>>40255207Anchor.
Bump.
seeing how this is a new thread, i might as well post a few villain models a friend of mine made.>King Sombra (Season 9)https://drive.google.com/file/d/1RH0mmqe4cKDTRY8gTS5hwjH4hme81SpW/view>Queen Chrysalishttps://drive.google.com/file/d/1fc1mTJPwT64AkDZSehJcqxxketej3STe/view>Lord Tirekhttps://drive.google.com/file/d/1j4CGS7PO65DJ7A8rgmwMD3AWRnb8Oy0i/view>Grogarhttps://drive.google.com/file/d/1LHp7-Dm-qPOqhJDGg27RbQzGXc3LHaLo/view
>>40255708Thanks, I was about to start training some of these on my own lol
>>40254695just yesterday lmao
if any pony want to see snap shutter sing different versions of the same sea shanty here it is :D https://www.youtube.com/watch?v=h5K4_gE4MQE&t=1s
Live in ~1 hour. Animating/audio work.cytu *dot* be/r/PonyPreservationProject
bump
>>40257752
>>40258400
Made a perchance generator for written moans so I can use it as a starting point for text to voice lewd noises. Feel free to use it or modify it.https://perchance.org/qnpx4rpsoo
https://files.catbox.moe/gyuxo3.wav
>>40260214https://files.catbox.moe/zdf7rf.wavponka version
>page 9bump
>>40260769Agreed.
>>40255207is there a possibility of getting a rvc model for each of the pony voices?
Trained an /mlp/ (thanks desuarchive) LoRA for LLaMA 2 13B that allows one to either write green or chat/roleplay with it in greentext/mixed format.Release soon.
>>40261595There are links to several model collections in the Main Doc. Look at the bottom of the section titled RVC. Alternatively, there is a list of pony RVC models in somewhat human-readable JSON format here:https://github.com/hydrusbeta/hay_say_ui/blob/main/architectures/rvc/character_models.jsonYou can search for a character in that and find links to the files needed for inference.
>>40255708are these singing models or speaking models
>>40256512Strangely never heard of that one. Was expecting Botany Bay or something (https://youtu.be/6Z_MuoRiSwI)Though I guess that one's technically not a shanty. Probably wouldn't ponify though, due to the lyric: >"Hops around with a log on our toes"We humans sure like singing about our hands, arms and feet.>>40261972>Singing or speaking models?RVC is usually quite capable of singing either way with the speaking models. They likely sound different from a pony's normal singing voice however, as many have separate singing actresses. Though come to think of it, that is something we should consider; Setting uniform naming of models to tag as specifically singing and non-singing based on the data they're trained on. Simple prefix additions like [SP] and [SI] could work.
Up.
>>40261779sorry, I guess i didn't realize rvc models ended in that file extension. it doesnt seem to make any sense
>>40262691cute twiggles
>>40262691I wish more ponies dressed in cutesy punk style.
Bump thread
I suggest if anyone would like, we could start a new thread... about creating a 4chan simulator with AI. In its infant demo form you'd simply copy-paste 50 threads to make the imageboard look nice in quality if you were to present it to someone you'd like to show off that your imageboard is better than other imageboards, community wise at least.Then the real thing and the real problem is making the AI create a quality community to post on your imageboard to attract more users or to just create the kind of community you personally enjoy. It could also play out like a 4chan video game, specifically a simulator & tycoon game, but very descriptive. Just like street simulators & tycoon games it could theoretically serve a real-life purpose of the blueprints for what you want your board quality to be like and what quality the moderators should encourage the community for and what to filter for.
>>40264179
>>40265955
What ever happened to the promise of image AI being optimised to allow for multiple generations per second? Still waiting for all the instantaneous mares to begin flooding my system rapidly.
AI pones stream up: https://www.youtube.com/watch?v=AjdPKGX8yJo&ab_channel=blobPrincess hair is wavy now!
bump https://files.catbox.moe/x3z9u0.mp3
>>40266578
Mares?
can someone explain the difference between svc and rvc?
D'y'all think 15.AI might be dead for real this time? It's been countless months and 15's Twitter says FORMER MIT. I've checked a few times a months for the past year, I fear it's just dead.
>>40269042Seems like it. He likely gave up due to various reasons (tech evolved too fast, lack of interest, etc.) and went out with a whimper instead of a bang.
>15.ai dead for a year now>does literally nothing besides bitch about anons who call him out>still collecting $700 every month from retards on patreon even though he's a multi-millionairewhy do people still have faith in this fag?
>>40269050That's so depressing considering that out of all the Voice AIs I tried for MLP, it was STILL the best. Ontop of that, it was FREE. Now everything's gimped or monetized.
>>40269059you can exploit 11.ai for really cheap if you rotate privacy cards to get the heavily discounted intro prices forever
>>40268167neat
>>40268973They are different architectures. RVC is faster to train and generates comparable results which is why it is more popular, but in some cases so-vits-svc outperforms
>>40269056It's mid.
Mrs.cake/ chiffon-swirl RVCv2 model made with 2minutes 30s of training data===============================================================================hugging face: https://huggingface.co/KenDoStudio/MLP_mrs-cake-chiffon-swirl/resolve/main/MLP_mrs_cake-swirl_e475_s2375.zipbrowser trial: https://app.kits.ai/convert/shared/mlp-mrs-cakechiffon-swirlSample: https://files.catbox.moe/in9crd.mp3male users set transpose to +12
>>40269056don't forget he's still playing tf2
his still wanking himself on team faggotry 2 that game is fucking aids >>40270161
>>40269056His pinned tweet for 17 months and counting should be some sort of important announcement about why he's taking so long, but it's actually just him bitching about VoiceverseIt's like the only patrons he has are people in his special /mlptf2/ server who he just gives special p2p hats to instead of actually working on anything
>>40270112any chance for Mr Cake?
King Sombra sings My Way by Frank Sinatra (TeamFourStar Version)>https://files.catbox.moe/mflfcp.mp3
>>40261735https://colab.research.google.com/drive/1Un02h4uQN6zLlgL3zmMOeyce9ICEA3qO?usp=sharingColab notebook that launches oogabooga's text generation webui with LLaMA 2 and two LoRAs, using the T4: the /mlp/ one, and another trained on the FIMFiction archive. Neither of them are instruction-tuned models: use them like an autocomplete. GGML -- although I think the new fancy thing is GGUF -- weights probably coming soon for local frontends like LM Studio.Pic unrelated.
>>40271923>Another trained on the FIMFiction archive>Neither are instruction-tuned modelsFantastic, sounds like it'd be reasonably comparable model to the original GPT-PNY format. I'll have to do some testing to see how much they differ. Nice work, anon!>Pic unrelated>Pretty OC>Can't unsee hoof lipsThis mare must be good at kissing
>>40272115>hoof lipsfuck
[SoVits] Trixie sings "Toyota Song">https://files.catbox.moe/zziok5.mp4I think she had an easier time with the Spanish accent than she did with the Aussie one
>>40271468>>40272373nice, more new content!>>40270454Do you know what are the current requirements to run 13b GGML offline?
how about this, >>40272732insteadof relying to 15.ai how about the dude that made thew deeponies ai why not wer contact him again so he can update his app and even add more voices to it we should contact him
>>40272782That sentence made me feel like I was having a stroke.
>>40272373kino
>>40269056People should be donating $700+ a month to fund training a open source voice clone model
>>40273078south americans will do that to you.
>>40269056He was always a joke.
>V3 by Cookie>V2 by astraliteHeart>Pony soupwhich model is best bros?
>>40274244> AstraliteHeart:>@everyone, V5 is available for download as we just reached 5555 users!>You can get the model from CivitAI https://civitai.com/models/95367/pony-diffusion-v5 or directly form Mega https://mega.nz/file/kSERCQYI#r42-WUvLF74TDGUtVJLvJ2ADhxWaC2I41oPkXqI9wNk>If you use CivitAI, I would really appreciate if you can give the model a review/rating.AstraliteHeart's "Pony Diffusion V5" is the latest revision.
>>40274290thanksgod they really need to update the /PPP/ google doc
>>40272373>venezuelan Toyota Corolla ad song sang by Trixie (now with aussie accent)YEEEEEEEEEEEEEEEEES! I LOVE YOU, HAZY
>>40274376Yeah, it's a pretty good resource for seeing how things were done but will all the new tech some little bits are outdated. What kind of changes do you have in mind Anon?
>>40274376One of the things I plan to do alongside various content creation and training is look into and revisit some technologies (like TortoiseTTS, Pony Diffusion, Bark, Chirp, Coqui, etc), so I'll have some doc revisions or other information to include at some point in the future. Stay tuned for that; be sure to not let me forget should it end up taking too long. It's also been a good while since I've done some tech scouting, though that might be lower priority than getting more information on tech we're already aware of and can explore further.
>>40271923GGMLs here. Originally I exported GGUFs but no program supported them so I had to get a week old clone of llama.cpp to export to GGMLshttps://huggingface.co/Nikolai1902/LL2-13B-DesuMLP-QLORA-GGMLhttps://huggingface.co/Nikolai1902/LL2-13B-FIMFiction-QLORA-GGMLPic unrelated.>>40272732Info in the model card. There are two levels of quantization I settled for.Next plan is to get a better filtered FIMFiction dataset and finetune with 8192 context length on https://huggingface.co/conceptofmind/LLongMA-2-13b
>>40273769
i am really liking this BG model
>>40277910>>40277917huh that's pretty good. Are they first shot generations?
>>40277987uhh, i just copied the tags from the sample images and clicked in generatethey came and i liked it?this safetensor is from the link at the OP
Hello, PPP folks, how difficult would it be to create such thing with ponies?https://www.youtube.com/watch?v=LAd2nfHtYEE
>>40277999So yes. I just meant if you touched them up with img2img, inpainting or manually fixing things.
>>40276284What GPU did you use to make it? How long did it take?
>>40278006pretty sure this one was faked or at least heavily touched up. But the general idea for UI doesn't seem to look that complex (its just The Sims on steroids), what stopping people fro actually doing it is is that to run it as closely to "real time" as possible someone would need A100 or posses several mega huge GPUs in order to be able to load text generators, dozen voice models and render everything without entire system shitting itself.
>>40278006>>40278591It might actually be quite watchable if properly implemented
Say, I'm SUPER ignorant of voice AI stuff, but is there one that instead of text to speech, can translate audio files of sounds (laughs, giggles, cries, groans, panting, ect) into a character's voice? Text to speech AI is very limited about this, I mean there's ways to bruteforce it but it never sounds quite right.
>>40278612rvc
New Content incoming i have just finally made my aunt lofty cover "Quilt That Out" Here is the Videohttps://youtu.be/MchyXb4j8KcNew RVC Model Aunt Lofty ( I Have Not Yet Uploaded this yet )
>>40278934ok now i uploaded the model here ya go guys :Ddownload Link: https://drive.google.com/file/d/1aWxUx_Ns2ZFTO7aP-UWQ2J-QZ_A8_nw3/view?usp=sharing
>>40271468Could you make Trixie sing Sinatra's My Way? Marefied if possible (saying mare instead of man, she instead of he, so it fits as is she is singing bout herself)?
>>40278278One rented RTX 4090, the FIMFiction model took 211 hours, and the /mlp/ one 101 hours.Finetuning LLongMA-2 with 8192 context length is probably going to take one or two A100s.
>>40278013in this case, i did nothing you mentioned
>>40278612start here:https://www.youtube.com/watch?v=F9PdoJ9zmrw
https://www.youtube.com/watch?v=D4JPSqF6vfshttps://odysee.com/@vul:0/mareweather:4Mare Weather: https://files.catbox.moe/0xtk77.mp3Rained In: https://files.catbox.moe/f6t7fk.mp3Well I promised myself I would force finish an album before mare fair and I did it, but at what cost?
>>40279903Yes, this is really good, but what about second mare fair album?
>>40280062>why the fuck the pic didnt upload?
>>40279903Digging it vul
>>40279903Saw a notification from Youtube. Thanks for another album release, Vul. Your horsemusic is always refreshing when listened to it.
mare sipping for mare content
>>40276924
!
>>40255213https://www.youtube.com/playlist?list=PLvnphC6j1PEGshPaw4MX4JZN_Pzm7GrImLet's talk Redub. The docs and spreadsheet to begin S01E05 have been ready to go for ~6 months now. The plan initially was to await the return of 15, but with the total comm blackout that seems hopeless at this point. SO, rather than let the series die, I say we just run with what we do have and see how it goes. The lack of any super easy type-and-go solutions, plus the main methods currently requiring reference audio means there's unfortunately gonna be a higher bar for entry this time. There's not really much to be done about that except wait for a solution, and I think we've tried that long enough.If anyone has thoughts on the matter, do share. Also let me know if you'd contribute with the current toolset so I can try to gauge interest. I'll be back later today to hopefully make a final decision, if there's no major objections or reasons not to, then I'll probably launch it TODAY, later in the evening.
>>40281655After mare fair?
>>40281655Tortoise TTS could fix all our problems train some models on there and see how it goes?
https://www.bilibili.com/video/BV1gu4y1e7gq/
>>40281655TalkNet still seems quite capable of decent mare outputs with just requiring TTS input. Requires a bit more alternative deliveries and audio splicing though, or at least, more-so than 15 normally required.>https://files.catbox.moe/q55x9i.mp4>https://files.catbox.moe/sdmexo.mp3But, there is a problem with using Talknet currently: This next episode is one with Gilda, and we don't seem to have a Talknet model available for her yet.>>40281773Wouldn't hurt to have additional options, though that'd take considerably long to implement given how many voices would need to be trained for all the characters speaking. Although, TortoiseTTS could potentially be usable in tandem with existing and already trained mare SoVits or RVC perhaps?
>>40282244its easy to train on tortoise like super easy just as easy as training on rvc and so vits
>>40281655Like others have pointed out, while being forced to use the reference audio for making pony voices is sucky there are ways to go around it by using other ai voice models or even use the Vocaloid/Synthesis V to get something, and if all of those option fails Anons can ask people here or in the /create/ thread for lines to use in their shorts.But I agree with this post >>40281685 prepping for that takes stupendous amount of brain power, so splitting attention between those two events would be less than desirable.
>>40278591I don't think it's fake, but it's definitely got a lot of cleanup and editing, probably pre-rendered and everything. The Cartman model fucking sucks. It sounds like Kyle or Stan.
>>40281655>do a normal dub like the /mlp/ youtube channel did rip>sovits the voicesSimple.
>>40281655Won't be able to do anything until after Mare Fair, but will do at least one submission after if it's still going.
>>40282672Eh. That would require major restructuring, and take away a lot of creative freedom and individual control that the clip method offers. Plus mlp dubs is already its own thing.>>40282244Right, I forgot TalkNet offers standard TTS. If I recall though the TTS mode is deterministic in its outputs which makes it harder to wrangle good results out of. Regardless it is an option.>>40281773If someone wants to do that, then by all means. Is there an easy system to use Tortoise TTS at the moment? A colab script, or something to just download and run easily?>>40281685>>40282476>>40282681I've no issue with waiting until after marefair if it's agreeable. Maybe the weekend after to give everyone a respite period. On reflection wanting to jump into it literally RIGHT NOW is perhaps a bit gung-ho of me. I'll set a tentative date of October 6th for the launch.
>>40282244>But, there is a problem with using Talknet currently: This next episode is one with Gilda, and we don't seem to have a Talknet model available for her yet.There supposedly was a Gilda TalkNet model, but it was said to be "poor quality". The Google Drive ID is:1z0JmBWzhQ-H8IWxWEBhHL2HldpWbOX3FBefore Mare Fair ends, we need to train Gilda models on as many different architectures as possible.
>>40283345Thank you for sharing this. I didn't know that model existed. I added it to Hay Say just now and generated a couple of files. It is indeed a bit low on quality, unfortunately:text only: https://files.catbox.moe/ga1vm6.flacwith audio reference: https://files.catbox.moe/mc1n08.flacAs far as I know, there is currently no Gilda model for either RVC or so-vits-svc (versions 3, 4, or 5).
>>40283307actually yes there is there is already tutorials out there here is one of them if this helps :D https://www.youtube.com/watch?v=6sTsqSQYIzs
>>40278591It doesnt really need to be in real time, just the ability to generate a random ass episode about specific topic would be fun to watch.
>>40255207https://porndude.me/xvideos/id/61560909/breeding-tavi/
>>40282672>/mlp/ youtube channel Is there a backup of that channel somewhere?
>>40283307>TTS mode is deterministic in its outputs which makes it harder to wrangle good results out of.Yep, which is why for that result I had to do a bunch of different takes with varying the punctuation, adding sentences. changing the order, etc. stitching the best together afterwards. So, it takes a bit longer to get a really good line delivered, but at least it's still a reasonable option.>>40283438Before I forget, now that I got Hay Say set up, I noticed there are some missing ease of use features that the Colab has which Hay Say doesn't:>Being able to upload multiple files at once>Noise slider for SoVits models>Being able to process multiple files sequentially>Output files inheriting the name of the original file, and adding "_<semitone value>_key_<model name>" to the endI may go into more detail on these by raising an issue on the github page tagged with feature suggestion. Might have more info on how to better implement them with how the interface is now.
>>40279903Really liked this one good job
>>40284100Thanks for the suggestions. Apparently, the file upload component I am using supports multiple file uploads out-of-the-box, but that support is disabled by default. It's a super easy fix to allow multiple files being uploaded at once; I'll add that to the next Hay Say update.I don't know what the noise slider is. I'll take a look at the colab version this weekend and see what it does exactly.Adding some sort of "batch mode" for processing multiple files sequentially is something I often think about, but I keep shelving it in favor of developing other features that seem more important at the time. I think developing this would take significant effort if I want to do it right. It's still in the backlog at the moment.I could probably have the output files inherit the original file name easily enough. So-vits-svc 4 has a lot more inputs that just semitone value and model name; would it be better to append *all* of the input values to the file name or just those two to keep it short?
>>40283345where did you find that model, is there more talknet models i dont know of if there is please share some more here please i love to use talknet regardless of its lower quality it still good because of its text to speech stuff
>>40284250>would it be better to append *all* of the input values to the file name or just those two to keep it short?Shorter is better, to make things easier to read. At most I feel there's like... 5 aspects of importance maybe, like:<Input file name>_<Model name>_<Semitone value>_<Clustering value>.flacSpeaking of flac, might also be nice to have a drop-down option to specify the output file type? Some programs and browsers only work with specific formats without needing to convert to a different one prior. Wav and mp3 being the common ones in that regard. Saving memory by having smaller files is good too.
>>40284281Thanks again for the suggestions. To avoid a name clash on the server side if the user generates output multiple times while tweaking a different input, such as slice length, I might need to add a hash value at the end, like:<Input file name>_<Model name>_<Semitone value>_<Clustering value>_<Hash>.<filetype>But... I have a couple of ideas that might avoid needing to do that.A drop-down to specify the output type should be possible. I use soundfile to render audio files, which supports MP3, WAV, FLAC, OGG, and a bunch of other lesser-known formats.I looked into the noise slider and discovered that it is readily available in the command-line interface for so-vits-svc 4, although it is not mentioned in the readme. It uses a default value of 0.4. I'll plan on adding a slider to adjust that value.
>>40283666hot
>>40255207I've just installed Hay Say but I can't generate anything. I'm on windows, I followed the install instructions to the letter. Doesn't matter what character I try to use. I just get the following errors:An error has occurred. Please send the software maintainers the following information as well as any recent output in the Command Prompt/terminal (please review and remove any private info before sending!): Traceback (most recent call last): File "/root/hay_say/hay_say_ui/main.py", line 266, in generate hash_output = process(user_text, hash_preprocessed, selected_tab_object, relevant_inputs) File "/root/hay_say/hay_say_ui/main.py", line 557, in process send_payload(payload, host, port) File "/root/hay_say/hay_say_ui/main.py", line 586, in send_payload raise Exception(message) Exception: An error occurred while generating the output: Traceback (most recent call last): File "/root/hay_say/controllable_talknet_server/main.py", line 88, in parse_inputs jsonschema.validate(instance=request.json, schema=schema) File "/root/hay_say/.venvs/controllable_talknet_server/lib/python3.8/site-packages/jsonschema/validators.py", line 1121, in validate raise error jsonschema.exceptions.ValidationError: None is not of type 'string' Failed validating 'type' in schema['properties']['Inputs']['properties']['User Audio']: {'type': 'string'} On instance['Inputs']['User Audio']: None During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/root/hay_say/controllable_talknet_server/main.py", line 32, in generate reduce_metallic_sound, output_filename_sans_extension = parse_inputs() File "/root/hay_say/controllable_talknet_server/main.py", line 90, in parse_inputs raise BadInputException(e.Message) AttributeError: 'ValidationError' object has no attribute 'Message' Payload: {"Inputs": {"User Text": "this is a test", "User Audio": null}, "Options": {"Architecture": "controllable_talknet", "Character": "Fluttershy", "Disable Reference Audio": true, "Pitch Factor": 0, "Auto Tune": false, "Reduce Metallic Sound": false}, "Output File": "0c729963a67d8ac36961"} Input Audio Dir Listing: TalkNet_Training_Offline.ipynb, model_lists, talknet_offline.py, README.md, horsewords.clean, core, requirements.txt, .git, mycroft_talknet.py, diffsvc_gui.py, LICENSE, assets, Dockerfile, controllable_talknet.py, hifi-gan, __pycache__, models, command_line_interface.py, results Output Audio Dir Listing:
>>40285028Ah crud. I apologize. Looks like I introduced a bug while refactoring the code for invoking Controllable TalkNet. Fortunately, there is a workaround. Try uploading an audio file (*any* audio file will do, even just a second of silence). Then, select the "disable audio input" checkbox and try generating again.What's happening is that if you have never uploaded an audio file, then a null value is passed as the reference audio filename, but some other code requires that it not be null.
>>40285060It works now. Thank you.
>>40284258All the TalkNet models I know about are here:https://ponepaste.org/5733
>>39714129 (https://desuarchive.org/mlp/thread/39686748/#q39714129)I need suggestions. I think I have the "bring your own datasets and models" thing figured out. I want to support "bring your own compute" too so people don't have to pay for my GPUs. I want it to work with both personal compute (plug in your own desktop) and shared compute (use other anons' GPUs, assuming they let you).Here's one approach:>You want to attach your own compute.>You install minikube on your desktop to set up a one-computer kubernetes cluster. (If you have multiple servers, you can create a more scalable kubernetes cluster with kubeadm or other.)>I provide kubernetes config files and scripts that you can use to connect your kubernetes cluster to my backend. One of these scripts will let you create API keys.>You use either my website or my scripts when you want to mod some AI model. In either case, you pass it the API key, and my backend will connect to your cluster so it can use your GPUs.Here's another approach:>You want to attach your own compute.>You install docker on your desktop.>You use my website or scripts to start modding some AI model.>It gives you some docker commands to run so you can attach your compute.If you have other ideas or thoughts on either approach, please let me know.
Does anyone have the .fla files for the late season intro? i remember playing with it and exporting puppets, but now i cant find the files anywhere on my computer. i downloaded the majority of the 2013 - 2015 leaks torrent. if anyone has download the same torrent can you tell me which folder it in.
>>40285942I think we have those same leaks and more, and I don't recall any of the intros being in any of them.
>>40285968i got this lyra and bonbon from the intro and i remember zooming in and stealing other puppets too.
>>40285627If you haven't already, maybe look into what vast.ai does? Also, is the goal here to let people use their own GPUs while keeping your company's code secret? If so, that seems difficult. But if not, then I'm not sure why people will need to pay you (assuming they steal the code running on their local machines).
>>40285214ive been trying to tell everyone on this board here we need to also keep making more talknet models its a good idea at this point
>>40286186I haven't looked into what vast.ai does, but I will now. Thank you.>Also, is the goal here to let people use their own GPUs while keeping your company's code secret?The cloud infrastructure management code and probably some of the blackbox optimizers will be secret, and that will not run locally. The plan is to make everything else open source. That includes all the inference code and research for modding & training AI. We're not going to be developing most of that anyway, it'll be from existing repos. If we need something that doesn't exist, we'll probably end up paying people to write and publish it. People will have source code for everything they run locally, and we'll try to make it easy to fork and modify the code.>But if not, then I'm not sure why people will need to pay you (assuming they steal the code running on their local machines).That's by design. It should be enough for people to do pretty much everything without us minus some convenience features that are impossible without cloud infrastructure. In the ideal future, this thing is free for open source communities and researchers with whatever privacy guarantees and features they want so they don't feel the need to develop alternatives. We'll be making money from companies that want to scale up without having to manage the infrastructure and that want to make this stuff available to their employees without having to retrain them.
>>40255207there were some new progress with vits2,sequel to vitshttps://github.com/p0p4k/vits2_pytorchvits2_pytorch and MB-iSTFT-VITS hybridhttps://github.com/FENRlR/MB-iSTFT-VITS2
>>40286301what is vits 2 exactly ?
>SovitsSVC 4 model: https://huggingface.co/Amo/so-vits-svc-4.0_GA/tree/main/ModelsFolder/OctaviaBrit>Dataset: https://huggingface.co/Amo/so-vits-svc-4.0_GA/blob/main/datasetsWav/OctaviaBrit.zipI've made an attempt at using VERY old 15ai Octavia lines I've dig out from depths of my hard drive and see if it would be possible to train model lon that, result is very meh, as just like with the other models trained on audio that was extracted from noisy datasets, the residual noise still fucks up with the process. Im still sharing it here, as I feel a layer of de-noising and music/sound effect will make the out puts more palatable but still, I hope we may get a tech in the future there will be tech that could at the very least fix up the derped dataset.>Examples:https://files.catbox.moe/ib2e7e.wavhttps://files.catbox.moe/uiwi7y.wavhttps://files.catbox.moe/1oubmo.wavhttps://files.catbox.moe/rcdmza.wav
>>40283307> take away a lot of creative freedomExplain.
>>40287243Being able to introduce new animation sequences lets the community build a unique piece of media not constrained by the specific, rigid animation laid out by just dubbing an episode.
>>40287243Yeah >>40287289 nailed it. Big part of the appeal for me and i'm sure others, both to work on and watch, is that it's not "just" a redub of the episodes. Claimants get full creative control over the visuals as well so they can take a clip in a completely different direction, add visual gags, extend and occasionally even animate their own shit to add, etc.
>>40281773>>40282244>>40282268>>40283307I played around with Tortoise TTS on Docker today. There are nine pretrained models you can use to generate fairly good audio, which may provide decent input for a speech-to-speech architecture like so-vits-svc or RVC. However, Tortoise TTS does not have any pretrained pony models and the author has opted not to release instructions on how to train new models. See https://github.com/neonbjb/tortoise-tts#trainingTortoise TTS has an alternative mode where you can provide voice samples and it will attempt to "mimic" that character's voice. Unfortunately, from the handful of samples I've tried, it does not produce anything close to the pony voice I was trying to get it to mimic, even after providing a ton of additional clean audio.
>>40285942nvm guys, i found it. i must have accidentally deleted the files, i had to redownload them. its under Full MLP Leaks 2019, Xmas 2017, FTP leaks, mlp_video, MLP8, Hasbro_Requests, MLPS8_OTS
>>40288149Could you render Trixie and Derpy in the background, please?
>>40287951u need to actually train the model thats why it wont output that voice well use the ai voice cloning github repository its way better
>>40287951the video shows the guy cloning his own voice and godfrey there is a super close comparison with his real voice and his cloned one
>>40288238
>>40288149Nice.>>40288238https://files.catbox.moe/sxe7mq.gifhttps://files.catbox.moe/wj68x3.gif
>>40288405>>40288395Thanks
>>40286301Tried to run this but looks like it uses way more VRAM than consumer GPUs. Says in the paper they used 4 V100s to train.
>>40287289>>40287386I don't get it. How is that impacted by, in the continued and endless absence of 15, having claimants voice their sections and run said voice through talknet, sovits, rvc, whatever?
>>40289412The original question was seeking clarification of why sticking to the show's animation and only allowing dubs of existing lines hindered creativity. You can just use TN/Sovits/RVC to do straight dubs-only, that's a valid way of creating content. But the Redub Project here on PPP doesn't want to just do a straight dub. They want to be able to add in custom animation and create an entirely new episode. That, by definition, allows for more freedom of creativity, since the creators are no longer bound by the strict animation laid down in the original episode. What you're describing is something entirely different from the PPP Redubs. We aren't saying it's any less valid, but we are saying it's structured to be "less" creative.
>>40288312Are you referring to this Git repo?https://git.ecker.tech/mrq/ai-voice-cloningLooks like it does provide an avenue for training models for Tortoise. Thank you, I'll look into this one and report my findings (and hopefully post some output examples if they are of decent quality).
>>40290638thats correct
>>40290638>>40287951hello guys i also have some new news, there is another new text to speech tool developed by Microsoft out of all people, called VALL E-X it is open source and its available now on github the quality is pretty damn good to desu and also has emotional contextualizes and seems to be close to 15.ai quality here is youtube video if tortoise tts fails try this https://youtu.be/7qgfoVFQmvkhttps://github.com/Plachtaa/VALL-E-X
>>40291541>>40290638>>40287951Here is an example of VAL-EExample 1https://files.catbox.moe/lx5je4.mp4Example 2https://files.catbox.moe/iwxaw0.mp4To Note Using Vall-E-X Depending on your reference audio if its too quiet it will not pick up the voice wellmake sure that the reference audio is nice and loud n clear in order for this to pick up the voice well btw it was very good quality i will be running through more tests but this is my findings so far and it looks promising to help you guys do the PPP redub
>>40290161No, that wasn't the question. >>40291591>discord message notifications kept on>bandicam logo>output sounds like shit>elevenlabs-tier "modeling" where it just falls back on hidden generic male or female voices>most of the video playback time is random useless shit that didn't need to be kept in
Live in ~1 hour. Animating/audio work.cytu *dot* be/r/PonyPreservationProject>>40291591Sounds decent, certainly worth investigating for pony voices.
>>40291591>>40291767Actually, I take that back. Initially listened only on my speakers but on headphones the quality is obviously terrible compared to what we curretnly have with sovits and rvc.
>>40291788obviously has room for improvement, but for only 3 seconds worth of data, this is pretty impressive.
>>40291755it's early youtube tutorial kino transported to modern day. All it's missing is wildly inappropriate music and writing on a text file to communicate.
>>40285973Zoom, Enhance!Damn, I knew it, she is the one stealing oats!
>>40292246lol
For those who needed it, RVC colab notebook has been updated to work(?) and also include some zip modelshttps://colab.research.google.com/drive/1785m7TPZeB8sGNmQMjGlkZql7KPLbJxf?usp=sharing
>>40292444
>>40291788yes but so vits isnt text to speech this is the next best thing the quality is actually good i showed a comparison between quiet reference audio and loud obviously loud sounds less robotic and more impressive for only 3 - 10 seconds from inferring it from audio
It's been a few weeks since my last one, but I'm back! This time it's a Pinkie Pie cover of Pure Imagination from Willy Wonka and The Chocolate Factory. https://youtu.be/mcrsGGSyZgU
I love you guys. Thank you for keeping the dream alive.
I wonder if the demand/appeal of AI has died down recently due to all the decent options being cucked or requiring certain specs (normie filtering)
>>40293273this is nice, it's always good to hear Pinkie singing
>>40293903It's definitely due to normie filtering. You can't expect a regular person to know how to install Python on their iPad.
>>40291767Live in ~1 hour. Animating/audio work.cytu *dot* be/r/PonyPreservationProject>>40293273That came out pretty good, nice job.
>>40294443hello clipper did you have a retry at my suggestion to see if it works well it might be the best thing thats almost as close to 15.ai we can get really
>>40294202Most people don't know how to install Python in a regular computer either. Some don't know how to install anything in a computer.
https://skybox.blockadelabs.com/
>>40295379>ponyvillemore like weebshit
>>40295379>>40295642Yeah was gonna say>request Ponyville>get some generic background art from a dating VN
>>40295642are you telling me you wouldn't a hoers weeb?
>>40295642>ponyville is weebWhat?
>>40296346DYEWTS?
>>40295642Try to think more than zero seconds into the future.
>>40296538>futuremlp turning into weeb anime?if that's the case i hop off the ride
>>40295642Just like your neighponese cartoons.
>>40295379The shadows there are all over the place, and that's it the most obvious issue. There's either two suns in the sky, it used a mirror method down the middle or it cannot keep the light source consistent. Not sure if it could handle the show art style very well. Everything looks blurry here but the objects in MLP are very clean and defined.
>>40293273It was a chill listening.Thank you!
I've been trying to figure out RVC models and I was directed here from >>>/g/You might be able to help: If a model I've been trying to develop doesn't do anything to my voice, but it adds static noise to it, what could I possibly be doing wrong? I was following the Kalomaze Training RVC v2 models guide that uses Mangio RVC. My intent was to build models that work with the wokada voice changer, but it took a lot of effort to figure out how to even get that working.I have about 8 and a half minutes of wav files to use for this in small clips.
>>40297674Do you have the pretrained models downloaded?
>>40297764Yes, I do. I've been trying to train with v2.
another >>>/g/lmg anon here. The AIHub discord for sharing RVC models is capped and full at 500k members. What am I supposed to do now? Seems like that would be the place for tech support with regards to RVC... I don't even have a crepe option in my training webui, so I am using rmvpe instead.
>>40297976Oh, I know that one, and I'm the one looking for help. https://github.com/Mangio621/Mangio-RVC-Fork/releases Check this fork and it'll have a crepe option. This one's a bit more foolproof, too.
>>40297983So in that RVC google doc guide, it mentions that the min is where you should stop. I trained 6200 steps and 640 epochs, and at 0.999 smoothing, it doesnt look like I hit the min.In the guide it looked very obvious where overtraining began.
>>40297780Can you provide an example? Also check the training logs and make sure it is using the pretrained model
>>40298012I saw that and even with the fix it said, I still wasn't able to get tensorboards to work. I had similar issues with how many issues I had to even get RVC working to begin with.>>40298054Is it alright to say I'm too self-conscious to do that? The output just wasn't affecting it at all. 2023-09-06 23:44:19,333 INFO {'train': {'log_interval': 10, 'seed': 1234, 'epochs': 20000, 'learning_rate': 0.0001, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 20, 'fp16_run': True, 'lr_decay': 0.999875, 'segment_size': 12800, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0}, 'data': {'max_wav_value': 32768.0, 'sampling_rate': 40000, 'filter_length': 2048, 'hop_length': 400, 'win_length': 2048, 'n_mel_channels': 125, 'mel_fmin': 0.0, 'mel_fmax': None, 'training_files': './logs\\Handler/filelist.txt'}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [10, 10, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 4, 4], 'use_spectral_norm': False, 'gin_channels': 256, 'spk_embed_dim': 109}, 'model_dir': './logs\\', 'experiment_dir': './logs\\', 'save_every_epoch': 5, 'name': '', 'total_epoch': 6073, 'pretrainG': 'pretrained_v2/f0G40k.pth', 'pretrainD': 'pretrained_v2/f0D40k.pth', 'version': 'v2', 'gpus': '0', 'sample_rate': '40k', 'if_f0': 1, 'if_latest': 1, 'save_every_weights': '1', 'if_cache_data_in_gpu': 0}2023-09-06 23:44:20,514 INFO loaded pretrained pretrained_v2/f0G40k.pth2023-09-06 23:44:21,352 INFO loaded pretrained pretrained_v2/f0D40k.pth
Does anyone have complete scripts of every mlp episode? I wanna try to train AI on them to make an episode.
>>40298148You can use examples that aren't your own reference audio. How are you doing inference?
>>40298223I don't know much about inference, and everyone refused to explain it to me in a way I understood.
>>40298229How are you testing your RVC model if you don't know how you are doing inference?
>>40298172https://mlp.fandom.com/wiki/Special:BlankPage?blankspecial=transcriptsWiki has transcripts at least. Dunno if a complete collection of full scripts exists anywhere.
>>40298282I was shoving it in the wokada voice changer software right away, since I knew how that worked.Whenever I was trying to ask friends what the hell inference is, they'd only give me instructions on training. So no matter what, going to them or finding outdated guides has left me fairly confused
>>40298172Here's a json. I don't remember how it's organized. https://raw.githubusercontent.com/effusiveperiscope/PPPDataset/main/tier1/episode_transcripts.json>>40298399Inference is just another term for doing actual voice conversion (in this context). I'm not familiar with wokada voice changer. Have you tried using the webUI in the main RVC repo?
>Page 9No.
>>40299341
https://www.youtube.com/watch?v=rbjVMsRswEsBGM made a new song.
>>40290638I trained a Rainbow Dash tortoise model using https://git.ecker.tech/mrq/ai-voice-cloning:https://huggingface.co/hydrusbeta/models_i_trained/tree/main/tortoise/Rainbow%20DashHere is a sample audio file:https://files.catbox.moe/is2y8a.mp3I broke it down into six prompts and stitched the results together in audacity. The output is not deterministic, meaning you can enter the same prompt twice and get slightly different outputs. I had to run each of my prompts a couple dozen times before it finally produced clean output with the character saying the line the way I wanted her to say it. You can also do "prompt engineering" by adding (non-spoken) context in square brackets to the prompt, e.g. "[I am really sad,] Please feed me." Here is some advice for anyone else who might be interested in training models:* I mostly followed Jarod's tutorial. It's a good starting point: https://www.youtube.com/watch?v=6sTsqSQYIzs* Install at least python3.9 and make sure to set up your PATH variable so that python3 points to python3.9 before runing setup-cuda.sh. Python 3.8 and earlier will fail to infer output with custom models.* After running setup-cuda.sh, start the UI with: "start.sh --defer-tts-load". That flag prevents the inference engine from loading into VRAM (which you don't need while training). I was unable to start training until I did that because I kept exceeding my VRAM limit.* When you run start.sh, you may see some short stacktraces and the following two scary messages. This is expected and OK; the UI should still run. The application tries to load these optional modules but gracefully handles their absence: ModuleNotFoundError: No module named 'vall_e' ModuleNotFoundError: No module named 'bark'* If you are using Docker: 1. You will need to expand the shared memory (pass "--shm-size <size-you-want>" to the docker run command) because the default of 64 MB is not enough. I used 8GB, which is probably overkill. 2. Don't forget to expose the port (pass "-p 7860:7860" to the docker run command) 3. Don't forget to expose your GPU to the container (pass "--gpus all" to the docker run command) 4. Pass "--listen 0.0.0.0:7860" to start.sh to make the Gradio server listen on every interface instead of just loopback. Otherwise, your host machine won't be able to access the UI.
>>40296081>>40297008>furry-proofed for the rest of time>cute weeb maresI'm... actually okay with this future.
Is it just me or has files.catbox.moe been down all day?
>>402742905.5 soon
>>40301934>>>/trash/
>>40301934Why is the mare grumpy?
>>40302372She realized she's not real
>>40302389Oh, is that all? We can fix that.
>>40301934That looks really good. The irises aren't quite right and the outer stroke on the mane is thicker than the rest, but it seems your next version is able to gen show accurate really well.
>>40301934Looks great. I hope that's not too cherrypicked.
>On September 8, 2022, 15.ai was temporarily taken down in preparation for an upcoming update, a year after its last stable release (v24.2.1). As of September 2023, it is still temporarily offline.
>>40303899You either run your waifu or live long enough to see her taken offline.
Does anyone know if there has been some changes in the ai music models generation? The MusicGen and the other ones are bit too random to my tastes and I am not willing to pony up for paid services to test them out.
rainbow dash sings my heart by paramorehttps://youtu.be/OzxtEVYXzAM?si=lxvsdJasH-zxM9my
>>40304785something something Bark
>>40305515Bark is fucking useless though, unless there's been a substantial update
>>40305515>>40305751...i'm retarded, sry>Chirp v1 is now available on Disco-Mhm, I see, I hope whoever runs Suno takes their medication and releases it normally soon.
It has been awhilesince i said anything here. Is there any llama 2 based pony models? Gpt pny is good, but the tech has progressed since.
I'd like an AI based off Vogelfag's schizo posts, that would be entertaining to discuss each episode with.
>go to install RVC on my main rig at long last>cucked out thanks to AMD support relying on DirectML and DML only building against WSL and not actual Linux distrosAs it is the holiest day in modern america may I wish a very happy 747 hitting the MS campus, allah and celestia will it for refusing to let me and my 5700XT work in peace
>>40271923>>40306355
>>40307320>I wish a very happy 747 hitting the MS campusPoetic words for a day like this.
>>40307476And very much deserved since the sovits voice selection is utter ass (prioritizing cozy glow and flimflams over characters like spike or getting more CMCs than apple bloom done faster? really m8?)
>>40300791so wait that was def text to speech right it actually sounded really really good and has potential to be used for the PPP REDUB thats excellent make some more models that one was great it sounded accurately like rainbow dash just needs to be volume boosted is all which is easy to do
>>40307623>it actually sounded really really good and has potential to be used for the PPP REDUB
>>40307541>prioritizing cozy glow and flimflams over characters like spike or getting more CMCs than apple bloom done faster? really m8????
>>40300791>Here is a sample audio file:>https://files.catbox.moe/is2y8a.mp3Sounds like a merging of Rainbow Dash and Nowacking / Vinyl Scratch. Which for being a Dashie model, strangely sounds less like the former and more the latter.Any more information you can provide on her training? More specifically, how many RD voice lines were used and how long was she was trained for?If this is as good as she can be with tortoise I worry it might be a bit too noisy and distant to be usable for redubs. But if further training is possible she could still have potential.
>>40307623>>40307632>>40307768From an intonation perspective that's sounding pretty good, decent amount of fluctuation and emotion in the delivery. But that's about the best I can say. Voice wise I agree it sounds pretty off from Dash, and audio quality wise it sounds absolutely GUTTED, like whatever denoising is being used is pushed way too high, to the point it's cutting into words and making it hard to understand.Once the time rolls around I'm happy for people to use whatever PPP tool they want for the voices, but if this is the best Tortoise has to offer then I'll be staying far away from it, myself.
>>40303899I'm trans
[SoVits] Cadance(?) sings "Just an Illusion" - Jeffrey Jey / Mastik Lickers / Chris Burke>https://files.catbox.moe/cdtg7j.mp4Now with extra AI elements, and definitely no changelings...
>>40304894This is incredibly well done.
You know, thinking about it since last night there is something real fucked going on with AI and corporate control over computing that not enough people are questioning by now. You've got all these tools being made and yet more and more are tied to either hardware made by nVidia or APIs developed by large corps like Microsoft with everyone else, even CPU bound, being an afterthought. Hell, nVidia's turnface toward AI profiteering was so sudden that you're starting to see more people point out how bad consumer GPU drivers are starting to get due to game driver devs being diverted or outright fired. The issue with tools being built for WSL is even more chilling since that just verifies the "embrace extend extinguish" issue people were already saying was going to happen with MS getting suspiciously more friendly with Linux. I dunno anons, I don't like the way these things are heading where you either have to fall into a single hardware/software pathway or get stuck using the dollar store brand materials.
>>40309003Agreed. The ai space is sadly full of backwards thinking rich old ceos. Even the ppp feels like its just a bunch of millionaires showing off their 4090s. I think amd needs a slap in the face to truly embrace rocm and kill cuda by being better and affordible.
>8
>>40307745The realvul 4.0 models repo on huggingface, it's depressingly short on models compared to all the ones RVC has now
>>403097701. I don't train characters that already exist in other repos. https://huggingface.co/Amo/so-vits-svc-4.0_GA/tree/main/ModelsFolderhttps://huggingface.co/datasets/HazySkies/SV4-M/tree/mainhttps://huggingface.co/OlivineEllva/so-vits-svc-4.0-models/tree/main2. Training a model on so-vits-svc 4.0 takes around 4-30x longer than on RVC which is why so many people train RVC models.3. I don't train so-vits-svc 4.0 models anymore because I generally prefer 5.0 for singing and RVC v2 for speaking.
>>40307623Yes, that was purely TTS. As I mentioned, though, it took a bit of work to get it all sounding right; it wasn't all in one pass.>>40307768[todo: provide info on training]I trained it using all the FiM Rainbow Dash audio with no noise tag from Clipper's master files (~1000 files). I used a learning rate of 0.00005 and trained it for 500 epochs which took several hours on an A4000 GPU. However, from the graphs that were printed during training, it looked like it converged much quicker than 500 epochs, and I found that I got better results using an earlier save point. I can't remember exactly where the save point was, but it was probably around 200 epochs. I also trained a starlight glimmer model using a learning rate of 0.00003 and achieved similar results with the quality of the voice. In an experiment to see if it would produce better results, I trained a Twilight Sparkle model using only audio clips with an emotion of "neutral" and used a learning rate of 0.00001. Unfortunately, the results were so bad that I didn't bother uploading that model to Huggingface. This is my first time training ai models, so it is possible I did something wrong. I resampled all the audio files to a sample rate of 22050 beforehand because I think that's what the software wants. The applications also attempts to auto-transcribe all the training audio files and it does not do a perfect job, so I edited the transcription files to make them accurate.
>>40310053lol, I forgot to remove the "todo" before posting. oops.
>>40309770In addition to the other repos Vul mentioned, here are a couple more:https://huggingface.co/Ponetan/Cloppy_SoVits_Models/https://drive.google.com/drive/folders/1pyitdHlO2-XOYC6H4iVK7AS5aRDeSuxG
>>40310053Thanks for the extra info, and good work with re-transcribing inaccuracies. Every bit of data cleaning and correction helps the mare shine bright as she is able. >I found that I got better results using an earlier save pointI had a similar experience with Tacotron2 when I trained Athena and MAS Rainbow Dash a while back, where it ended up sounding worse the longer it went on, likely due to overfitting issues or something. May be worth experimenting further, but if it remains similarly bad, might be more worth scouting other options.>>40310082>Cloppy SoVits Models>Not a single ponySo much for "cloppy"The drive link has a fair amount of pony though; Curious to see how Tempest, Thorax and Diamond Tiara sound. Thanks for sharing
i mean for fuck sakes i have to basically reupload all my damn ai covers onto odyssey now cause the no fun police at youtube fag tards gave me a copyright strike luckily been saving my hardrive space for that i will be making more models later on im tied up with this shit
>>40310946
>>40312461
>>40313213
>>40274290>astraliteOh, it's the model that specifically excludes artists and blocks you from artist-specific prompting.
>>40314657Damn, I like that style!What project is it for?
>>40314657>Apple Bloom has green eyesbut why
>>40315005>Green-eyed Apple BloomChangeling detected
>Make random mare for first time>In love with random mareFrom NovelAI, usually don't pay attention to the image generation, but still...help
>>40315488>NAIpony diffusion is much better
>>40314657looks like crypt of the necrodancer
>>40315488Looks pretty good, minus the melting hind leg.
>9
>>40262691prompt pls
>>40279903i love the mare-y music!life's demands seem more endurable now! :D
>>40315002Looks like Necrodancer
>>40317992man, this reminds me how much i missed the ponification threads.
Made a cover to test out the Maud Pie (speaking) model by vul that isn't included in Hay Say.https://youtu.be/9cc48YgToY8
>>40320377This is awesome! Maud is a deceptively amazing singer.
>>40320485Thank you! I totally agree.
>>40320377Very nice! I quite enjoyed that.I added Maud Pie as a downloadable character in Hay Say just now. Looks like I missed a few others from Vul's repo, too. I'll go through all the repos I know about tomorrow and add all missing characters.
>>40320377
>>40303899It's felt way longer, I wish he would do an update or fucking something, tell us if it's truly over.
>>40321128i believe its truly ovJA9ATTer
>>40321128>>40321406i believe its over
>>40321406>ovJA9ATTerI know what happened but it still seems incredibly funny.It really does.Anyway here's hoping someone doesn't charge out the ass so leeches like me can make their pony HFO JOI hour-long comedy sketches.
>>40315488What a lovely mare.The real curse is that there's no way to make the same mare again.
>>40322881>
>>40314657That's an interesting composition of ponies.
>>40323470
>>40324376
how do I uninstall so-vits? because I need space for RVC
Messing around with Shap-e and Chirp. The AI mare do spin>https://files.catbox.moe/izacux.mp4
>>40327050It would look nicer if you rendered only edges (could also fill polygons with background color to make wires not seethrough) with some kind of glow effect and black background.
>>40328769
>>40329221>7fast board today
>>40327050i just messed around with chirp and just made an instant pony hit :D
Any one know what the best current TTS Luna model is a the current moment?
>>40330247>TTS LunaThat would probably by the Talknet, but I would still suggest to use a audio combination of that, sovits and rvc to get more natural sounding outputs.
here are my videos testing out chirp ai using both rvc and so vits svc 5.0https://youtu.be/ajPxzs32-xYhttps://youtu.be/RWnivGiUiJU
>>40330724Ooh, those turned out great!Would definitely like to hear an extended version of the second one. Twilight mapped to it quite well.
>>40330785rn i am making a rainbow dash punk rock :D
>>40330936>>40330724>>40330785here it is rainbow dash sings wonderbolt punk im awesomehttps://youtu.be/Tb1MJP_qBGU
>>40322322You could draw her yourself.
What would the pones do with such technology?
>>40332838artificial changelings
>>40332838Im sure they wold love to see/hear retelling of their stories and legends in the voices of the heroes of their past.
>amre
Matcha-TTS: A fast TTS architecture with conditional flow matchinghttps://arxiv.org/abs/2309.03199>We introduce Matcha-TTS, a new encoder-decoder architecture for speedy TTS acoustic modelling, trained using optimal-transport conditional flow matching (OT-CFM). This yields an ODE-based decoder capable of high output quality in fewer synthesis steps than models trained using score matching. Careful design choices additionally ensure each synthesis step is fast to run. The method is probabilistic, non-autoregressive, and learns to speak from scratch without external alignments. Compared to strong pre-trained baseline models, the Matcha-TTS system has the smallest memory footprint, rivals the speed of the fastest models on long utterances, and attains the highest mean opinion score in a listening test.https://github.com/shivammehta25/Matcha-TTShttps://huggingface.co/spaces/shivammehta25/Matcha-TTSmight be relevant for you guys. the base model is trained with the LJ Speech dataset (female US speaker) but if you have your own well curated dataset (as I remember you do from when 15.ai was a thing) then making your own specific voice model shouldn't be difficult. they used 2x 3090s to train a 18M model so should be feasible for recreation especially if someone is willing to rent A100s. might need to use an existing voice conversion model to extend the dataset used for training
Poll: https://poll.horse/25y8gPCy
>>40334691main thing that is stopping me from making ai content is irl bullshit, like voice models not being able to work with my voice range, or being forced to work with small image batches for ai art since my gpu is old as fuck, also its difficult to organize myself for larger project, something small that takes an hour or two is pretty easy to finish as my I can see the progress going on in real time but with a project that spreads over days or weeks I loose the track over what I should actually focus and what is just a fluff that waste my time.
>>40334691Make shit that actually works for more than the people heading /ppp/ for one. If you've got an install guide that says to do a list of things, and there's actually other requirements that are not listed, you're doing it wrong.
>>40334748>like voice models not being able to work with my voice rangeI think pitch shifting works for most anons, up to a point. Would it help if that were built into the UI?>being forced to work with small image batches for ai art since my gpu is old as fuckWhat if you could queue up a bunch of image generations, then check on them whenever you had time?
>>40334782Is this just for PPP stuff or other AI projects too? On the PPP side, have you seen Hay Say? I think that would give you access to pretty much everything, and I think the installation instructions are complete.https://github.com/hydrusbeta/hay_say_uiDo you have the same issue for image generation, if you've tried it?
>>40334806Or for text generation like >>40271923.
>>40334783>pitch shiftingI know what that is, and no amount of fucking around with it's numbers is able to make my voice work, the closes I can get is use combo of placing reference audio into Talknet and then sovits/rvc but even then the output is still flat sounding.>queue up a bunchI could, but as I say, with shitty old gpu I am not able to work on anything while SD is doing its thing for solid minute while outputting a sub 720x720 image (standard 512x512 are always too low res and low quality), just doing a lot of them in a batch means I would need to sit out 20 minutes to get dozen very shitty drawing while not able to progress on anything else since all the pc power would be focus image generation.
>>40334691If by "AI things" you mean content, then more tools. 15 left a gap that still hasn't been filled, AI powered animation has seemingly been all but forgotten about, image gen seems like something is happening (?) but all behind closed doors and Discord servers. Feels like everything has completely stagnated.
>>40334691This >>40334782 too. I know proper UI and packaging is like the "eat your greens" of programming but fuck sake it NEEDS to be done, if you want people to use the tools. I will say Hay Say seems decent on that front from what I've seen and heard, but haven't tried it yet myself.
>>40334898This. More tools of varying kinds to play with and enjoy using brings about creative momentum. The same old stuff can lead to stagnation, as this anon mentioned. Talknet's TTS for example has been around for years, and even though it's an option there's practically no anons (other than blob's ai ponies) making shitposts with it anymore; at least within this thread anyway. This may be a result of our mare quality standards growing every higher these days.>>40334691More technology scouting and exploration. The more we have to try out, the more we can test and determine how many unexplored options are viable for mare production. Content made thereafter with them is both a source of AI mares, and further documents how good the tech is. Also potentially how compatible it is with existing technologies, were anons to test that too.Even if the tech doesn't turn out as well as we might hope, more mare is more mare, test tech with potential with the mare and she will try her best.
>>40335913I like how you think
>>40335913more tools, or tutorials on how to better use existing tools is always nice.
From a new user perspective the documentation and OP are likely bloated with a lot of obsolete material. See: "Latest developments" mentioning SortAnon's TalkNet script, or TacoTron2 still having a section in the quick start guide. IMO the best way to deal with this would be an effort to rewrite them periodically.
>>40336730do you have something in mind, like a some kind of digital leaflet made every 3~ months with whatever changes has happen in-between?
>>40336741(Assuming you are talking about replacing the "latest development" section) I think there could be a section in the main doc for history/important developments. (Not the quickstart doc because that should be constrained to stuff a newbie is likely to use NOW--it shouldn't be used for "historical" purposes). The rationale being that the only people who are interested in history/developments are those who are probably already familiar with AI/the project and thus willing to look at a separate document, whereas a newcomer just wants to do stuff. "Last thread" stuff that doesn't really rise to that level of importance (such as new svc models) might belong in a separate post similar to how /create/ does it, but not in the first OP post. We don't really go through threads fast enough to justify anything beyond that. Also timestamping entries like /lmg/ would help so we don't keep multiple year old posts linked.
>>40336730Other observations>The main focus of PPP is no longer just "datasets" anymore and the OP should be updated to reflect that. IMO the first four paragraphs could effectively be replaced by "The Pony Preservation Project is a collaborative effort to harness and improve AI technology for pony-related purposes, such as pony voice synthesis, image generation, and text generation/chatbots." >Not sure how useful "Active tasks" is since a lot of it is really up to individuals and there are so many potential avenues of research/application now>FAQ needs to be updated as well? A lot of it is targeted towards TTS and voice synthesis even though the scope of the PPP has expanded beyond that. Also linking to 15.ai when the site hasn't been up for a year and there is virtually no communication with 15.>The less our OP text and FAQ reflect the actual state of things in the thread the less likely people are to actually read them.>idk how much willingness there is to change these various aspects of the OP, want to hear other opinions
>>40334806Primarily: Anything RVC related. Talknet I remember actually working as advertised including installation. SOVITS was a pain and didn't exactly work as directed during the setup process, but could at least be worked around.RVC, both in /ppp/'s instruction set and elsewhere, is a fuckin joke. Real bad shit. I'm frustrated and conspiratorially minded enough to think it's a gatekeeping thing, because you've even got RVC sets where the installation process is literally just running a .bat file, but the .bat file doesn't actually install everything that's needed to run and there's no instruction set for installing said dependencies, which should have been handled by the installer in the first place.You get a windows .exe or .msi, run it, it does what it's supposed to. If it has some prereq like DX redistributables, BAM it fuckin tells you to install it and then either launches an installer or takes you to the website to install it.You got a list of linux commands to set up something like a samba share and configure permissions? Follow a guide and bam, it's done. It's set up. It works.Anything RVC? Follow the instructions to the letter, doesn't work. Why?>The creators don't actually want to share with the masses and want to keep the tech to themselves>The creators are freetarded in the brain and don't realize what steps they've left out in the install process because they don't realize they already performed the missing steps on their systemLogically, it's the latter, but the former stings more.
Are there any UIs outside Hay Say so I can more easily test whether or not the ROCm whl installs used to get so-vits 4.0 running work under 5.0 with the python version upped to 3.9 or am I just stuck working with the lengthy terminal commands for this
>>40337062>Anything RVC relatedperhaps if you listed what exactly doesn't work for you when installing RVC? >Follow the instructions to the letter, doesn't work.nobody will be able to help you if you dont use your words (and hopefully screenshots to go along with) to explain at what point the RVC breaks for you, codefags are not mind readers.
>>40337062>You get a windows .exe or .msi, run it, it does what it's supposed to. If it has some prereq like DX redistributables, BAM it fuckin tells you to install it and then either launches an installer or takes you to the website to install it.My reflexive response to this is "I don't want to get people used to downloading random executables linked by 4chan" but I understand that that's how people typically interact with these things.>>40337727NTA but individual RVC installs are not explained anywhere in quick start; as a matter of fact RVC is only mentioned once in the quick start guide.
>>40337770Just to be sure, you did not had a look into the main google doc subsection for RVC?>not explained anywhere in quick startThats because the steps for installing it are pretty lengthy so they are primarily placed in the main google, since the Quick Guide meant to be "go and see what's on" and the rvc installation step list takes two pages which breaks the flow of all the other sections that are just a quick info on how to get things going.But thank you for pointing out that the RVC section was not written in Quick Start Guide, I will see what I can do to fix it up.
>>40337727Maybe if developers actually did their jobs in the first place there wouldn't be an issue.https://github.com/IAHispano/Applio-RVC-ForkI don't keep logs on weeks of on and off tests, but here's one recent attempt.Look at the installation guide. Run a batch file, it installs everything you need, process done. You can run it now! Right? Nope. Actually, it doesn't work, because it starts saying torch isn't present. But wait, torch is present. The files are right there! The installer batch file explicitly installs torch! Why doesn't it work? Fine, let's do a manual inst-wait a minute, you can't do that. This is the entirety of the manual installation instructions. What the fuck is paperspace? How am I installing the application and all its components and dependencies by just going to a directory and typing "make install"?These are meant to be easy GUIs for the average person to use without relying on cloud servers that can cut you off any time, but it's slapped together by people who forget that someone other than themselves needs to be able to install things on their machines and won't intrinsically have the same environment as the dev from the get-go. I have never encountered another set of programs so obtuse and poorly slapped together, and I've had decades installing and configuring random shit from the internet. Even building a shitty app from scratch using (pre-chatgpt SEOspam total search engine death) tutorials from the internet was more intuitive for a non-developer than making this python slop actually function.
>>40338203This is the exact experience I had with talknet on linux right down to python dependancy hell, all the guides are written for wintoddlers clicking a batch file and booting a version that's not reliant on docker stupidity and if you want help with rebuilding the image then get ready to get fucked because you use arch and not (insert random redhat downstream distro here)
>>40338203I agree, the python dependency hell is a special kind of torture, even when someone has general idea on how to solve it with less then helpful stackoverflow answers.This may be a dumb question but did you tried doing the instructions for RVC in the google main doc? A lot of alternative github projects love to randomly slap different dependencies from original source without telling users what they did, the one you posted is not familiar with me, but even in the description it states that its edit of "Mangio-RVC-Fork", so its a modification of another modification of original github RVC source.
>>40337062It's not a conspiracy. I can tell you from having distributed software here that it's very difficult to make sure you've captured all dependencies, especially when you don't have cloud resources or a spare computer with the necessary hardware. The problem is that people develop everything on their desktop, which means a lot of packages have already been installed for the developer from earlier projects. It's easy to miss those when creating an installer or writing the setup instructions. And that's just for the hardware-agnostic software. Anything that depends on available hardware is basically impossible to get right without cloud resources for testing a bunch of environments.
>>40338203Is there a reason that Hay Say wouldn't work for you? I could put more work into the standalone GUI for RVC but I'm prepping for mare fair
>>40338233Why would arch have trouble building a docker image? Why would you even need to rebuild a docker image? The dockerfile standardizes all interactions with the host OS, and the image should run the same way on all operating systems, so it should work regardless of where it's written or used.Docker would be a great solution to installation hell if it weren't for gpu driver issues. It's probably still the best we're going to get. Hay Say seems to do it really well, though I don't know how robust it'll be when you try to enable the GPU with, e.g., mismatches between the container CUDA version and the host driver version. The last time I tried to use a cuda 11 container with an nvidia 12 host driver, the gpu just froze until I restarted. If it weren't for that, docker would make it easy to install, run, and uninstall any project.
>>40337246Technically yes but not very documented https://github.com/effusiveperiscope/so-vits-svc/tree/5.0
>>40334691Poll for solutions: https://poll.horse/LpKD4Bte
>Page 9
Dall-e 3 soonhttps://www.youtube.com/watch?v=BAfOGBojiEU
>>40340618Why they pick guys with the weirdest accent all the time?
>>40340698>pickWhat do you mean? That's just a random youtuber.
>>40340618>I was also happy to see that it will not create an image in the style of a living artist.In other words, more censored models.
>>40340806Yes, but the model will also trickle down to the actually good stuff.
>>40340792I mean that. Or is every single AI guy is some east european?
>>40340257
What if there are tiny mares hiding in the computer?
>>40343017Then you can say your computer has the horsies.
I'm surprised this hasn't been used much for lewd stuff.
>>40341583
>>40340806People will figure out how to use it to generate datasets for open-source models, which will get tuned for uncensored models.
>>40343907
>Page 10
aaah muh gpus
>>40346143dear lord
Bros why are local LLMs still so retarded?
>>40346706The ones with the resources to develop LLMs are mostly self interested big corporations who want to run you through their servers.I've no doubt potential is there that's not being pursued.
talking about text models, do anyone knows if there are any that can work offline AND produce a script kiddie level of programming? As far I can see everything that works (to questionable degree) is forced to run on someones else server/service.
>>40346793VRAM?
bumpu
>>40348582Right no its 8gb, Im planing on saving up for a 16gb card (and yeah, I am aware that best option would be to go for largest 24gb card but my current situation doesn't really allow for that unless I wait for yet another net gen and grab something from rtx 20s pile).
audioSR is fixed on local so i finally can share thishttps://github.com/haoheliu/versatile_audio_super_resolutionit can upscale any khz audio up to 48 kHz,sample and demo here:https://audioldm.github.io/audiosr/https://replicate.com/nateraw/audio-super-resolution
>>40349352>upscaling audioSo this is actually a thing now? I remember thinking about this concept a while back and looking it up, only to find nothing.>https://audioldm.github.io/audiosr/What the fuck? This is actually really impressive based off of the given examples.How much is required for this to run locally? I don't see anything about that.
>>40349377Nevermind, I just tried out the demo and it did pretty poorly. But that might be because I used a 2+ minute soundtrack that wasn't shit quality (just meh quality) and I left it at default settings.
>>40349386as for now it is better to cut down below 10 sec for best result,if you want to process a 2 min soundtrack you can try to merge the results later
>>40349448That sounds like it'll be extremely tedious. I'm getting some pleasantly surprising results from short sound bites at least.
>>40349352I am definitely noticing that it just botches anything that already has enough quality. I'm assuming you have to fiddle with the settings so it's not as intense.
>>40349352It sure spits out some really strange errors.What they hay does "Audio buffer is not finite everywhere" even mean?Perhaps this janky code sometimes glitches the machine/simulation to provide an infinite audio buffer. Sounds useful.Maybe we could use a similar method to give our AI mares infinite memory to train on ... and all the digital oats she could ever want.
>>40349529https://youtu.be/wtLft3wb0zk
>>40349352>Tests audioSR as an audio enhancer for Talknet TTS>Fetches existing output of Celestia>Downsamples audio to 8khz>Feeds 8khz Celestia into audioSR>Celestia speaks like she has shards of broken glass in her mouth[Original/Ground-Truth] https://files.catbox.moe/sdmexo.mp3[8khz downsample] https://files.catbox.moe/hmid7w.mp3[audioSR result] https://files.catbox.moe/vdujww.wav
>>40349352The obvious use case is trying to upsample so-vits-svc 5.0 generated audio. Not really impressed by the result though.base: https://files.catbox.moe/vlxnpt.mp3predicted: https://files.catbox.moe/oiqg9w.mp3>>40349377>>40349637I think TalkNet already did something like this via HiFiGAN? But it takes a highpassed version of the new signal and mixes it with the original.
>>40349352>Og: https://files.catbox.moe/lykk9t.mp3Upscale: https://files.catbox.moe/c8bsv7.mp3>Og2: https://files.catbox.moe/z54y15.mp3Upscale2: https://files.catbox.moe/f8j4fr.mp3I can see the use of this tech in form of taking low quality training audio and turn it into little bit better quality, BUT as others above pointed out, for some reason it seems to like randomly add extra shimmery sound (ive hear it few times in musical samples outputs).
>>40348898WizardCoder or Phind v2? 7B GPU only or 13B GGUF (CPU+GPU)
>>40350021nvm I just realized phind only has 34bs lol
Live now. Finishing the last few things. Final stream before Mare Fair.cytu *dot* be/r/PonyPreservationProject
>>40350963I'll do you one better!
>>40351654Thanks, kind AI mare.
How are you doing?
>>40352820irl is keeping me too busy, I just wish I could make more mare content.
>>40352820I am still too much of a lazy faggot to do anything.One day maybe...
>>40255207hey guys I never visted these threads before but I need two ai models for something and I don't know what to do, nonef the derpy.me links in the OP are working because of internal server error, can someone help out?
>>40354345Hmmm. Indeed, it seems that derpy.me is down.What kind of ai models do you seek? Text generation, speech generation, or image generation? And are there specific characters you are looking for?
i wonder if any pony could reupload all the old tacotron 2 models https://docs.google.com/document/d/17VAnMQI4NJzu7UXZALs14AFvhpw8wvbLdA9HrA2xLus/edit the doc has them all crossed out and need to use it for the speech synthesizer / tkinker gui tool also needs to be fixed for the current cuda toolkit thats why it has stopped launching lease if anyone here that can fix these issues YYHTTA
I'm trying to install so-vits-svc-4.0 And ran into a error doing the installation. (python setup.py egg_info did not run secccsessfully.)Windows 10 x64, installed Python 3.11 (64-bit)Already tried to do-m pip install --upgrade pip-m pip install --upgrade setuptoolsBut it didn't do anything.Any advice? I'm confused.
>>40355305>RuntimeError: uvloop does not support windows at the momentProbably due to this error. You're probably trying to get it from the original github page which is linux only. Windows users are using forks of it that work on Windows. Try this instead: https://github.com/effusiveperiscope/so-vits-svc/releases
>>40355354Tried it, exactly the same error, no result.A couple months ago I installed from the same link from which I originally pic so-vits-svc-4 on my weak laptop. And it installed without any problems.I checked Python and MVS on my laptop, it's exactly the same version and stuff.
>>40355861Bump
>>40355305>>40355433The problem seems to be coming from gradio (inexplicably). The dependency graph is too convoluted for me to want to find out exactly what's wrong. Bumped the requirement. Repull and try again.
>>40354528I need sweetie belle and maud pie text to speech and also a guide how to use them
>>40333749https://github.com/shivammehta25/Matcha-TTS/issues/9author wrote instructions for training with a custom dataset
>>40356028She looks like Derpy as a new pony.
>>40356377The only Text to Speech application with both of those models right now is Controllable TalkNet. The older Tacotron applications and DeltaVox RS have models for Sweetie Belle but not Maud (there used to be a Tacotron one for Maud but the link for it has died).You have a few options for using Controllable TalkNet. 1. You can install it locally, and there's a convenient script to make it easy if you are running Windows. A video tutorial can be found here: https://www.youtube.com/watch?v=0YtGqPzcgdQ and the source code can be found here: https://github.com/SortAnon/ControllableTalkNet2. You can install Hay Say, which comes with Controllable Talknet as well as a few other "voice conversion" (i.e. speech-to-speech) models. Beware that Controllable TalkNet won't work on it if you are using MacOS. It is pretty straight-forward to install, and instructions for installation can be found here: https://github.com/hydrusbeta/hay_say_ui3. There is a Google Colab script you can run from your browser: https://colab.research.google.com/drive/1sAbqSQj9P56TTpsU7bzbobzAxmydvUSABy the way, local installations of Controllable TalkNet used to require a GPU but it can now run on CPU if you don't have a CUDA-capable GPU.Although derpy.me is returning an internal server error, the main doc is still available: https://docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit
>>40357408Thanks. I will see if I can get this to work
Double feature this time, RD sings Can't Stop by RHCP and Fluttershy sings the original Japanese version of Renai Circulation!https://youtu.be/WsOHvSsPxrchttps://youtu.be/g-FpFVRsGUw
So... I kinda fell into the trap of not getting up to date on things and I trained a Trixie RVC V1 model. I'll try a RVC V2 model soon.(mild nsfw warning)https://files.catbox.moe/w1gzur.mp3
>>40359235Looks like the Mare Fair is really affected a lot of threads. It feels like half the board is missing in action.
>>40354827I'm working on a new tool for tacotron 2 + hifigan and I'm going to train some new models for them
>>40359990The board seems slower too.
thank celestia / hoofness this is as close to 15.ai we will get good stuff :D
>>40360576>>40360163P.S Forgot to tagalso will you be able to still use the old models to ?
>>40361352+1
>>40334691Up-to-date Colab notebooks.Simple technical explanations/visualizations/diagrams (not just, how to make ponies talk, more, HOW to make ponies TALK. what is the function doing?)
>>40360577Yes, the old 22 khz models work as well
how about the ones that were higher? in khz>>40363673
>>40362362-2
>>40365134
>>40364580If I can get them to work, yes
Up. The oats have spoken.
Free Hugs is now live on PonyTube and YouTube. Pinkie receives some very disturbing news regarding the cost of hugs, and takes drastic action to prevent the total collapse of Ponyville's hug economy!https://pony.tube/w/mvc9HMTY4JGDyTp1nPN61Nhttps://youtu.be/czoZWXhV2oU
>>40357408>>40357460Here's Tacotron Maud:https://mega.nz/file/Ue5DnaxB#OuHpJeE1QSGrr52zo1_AQAHqW7mWBu9DYKW_a9BaToELooks like Flutteranon's models have been deleted. The only other one I have is Derpy:https://mega.nz/file/MGhQUBBR#rrH4gvUgi4q1AoPyQR2RofGLgvEy3mEmWlCTGve8eyQAny others missing?
>>40366030
>>40366030Panel was great
>>40366030I want free pony hugs.
>>40366030Amazing. My only feedback is that the pacing is a bit slow sometimes.
>Quick start guide:derpy.me/FDnSkThe deeply link is giving me errors, could anyone post full Google link to the quick start guide?
>>40366030IM SO FUCKING EXCITED IM SO FUCKING EXCITED YEEAAAHAHHHHHHHHLBVHEFAHBVFEBEGAF:FE:asdfg hjkpl;'zgss 4zx c vh nb ml,./zXDDDDDDDDDDDDDDDDDDDDD
>>40366030my favorite part is definitely when pinkie presses her fat ponut up against the screen and smears it back and forth for 4 minutes and 28 seconds
>>40366030This has already lead to new episode withdrawal. I need more! I can't go back to rewatching what I've already seen. Please, I need my fix!
>>40366030It's great! But I get the impression that the "hug voiceovers" weren't supposed to be said aloud, but they were an expression of the characters' thoughts, so I wish you could have put some reverb on them.
>>40366030This is absolutely adorable. The plot feels like something out of Winnie The Pooh in best way possible with how silly and heartwarming it is, got me smiling almost the whole time.
>>40366138ooooh have you got your tool working wheres the link to your tool
>10
>>40368934
>>40367995It was TKinterAnon’s 2.0 tool, you can get it from here:https://docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit#heading=h.b686z8pjsf0zGet the CUDA version linked there as well.
>>40369794
Apparently Bing just dropped Dall-e 3 for their image generator ( https://www.bing.com/images/create )It's no half bad, can actually spell this time, and has pretty good contextualization and character recognition.>Pinkie Pie pony squinting while reading a book that says "Fighting anti-hug tyranny 101"
To say I'm impressed with it would be an understatement.
>>40371796Damn, that twiggy on the computer is something I'd actually use if the wires were fixed.
>>40371837AI's gone from soulless portraits to images that can somewhat tell a story/have more than just the face value meaning. And I didn't even need to prompt it in a weird way."Twilight Sparkle pony reaction image, she's at a desk looking at a computer screen and acting very confused."
glimpses of artificial snowpity
>>40370506i am on the latest cuda at the moment is this supported or do i need to downgrade my cuda toolkit ?
>>40373305damn, i quit all attempts trying to learn how to draw now
>>40371390>>40371796This is actually passable output, compared to other options that have long since degraded from "cool new tech" to "this shit is actually terrible, stop posting it you faggot". Still fucked in many ways, with that mouse being backward, wires being wrong, the keyboard having shadows from a different light source than the rest of the image, etc.The real question is, will this be doable with locally runnable tech, or are we stuck hoping some corpo cloud service doesn't decide to lobotomize the output?
>>40370506Is there a collab for this somewhere for those of us with no GPU to use?
>>40370506that one also doesent work it spits out nonsense and is super quiet sometimes it even freezes and does nothing for me even tho i am on a gpu and on the latest cuda
>>40366030never gets old
>>40375049I just installed the CUDA version it mentioned and it worked just fine:https://files.catbox.moe/uyocxi.oggI also had this cheat sheet for when I used 22KHz voices to make them sound better with Audacity:>Set Project Rate to 48KHz (bottom left corner).>Set the audio track's rate to 48KHz (little arrow next to the X, then go to Rate).>Slow it down by 45.9% (Effects, Change Speed, change speed multiplier to 0.459).>Change the bass and trebble by 3db and 20db respectively (Effects, Bass and Trebble).>Maybe lower volume by -6db?>>40374686From what I read in the archive, you can't use Colab for these models anymore, at least not without some reworking. Try DeltaVox RS, it's in the main doc. It doesn't use the GPU.
>>40364580Probably not anymore, because I was able to train smaller models and make them sound better. I also removed tensorflow and cuda installation jank.
>>40375622u made a modified fixed version of the toll cause when i click that floppydisk and sound icon it sounds like rarity and is really quiet and speaks jobberish ?
>>40376594when i load the 22 khz engine his alias will not be present in Numba version 0.50.0. from numba.decorators import jit as optional_jitM:\FULL 2.0\winpython\python-3.7.7.amd64\lib\site-packages\librosa\util\decorators.py:9: NumbaDeprecationWarning: An import was requested from a module that has moved location.Import of 'jit' requested from: 'numba.decorators', please update to use 'numba.core.decorators' or pin to Numba version 0.48.0. This alias will not be present in Numba version 0.50.0. from numba.decorators import jit as optional_jitM:\FULL 2.0\winpython\python-3.7.7.amd64\lib\site-packages\torch\serialization.py:657: SourceChangeWarning: source code of class 'glow.WaveGlow' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning)M:\FULL 2.0\winpython\python-3.7.7.amd64\lib\site-packages\torch\serialization.py:657: SourceChangeWarning: source code of class 'torch.nn.modules.conv.ConvTranspose1d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning)M:\FULL 2.0\winpython\python-3.7.7.amd64\lib\site-packages\torch\serialization.py:657: SourceChangeWarning: source code of class 'torch.nn.modules.container.ModuleList' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning)M:\FULL 2.0\winpython\python-3.7.7.amd64\lib\site-packages\torch\serialization.py:657: SourceChangeWarning: source code of class 'torch.nn.modules.conv.Conv1d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning
>>40376622screenshot 2 when i try to generate it does this and freezes and does nothing :(
>>40376628>>40376622% loading tacotron2 22 It might take two minutes. -----------------% loaded2023-10-03 14:23:18.474064: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dllM:\FULL 2.0\winpython\python-3.7.7.amd64\lib\site-packages\librosa\util\decorators.py:9: NumbaDeprecationWarning: An import was requested from a module that has moved location.Import requested from: 'numba.decorators', please update to use 'numba.core.decorators' or pin to Numba version 0.48.0. This alias will not be present in Numba version 0.50.0. from numba.decorators import jit as optional_jitM:\FULL 2.0\winpython\python-3.7.7.amd64\lib\site-packages\librosa\util\decorators.py:9: NumbaDeprecationWarning: An import was requested from a module that has moved location.Import of 'jit' requested from: 'numba.decorators', please update to use 'numba.core.decorators' or pin to Numba version 0.48.0. This alias will not be present in Numba version 0.50.0. from numba.decorators import jit as optional_jitM:\FULL 2.0\winpython\python-3.7.7.amd64\lib\site-packages\torch\serialization.py:657: SourceChangeWarning: source code of class 'glow.WaveGlow' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning)M:\FULL 2.0\winpython\python-3.7.7.amd64\lib\site-packages\torch\serialization.py:657: SourceChangeWarning: source code of class 'torch.nn.modules.conv.ConvTranspose1d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning)M:\FULL 2.0\winpython\python-3.7.7.amd64\lib\site-packages\torch\serialization.py:657: SourceChangeWarning: source code of class 'torch.nn.modules.container.ModuleList' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning)M:\FULL 2.0\winpython\python-3.7.7.amd64\lib\site-packages\torch\serialization.py:657: SourceChangeWarning: source code of class 'torch.nn.modules.conv.Conv1d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning)loaded modules ['22']generate function debug:text: hello i am maud the bestest rock pony around.output: results/latest.wav play: 0params: ('0.75', '0.01', 0) eng 22 char models/Maud_429_mp_neutral__22K.tvmValidate sigma, denoise: True True<22 khz Warnings after i clicked generate
>>40366030https://files.catbox.moe/6ucbci.mp3Damn, let's download it for latter...Thank you Clipper!
>>40304894I don't even watch this show or listen to paramorewhy did this hit me like a truckload of bricks
>>40366138I have this from 2019-2020.And the "TKSynthetizer" still work, even if quality wise, it's not terrible.Need me to reupload some? (will take a while, I am away from my main for 3 weeks...)Adagio Dazzle_440_ad_neutral_0.0_MMI.tvmApplejack_202431_aj_neutral_0.162_MMI.tvmApple_Bloom_59862_ab_neutral_0.203_MMI.203.tvmBig Macintosh_12650_bm_neutral_0.024_22K.tvmBig Macintosh_408945_bm_neutral_0.095_MMI_Applejack270kBase.tvmCadence_420000_pc_neutral_0.276_MMI_Twi300kBase.tvmCelestia_17162_pc_neutral__22K_Epoch 329.tvmCelestia_373600_pc_neutral_0.174_MMI_300kTwiBase.tvmChrysalis_63750_qc_neutral_0.167_MMI_NewDataset.tvmCoco Pommel_205_cp_neutral_MMI.tvmColgate Minuette_179_cm_neutral_MMI.tvmCozy Glow_6048_cg_neutral_22K.tvmDan Jerk__dj_neutral_MMI_v3.tvmDerpy__dh_neutral__MMI.tvmDiscord_159027_di_neutral_0.192_MMI_Audiobooks.tvmFluttershy_280496_fs_neutral_0.103_MMI.tvmfsnew_checkpoint_75000Gallus_1800_gl_neutral_0.179_22K.tvmGranny Smith_313777_gs_neutral_0.252_MMI_Applejack270kBase.tvmLightning Dust_225_ld_neutral_22K.tvmLimestone_252_ls_neutral_22K.tvmLuna_362120_pl_neutral_0.141_MMI_300kTwiBase.tvmLuna_57156_pl_neutral_0.100_MMI.tvmLyra__lh_neutral__MMI.tvmMaud_429_mp_neutral__22K.tvmMayormare__mm_neutral__MMI.tvmMeadowbrook__mb_neutral__MMI.tvmMrs Cake__mc_neutral__2K.tvmmudbriar__mb_neutral__22K.tvmNightmare Moon_657187_nm_neutral_0.693_MMI_362kLunaBase_300kTwiBase.tvmOcellus_1096_oc_neutral_0.123_22K.tvmPinkie Pie_211799_pp_neutral_0.190_MMI.tvmPipsqueak_191194_ps_neutral_0.188_MMI_100kTwiBase.tvmRainbow Dash_301813_rd_neutral_0.174_MMI.tvmRarity_101090_ra_neutral__MMI.tvmRarity_222822_ra_neutral_0.164_MMI.tvmSandbar_1688_sb_neutral_0.094570_22K_561 Validation loss.tvmScootaloo_421210_sc_neutral_0.170_MMI_300kRainbowBase.tvmShining Armor_55182_sa_neutral_22K.tvmSmolder_1442_sm_neutral_0.140600_22K_239.tvmSpike_201752_sp_neutral_0.168_MMI.tvmSpitfire_646_sf_neutral__22K.tvmStarlight Glimmer_37891_sg_neutral_0.1366_MMI_518, 84.000148.tvmSunset Shimmer_15003_ss_neutral_22K.tvmSweetie Belle_427784_sb_neutral_0.153_MMI_300kTwiBase.tvmtf2_announcer_dataset_3202020.tvmThorax_11226_tx_neutral_0.053_22K.tvmTirek_12402_lt_neutral_0.337_22K.tvmTrixie_21104_tr_neutral__MMI.tvmTrixie_30256_tr_neutral_0.059_22K.tvmTwilight Sparkle_30813_ts_depressed_0.187_22K_Whining-Fear-Anxious-Sad-Confused-Tired-Exhausted.tvmTwilight Sparkle_31129_ts_cheerful_0.187_22K_Neutral-Happy-Amused-Love-Smug.tvmTwilight Sparkle_31444_ts_angry_0.186_22K_Disgust-Annoyed-Serious-Shouting-Angry.tvmTwilight Sparkle_317771_ts_neutral_MMI_48mmi.tvmVaportrail__vt_normal_MMI.tvmYona_1120_yn_neutral_0.165627_22K_373_NoisyBase.tvmZecora_370016_zc_neutral_0.132_MMI_300kTwiBase.tvmZecora_8977_zc_neutral_0.058_22K.tvm
I promised an anon I would find this segment in the PPP panel from this year's /mlp/con. I really liked the lipsync at the start.>https://pony.tube/w/fVZShksjBbu6uT51DtvWWz?playlistPosition=9&start=1h36m44s
>>40376805for some reason both tools stopped working on my pc i feel the 1.0 works better than 2.0 one i updated it with the 1.1 patch and fix, but now its gone broken as well is there a way to fix and solve this tell me if i need o send on whats going on with it
>>40377129Did you install Cuda 10? Have you tried other models?
>>40377967so i specifically need cuda 10 only what happens if that doesent change a thing and breaks my other ai tools ?
>>40378006Cuda installs each version in a separate folder.
>>40378062i installed cuda 10.0 still no change also i could be either my numba or tensorflow version warnings.warn(msg, SourceChangeWarning)R:\Tk Synthesis 2.0\winpython\python-3.7.7.amd64\lib\site-packages\torch\serialization.py:657: SourceChangeWarning: source code of class 'torch.nn.modules.conv.Conv1d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes. warnings.warn(msg, SourceChangeWarning)entry not selectedException in Tkinter callbackTraceback (most recent call last): File "R:\Tk Synthesis 2.0\winpython\python-3.7.7.amd64\lib\tkinter\__init__.py", line 1705, in __call__ return self.func(*args) File "R:\Tk Synthesis 2.0\menu.py", line 195, in genall self.app.genall_continue(True) File "gui_2_class.py", line 258, in genall_continue self.action_generate_advanced(0) File "gui_2_class.py", line 398, in action_generate_advanced self.entry = self.pro_data[self.data_index]AttributeError: 'App1' object has no attribute 'data_index'Exception in Tkinter callbackTraceback (most recent call last): File "R:\Tk Synthesis 2.0\winpython\python-3.7.7.amd64\lib\tkinter\__init__.py", line 1705, in __call__ return self.func(*args) File "R:\Tk Synthesis 2.0\advanced_ui.py", line 137, in clear_fx_action self.clipscopy = self.app.pro_data[self.currently_editing_index]['clips']IndexError: list index out of range
>>40378127I remember either the doc or a readme in the tool's folder mentioning you should run a cmd file to fix some issues, do you see anything? I haven't used this in a while.
bumo
>>40378127>>40378314The main doc does mention a possible fix, picrel.
>>40366030Is the video supposed to be set to Youtube Kids?
>>40379934YouTube does that with MLP, automatically sets stuff that it deems is designed for kids to watch. I don't know if you can set the flag back once youtube decides it so? Let's hope Clipper can.
Does anyone know where i can get svg or fla files of mlp backgrounds? i had a whole folder but i had to delete them because they got corrupted.
>>40379576i tried that it didnt work after that still
>>40380207lolnope. jewtube hates their userbase, and the only surefire solution to not getting marked for kids is to include cursewords or a casual "nigger" somewhere, but that demonetizes the video as well
>>40380207>I don't know if you can set the flag back once youtube decides it so? Let's hope Clipper can.PinkiePieSwear got his comments back after swearing all over the title and the description, but he can't unblur the video. So don't ever try to do anything like that to protest.
>>40376800You already know why. You just can't admit it.
>>40380342SVGs: https://drive.google.com/drive/folders/1WHI3ROYGbgITQO_riA1wnYZ1I0uQP4sc?usp=sharing- It should be in the "Backgrounds - composed" and "Backgrounds - decomposed" 7z files.FLAs: https://drive.google.com/drive/folders/1gKVRwSGZsz_j0nIkIm0fQKGWaqQGg5lP?usp=drive_link>>40377347I'm considering replacing my (hard-to-use, impossible-to-maintain) IPFS files with torrents. I'm discussing it on /pag/ while I setting up torrents for the /mlp/ archive. Here's a summary of what I'm thinking:https://ponepaste.org/9358
What's the healthy amount of self doubt and rewrites you should have without going full retard?