TwAIlight welcomes you to the Pony Voice Preservation Project!https://clyp.it/tm03e5enThis project is the first part of the "Pony Preservation Project" dealing with the voice.It's dedicated to saving our beloved pony's voices by creating a neural network based Text To Speech for our favorite ponies.Videos such as youtu.be/GuJKTodX1FA. or youtu.be/DWK_iYBl8cA have proven that we now have the technology to generate convincing voices using machine learning algorithms "trained" on nothing but clean audio clips.With roughly 10 seasons (9 seasons and 5 movies) worth of voice lines available, we have more than enough material to apply this tech for our deviant needs.Any anon is free to join, and many are already contributing. Just read the guide to learn how you can help bring on the wAIfu revolution. Whatever your technical level, you can help.Document: https://docs.google.com/document/d/1xe1Clvdg6EFFDtIkkFwT-NPLRDPvkV4G675SUKjxVRUWe now have a working TwAIlight that any Anon can play with:https://15.ai/https://derp.link/vCzm2 (48KHz Training)https://derp.link/hdJQF (48KHz Synthesis)https://derp.link/NR7Xi (Ngrok Synthesis)https://derp.link/YTJ94 (Guide)>Active TasksCookie is working on controllable speechResearch into animation AIResearch into pony image generation>Latest DevelopmentsClipper sorts animation files (derp.link/O24pp)Clipper looking for AI skit ideas (derp.link/JfVsA)Clipper collecting sound effects from show (>>36723767)New DeltaVox (>>36812261)Training notebook for HiFi-GAN (>>36874641)New guides and notebooks for training/exporting models for DeltaVox RS (>>36898031)Clipper voice dataset (>>36901235)Clipper added to the HiFi-GAN notebook (>>36905521)Train your own CLIP model (>>36930047)GPT-2 model released (>>36930714)Animation script finished (>>37003821)New audacity to TacoTron training text tool (>>37025693)15 makes updates to test siteTalkNet as a potential replacement for TacoTron (>>37040781)TalkNet update (>>37082982 >>37119597)Public contributions reintroduced for next year's panel (>>37099451)Singing Talknet models (>>37134971 >>37144858)Animate automation tool available (>>37147092)DiffSVC UI done(>>37150296)GDrive clone of Master File now available (>>37159549)New TalkNet models (>>37179832)Better copy of show bible available in doc (>>37246652)FiMFic based GTP-J-6B demo notebook (>>37284129)Latest Synthbot progress report (>>37241505 >>37251301 >>37253865)Latest Cookie progress report (>>37241115)Latest Clipper progress report (>>37189422 >>37193768)>Voice sampleshttps://derp.link/fHs3Khttps://derp.link/O1xdh>Clipper Anon's Master File 2.0:https://mega.nz/#F!L952DI4Q!nibaVrvxbwgCgXMlPHVnVwhttps://mega.nz/folder/0UhSmYAB#WBrB-qCprQTofkAhwMp5CQ>Synthbot's Torrent Resourceshttps://derp.link/ZJNca>Cool, where is the discord/forum/whatever unifying place for this project!?You're looking at it.Last Thread:>>37240950
FAQs:>READ THE DOCDo it nowderp.link/V7cMp>Where can I find things made with the voice AI?In the Good Poni Content folder: derp.link/23EUs>Did you know that such and such voiced this other thing?Yes. We are very much aware. It is best to keep to official audio only unless there is very little of it available. If you know of a good source of audio for characters with few (or just fewer) lines, please post it in the thread. 5.1 is generally required unless you have a source already clean of background noise. Preferably post a sample or link. The easier you make it, the more likely it will be done.>What about fan-imatitions of official voices?No.>How do I make the voices?Several guides are available. In depth guides on how to do training and synthesis (making the ponies speak) are in the doc. If you don't want to use the navigation bar in the doc, the sections are also directly linked in the OP. If you want to use the WiP 48KHz notebook, some kind Anons have put together some image guides for you.48KHz Training: derp.link/wW2hX48KHz Sythesis: derp.link/j4MXQ>How do I make the ngrok links?Doc: derp.link/SfIhYVideo: derp.link/qYgIp>Where are all the voice samples?In the doc.>Is a place I can find all the pony models?In the doc.>What about muh waifu?Check the doc.>Will you guys be doing a [insert language here] version of the AI?Probably not, but you're welcome to. You can however get most of the way there by using phoenetic transcriptions of other languages.>What about [insert OC here]'s voice?Not a priority. Again, however, you're welcome to. There are already people doing this.>Where can I view the PPP /mlp/con panel?>2020:YouTube: youtu.be/WtuKBm67YkI CyTube chat: pony.tube/videos/watch/b83fbbfc-6d4e-4768-8deb-edb61ea38abb>2021:YouTube: youtu.be/RAYWr1uOGVMCyTube chat: pony.tube/videos/watch/56cf0502-0ef8-41a7-96c5-bd7cb727bb9f>I have an idea!Great. Post it in the thread and we'll discuss it.>Do you have a Code of Conduct?Of course: 15.ai/code>Is this project open source? Who is in charge of this?derp.link/CQ3Ca
i haven't seen 15 in a bit, i miss them already
>>37286879He posted not even 3 hours ago.
>>37241115>Cookie 11 days agoDamn I'm slow.Anyway, I've;- written DiffWave- written HiFi-GAN- written Fre-GAN- written new universal BOHB modules so every model and every line of every config file can be "tuned" automagically to improve results.- written Unet, DilatedWN and FFT versions of CTC models. (conformer version soon too)- trained the new HiFi-GAN to 842k steps on single GPU- tuned a few DiffWave + CTC model variants- moved my data to faster storage- reduced RAM usage on dataloaders (letting me spin more up at a time to help with small model training)Infact, reading>>37241115> Some main things I expecting to fix;It looks like I've completed almost everything I said I would do.Next up is training all the old baselines and comparing them against new models/modules programmatically, and crushing all the bugs I can find.>>37241494>What's DiffPitch? A diffusion version of FastPitch?Diffusion version of DiffSVC.PPG + Noisy F0 -> Denoised F0 (for a 10~ steps)It's a small network that works in conjunction with the mean shifting pitch preprocess to ensure all pitch values for the speaker within their speaking range (using ppg to take into account the phoneme being spoken simultaneously).
>>37286975Oh, using diffusion to clean up F0, that's clever
>>37286884It's like he's still here, with us, even now. I swear sometimes I can still even see his posts...
>>37286879>them15 is a team of people? didn't know about that
>>37286975>universal BOHB modulesWhat is this?
>>37287144Here's your (You).
>>37287224It is a parameter tuning algorithm that is slightly faster than the normal ones, combined with my 600~ lines of config file.I can add any line of the config to a tuning file (along with appropriate limits) and it'll run many many training runs and slowly find the values that result in the lowest validation loss (or whatever I set) possible.It's really slow, but extremely low effort to use and can find patterns that I miss because I don't expect them to work and never test them.Since it works directly with the config files and train module, it will work with every model and data feature I build in future, so it should reduce the amount of manual work by a decent chunk or let me work with more GPUs at a time.
Hello? Yes, 15 inc.? Your Fluttershy model generated static at me, This is unacceptable and I demand to speak to the manager so I may receive compensation for this injustice.
>>37287270Fluttershy works fine for me, what browser are you using?
>>37287275I'm just shitposting, models work fine
>>37287325>ppp is actually polacksKek, thx slavs.
wew,they really are doing it for free.Anyhow, anybody making new projects or something? im feeling like re-dubbing some meme songs into with pony voices.
>>37287369talknet got celly voice or nah?
>>37287369I'm about to start work on PTS4. Also working on another thing that'll probably come out before that, depending.>>37287381Yes for Celly voice, no singing model for her yet though.
>>37287408>Yes for Celly voice, no singing model for her yet though.I mean we are not really far away from it but still I'm really grateful for what all of us made real.Literally what other fandom made multiple artificial intelligences because they love their show/waifu that much?In retrospect it really is fucking insane what we have done.
After training GPT I will want to make a porn StyleGAN2. Has anyone mass scraped images from one of the boorus yet?I'll also want the TPDNE checkpoint to finetune from.
>>37287502The previous thread isn't archived yet, you can still make it! You just have to believe!
>>37287486There's the TPDNE dataset, but it's just faces, so it won't be helpful. Clipper was labeling pony butts, could be useful depending on what you mean by porn.For mass scraping, maybe talk to the altboorus. Don't know if TPA or iwiftp has something.Checkpoint is linked on the TPDNE website. Pretty sure it uses the estimator branch of Shawn's fork. See https://make.pony.pictures for a usage example.
>>37287369Making one more effort to attempt the TalkNet. Hasn't worked on my end, but that's probably because I use Firefox. If it still doesn't work, I'll stick to 15ai until I can figure it out; probably will just have to add another browser that works for it.Currently working on a Sci-Twi audio idea I had a while back. It was on hold when the test sites went down, so I'm working on it a bit tonight. Probably won't be done for a while. Currently sits at 18 minutes long.
>>37287517To narrow down your search, it doesn't work in Netscape Navigator either.
>>37287517>Sci-TwiIs this the part where I'm supposed to say >no hooves?
>>37287550Say the line, Anon!
For some reason 15's models really don't like sentences starting with 'Thus, '. Doing so results in audio that either has noticeable artifacts/static or skips over the problematic part really quickly. Using high-emotion contextualizers makes it worse:>Thus, the two sisters maintained balance for their kingdom and their subjects, all the different types of ponies.|I'm bored.https://u.smutty.horse/mcfeqknxwzs.wav>Thus, the two sisters maintained balance for their kingdom and their subjects, all the different types of ponies.|What?!https://u.smutty.horse/mcfesppbomu.wav>Thus, the two sisters maintained balance for their kingdom and their subjects, all the different types of ponies.|I can't wait!https://u.smutty.horse/mcfermkotbx.wavInterestingly, the issue doesn't seem to apply to similar words like 'therefore'.>>37287550>Sci-Twi>no hoovesWait hold on, you mean that the (pony) Twilight with glasses that appears in some fan videos is actually from eqg and not an original fan idea? That explains a lot.
>>37287578>no hoovesAlso, the models are really picky like that sometimes. I remember BGM posting an example text for his second pony rap video, and the models only accepted the very last sentence and ignored the rest.
>>37287517>Currently working on a Sci-Twi audio idea I had a while back. It was on hold when the test sites went down,Too bad it couldn't be on hold indefinitely you barbie faggot. Take it to one of the EQG threads. It doesn't belong here.
>>37287625If you niggers put half as much energy into loving ponies as you do into hating EQG we would have robo-waius and probably our own 10th season at this point.
>>37287650>You will never go to Equestria.Extremely ironic coming from someone defending a character who left Equestria.>>37287665You apparently weren't here from the start but the idea of scrapping audio from eqg caused controversy in the early stages of PPP despite the characters sharing VAs. Ultimately it's a good thing that it was done but it wasn't unanimously praised.Let me rephrase it, even touching the eqg audio (which objectively doesn't differ from FiM) was treated like a minor blasphemy.
>>37287680>even touching the eqg audio (which objectively doesn't differ from FiM) was treated like a minor blasphemy.Tell me about it. I'm glad Clipper was willing to pick that up for me because I sure as hell wasn't looking forward to dealing with it. I acquired the audio, but that was about all I did involving EQG here. It would be a mistake for any of these barbiefags to think that this meant they were ever welcome here.
>>37287685>I'm not adding potential data to the project that would increase the quality of our waifus because FINGERSokay buddy
>>37287743Do you even read, retard?
Boy am I glad deleted posts don't count towards bump limit, or PPP threads would be a few hundred short thanks to these goobers who piss about purity instead of posting ponies.
>>37287901Why the hell would you bring this up again?At best you're an idiot and at worst a troll trying to start the reignite the shitflinging.
>>37287910Nah, just wanted to make my quirky observation about bump limits, I also feel if it went unreferenced, some jackass might just come in and do it all again thinking it's for the first time.
Just migrated to Linux 'cause Windows kicked it. Speaking of, looks like 15 kicked away the "final" huh? I look forward to the 14th update past here just so we can get to 15.15 and double it up.Back to Linux though, is there a Linux specific version for DeltaVox RS? Or am I gonna have to try and Wine the existing one?
>>37287932Last time I checked the doc, I don't think I saw a Linux specific version. Wine might be your best shot. Imagine not duel booting Win7.
>>37287932>Linux specific version for DeltaVox RS? Or am I gonna have to try and Wine the existing one?You're gonna have to use Wine, I have many things imported as Windows-only DLLs (even if they're cross-platform) and porting to Linux is planned but only when I have absolutely nothing else to do. Although it might be unfeasible without major refactors because the Logitech LED API is Windows-only.
>>37287946Alrighty then, thanks for the quick response. I guess I'll let you know how the Wine-ing goes. If worst comes to worst I guess I'll be able to run it within a virtual box? Let's hope it doesn't come to that.I look forward to your linux version though, whenever that comes around.
>>37287970From the distant memory of trying out a very early build in my Kubuntu 18.04 LTS installation back when I was dual-booting, it didn't go very well, possibly because Tensorflow is pretty big and complex. Interested to see your results.
>>37288045So far results don't seem too good. It can't find certain dll libraries it seems, even though most of them seem to exist still. Strange. The directory in question is identical to what I had working on Windows, so there's no missing files. And Wine is working too as I was able to run the included 'Visual C++ Redistributatable (Install if v140 dll error).exe' just fine.
>>37287262That's pretty cool. Other than BOHB for hyperparameter tuning, HiFi-GAN for vocoding, and diffusion methods for improving audio quality, have you found anything else that looks like a clear winner?By the way, it looks like you're not looking into anything involving equivariant networks. I think those are getting really popular. Here's an example used by AlphaFold2: https://arxiv.org/abs/2006.10503.Equivariance sounds complicated, but it just means "hidden layer outputs have the same structure as their input, and they deliberately reflect certain characteristics of the input." For example, a convolutional layer's outputs have the same pixel structure as their inputs, and their outputs will reflect translations in their inputs. Convolutions also reflect rotations when using rotationally-symmetric kernels, but that's really rigid since that always turns, e.g., 50 degree rotations in the input into 50 degree rotations in the output. SE(3) transformers also reflect rotations, but they can do so more flexibly, e.g., by stretching or shrinking rotations.I don't think it's been applied to speech yet, but it's something to watch for, especially in voice conversion.
>>37288297I read that AlphaFold2 is using Invariant Point Attention instead of SE(3) transformers, but I think the idea is the same.I'm curious what kind of equivariance would be good for speech, though. Translational invariance, of course. That's already built into 1D convs (though aliasing may hurt performance a bit, e.g. see Alias-Free GAN). "Flip invariance" via symmetric kernels doesn't seem that useful. Maybe something could be done with spectrograms, since they're 2D?
>>37288683Highlights how strange it is to shout out loud what you're typing as you type it.
Is SortAnon's TalkNet not working for anyone else? Last two times I tried to use it, the generate button just doesn't work. No error or anything, it just doesn't create an output. I'm running Chrome, I've changed nothing on my end since it was last working.
>>37288808I tried signing out of my Google account and signing back in, that fixed it for me.
>>37288297>have you found anything else that looks like a clear winnerThere are no clear winners that I can think of. Everything has at least 1 trade-off. Be it performance, variability, stability or coding complexity.>I don't think it's been applied to speech yet, but it's something to watch for, especially in voice conversion.Thanks.I'm going to be rewriting all my networks to use my new modules for at least a few more weeks, but I'll keep this in mind for when I want new stuff to test out.
>>37288816Hmm, didn't seem to make a difference for me. Still not getting any output from the models.
>>37288565PCM is a phase and amplitude assigned over time. Voice conversion should probably be equivariant with shifts in most of these things and their rates of change. That means phase shifts (to capture spatial positioning), time shifts (to capture timing... invariance might work better here than equivariance), frequency changes (i.e., change in phase over time, to capture formants), and amplitude shifts (to capture volume... maybe use the Bark scale version of these). The only one it probably shouldn't be equivariant with is shifts in power / MFCCs since that one is usually used as a voice signature to identify the speaker.
>>37288808well, it did work for me yestarday but now im getting the No module named 'dash', i was able to get it to point of generating the UI window by adding the missing modules above step 2:!pip install dash!pip install jupyter_dash!pip install crepe!pip install psola!pip install torch_stft!pip install kaldiio!pip install pydub!pip install frozendict!pip install unidecode!pip install pyannote.audio!pip install g2p_en!pip install pesq!pip install pystoi!pip install ffmpeg-pythonHOWEVER that just leads me to the same point of BGM were clicking the Generate button does nothing.>>37288816following this on new incognito tab gives me error 403, both with step 3 and step 3B.Than I factory restarted and run everything again but without extra code and was still getting the error 403, it seem colab is being very fible with this particular code.Can someone tech savvy figure out how to run this offline because it seems google is really going out of its way to fuck around the colab code.
>>37284805>The only downside is that I was previously getting the origin point for each shape from JSFL. I can't do that anymore since I don't have the mapping between XFL elements and JSFL elements. I was doing that because I couldn't figure out how to get that information from the XFL. It looks like the <transformationPoint> needs to be converted somehow, and I haven't figured out how.I'm still struggling with this. I found some code that does this for Unity, and it looks like transformation matrices and origins are tracked separately. As in, it doesn't apply origin changes and matrices alternatingly, it only applies matrices to matrices and origin shifts to origin shifts. I don't understand how that's supposed to work, but I can try it.
>>37288864just tossing an idea out there but could it be possible to make talknet save the generated wavs in the google drive folder like the ngrok does ? would that make any difference SortAnon ?
hmm, it seems colab isn't fan of re-downloading files from the same github source, so maybe having all the anons duplicate the talknet to their google drives and than run it from their personal accounts would solve this ?
>>37288901Just tried saving a copy in my drive and running it from there, still getting the same error BGM reported of the generate button not doing anything. At this point I'm pretty sure that the issue is something inherent to the script that only SortAnon can fix.
I've made a list of all the tags used in the master folders for the Music and SFX files for organizing reasons, just posting those here as perhaps some other anons could use those for their own projects.https://pastebin.com/2CHjh5tWhttps://pastebin.com/6BkgJV1w
>page 10Bumping with this chart now that the FIMFiction archive 20k steps finetuning run is complete. Currently readying the model for inference in Colab by slimming it.
>>37290865FIMFiction-20k model can be played with in Colab.https://colab.research.google.com/drive/13R8MJEDTwinEmUJMLqydKOIcAvWiBIlT?usp=sharing
On 15.ai, voices of other characters bleed into generated speech clips from models that are trained with 0.3 minutes of audio data.For instance, the Snails voice model sounds this way: https://u.smutty.horse/mcfotncapfe.wav
>>37291191Yeah I noticed this as well, I thought Octavia sounded like Twilight and Rarity a bit too much at times.
>>37291191Good thing nobody gives a shit about snips or snails.
>>37291162is there a way to download the text model so i could generate my own text offline ?
>>37291346You can download it from the link that it downloads from the notebook, here, I'm pasting it for you: https://storage.googleapis.com/xdisk/fimfmed.tarBut you need an RTX 3090 and Linux with everything installed or TPUs to run it locally, it's a really, really big model.
>>37290137Thanks for that, I've added these as "taglist.txt" in the respective "SFX" and "Music" folders.
>>37291366https://github.com/arrmansa/Basic-UI-for-GPT-J-6B-with-low-vramApparently it is possible to run it on 1060 6gb gpu, but it also seems to get a large hit in the performance / quality.DH8PA
>>37291366>Needs RTX 3090Wouldn't it be able to work with any decent card, but just allow it to process longer for the same result?Like... If a GTX 1650 is around 400% slower, wouldn't that just mean it'd take 4 times as long to get a similar result?
>>37291903problem is you need to physically able to fit the card on computer memory while the correct gpu runs the code, like the above github code is able to get away with using older cards as they are still compatible with tensorflow code BUT they still need to split the load between GPU and RAM.Having a gpu with over 20GB just means its all in one spot and can be processed all at once, you cannot just tell computer to load 1/4 of the gpt model (like imagine reading a one part of a book that was chopped in four parts like a pie, it would impossible)
>>37291930Oh I see, so it's not so much a matter of processing the same amount of data over time, but rather needing the required memory overhead in order to support such a massive model. Thanks for clearing that up, anon.
>>37291162I'm getting MUCH better results than from the 500 step iteration, though I guess that's to be expected. It actually seems to generate things somewhat coherently, perhaps even getting close to the coherency of the GPT-2 AID model. I'm having a blast with it so far.
>>37288901>re-downloading filesAh fuck, I think I know why it's broken now. I'll look into it in a few hours.
Try the new TalkNet notebook and see if it spits out an error message. Post it here if it does. If the error is "VersionConflict", try restarting the runtime (but copy the error message first).If it still fails without an error, post a screenshot of your models folder, like this. You might need to click the refresh icon for it to show up.
>>37292758I got the VersionConflict error but restarting the runtime fixed it. I copied the error message if you want me to paste it. Apart from that it seems to be working for me now at least. Tested two different lines and got the inputs as expected.
>>37288867I figured it out.- For shape objects, the edge format defines the origin point. That origin is relative to the midpoint of the shape. I can dump the origin point and the size of the shape with Animate.- https://u.smutty.horse/mcftdfylrke.mp4It turns out that didn't fix the issue I had last time. I'm getting the same animation with the messed up eyebrows, choppy animation, and non-animated wings. BUT the whole thing runs way faster, it's way cleaner (you'll have to trust me on this since the code still looks like shit), and I no longer run into any of Animate's random failures when trying to dump shapes.I got a hint from the anon in the last thread about what's making the animation choppy. I'm going to commit my stuff for now, upload the updated tool for dumping animations, and work on the choppiness when I get back in several days.Render Anon or Morph Anon, if you want to try figuring it out while I'm away, I should still be able to respond to posts occasionally. If you need samples to test out, Clipper knows how to convert files to the new format.
I am now beginning the /mlp/ dataset finetuning run - it is as I explained in one of my earlier posts last thread. When trained, it'll work as a shitposter and green writer.
>>37292891Correction: I can't post the updated auto-animate tool yet. Pyinstaller doesn't play nicely with cairocffi. I'll need to figure that out first.
>>37288851I think human hearing is invariant to phase, so I'm not sure if phase equivariance is useful. And if you use magnitude spectrograms, you're already ignoring phase. I've only seen the mono -> binaural audio NN care about phase since it affects perception there.Time equivariance seems good. Some invariance could be good for changing prosody? Depends on if it's a 1 to 1 conversion or a seq to seq.Frequency equivariance makes sense. I guess that's what pitch augmentations aim for.Amplitude equivariance makes sense too.Wonder if all of these things need special architectures or just tweaks to existing ones.
>>37292881Recently remade the /tg/ station Start/End Round Sounds for the /mlp/ SS13 server (/vg/ Codebase), not sure if it fits here but it's content.Orginal Start/End Round Sounds: https://u.smutty.horse/mcftodlmcrj.wavPonified: https://u.smutty.horse/mcftodotcky.wav
>>37292923Human hearing uses phase to spatially position a sound. That doesn't matter much for our ability to recognize a speaker, but it can have a big impact on how realistic something sounds.
>>37292945Yeah, I guess I meant to say that absolute differences in phase don't matter. But relative differences certainly do
>>37292758Got the 'VersionConflict' error, so i follow the runtme->restart routine steps and run it all again.Now, after pressing the generate button im getting the error 'cuDNN error: CUDNN_STATUS_NOT_INITIALIZED'
>>37293086ive tried to run it in the incognito opera and once again getting the error 403
Is there a way to make Trixie do say The Great And Powerful Trrrrixie?https://vocaroo.com/1iomCiEQTi0D
>>37293322I've tried making her roll her R's but to no avail, she sure does sound cute while saying it thoughhttps://u.smutty.horse/mcfvloqfhqy.wav
>>37293086I think I've fixed this error. If it shows up again, post your output from step 1.
By the way, there's no reason TalkNet can't run offline. How many of you have gaming PCs?https://www.strawpoll.me/45515369
>>37293388>output from step 1I got the usual gpu Tesla T4 15109MiB And now the colab randomly decided it does want to cooperate on firefox once again, this inconsistency makes me think that maybe this is a problem of too many people downloading the files at the same moment ?>>37293388Also since you are talking about offline version, that would be pretty great since I could use the other hifigan or ngrok at the same time when working on a projects.if I may make a suggestion for the code upgrade , could it be possible to add a "ticked on" option for adding an extra 3~5 seconds of silence at the end of clips? ive notice when messing around the echos and reverb is bit difficult as many editing programs refuse to "go over" the audio clip length making a weird hard cut on the echo effects (nothing that cant be fixed in audacity BUT editing 100+ clips by hand does adds up).
>>37293398How would you put together a Controllable TalkNet model that runs offline, exactly? And how would the speeds compare even on a good gaming GPU?
>>37293486>could it be possible to add a "ticked on" option for adding an extra 3~5 seconds of silence at the end of clips?Sure, I could do that.>>37293512It already runs offline. I just need to write a guide on how to set it up.>And how would the speeds compare even on a good gaming GPU?About the same speed as on Colab.
>>37293398I knew being a VRAMlet would bite me in the ass eventually got a 1060 3gb here
>>37293398wait all it takes is 4GB?
>>37293322Here's what I got from the attached input as the second generation.https://u.smutty.horse/mcfxsozxuaq.wav>>37293373You guys using ARPAbet strings?
>>37293322You're going to have to hire Kathleen Barr, or another voice actress who can imitate her. Rolling the Rs is a very human skill.
>>37295424Everything is a "very human skill" until we get a machine to do it better.
>>37295424You can get Trixie to trill her R's, you just have to be very lucky and use the right contextualizer.
>>37295424Funnily enough I remember the earlier models like the 22khz google colab being able to trill the Rs quite often. So the AI *can* pick up on it. It just probably thinks it's unnecessary. So you find a way to specifically tell the AI you need to include it with that phrase.
>>37295515>betterMachines can never be better humans than humans.
Yo anons. I'm not sure if this is a known issue because I haven't participated in anything related to the PPP whatsoever until just now, but I tested the latest MMI Pinkie Pie voice model (11NULGhxh1JTwb7oHBdmT7TAMKvuX-rCg) with the text set as the singular word "I". This causes a glitch that results in her saying the word "I" like 40 times instead of just once! It's actually hilarious. https://drive.google.com/file/d/1YqWwkJ1U3Og6qC-jog33P2cDal0nz9xw/view?usp=sharing
>>37296295then don't do that if you want it to say I then just use a word that sounds like it like eye
God! Why I keep hitting "Generate" instead of Download the sound when I get great result!?
>>37296295Has there not been a HIFI GAN model trained of pinkie yet?>https://drive.google.com/file/d/1YqWwkJ1U3Og6qC-jog33P2cDal0nz9xw/view?usp=sharingAlso could you use smutty.horse in the future for file hosting?
>>37296357Ironically, that has the same result. I don't need the sound for anything though; I was just surprised the notebook broke in such a strange way.
>>37296369>Also could you use smutty.horse in the future for file hosting?lol Is that the standard here?
>>37296389>lol Is that the standard here?Yes, because there is no chance for arbitrary deletions or content policing.Also other file hosts compress files sometimes that can cause feedback issues.
>>37296379yea the notebook's and and even 15.AI are pretty easy break if it's spitting out garbage you should just change what you're putting in
>>37296362I know the struggle, Muscle memory is a blessing and a curse.
>>37296080I've been thinking about it a lot, I don't think there are any words where R is followed by a W, so I'm wondering if any trilled R could be rewritten in the training data as "RW". It might be dumb, but it would also bake a consistent prompt for trilled Rs into the models for now. (Until models are trained on the full IPA.)
>>37296621otherwiseAnd holy hell the captcha was DRW2Y. What are the odds?
>>37296841n^5 where n is the number of possible charactersassuming full alphabet and numbers (i know some aren't used bear with me) it'd be 36^5, so your odds of exactly that combination are approximately 1 in 60,466,176
Does Ruiji still lurk here? The one who posted https://vocaroo.com/1nvYTdC84VZtandhttps://u.smutty.horse/maujfrovyne.mp3
>>37296362If only there was a way to temporarily keep the previous generation in memory. Just one cycle, then delete after "generate" is pressed a second time.
>>37296977or just 1 in 324 if we're just looking for the odds of an R followed by a W appearing anywhere in a captcha, instead of the odds for the full string.
>>37292906You were asking in the last thread whether 20k iterations was enough. This might help.https://arxiv.org/abs/2001.08361Also, the roto-translation equivariance stuff seems both complicated and extremely useful. I'll write up a summary once I understand it better and get some time. >>37288565, if you already understand it, a summary would be great. I'm having trouble seeing where the Wigner-D matrices come from and how exactly to use them.
>>37297527Unfortunately, I don't understand it. I only skimmed this blog post by the author (https://fabianfuchsml.github.io/alphafold2/), which brings up irreducible representations and Wigner-D matrices quite suddenly. Maybe the paper will give more background info or at least have citations.But yes, the overall idea is interesting. Again, not sure how effective 2D/3D stuff will be for audio, but you never know.
Out of curiosity, where the hell are you guys going to store all of this data anyways and how big is the filesize now?
>>37297736My PPP folder is 330 GB. It has some non-pony data in it, though.
>>37297736>Out of curiositydon't shoot glowing one
This week on things we need to apply to ponies eventually:https://www.youtube.com/watch?v=0zaGYLPj4Kk
>>37286871I’m just curious: what tools are you guys using to mine the data and make predictions? I took a data mining class where I had the choice of PySpark, Pytorch, and others for a project and am probably going to use Prophet in a work project.
>>37298709shooo fbi plus hasbro, go and glow somewhere else
>>37287582this is cute, got more?
I've made a script that lets you "install" TalkNet on Windows. I've only tested it on one machine, so it might still have some bugs.https://github.com/SortAnon/ControllableTalkNet/releases/latest/download/TalkNetOffline.zipExtract this somewhere, and run setup.bat first. It'll take 20 minutes, and you need 10 GB of free space, and an NVIDIA card with 4+ GB VRAM. When it's done, run talknet.bat. If everything works, the TalkNet UI should run at http://127.0.0.1:8050/.
>>37298804But suppose I were interested in working on this project for an entry-level salary for a master’s graduate. What tools would I need to know how to use? Though I’d probably need to brush up a little on my Python or learn to program in R regardless.
>>37299690yeehaw!https://u.smutty.horse/mcgkzkjxdmr.wavpov you just messed up your lines:https://u.smutty.horse/mcgkzpdjsqw.wavouttakes:https://u.smutty.horse/mcgkzowqtbi.wavhttps://u.smutty.horse/mcgkzprbnso.wavwhat a silly pony:https://u.smutty.horse/mcgkzmjhmry.wav
>>37299594Is this TalkNet for training models, or Controllable TalkNet for actually generating outputs?
>>37299916And does the installation allow me to set which drives/directories will be used, or does it use default directories?
>>37299928It installs everything to the folder you run it from.
>>37299936Ah, nice. I'll try getting this set up on my primary PC at some point and see how everything runsThanks for all the work you've put into this, by the way
>>37299594I would've tried to Wine this, but my GPU sadly only has 3GBs, not quite enough to match the required 4. Damn.
How much would it be worth it to get one of these to train models on? https://www.pugetsystems.com/recommended/Recommended-Systems-for-Machine-Learning-AI-174/Buy_200
>>37296085Just a quick reminder, humans are just machines of a different kind. I'm sure some day after ponies are completely digitised we can immortalize ponyfags too.
>>37300284The ones >>37298709 mentioned are free and open-source, but Pydub and PyAudio may work the job. The issue may be the amount of storage you’ll need to hold the training data, which, as many here will likely concur, is exactly why NO copy of SM64 is personalized.
>>37298804>>37298709>when the subsequent leaks proved the absolute basement-tier attempts by habsro to find out who leaked the first timeWish I could relieve those days. Need more leaks they're too fucking funny at seeing how terrible billion dollar corporations are
>>37300297The page I linked to was for a Machine Learning PC.
>>37300307Oh. I thought you were referring to the algorithm or tool to use.
>>37300385The url itself is pretty vague though.
Are there any idiot guides out there for getting koboldAI to run GPT-J with the fimfiction mod?Idiot guides for idiots on the level of not knowing wtf any of what I just typed up there means. I just want a new AIdungeon for frantic pony fuckery.
I trusted the plan and 15 still hasn't come back m8s. It's over.
>>37300412I'm looking at their example Colab and I think I can implement the same API and web service but in my notebook, give me tens of minutes to a few hours, depending on how complicated. It doesn't look difficult.
>>37300556Then give me Carl
>>37300412from the /aids/ thread. I don't know about fimfiction but i assume that's something you've made?https://rentry.org/itsnotthathard
>>37299594it says i need visual studio do i really need visual studio for this to work? or i just bypass it
>>37300748Normally any program that says you need Visual Studio means you have to bite the bullet, and download the dependency. -T. Various Versions.
>>37300748>>37300778that doesn't look good
>>37300790Ooh, that's a lot of red.
>>37299594well that was something my PC froze for about 20 mins then the setup failed
>>37300778Does Visual Studio have compatibility with Python and R?
>>37301095Some cursory google tells me it gets goofy.
>>37300256It might work with 3 GB, depending on how much you're running in the background.>WineIt runs natively on Linux. Do you want some setup instructions?
>>37299695What predictions are you taking about? Most of the AI stuff is Python, and some C++. People are using a lot of tools to work with audio data and to scale up training. I don't think anyone is using anything special to preprocess data.
>>37301060What happens if you just install this and rerun setup.bat? I really hope I don't need to include a full Visual Studio setup.https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=BuildTools&rel=16
>>37299594Having trouble running it. when starting the talknet batch it gives me the bellow message.>no module named 'nemo'Ive run pip install on nemo but it still tells me there isnt nemo module installed.Ive also tried to upgrade it but it tossed this message at me, whatever the 'git' command is the python is not recognizing it, are you sure it not supposed to be 'get' or something else here?
>>37301734i thought about being smartass and forcing the module to be installed before the code ask for it but it still no-go, i will try mesa around admin permissions if that will fix anything.
>>37301746ive somehow solved the problem by installing the git 64 from herehttps://git-scm.com/download/winthan running update and letting every module get sorted.Now i have just few questions, where the fuck do I upload the reference wavs in the browser only version (since there isn't a colab folder to drop those in) ?How do I get custom models working complete offline (as far I understand it seems the script can only reference back prewritten models and all new ones need to be re-download ?)
>>37301778and im getting some weird errors again.
>>37301788never mind, it seems there is some problem with extracting the Trixie model as script makes the proper folder for it but do not actually export anything in it, just had to do it by hand, also it seems i misunderstood how this works, as you do need to have internet to first fun a new code to download model but after that it will use the model reference to grab the file from model folder.Im sorry Sortanon if I sounded assholeish above but im just really annoyed with getting random setbacks in everything when all I want is just listining to pony voices.Well, there is one more error, im getting message that cuda is out of memory were clearly the code isn't even using half of my gpu memory. And yes, i did close it down and reopened but the same thing happened anyway so it seems to be a code problem
>>37301830Were you offline when you first selected Trixie? I know why >>37301734 is happening, but >>37301788 is still a mystery.>im getting message that cuda is out of memory were clearly the code isn't even using half of my gpu memoryWindows could be underreporting the VRAM usage. Try opening a command prompt and typing "nvidia-smi".>>37301778>where the fuck do I upload the reference wavs in the browser only versionFor now, it's the folder called ControllableTalkNet. I'll add a more convenient way to manage it later.
>>37302077>Were you offline when you first selected Trixie? no, all the models downloaded fine, even the custom ones, it just for some reason Trixie had to be done by hand.Actually let me post the one made in past few month here, for the other anons to use:tf2_soldier_TalkNet.zip1Gt7sD4fsU0aC06V2zQsn4Vrnj6g2E6xQMrRogerTalknet_TalkNet1qbcYrxgO3f3RIWfOrL9QqFJVuzy0H_W_NamelessHeroTN1lhtg5jPfz-9j-re2d4DQ0P1vQCXBqcLw>Windows could be underreporting the VRAM usage. Try opening a command prompt and typing "nvidia-smi".I've tried that, it still run out of memory, even after i closed my video editor and rtx voice, this is bit weird since i am able to run the medium gpt2 text model and those also require 4gb of gpu memory.
>>37302203what gpu do you have?
>>37302228gtx 1080, it sure doesn't rocks socks off but it can play even the newest games at medium settings.
>>37302238Same. I'm building a new machine but >GPU prices.
pinkie pie huge ass
>>37302203It shouldn't use that much VRAM. What happens if you do a line without reference audio?
>>37302444>What happens if you do a line without reference audio?If i tick the "disable reference" it works fine, there is no problem generating audio, if I tick that off and click the update list it still works (it just ask to put the reference in), it only breaks when i click the droplist and choose the wav reference file.
>>37302457So it's the pitch estimator that's the problem. I should try replacing it with torchcrepe.
>>37299594Working perfectly over here.
If nothing derails me too much for next few days I think I will be able to get new audio episode this thursday/friday (also bump).
>>37304192Awesome, I'm excited to see what you have in store.I get the feeling this week might be a good one for content.
>>37300293>he fell for the "brains are just bio-computers" memeLol. Lmao.>>37301514Lost hard. It's just like Terry used to say.Also now I've gotta go jack off to the thought of AJ calling me a fucking nigger.
yo who the fuck is this nigger?https://youtu.be/zqklInNM9H4
>>37304842Random faggot who doesn't browse the board and got mad that the 15.ai is down so he now lies about it on the internet for views. Ignore him.
>>37304842>yo who the fuck is this nigger?>ThunderShyOfficialSomeone who doesn't want to be impersonated apparently. Too bad for them.I hope this video is satire.
Figured I wouldn't let these go to waste:The Weeknd - Can't Feel My Face (WIP)https://u.smutty.horse/mcgzletwotq.mp3Gwen Stefani - Rich Girl (WIP)https://u.smutty.horse/mcgzlessmrw.mp3The AI shits itself around the lines with overlapping or faint vocals so maybe one day it can learn how to distinguish them for better results. Still fun as heck.
>>37304842>look through his videos>hes a shitfagevery single time
>>37299594Thanks very much for this! It works great so far although I did run into the same problems this anon did >>37301060>>37301734but fortunately downloading this>>37301462and this>>37301778and rerunning the install .bat fixed everything.
>>37305246>Gwen Stefani - Rich Girl (WIP)Wouldn't that be more appropriate for Rarity to sing? What kinds of problems does Rarity's TalkNet model have?
>>37305355Twi is the more stable and flexible horseRarity doesn't sound bad now that I notice but she still needs a bit more practice: https://u.smutty.horse/mcgzzxscvkg.mp3
>>37305343Thanks for confirming we only need the build tools. I'll update the setup script later today.
>>37305628I needed to install the "Desktop development with C++" under the "workloads" tab in order for the install to build correctly. Was that what you were referring to?
>>37305650Yes. I have Visual Studio installed, so I never ran into any errors.
TalkNet installer's been updated. It should be bulletproof now.https://github.com/SortAnon/ControllableTalkNet/releases/latest/download/TalkNetOffline.zip
>>37306812made a fresh installation, and once again the no-reference audio works wile the reference audio make the gpu run out of memory.Is there chance to make the part of code that converts reference wav to reference pitch use cpu instead of gpu ?
>>37307147It could be an issue with the CUDA install or Nividia's drivers. I had issues like that last year. Did you update your drivers and what GPU are you using?
>>37293398RTX 3060 12gb here, gonna have sum fun.I hope
>>37307279gtx 1080, and whatever the newest drivers were updated four days ago.
>>37306812there's a whole lot of red words here
>>37307550And no other applications that could be using a significant amount of VRAM are running while generating the audio?
>>37307675haven't change anything since last time >>37302203I guess Sortanon haven't change yet whatever the pitch audio reference code in here is causing this error.
>>37307590>Cannot open include file: 'io.h'Don't tell me you need the entire Windows 10 SDK to read a file.Try this. Go to the installers folder, and run vs_BuildTools.exe. Follow the steps in this image, and run setup.bat again when it's done installing. Does that fix the error?
>>37307697I tried replacing the pitch estimator today, and it broke the models' ability to hold notes. So without retraining every character, I'm stuck with the existing one.https://u.smutty.horse/mchhrqrwewg.oggIt does run on CPU, but it's very slow. I'll add it as an option.>>37308099Installer's been updated again. No one else should run into >>37307590.
>>37308290Any word on >>37307147 ?
>>37308345That's what the new option fixes, at the cost of speed.Run update.bat. Open ControllableTalknet/controllable_talknet.py in a text editor. Go to line 41, change "CPU_PITCH = False" to "CPU_PITCH = True", and save. That should fix the memory problems.
>>37308516run the update and still nope, it still getting the same old message >>37307147here is a screenshot a moment before it spams the "out of memory" message.
>>37308575And you're sure it's set to CPU_PITCH = True? I've tested it on two different machines, and it doesn't use CUDA on either of them.
>>37308617>CPU_PITCH = Truein what file is that written ?
>>37308814controllable_talknet.py, in the ControllableTalknet folder. Line 41, just beneath all the import stuff.
>>37308835yaa, it works now, happy times.
Oh man.. I want to make a version of Alabama Nigger but with apul and talk of Ziggers, but im more retard then a nigMe no understand compooter stuff, what do
>>37308870You just gotta fiddle around with it until you come to understand it. Too bad there isn't any text anywhere to read that explains it, like say a long running thread, or literal guides written into the notebooks.
>>37308885I'm trying but all the words start dancing around and shit ahhh
>>37308889Sounds more like you need a diagnoses.
>>37308896It's not worth it, FUCK doctors
>>37308899Take the meds, or face the feds.
>>37308904>fedsbecause I can't read good?
>>37308908Because the government treat the doctor avoidant poorly, despite doctors costing a lot. Also because it rhymes.
>>37308931Government sucks dick
is the test site down?
>>37309498Looks like it.
>>37309534I am back in action, after some stuff happened. Today I experimented with programming angles into the servos. I don't know how to make a walking robot with 4 legs yet.
>>37305246these are amazing pls make more
>>37309545I can help but wonder how many generals the schematics are gonna touch.
>>37304842lol hes right faggot.
>page 9Bumping with this mini-PTS I made last week:https://u.smutty.horse/mcgnzmibuvu.mp4
>>37292942Can I get a link to the git?
>>37309498I do hope a test site comes back soon.
>>37310795https://github.com/AlphaPassive/mlpstation13and here's a link to the thread if you're interested>>37306319
https://www.youtube.com/watch?v=kgjvnI_FVccIm getting annoyed with some sfx/voice stuff from main audio episode and took a break to finish the meme song Ive been messing around for some time.So I hope you guys will enjoy the DRD 100% Gamer song.
>>37300564I don't like how bloated KoboldAI is, I will instead make a simple web interface like Cookie's ngrok notebooks. Hell, I could probably use one of the free TPUs under the TRC to serve the FIMFiction model if I didn't accidentally waste a ton of free Google Cloud credit on data ingress between continents.
first time using the talknet thing and I'm getting this error CUDA out of memory. Tried to allocate etcdid i fuck up something?
>>37311317try this >>37308835>>37308516it would e nice if there was a small readme file added to the zip download to explain that to people.
>>37306812If the filename for the reference audio is too long it will throw an error when trying to generate the audio.
>>37311628Windows API doesn't support paths longer than 260 characters. I'm working on a UI change that might help with this, but it's not really a bug.
>>37311747>>37311774Ok good to note of it because people might stick the lyrics they want to generate in the filename so they might run into the same issue I did.Thanks.
>>37311780I just type the lyrics separately in Notepad or something.
>>37311226GPT-J-6B inference notebook has been updated with ngrok-access web interface. Same link.Instructions are to use Runtime->Run all and wait for the last cell to output pic related, then click the link.https://colab.research.google.com/drive/13R8MJEDTwinEmUJMLqydKOIcAvWiBIlT?usp=sharingNow I have to write a post in my Tumblr since the TRC program requires progress tracking.
>>37312085I've tried to run it several times to no avail. It fails every time on the first step with pic related as the error I'm using TPU and I've tried both restarting and factory resetting the runtime, with no change in the end result.
>>37312593Go to the first code block, move line 25 to line 17 and try again. I'm not the owner so I can't update it myself.
>>37312911Thanks, this seems to have worked.>>37312085I've only generated once and I'm already impressed by its coherency.
Been away from the thread for a bit, what happened to that anon who said he was trying to rig up voice-controlled generation? Did that ever make it to a workable stage?
>>37313305It works like a charm, everybody loves it. https://colab.research.google.com/drive/1aj6Jk8cpRw7SsN3JSYCv57CrR6s0gYPB?usp=sharing#scrollTo=tOXejargIPTq(Here's Pinkie doing an acapella song as an example.)https://u.smutty.horse/mbrjeellwhq.wav
>>37312593>>37312911Silly mistake. Fixed.
>>37292923Actually, one way to achieve frequency equivariance could be with the constant-Q transform. It's like a Fourier transform, but the bins are geometrically spaced. So, you can make the center frequencies correspond to a musical scale. This means that a pitch shift would correspond to a shift of the bins in the same direction.I've seen some papers use it, but overall it's not as common as the STFT.
>>37314028Excuse my dumbness, but didn't AI uses MEL transform for frequency stuff?
>>37314569Yes, but not all bins of a Mel spectrogram are geometrically spaced. So, a pitch shift, or multiplicative scaling of the audio, won't always result in an additive scaling of bins.Maybe this is good, because human perception of pitch doesn't perfectly follow a geometric scale anyway. But, if you want equivariance to pitch shifts on a musical scale, the CQT is still probably better.
TPDNE has been used in a paper on unsupervised StyleGAN segmentation using CLIP. As far as I know, this is the second time ponies have been used in a published ML paper (first time was iCartoonFace).>Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP>https://arxiv.org/abs/2107.12518>https://github.com/warmspringwinds/segmentation_in_style
>>37316125>When AI can do better muzzles than G5.
>>37310111Is there a way to share files without me having to manually clear metadata every single time? Otherwise, it won't be an anonymous project.
>>37317212Depending on how the meta-data is stored, there's probably programs that'll do it for you.
>>37317231I'm mostly concerned about the computer name and location data in stl, (solidworks) part, and (prusa) 3mf files.
>>37317250I think the best way to handle the metadata for the stl is to use a patching program like http://www.romhacking.net/utilities/240/ to create a patch that you can apply to any stl file. *(You do this by taking one of your files that has the metadata still, and manually editing the metadata out of it, then make an .ips file with these two). This may not work, but if it does, it'll save time. I haven't found anything that says 3mf stores metadata in a common way across files, but it would be weird if it didn't. So this method may work for both.
>>37317338Wow anon, thanks! I found anonymous github, and looking into that now too.
https://u.smutty.horse/mcihghgobfv.wavGais Van Baelsar speech from FFXIV delivered by Glimmer, hope you enjoy
>>37317680Starlight Glimmer is a nigger.
>>37313563Jesus, the dub of the Mentally Advanced Series almost sounds just like the VAs did it. Fits perfectly.
>>37286875https://www.youtube.com/watch?v=NynmEU-tCBASo here it is, another long audio episode, this time I've done it 100% without using the 15s voices while going full ham on the ambient and sfx sounds.I hope all you guys will enjoy watching this one.
>>37318223That was a good watchFlutters could be really adorable as a tap-dancing god
>>37286875The Mane 6 get together for a friendly rap battle, courtesy of Pinkie befriending Tyrone, but it turns out there’s more to him than anypony expected...https://youtu.be/psS0fTknd-cThis is another thing I’ve wanted to do for ages, now finally realised with SortAnon’s TalkNet, and a slammin’ beat courtesy of BGM. Thanks very much to you both for making this possible. Based on this /mlp/ thread from 2012:https://desuarchive.org/mlp/thread/6709157https://u.smutty.horse/mcidnwljcym.pngHigh quality downloads:Full version - https://mega.nz/file/Nc4mFTYZ#h65qzHd57Ser8ObYz0BXn5imdzaBeoSjfHqY104hw9oRap battle only - https://mega.nz/file/dIwSFJiB#_AnhKoHngCn4NKGTTu5FU2OdkRUoOHKZ8g-wzn7BR4MInstrumental - https://mega.nz/file/oZgyzZxT#qaTr8vMcYMakKdZNSFN0dmSBEaeXyJ2bGN1fvaAtQT8>>37318223Nice work mate, I enjoyed listening and seeing you get better at using sound effects. One thing I did notice is that a lot of the lines got cut off just before the end, which sounded a bit weird. Not sure what was going on there, but was still a fun story and I'm happy to see you still making these.
>>37319348>Mane6 Rap BattleNow thats good shit, Btw, were do you get the Tyrone sounds? I hear that voice inserted in ytp from time to time but I can't never find original source.>lot of the lines got cut off just before the endI run into a problem of generating sentences that kind of breaks the last word in the sentence (for some reason mostly with TS and RD lines) in like it 9 out of 10 times, a weird cracking, shimmer or other noise is added to it.So I solved it by adding extra word and just cut it out n the editor.I know I can generate the word on its own or in another sentence and try to swap it however a lot of time it will have a different speed, pitch and/or tone to original sentence and to me it sound more jarring than cutting the sentence short.I think that could be solved with arpabet control like in the DeltaVox, with enough dicking around any word can be broken and rearranged in such way that I can force it to work.And before someone post to just use more of the reference wavs, I've used that function for some of words that tts had trouble to generate but sadly most of the time my way of speaking did not go hand in hand with the way pony models wanted to pronounce the word (and on more autistic note, pretty much all the lines were generated between 5 to 20 times, pic related).
>>37319348Love how it turned out. It was a lot of fun to collaborate on!>>37319473>were do you get the Tyrone sounds?They're from my music program, Logic Pro. I exported them for Clipper, if anyone else wants them, here they are. They're all locked to 100BPM cause that's what was relevant for the project.https://drive.google.com/file/d/1wXk1FfoLzzIR11-5qApgu2_3Jvq2tuPY/view?usp=sharing
Hey Sortanon, i think you need to fix something in your colab talknet training code.I just got some missing modules on step 5 (easy to fix with '!pip install' command ) and a wall of text of errors on the step 7. ive change the batch size to 1 but that didnt fix it.https://pastebin.com/dhWxrtxQRuntimeError: The size of tensor a (217) must match the size of tensor b (685) at non-singleton dimension 1
>>37320238Still works on my end. Post your dataset.
>>37318223What voice is Twilight Sparkle at 6:15?
Voiced MLP mod for Idol Manager when
Hello. Which model was used for this? https://u.smutty.horse/mbtxajqrtdk.mp3
>>37321564The singing model
>>37321578This one? https://colab.research.google.com/drive/1aj6Jk8cpRw7SsN3JSYCv57CrR6s0gYPB
The Weeknd - I Feel It Cominghttps://u.smutty.horse/mciudfgpcmg.mp35am drop, chorus might be a bit off but whatevs, Enjoy
>>37321626I love how she holds those long notes, or whatever they’re called. Good song choice too, nice job anon.
Can't get the setup.bat to run with the offline talknet
Hey Sortanon, the offline Talknet is not detecting my gpu anymore, I've have done nothing code/driver wise with my pc between now and >>37308861 this time.
>>37321741Fixed. Delete your output folder and try again.>>37321922What happens when you click on it?>>37322229I haven't changed anything either. Try rebooting your PC. If that doesn't help, go to line 40 and change "cuda:0" to "cuda".
>>37321626>>37321663Fuck, I forgot to fix some of the noteshttps://u.smutty.horse/mcixvofmfxh.mp3
>>37322342This is fantastic, very well done.
>>37319348Damn this was good
>>37322255>Fixed.yep, its training allright.>Try rebooting your PCwell, it is working now but I also got welcomed by this message on start up, not sure if this fille and error were connected or not.
>>37322549https://u.smutty.horse/mciyzjcedsx.mp3Hmm, the trained model sounds really out of breath (Spectrogram 400, HiFi-GAN. 2,000 steps), I think next time I will try using the pretrained spectrogram and see if that would made any difference.
>>37322342That was amazing to listen to, great work anon.
>>37322549That's the auto-updater for GeForce Experience, I think. There's not a lot of info about it.Something's wrong with your system. Install CrystalDiskInfo and check your C drive. Run Windows Memory Diagnostic and test your RAM. If none of those are failing, remove all the NVIDIA stuff and do a clean install of the latest drivers.
Is anyone training a pony module for NovelAI? I wonder how it'd compare to Delta's FIMFiction model.https://novelai.medium.com/custom-ai-modules-dbc527d66081
>>37323935is this something that I could download and use with KoboldAI to play offline?
>>37323960No, but others will probably copy what they've done. A 200 KB module is a lot easier to share than an entire GPT-J model.
>>37323935Delta's model should be better:>12 GB of Fimfic stories vs 10 MB max for NovelAI>GPT-J with 6B params vs GPT-Neo with up to 2.7B params>Trained for 20k steps (about 34h according to >>37290865) vs a maximum of 8k steps (about 24 min) per month>Free vs having to pay $25/month for Opus tier to trainNovelAI might be better if you only want to finetune on a tiny amount of data and don't mind paying up. Maybe running GPT-Neo on their servers is faster than running GPT-J on Colab. But quality is probably much better with Delta's model.
>>37324237They're both based on GPT-J.
>>37324281Oh, where does it say that? I'm not familiar with the company, so I just looked at two recent blog posts, both of which say Neo:>https://novelai.medium.com/roadmap-pricing-launch-scaling-new-features-cfb7efa445eb>https://novelai.medium.com/the-first-month-of-novelai-30a4a551a4baThe second post also confirms that it's the 2.7B Neo model.
>>37324292Never mind, I see in the FAQ that they do have a GPT-J model too. The main advantage of Delta's model is just the dataset size and training time, then.
>>37324947huh, fast weekend, I guess people are busy with stuff.
>>37325467The board has been stupid fast recently. I don't know what exactly is going on, but you can't even leave a thread unattended for a couple of hours anymore without dying due to the insane post volume these days. I know for a fact this wasn't a problem a couple of years ago.
>>37325479And the crazy thing is it seems less stuff than ever is going on. Probably just EQG and G5 spam ramping up again.
>>37325479>>37325496>probably just EQG and G5 spam ramping upThere are usually 8-12 eqg "human" threads at once, the last three days we've had 20+. Yesterday there were 23. That extra dozen threads makes a big difference.
>page 9IM GONNA SAY IT
>>37322868https://u.smutty.horse/mcjksnphbvp.mp3trained to same spectrogram and hifigan, but soft starting from other modem spectrogram made it sound worse, however this is probably my fault because i've used male voice. welp, ive used up all my free gpu so i will need to wait few days until further tests.BTW, if the guy whom trained the Solder on talknet could share what option and how much he trained his model, I mean I know there is going to be massive difference between 2 minutes model and 30 minutes but still.
>>37325641I wish the mods actually did their fucking jobs and stopped this spam. We lost a bunch of extremely valuable threads already to this rampant spamming, like instances of tempo or elaowf. All because the mods just don't give a shit.
Since TEMPO is dead, I'm looking for any copy of a Glitch VST or Glitch 2 by Illformed. Thank you. This is for a vaporwave.
is there a tutorial for the talknet stuff?
>>37327847Follow the instructions written at the top of the notebook. Ask here if you run into any errors.
>>37327852I found it, sorry for pestering. Even more embarrassing, I had a copy of a GLitch 2 VST in my folder already. So if anyone wants it, I can provide.
okay I checked the Google Doc but I don't see the link to TALK.net. do I need to check the archive for those dead posts in the OP?
>>37327886You added a period that shouldn't be there, that's why you can't find it. It's "talknet" not "talk.net"
Has anyone wasted their free trail of splitter.ai 's better 2 stem model to figure out if it's viable for getting clean voicelines for training or talknet stuff yet?
>>37328145From what I've seen lalal.ai is still betterYou have to give an email for a full 10 min max song though
>>37328145>>37328759I used lalal.ai to extract vocals for a tempo collab a while back. It works pretty well for what it's intended but it'll probably damage the vocals and retain too much noise to be usable for the purpose of clean lines. Izotope RX7 would probably work better anyway since lalal.ai is specifically for music while the dialogue isolate in RX7 seems to be much more geared towards general purpose.
Page 10 in four hours? Still not use to the recent boost in board speed.
>>37329304>the recent boost in board speed.It's spam. It's just getting worse.
>>37329304>>37329372The cup is going on. That always brings a bit more activity to the board as well.
>>37328145>>37328759>>37328778there's also `vocal remover 5` which has new models optimized for extracting the vocals (at the cost of confusing naming: the vocals end up in the INSTRUMENT file)i'd say it's similar to the old lalal.ai model (to my untrained $20-headphone-having ear), that is to say you might be able to extract some bitssamples with unhelpful commentary, based on this wander over yonder clip: https://files.catbox.moe/qxs5mb.wavvr5 vocal_2band: https://files.catbox.moe/p6g7v6.mp3 (meh)vr5 vocal_hp_4band: https://files.catbox.moe/pdl2k0.mp3 (similar?)vr5 regular 4band: https://files.catbox.moe/5krclh.mp3 (a lot more foley makes it in)lalal.ai new mild: https://files.catbox.moe/q1i92z.mp3 (a bunch of foley but at least it doesn't dip)lalai.ai new aggressive: https://files.catbox.moe/5q7dgq.mp3 (similar to above but cuts out the bit at the end)lalal.ai old mild: https://files.catbox.moe/vbfm8e.mp3 (bad awful horrible)lalal.ai old aggressive: https://files.catbox.moe/a0s12r.mp3 (similar to vr5)the python can be downloaded from: https://github.com/Anjok07/ultimatevocalremovergui/tree/v5-beta-cml (the 5.0.0 in `releases` is outdated)and the models from: https://github.com/Anjok07/ultimatevocalremovergui/releasesalternatively here's a colab notebook for it: https://colab.research.google.com/drive/1eK4h-13SmbjwYPecW2-PdMoEbJcpqzDt?usp=sharing`pretrained_model` should be `MGM-v5-Vocal_2Band-32000_BETA1` or `...BETA2`, `parameter` should be `2band_32000.json`, and `aggressiveness` should apparently be `0.5` but i can't tell the difference
>>37329971The vr5 samples are really impressive. I'm surprised you think they sound worse.
What's the latest on the animation AI? Is that still making good progress?
Yesterday I heard a 15.ai pony song crossover with the song Everlong from Foo Fighters, but been looking on this thread and I havent found it unless I'm blind and retardedAny help?
>>37286871Is it finally online? C'mon I want to start my pony review series and I don't want everyone to recognize my voice. I want to use my voice for other things without everyone yelling "Haha he started off with ponies, haha he can't write worth a shit but he criticized the show for bad writing"
>>37331533>caring about what normies thinkGive up, you've already lost.
>>37331452https://u.smutty.horse/mccpdvywsdw.mp3This is reupload link, ive renamed the file and cant find the original (I think it was first posted in TEMPO threads).
>>37331557Fine it's actually because I'm too lazy to voice it myself, but I can't stand the way Google.AI sounds.
>>37331569Thank you based frenFoo Fighters is my jam
>>37331569fuck, this sounds better than expected
>>37331572Give him a couple more days, it'll be back soon
https://u.smutty.horse/mcjxkqgbrwp.mp3TF2spyTN_TalkNet, Train spectrogram 400, Train HiFi-GAN 7000 steps1BbatM94deM1iCYBiH8Lib-tfv-olKF4hIm taking a step back from doing audios to train the tf2 voices, here Spy example, the no-reference talk, singing reference and talking reference, it seems Spy model have preference to work with deeper male voices and it has bit of trouble "stretching" it's notes.Im totally expect one of you guys to make a cover of some french song by the end of the week.
>>37331681...fuck, forgot to quote the anchor post again >>37286875
>>37331681The normal talking part actually sounds pretty good, nice job!
>>37331452See >>37261796>>37331569That's the original link, it came from a previous PPP thread
>>37323935Here you go Anon. There are only two MLP modules at the moment. One is for Fallout Equestria and is trained on the Original plus part of Project Horizons and my Friendship is Optimal module that I managed to train on most of the canon compatible stories. I do want to try Deltas model on KoboldAI to compare the two though. its just that NAI has a way better interface and memory functions.anonfiles dot com slash zaPeX79du7 slashMLP_NAI_Modules_7z
>>37331972This doesn't have the Fallout Equestria module, only the lorebook.
>>37332182Well oops, hereanonfiles dot com slash F2k0Y896uc slash Fallout_Equestria_Universe_2_module
>>37332212Thanks. I'm surprised no one's trained a more general pony module yet, though. The closest I've seen is a foalcon model on /vg/.
>>37332280Well I do have more training steps, we were given another full month worth after a glitch. I just dont know what fics and stuff I should use for it.
Could somebody please train TalkNet models for the CMCs? They don't have to be singing, but it'll be a nice bonus.
>>37332819I'll have a new batch of characters trained in a few days. Thanks for reminding me.
>>37332847Have considered training from the Kristen Chenoweth dataset?https://mega.nz/folder/0UhSmYAB#WBrB-qCprQTofkAhwMp5CQ/folder/JMJnlYYJClipper was complaining that the voice Chenoweth used here to narrate her audiobook is too neutral to sound like Skystar, but this is exactly the kind of application that voice style transformation is suited for. If we want her to sound more emotional, we can supply the emotion ourselves in the reference audio.
>>37332847I'm interested; since TalkNet is a single speaker model, what would happen if you trained it on data from multiple characters? Would it end up sounding like an amalgamation of those characters, or would it sound uneven and constantly shift in and out of different "voices"?
Is there just a plug and play way to make use of this? Like just feed a bot some text and get a voice out?
>>37332997Not until 15 puts his site back online.
>>37333036>>37332932What about the Tacotron 2 notebook,or the Talknet Notebook?
>>37333053I don't know if you're the same anon from >>37332997, but someone who asks that question wouldn't consider them to be "plug and play".
Just to note that if you are using the local talknet program to not to have too many GPU intensive programs open while generating audio. Had some editors open in other windows and the GPU soft crashed causing the programs to close without saving.Nothing major was lost but be careful when doing such things.Also always make backups.
>>37332847Great work on the talknet stuff, just started messing with the "normal" voices using reference audio and the results are very interesting. Also the program is a godsend.
>>37331681Now we can have Spy voice all of his lines from the TF2 comics. https://www.teamfortress.com/comics.php
"CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 7.43 GiB total capacity; 193.70 MiB already allocated; 4.81 MiB free; 244.00 MiB reserved in total by PyTorch)"and that's what Talknet said
>>37333744what version are you using ? the offline version has option to change the pitch stuff to work off the cpu, in the file controllable_talknet.py change "CPU_PITCH = False" to "CPU_PITCH = True".For colab one i guess you can try the Runtime - > Restart Runtime and than run the last cell.
>>37333761found the fix. restarting didn't work but factory resetting worked
>>37333865>still thinking LINK is a good buyimagine not holding at least $100 worth of GE before today
>>37333744>>37333772This is what happens when Google lends you a Tesla P4 instead of a T4. In the last thread, I mentioned that you can see this by running only Cell 1 first: >>37262053
i am in dire need of 15 to return, but if he's gone for this long that must mean he's making sure its extra good, or trying some new thing he thought of.
>>37335387>if he's gone for this long that must mean he's making sure its extra good
>>37335439He's passed out in a lab with VR goggles and headphones. Victim to his own invention as his mind lives in equestria while the body withers away. God I wish that was me.
>>37334266>This is what happens when Google lends you a Tesla P4 instead of a T4.Interesting. Pascal cards seem to be the common factor, but I've never seen it happen on a P100. Maybe the extra VRAM is enough to mask the problem.>>37332981I trained a multispeaker TalkNet last month. It works fine, but the audio quality's worse. I plan to experiment with it more later.
>>37335823He finally made it to Equestria...I actually know someone who's friends with him. Dude's been playing TF2 here and there.
>>37334266I hope not to get the p4 too often
>>37336056>It works fine, but the audio quality's worse.Is this regardless of the source audio quality? If you fed it, say, hours and hours of perfectly clean, perfect quality audio ripped from, say, a video game, or from a sufficiently large audiobook sample, does it make any difference?I'm mostly interested in if it would be possible to essentially create a "hybrid" voice based on a sufficiently large and high quality enough sample set of two character voices.
Why don't we just create an AI to create better AIs
Apple Bloom, Scootaloo, Sweetie Belle and Cozy Glow have been added to TalkNet.>>37336775You'd want a multispeaker model for that as well. I haven't tried it, but mixing speaker embeddings should let you "morph" between two different voices, kinda like pic related with StyleGAN.
>>37336502He's always playing TF2 and he's fuckin nasty at the game too
>>37336796If only it were that simple...>>37336903Voice morphing sounds very interesting. Would allow for many possibilities in regards to new voices. Like a voice version of This Pony Does Not Exist, but not to that extreme and with more control.
Would it be greedy and selfish if I were to ask SortAnon of a singing model of Discord so I can do this dumb meme better?https://u.smutty.horse/mckppypurdz.wav
>https://derp.link/NR7Xi (Ngrok Synthesis)Why does this thing have minor ass characters that only showed up once yet no Twilight Velvet?
>>37336903>but mixing speaker embeddings should let you "morph" between two different voicesI don't suppose this would help voices with very little data by using a much more robust and stable voice as a crutch, right?
>>37337725Multi-speaker models always sound way worse than single-speaker models, that's why everyone except 15 is using single speakers.
>>37337725we're already kind of doing that, instead of training a fresh model every time we use a pretrained model which had 24 hours of training data ( https://keithito.com/LJ-Speech-Dataset/ )though i do wonder if a model pretrained specifically on cartoon voices would fare better
>>37336903Awesome, great to see the character list expand.>>37336056>multispeaker TalkNetThis brings a question to mind. Would it be possible to get singing models from characters with little to no singing data with a multispeaker? I ask because I recall many of the speakers from Cookie's Ngrok model to be capable of their rudimentary singing, even those that had little to no singing parts in the actual show.I don't know how this all works, I just want more singing models.
>>37337304Thank goodness Travis Stebbins provided acapella tracks.
>>37338602It was really nice of her, yes.
>>37338602The acapellas have a lot of harmonies that mess up the pitch detection, so I'm actually singing the harmonies myself in this.However, at least for Discord, TalkNet can't do stretched notes well at all, so it's tough to do the rest of the track, especially the "discord~!" chants.
>>37337963I think it would, specifically because the LJ Speech dataset uses older/more formal english and lacks emotion and dynamics in voice tone.