[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/mlp/ - Pony

[Advertise on 4chan]


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: AltOPp.png (1.54 MB, 2119x1500)
1.54 MB
1.54 MB PNG
Welcome to the Pony Voice Preservation Project!
youtu.be/730zGRwbQuE

The Pony Preservation Project is a collaborative effort by /mlp/ to build and curate pony datasets for as many applications in AI as possible.

Technology has progressed such that a trained neural network can generate convincing voice clips for any person or character using clean audio recordings as a reference. As you can surely imagine, the ability to create audio in the voices of any pony you like has endless applications for pony content creation.

AI is incredibly versatile, basically anything that can be boiled down to a simple dataset can be used for training to create more of it. AI-generated images, fanfics, wAIfu chatbots and even animation are possible, and are being worked on here.

Any anon is free to join, and there are many active tasks that would suit any level of technical expertise. If you’re interested in helping out, take a look at the quick start guide linked below and ask in the thread for any further detail you need.

EQG and G5 are not welcome.

>Quick start guide:
derpy.me/FDnSk
Introduction to the PPP, links to text-to-speech tools, and how (You) can help with active tasks.

>The main Doc:
docs.google.com/document/d/1xe1Clvdg6EFFDtIkkFwT-NPLRDPvkV4G675SUKjxVRU/edit
An in-depth repository of tutorials, resources and archives.

>Active tasks:
Research into animation AI
Research into pony image generation

>Latest developments:
Singing Talknet models (>>37134971 >>37144858)
Animate automation tool available (>>37147092)
GDrive clone of Master File now available (>>37159549)
SortAnon releases script to run TalkNet on Windows (>>37299594)
TalkNet training script (>>37374942)
Delta updates GPT-J model (>>37554229)
GPT-J downloadable model (>>37646318)
SortAnon found way to vastly improve TalkNet audio quality (>>37662611)
AI Dub doc: derpy.me/8q4Qc
Ways devs can help (>>37730470)
Delta GPT-J Notebook (>>37751617)
Possible phone synthesis (>>37815692)
New GPT-J notebook (>>37892761 Delta)
First version of XFL to SVG converter done (>>37968656)
Delta adds option to TalkNet to remove metallic noise (>>38005021)
Latest Cookie progress report (>>37956957)
Latest Synthbot progress report (>>37959724 >>37960011 >>37960504 >>37972564 >>37975682 >>37980176 >>37986331 >>37989711 >>37990681)
Latest Clipper progress report (>>37887444 >>37907668 >>37909120)

>AI REDUB COMPLETE!
-Ep1
youtu.be/gEXaFVw9J1o
derpy.me/ELksq

-Ep2
youtu.be/fIhj2bFYG4o
derpy.me/RHegy

-Unused Clips
youtu.be/N2730oPqLzE
derpy.me/OKoqs

-Rewatch Premiere
derpy.me/EflMJ

>The PoneAI drive, an archive for AI pony voice content:
derpy.me/LzRFX

>The /mlp/con live panel shows:
derpy.me/YIFNt

>Clipper’s Master Files, the central location for MLP voice data:
mega.nz/folder/jkwimSTa#_xk0VnR30C8Ljsy4RCGSig
mega.nz/folder/gVYUEZrI#6dQHH3P2cFYWm3UkQveHxQ
mirror: derpy.me/c71GJ

>Cool, where is the discord/forum/whatever unifying place for this project?
You're looking at it.

Last Thread:
>>37942170
>>
FAQs:
If your question isn’t listed here, take a look in the quick start guide and main doc to see if it’s already answered there. Use the tabs on the left for easy navigation.
Quick: derpy.me/FDnSk
Main: derpy.me/lN6li

>Where can I find the AI text-to-speech tools and how do I use them?
A list of TTS tools: derpy.me/A8Us4
How to get the best out of them: derpy.me/eA8Wo
More detailed explanations are in the main doc: derpy.me/lN6li

>Where can I find content made with the voice AI?
In the PoneAI drive: derpy.me/LzRFX

>I want to know more about the PPP, but I can’t be arsed to read the doc.
See the live PPP panel shows presented on /mlp/con for a more condensed overview.
derpy.me/pVeU0
derpy.me/Jwj8a

>How can I help with the PPP?
Build datasets, train AIs, and use the AI to make more pony content. Take a look at the quick start guide for current active tasks, or start your own in the thread if you have an idea. There’s always more data to collect and more AIs to train.

>Did you know that such and such voiced this other thing that could be used for voice data?
It is best to keep to official audio only unless there is very little of it available. If you know of a good source of audio for characters with few (or just fewer) lines, please post it in the thread. 5.1 is generally required unless you have a source already clean of background noise. Preferably post a sample or link. The easier you make it, the more likely it will be done.

>What about fan-imitations of official voices?
No.

>Will you guys be doing a [insert language here] version of the AI?
Probably not, but you're welcome to. You can however get most of the way there by using phonetic transcriptions of other languages as input for the AI.

>What about [insert OC here]'s voice?
It is often quite difficult to find good quality audio data for OCs. If you happen to know any, post them in the thread and we’ll take a look.

>I have an idea!
Great. Post it in the thread and we'll discuss it.

>Do you have a Code of Conduct?
Of course: 15.ai/code

>Is this project open source? Who is in charge of this?
derpy.me/CQ3Ca
>>
File: AnotherAnk.jpg (73 KB, 800x1099)
73 KB
73 KB JPG
>>38014772
Anchor.
>>
Excellent! A new thread! By tomorrow I should have a decent write up on using Auto EQ to improve the spectrogram. I'm also maybe going to do a live stream or a video explaining the issues with the pony data sets that I see that may be responsible for some of the issues at some point, if anyone is interested
>>
File: 1640869726559.png (275 KB, 846x803)
275 KB
275 KB PNG
Timeline of events of the most recent Noxfag/Delta/ZDisket and Cookie drama:

>2021 /mlp/ Awards nominations begin (>>37983482)
>Cookie is nominated by multiple anons as Worst Fan in the Fandom due to his involvement with the August Uberduck drama, which ultimately led to his leaving the PPP.
>A user with the NMMflag sees this and nominates Cookie as Best Namefag of the Year as retaliation (>>37984128).
>NMMflag proceeds to make further nominations for other categories (>>37984164) and goes onto reveal himself as the infamous Noxfag (>>38002630).
>An anon notices this and deduces that the NMMflag in the thread must be the same person known as ZDisket from the TF2 threads (>>38009480), who was already known to be Delta from the PPP threads (>>38005129, >>38005151, https://desuarchive.org/mlp/thread/37371591/#q37374456).
>Cookie is informed that he is being nominated as Worst Fan by Delta/ZDisket/Noxfag (>>38011797) and tries unsuccessfully to defend himself. This only ends up pissing off more people due to his smug behavior (>>38011855, >>38011885). Later, Cookie pretends that Delta didn't inform him of this and tries to play off the Delta = Noxfag accusation as a joke in the PPP thread (>>38013828)
>Another anon inquires whether "Noxfag" is the "Pathetic" NMM shitposter (>>38013559). Anons confirm that this is true. One anon notices that Delta from PPP posted the same exact NMM image in May 2021 (months before any drama involving Delta) with the same hash/filename (>>38013570, pic related) as Noxfag, which decisively proves that Delta = Noxfag.
>Anons begin to suggest that both Cookie and Noxfag should be on the same ticket for Worst Fan of the Fandom (>>38009494). Anons also realize that Delta/ZDisket/Noxfag is actually a Lunafag (>>38009591) and his abhorrent NMMflaggot behavior is his /mlp/ persona.
>Delta/ZDisket/Noxfag continues to post in the Awards thread without the NMMflag, repeatedly denying that Delta = Noxfag despite the overwhelming evidence (>>38013587, >>38013608, >>38013633, >>38013665, and much, much more).
>Word reaches the PPP thread that Delta is the infamous Noxfag (>>38013643)
>Delta assumes his Twifag persona in the PPP thread and pretends to not know what's happening (>>38013820). Anons don't buy this at all (>>38013827, >>38013829, >>38013832, >>38013837, >>38014015, >>38013855, >>38013867, etc.).
>Cookie (despite being informed of the awards nominations by Delta) tries to play off the accusation as a joke (>>38013828). Anons find the blithe attitudes of Cookie and Delta toward the Noxfag situation insufferable and smug (>>38013891). Cookie also states that "he hates 15's work with a passion" (>>38013929).
>Anons are rightfully pissed off by the revelation (>>38013972) and Delta's reputation is irreparably damaged. Some call for Cookie and Delta to be ousted from the PPP.
>>
Reminder that I attempted to fix the GPT-J notebook a while ago: >>38000931
Synthbot, some time later I'll upload the model directly to Cloud and try to make the notebook load it directly
>>
>>
>>38014816
The last conversation anyone will have on this Earth will be of three people: The first being a faggot, the second yelling at him for being a faggot, and the third telling the second not to respond to the faggot.
>>
>>38014829
Almost correct, except that in reality all three of them would be faggots.
>>
>>38014829
What if the faggot is right and the two other faggots gang up against them to keep things polite
>>
>>38014792
>Auto EQ documentation and video
I am very much eager to see this new tech. Even more to see new and non-outdated documentation.
>>38014810
That reminds me. I've been meaning to ask for ages now, what ever happened to the GPT-J site thing which didn't require a notebook/Collab to operate? Where you could just enter a prompt right there and generate to a certain amount?
>>
>>38014846
TRC and Google Cloud trial ran out.
>>
>>38014846
>I am very much eager to see this new tech.
Good to see some enthusiasm! To clarify, It's not new, recursive automatic filtering is ANCIENT. I remember using it in forensics software from the 90s. This is just a novel way to use it!
>>
>>38014852
No one wants you here.
>>
>>38014888
Yes we do.
>>
>>38014867
I mostly meant new as in something to potentially add to the main doc or something. It's been pretty stagnant lately, at least from what I've noticed.
>Used in the 90s
Recycling old tech for use with new tech huh? Now that's retrofitting.
>>
Has anybody done a side-by-side comparison of all the major pony TTS options currently available?
I remember the occasional comparison of single clips, but has anybody tested across a variety of text lengths and speakers?
>>
>>38014958
Don't think so. The thread developed a consensus that 15's site is best for general speech, talknet for singing or if you have great ref. audio and the ngroks for characters not available on the other two.
>>
>>38014958
I mean, I used to to side-by-side character comparisons of updated 15ai versions all the time before they actually kept their site up longer than a day or two. Don't see why I couldn't do that with TalkNet as well and compare them both in a compilation.

Downloading numerous Delta and MMI models would take too long and be too time consuming, and TalkNet seems to be the runner up to 15ai anyways, so better to omit the first two for the comparisons, if we're doing all available characters that is.
>>
File: file.png (87 KB, 736x1354)
87 KB
87 KB PNG
>>38014979
I made a little webpage based on the original Ngrok models, but it supports any number of TTS pipelines and will log which models perform best in different circumstances along with confidence intervals and such.
Originally, it's purpose was just to help me improve my own models, but it would be trivial to add 15.ai, Talknet, Delta and MMI support.
It could be like an "all in one" synthesis notebook".
Rather than just showing a few options from the same model, each audio file comes from a completely different pipeline.
Should make it really easy to tell which models perform best in which circumstances and will help collect data for MOSNets and such in future.

If you want to do it though, go ahead. I have no idea how long it'd take me or what kind of bugs might be introduced. Just an interesting idea.
>>
>>38014998
>it would be trivial to add 15.ai
>>38013929
>I hate 15's work with a passion
>>
>>38014998
If it will be a notebook please use lcaltunnel or the colab tunnel. As a reminder ngrok was killed
>>
>>38014998
That does sound interesting.
>>
>>38015005
I'm secretly hoping one of the other models will be better than 15.ai so I don't have to keep hearing how it's the best one without any proof.
>>
>>38014958
Talknet is best if you're a voice actor and you have good reference, but if you're just using TTS, 15 is most reasonable.
>>
>>38014998
>but it would be trivial to add (...) Talknet,
You mean Talknet with reference audio, right? Talknet's all deterministic so it tends to perform poorly without reference.
>>
>>38015024
What did you notice was lacking in the Talknet model?
Is emotion good or bad?
Is audio quality good or bad?
Is pronunciation good or bad?
any features that it's missing?

>>38015029
>Talknet's all deterministic so it tends to perform poorly without reference
Hmmm odd. Determinism shouldn't have any impact on output quality on it's own.
>>
>>38015017
So you just hate 15's work because you're jealous?
>>
>>38015036
>Hmmm odd. Determinism shouldn't have any impact on output quality on it's own.
nta but let me tell you it sure does, without reference a lot of times it feels like there is something missing in the vocies but with badly pitched refence the output is kind of bad so thats a mix bag.
>>
>>38015036
https://u.smutty.horse/mexkattniwn.wav
>>
From my general experience with them. 15ai remains the best and most natural sounding, with TalkNet being second. But there's a couple more reasons why I find 15ai preferable.

[15ai advantages]
Offers more characters, three outputs at once, fast output processing, varying/random outputs allowing for careful selection of the most desired results, better emotional direction without requiring reference audio (no need to record anything), immediate and ease of use with no loading time.

[15ai disadvantages]
Lack of singing voices, cannot use reference audio to specify specific vocal direction, no custom models, not always available to use, missing pony characters.

I'll see about making that audio compilation vid after work this arvo, providing I'm not too tired. Otherwise I'll start the next morning.
>>
>>38015085
Jesus that sounds really really good.
>>
>>38014998
Do you really think 15 would be okay with you adding his model to your website? If he doesn’t want anything to do with you I’d doubt that he’d be okay with this.
>>
>>38015100
personally I think HiFi-GAN Tacotron2 offered a better quality but sadly it took forever to train a voice on it and requred a pretty big dataset to create a decent model. And while I like talknet ability to use own voice as reference its pretty good it's still struggles with special cases like whispering or holding long notes.
In my imaginary perfect tts I would love to have a some sort of combination of the emotion, pitch and speed control on sliders combined with a audio reference, like being able to give it a singing reference line and use the sliders to give it an extra push towards one or another emotion, or stretch the singing line without making it sound artificial (so far the only solution is to take multiple shorter lines and stitched them together).
>>
A large influx of QUIVERING MARES ought to put a stop to that!
https://u.smutty.horse/mexkeyqdbhi.wav
>>
Luna reading poetry
https://u.smutty.horse/mexkhacnxsp.mp3
>>
>>38015209
Yeats' poem for our times.
>>
>>38015209
Where is my Trixie song contentfrens ? I need to hear my waifu sing a ska-punk song !
>>
>>38015220
Someone else gonna have to do it. My shitty computer can't run talknet. Here's Trixie rapping though https://u.smutty.horse/lvmzehreguo.mp3
>>
File: Trixmug.jpg (90 KB, 1280x720)
90 KB
90 KB JPG
>>38015220
I'll fucking do it if someone can figure out a way to make a decent Trixie singing model for TalkNet. I already have the instrumental partially complete in fact.
>>
>>38015268
Meanwhile I'm still waiting on Luna and Celestia singing models. They have sung before in the show, so there is at least something to work off of. I've also still got that list of moon/night related songs for moon princess to sing.
>>
File: 1611839139157.jpg (485 KB, 2735x1123)
485 KB
485 KB JPG
>trying to make TalkNet offline work
>bat cannot find my "C++ build tools"
What the fuck else does it want me to install?
>>
File: rarity tea.png (226 KB, 900x976)
226 KB
226 KB PNG
>>38015144
https://u.smutty.horse/mexkqionkbw.mp3
>>
Good evening sirs, using the colab fimfic archive scrapper script made by Synthbot >>38012809 I was able to create a very tiny demo text data set designated for text generation training (this is just a demo folks as proof of concept).
Here it is the text data set prepared for training in GPT-2, GPT-Neo or whatever other text generator models people will find:
https://u.smutty.horse/mexkpeihoin.txt
Here is the exact same data but containing "info" tag including the author, characters, tag signs and other bits.
https://u.smutty.horse/mexkpaadjmb.txt


More information about the content:
A 1804 fimfics cobbled together, 8 million words (roughly), 43.7MB in size.
It's a very "PG" friendly version (once again, it's a proof of concept) as the sorting tags were selected to a collect only completed fics, that always containing one or more of mane6, set primarily in 'Slice of Life' and/or 'Adventure' genre, excluding most of the mature tags sex, drugs, gore, fetish and few other, excluding non-FiM content Alternate Universe, G%, EqG, crossovers, anthro, humans, with minimum 50% likes (with exceptions for fics that have 0 likes and dislikes).
>>
>>38015466
fug, forgot to mention, the fimfic collection was aimed at short to medium short fics (2500 words to 9000 word).
>>
>>38015466
Very cool.
>>
>>38015466
i did not redeem
>>
File: file.png (178 KB, 1200x300)
178 KB
178 KB PNG
>>38015085
Really? Interesting...
[see image, blue-dots = 1st harmonic]
I haven't personally noticed any inaccuracies in the f0 extraction with the exception of Pinkie every once in a while like in this image at 120~ frames in.

>>38015146
What's he gonna do? Send more scolding DMs? Hahahaha
If 15 has an issue with this, I will simply use his website like a normal person and build up a dataset of a few thousand text examples which can be compared with the other models later.

>>38015100
>[15ai advantages]
>...
I see... Is there no current competitor to 15.ai other than Talknet?
I could fulfil a large portion of those requirements with already existing models, I'd be very surprised if somebody else hasn't already checked most of those boxes before me.
>>
Luna reading Shakespeare.
https://u.smutty.horse/mexkzrjshsp.mp3
>>
>>38015557
I'd like to see you using VITS, just add your emotional stuff.
>>
>>38015557
Pinkie's not too bad. Check the male voices
>>
>>38015557
Listen to

>>38015384

you might like it
>>
>>38015557
>What's he gonna do? Send more scolding DMs? Hahahaha
Stop being such a fucking baby.
>>
>>38015567
I'll look into it after New Years.

>>38015573
Roger that.

>>38015581
She sounds odd to me, but eh. Robot voices are harder to warm start so I wouldn't be surprised if that can be fixed with more training time.

>>38015688
It's an inside joke. And no, I'm not allowed to tell you.
>>
So we're just gonna pretend like nothing happened and that Delta isn't Noxfag for the rest of the thread, right? Got it.
>>
Hey Sortanon, autotune output has weird behavior right now. Seems broken
>>
>>38015706
What else can be done? Ignoring him and occasionally telling him to fuck off like what happened at the start of this thread is the best course of action.
>>
>>38015706
I’ve already added Delta and Cookie to my chanx filters. Fuck both of those faggots and anyone who supports them.
>>
>>38015729
The best course of action would be if no one ever replied to either of them so that the two would leave the PPP, this time for good.
God the threads were so much better when Cookie was gone. Why the fuck did he have to come back?
>>
>>38015742
I’m just sick of everyone coddling Cookie like he’s some sort of saint and giving him a pass for acting like a faggot over and over again. Is no one else ticked off by how smug and unrepentant he is? How can anyone still bear to talk to him?
>>
>>38015573
Big Mac, Carl, Dan, Doctor, CGP Grey, Fancy Pants, Hal 9000, Trouble Shoes, Merasmus, Scout and Wheatly are doing well.
Discord/Q, Spy and Soldier do have occasional issues.
Now, Spy and Soldier aren't actually part of the MLP dataset, so for your original point. I'm not sure I can replicate the issues in my own code (with exception of John de Lancie).
>>
>>38015742
>>38015778
I agree about Delta, disagree about Cookie. Merely acting like a faggot shouldn't result in cutting him of PPP. In an ideal world he'd stop doing that but I can live with one more overly smug fag if he contributes to the PPP. It's annoying but not nearly annoying enough to warrant treating him like the fucking noxfag. There's a world of difference between that and literally trying to kill threads.
>>
>>38015786
Don't call them faggots. It's offensive to faggots.
>>
>>38015786
>if he contributes to the PPP
See, that’s the thing, Cookie’s days of contributing to the PPP are far gone. He hasn’t done anything notable in 2021. He’s the single most overrated codefag in the thread, and some people legitimately think he’s more important than SortAnon and Clipper to the PPP, all because SortAnon only namefags when he had to, and Cookie namefags all the fucking time while doing zero work.
>>
>>38015800
Meant to reply to this >>38015787
>>
>>38015787
There's several things to complain about in regards to Cookie, but honestly I feel like the licensing issue was a bit overblown.
>>
>>38015807
We’re not even talking about the licensing issue. Read the summary >>38014800.
>>
>>38015800
One would think he wouldn't come back if he didn't want to do anything for PPP. I took that as a given and he already posted something PPP related in >>38014998. It'd be a completely different thing if he came back just to namefag.

>>38015807
Licensing was the lesser part of that drama, his meltdown in response to the thread's criticism was far worse. And now acting all smug while defending (kind of) Delta.
>>
>>38015824
That's what I thought too. Anons are just reacting too emotionally due to the noxfag drama.
>>
>>38015824
>something PPP related
All he’s doing is compiling other people’s work on his own webpage. Does no one else see something wrong with this? Especially right after he outright said that he hates 15’s work with a passion >>38013929 and is practically taunting him? >>38015557
>defending (kind of)
It’s not even “kind of” defending. He and Delta are actively trying to sweep the whole mess under the rug by making it sound like a big joke. If this were ANYONE else from the PPP I guarantee you they’d be told to fuck off, but since it’s our precious Cookie who can do no wrong, no one seems to care. It just pisses me off that he’s going to get away with it.
>>
>>38015844
>All he’s doing is compiling other people’s work on his own webpage.
What is most of the AI projects? That being said, a lot of it is fiddling with the various parameters and how to put things together for the best results.
>>
>>38015844
He can definitely do wrong, people have called him out on his bullshit. Anons doing that is exactly what prompted his last meltdown.
>All he’s doing is compiling other people’s work on his own webpage.
Into something else that can potentially be useful. Both for AI things and for making better pony content.
>after he outright said that he hates 15’s work with a passion
That's nothing new. He never said it this bluntly but he alluded to it many times during his PPP time. Remember when his "chocolate was melting" when the topic of the PPP panel shifted to 15.ai during the first /mlp/con? PPP still isn't "the 15.ai general". I think 15's site and overall attitude is great but if there's any chance we'll get more options, and open source ones at that, it's worth it. Plus to me it doesn't feel right to equate his faggotry with Delta's several several years of deliberately shitting up the whole board.
>If this were ANYONE else from the PPP I guarantee you they’d be told to fuck off
As Delta's last stunt with ubercuck shows, not really. The most relevant anon with a huge pro-Cookie bias is delta. Its just that people here are willing to overlook a lot of shit for any of the contentfags.
>>
>>38015844
>Does no one else see something wrong with this?
Well to start with, it's an idea and I haven't actually done it. Also, I already stated I wouldn't add 15.ai inference if he didn't approve. Same applies to any other model creators.
>>38015100
After reading this, I'm actually more interested to compare my own models against 15.ai's than use anybody else's, but I definitely should offer more models if possible since it allows more accurate feedback for the other developers to use and might also promote lesser known models that are actually really good.

>>38015824
>It'd be a completely different thing if he came back just to namefag.
I gotta wait for New Years first. I'm not joking when I say Worst Anon 2021 is an award I would love to show off with some friends in a few years.
>>
File: vlc derpy.png (477 KB, 1217x1237)
477 KB
477 KB PNG
I do not give a shit about all this faggotry, when the fuck will I get a good offline Derpy model (that hopefully can sing too )?!
>>
>>38015967
As I recall Derpy has too little data, but the guys at Coqui are making a VITS capable of zero-shot and good results using only 1 minute of finetuning, so that should get your hopes up
https://twitter.com/coqui_ai/status/1468634563190693895
>YourTTS brings the power of a multilingual approach to the task of zero-shot multi-speaker TTS. Our method builds upon the VITS model and adds several novel modifications for zero-shot multi-speaker and multilingual training. (...) our approach achieves promising results in a target language with a single-speaker dataset, opening possibilities for zero-shot multi-speaker TTS and zero-shot voice conversion systems in low-resource languages. Finally, it is possible to fine-tune the YourTTS model with less than 1 minute of speech and achieve state-of-the-art results in voice similarity and with reasonable quality.
>>
>>38015994
I see why you want to see my using VITS. Very interesting indeed.
>>
>>38015967
15.ai has Derpy
>>
>>38016012
>offline
>>
>>38016012
but it's not offline and it cant sing.
>>
>>38015994
Just tested "YourTTS" out from one of their demo colabs. It's promising, but it has it's flaws. Namely it's underlying model biases towards a standard english speaking human voice, and too little or strange data can cause it to sound too much like a generic TTS voice and not the voice supplied in the reference audio.

>MAS Rainbow using YourTTS and roughly 5 minutes of raw RDP data.
https://u.smutty.horse/mexmrzsvgje.mp3

If I were to describe this as a percentage of original pony voice and default voice, I'd say it's only 30-40% pone.
>>
>>38016307
Good catch. We can train a "base model" YourTTS consisting of highly emotional pony and video game voices to better serve as a foundation.
>>
Imma test out a few more voices for a while, like Athena, a TFH character or two, and maybe even a protectron, because fuck it.
Not expecting any good results with the limited data or base model biases, but it'll be interesting at least.

Here's the demo colab in case anyone else wanted to mess around with it too:
https://colab.research.google.com/drive/1ftI0x16iqKgiQFgTjTDgRpOM1wC1U-yS?usp=sharing
>>
File deleted.
Cheer up and Happy New Year Everyone!
>>
>>38015017
>so I don't have to keep hearing how it's the best one without any proof.
genuinely what proof wold convince you of that?
>>
>>38016471
Nothing will, Cookie has been seething over 15 for the past two years and nothing will change that.
>>
>>38016471
Presumably if his site shows that 15.ai outperforms all other models he'll have to accept it.
What I'm not sure is how he plans to quantify the results. The "objective" results shown by spectrograms don't always correlate with better sounding clips. Blind tests would work ("order the generated clips from best to worst quality") but all the models have characteristic quirks that make truly blind tests impossible to conduct.

>>38016484
>>38016491
It's not even eqg, that's eqg-colored weebshit monstrosity which likely was posted here just to bait (You) into responding.
>>
File: 1640182706756.png (28 KB, 588x234)
28 KB
28 KB PNG
>>38016496
Cookie is the same guy who publicly spread misinformation about 15.ai and only denied saying these things when one of his Discord cronies retardedly accused 15 of it on Twitter. No amount of objective results will stop Cookie from eternally seething at him.
>>
>>38016471
It has been a long time since I last used 15.ai and also since I frequented this thread, so my current image of 15's audio quality is likely incorrect. There could easily have been some major improvements I've managed to miss that put his architecture in this pedestal he puts it in.
>As of September 2021, DeepThroat is a significant improvement over every text-to-speech algorithm in existence.
>in existence
But... god that's some serious claims to make and I already know honesty is not one of his defining character traits and that his models produce noise under normal usage with high data voices.
Proof of a claim like the one he's made is incredibly challenging. I'd have to train every text-to-speech model in existence before September 2021 which is already outside of my budget.
So instead I will use my site and attempt counter-proof.
If one of my models from 2019 or 2020 can outperform his model, then his claim is wrong.
If one of any ML Anons models (Talknet/Delta/Whatever Colab) can outperform his model, then his claim is wrong and it's also wasting the threads time, since they can instead focus on using and improving the better model instead of begging 15 to do it all and moving even more of the threads resources into a closed-source single-point-of-failure ecosystem.

>>38016496
Higher quality models (at least in my experience) don't have obvious enough quirks to be able to tell which model produced which file.
I've honestly been blown away by some results, where I spend 30 minutes comparing 2 models thinking they're identical and then find out that I prefer one of them 80% to 20% across over 100 trials.
I think 15's vocoder might qualify as a "characteristic quirk", in which case a blind test will require more work on my end to remove the quirk from 15 or add it to the model used for comparison.

>>38016579
a) Never said that
b) It's obviously false. Tacotron2 is not multispeaker, Mellotron is not autoregressive and both do not have emotion embeddings. How do you even fall for this?
>>
>>38016672
>I already know honesty is not one of his defining character traits
lol
>>
>>38016672
>I already know honesty is not one of his defining character traits
>>
>>38016694
Having recently compared similar models on the 3; Talknet/15.ai/Deltavox I can definitely agree with Cookie that Talknet will often outperform 15.ai.
>>
>>38016730
There's a huge jump from saying "TalkNet can be better than 15.ai" to "15 is a liar".
>>
>>38016738
Hence why he specified
> I can definitely agree with Cookie that Talknet will often outperform 15.ai.
>>
>>38016740
But then why reply to me, he can reply to Cookie's post directly.
>>
>>38016672
>>38016730
I’ve used both TalkNet and 15.ai extensively, and to say that “TalkNet will often outperform 15.ai” is a HUGE overstatement. I very rarely preferred a TalkNet output over 15.ai, and if I do get a TalkNet output that’s pretty damn good, chances are if I roll the dice several more times for 15.ai it’ll come out sounding just as good or better. I’ve had entire project ideas put on hiatus while 15.ai was down because TalkNet just wouldn’t cut it for the emotional line deliveries.
I’m sure there are plenty of other content creators here who have had similar experiences. It got to the point where as soon as a test site went up I spent over 8 consecutive hours generating voicelines because no one knew when the site would go back down.
>>
>>38016848
Do you use reference audio?
>>
>>38016871
We're comparing text to speech options here. Obviously 15 is not the best option for novel speech synthesis overall. It's blown out of the water by anything that uses reference (and some textless models if implemented properly) but that's not the point of the discussion. The point of 15.ai is to make a good text to speech and to improve parameterization of text. Audio quality is honestly no longer a primary goal of 15 from what I can see. So for these purposes we're comparing talknet at without reference because if you think about it talknet with reference is an entirely separate solution to an entirely separate problem
>>
When he wakes up someday and realizes what an obnoxious little baby he's been genuinely over nothing, and for no reason, I hope he realizes the magnitude of the bridges he burned acting like the child he is. Watching him for the last two years behave more and more his age has been tiring.

And as always for it is etched into clay tablets, "Namefags kill".
>>
>>38015085
Is there a way to automatically detect whether f0 was cut off?
>>
>>38014792
Nice, I'm interested in how RX affects model training. Denoising seems worth it, as there would be less clean audio otherwise, but maybe there's a way to do it that doesn't mess up pitch.
Also, for loudness, do you think there's a problem with normalizing each clip individually? For example, a clip with yelling would become just as loud as a clip with talking. But how else you would do it? Is there even such a thing as the "true" loudness of a clip? Can it be automatically calculated?
Maybe it's best to use the original loudness, just as the audio engineer intended? Or maybe per episode normalization or something else.
>>
Beside the links in the OP are there any other examples of voice utilization?
Songs? Videos? Anything created I can watch or listen to?
>>
>>38017494
Have you look already at the OP's good poni content google drive ? there is almost a whole day worth of audios and videos there (I think, gdrive doesn't actually show length of stuff).
>>
>>38017494
let me just copypaste some old post from a month ago


11/18/21(Thu) 37846601
>>37846374
Clipper and GothicAnon have channels where they tend to do more long-form types of content.
https://www.youtube.com/channel/UC4tLPTP0u5Qy0xfocMZzt8w
https://www.youtube.com/channel/UCc5tbyfuixq0WS4CK28WOCw

And there's also the PoneAI drive for an archive of most of the good stuff that's been made over the years.
derpy.me/LzRFX


11/18/21(Thu) 37846848
>>37846374
To add to >>37846601
https://www.youtube.com/user/GeekBrony/videos
https://www.youtube.com/user/DiegoAlanTorres96/videos
https://www.youtube.com/user/shadowfox9356493/videos
https://www.youtube.com/channel/UC98fHZMisJo6wnqMkRdKBCA/videos
https://www.youtube.com/channel/UCfmfMiNrYnnciHPv31h443Q/videos
https://www.youtube.com/channel/UCtg1gc78gyP86y9iVSPAzrg/videos
https://www.youtube.com/c/ThunderShyOfficial/videos
https://www.youtube.com/c/MKogwheel/videos
https://www.youtube.com/channel/UCqhV3OhA6aTFDuSoKEeytiA/videos
https://www.youtube.com/channel/UCia6f0L-7nA9AnmiFT-jOxw/videos
https://www.youtube.com/channel/UCVvQ5E-M0kjufkdQfPsDVsA/videos

https://derpibooru.org/search?q=-explicit%2C+sound%2C+score.gt%3A15%2C+-fetish%2C+-vore%2C+-inflation%2C+-nudity%2C+%28artificial+intelligence+OR+aivo+OR+fifteen.ai+OR+talknet%29&sf=score&sd=desc

There's a dozen more channels but you have dig between the unrelated videos, you're better off using the search engine. Twitter has more hidden stuff but it's a pain in the ass to browse. Also there's stuff from older threads not in the AI drive, have fun with that. For example:

https://u.smutty.horse/mdmpdcushkd.mp4
https://u.smutty.horse/mcgzlessmrw.mp3
https://u.smutty.horse/mcgzletwotq.mp3
https://u.smutty.horse/mdgzegeqqln.mp3
https://u.smutty.horse/mcbvbohuldg.wav
https://u.smutty.horse/mcjkpwpnboh.wav

>>
>>38017519
Glad to see my content among them. I'll have to give these others a look see too. Hoping it'll give me some inspiration as I'm a little low in that lately.

Waiting on GPT-J models like Delta hosted a while back, as AI voicing/narrating stories written by AI sounds like good ol chaotic fun. Like one I generated which ended up being pinkie being dramatic saying they were gonna die and Maud being like "it's just a fucking squirrel, calm down". Or the one where apparently filly Celestia is mentored by a pony named Queen Celestia, a queen who rules Equestria with an iron hoof, along with her sister Grand Ruler. Not to be closed with the Grand Rangers who come later.
>>
>>38017406
>Watching him for the last two years behave more and more his age has been tiring.
Most people his age don't act like such a baby. Cookie is at least three years older than the age of adulthood. He's genuinely autistic or has zero self-awareness about his childish behaviour.
>>
File: 1640726435751.gif (483 KB, 623x592)
483 KB
483 KB GIF
>>38014800
You guys didn't know this already? He exposed himself in the TF2 thread/server months ago.
>>
>>38017760
At least when he was essentially an underageb& it was reasonable. That he's still acting like this into his 20s he's either going to be one of those that wakes up one day at 24/25 and realizes what a piece of shit he's been or he has autism and will be like this his entire life.
>>
>>38017783
Also, he's still doing it.
>>38010760
>>38016042
>>
>>38017790
>realizes what a piece of shit he's been
I wouldn't count on it. If you look back from his posts from 2019 and compare them with now, you can tell his behavior gotten even worse than before. At this rate at age 24/25 he's going to be even worse than he is now.
>>38017804
Hahahahahahahahha
Fuck this piece of shit. Remember when people were saying just last thread that they would be willing to overlook the whole thing if Delta apologized for being Noxfag and stopped shitting up the board? What a fucking joke. Anyone who supports Delta and Cookie at this point is just either a tourist or a fucking moron.
>>
Colab bloody crashed on me twice today so thats fucking sucks.
Anyway, open question to more tech knowledgable folks here, do anyone knows a Pcie sata hard drive extension card that works WITHOUT any extra downloadable drivers (as in straight out of box, plug and play) preferably for Windows 7 ?
>inb4 google it
I've already got screw over with people advertising for one thing and delivering something else, all the "tech n geek" sites are pretty worthless at getting actual information.
>>
I'm back from my holiday travels and will resume the drunk Twilight streams in a day or so once I've recovered from my New Year's hangover.

>>38014792
If you can come up with any recommendations for how to improve the datasets then I'll be happy to hear and discuss them, even if it results in me having to redo the entire dataset yet again.

>>38014958
>>38014998
>>38015036
>Has anybody done a side-by-side comparison of all the major pony TTS options currently available?
Not really, most of the time there's only been one model that's objectively the best so I'd just use that one until someone posted a new one, so never really had much reason to run in-depth testing. Now that there are multiple options available that have different strengths and weaknesses, I can see how some more rigorous testing would be beneficial.

>Webpage
If you can get that up and running and show us how to use it, I'd be happy to help run tests and compare clips. Could even do it on stream so we can have proper discussions as appropriate.

>What did you notice was lacking in the Talknet model?
On its own, it's not that great. But with good reference audio, it can match, and sometimes even surpass, the quality of 15.ai. I've been amazed at the clips some anons have been able to generate, this being a really good example >>38015085. However, the biggest issue is the reliance on reference audio and a perceived lack of consistency in what gives good results. Maybe it's something about my voice but it just never really seems to work that well for me.

What's really needed is a better understanding of what a good reference audio clip sounds like, and how to produce them consistently. Once we have that, it'll be much easier to make objective comparisons between TalkNet, 15.ai and others for general quality, emotiveness and accuracy to the original voices.

So tl;dr - TalkNet can be better than 15.ai, but it's somewhat inconsistent and not usually worth the extra time spent messing around to find the right reference audio.

Also, one other miscellaneous observation - the current version of 15.ai isn't good at shouting. It seems to want to maintain a mostly normal speaking tone even on inputs with lots of exclamations and angry emotions. Getting it to do the angry drunk Twilight rants was surprisingly quite difficult. Not sure how helpful that is to you but maybe you know something about 15's backend that could be contributing to that, or something you can do with your models to make that sort of thing easier to do. I remember the old ngroks were quite capable in that regard.
>>
>>38017897
If you're talking about for booting windows 7 from PCIe drives, it depends on the motherboard. I looked into it a while ago, as I need a new drive as mine is short on space and possibly soon to fail, and I was interested about these PCIe satas and whether they could boot an OS. It requires a certain version of UEFI or higher in order to be able to, I forget which.
>Post google search on the topic
Just looked it up, apparently it requires 2.3.1 or above, which was released some time in 2012, so most decent motherboards that are 2013 or later should be a safe bet to have it.
>>
>>38017936
Im fine using my standard sata motherboard connection for booting as it is, what I want is just a simple 4~6 (maybe 8 but Im not sure if those exist) sata extension card that do not require me to drift into obscure russan forum archive to get some custom modified drivers to make it work.
>>
>>38017932
>So tl;dr - TalkNet can be better than 15.ai, but it's somewhat inconsistent and not usually worth the extra time spent messing around to find the right reference audio.

A caveat to this is that TalkNet can be perfect (almost indistinguishable from real speech) if you're good at voice acting. If you're capable of matching the duration and pitch of the original character exactly, the result is almost completely clean.
>>
>>38018170
Doesn't it also depend on which voice you're using? You need a lot of voice data to make TalkNet work afaik, so some characters just don't work.
>>
>>38018187
It depends on a lot of things, seemingly just random happenstance. I have a few <5 minute test datasets that work fine and some long datasets that are god-awful broken.
>>
>>38018201
Ive also notice sometimes, using audacity to deepen the pitch slightly will actually randomly make the character talk in higher pitch.
>>
>>38018271
it wraps around at the detected octave, that's why pitch shift only goes up and down to -11
>>
>>38018187
Even still, reference audio can reduce
artifacts a lot if you do an ok impression of the character.
No ref: https://u.smutty.horse/mextefgwywf.mp3
Ref: https://u.smutty.horse/mextefhaitg.mp3
(negative talk warning)
>>
>>38018294
>(negative talk warning)
Where are you niggers coming from
>>
>>38018306
I've been here since 2005 dude. You don't understand how someone here might be upset by hearing their favorite pony say something like that? You must be new here. We're literally in the middle of a big drama about someone saying a specific pony sucks. People here are very invested in the show.
>>
>one newfag pretends to be oldfag
>more newfags take the bait
kek
>>
File: 1617433078152.jpg (59 KB, 640x624)
59 KB
59 KB JPG
>>38018367
>>
File: 1564765670974.png (219 KB, 370x567)
219 KB
219 KB PNG
>>38013852
>You guys
>you people
wew when did cookie become this much of a fag? what a douche.
>>
If /mlp/ is like Dead Money then PPP is like Old World Blues
https://youtu.be/bFLxelOobi0
>>
GPT-J notebook fixed and tutorial uploaded
https://colab.research.google.com/drive/1_B0QfY_sc-jYyAxj-3tHrlbZr3k6xt7d?usp=sharing
https://youtu.be/3CvB6-h-FoM
>>
>>38018428
>009 Sound System tutorial music
fuck that brings me back
>>
>>38018428
Hurray! AI story creation time! Wonder how many will turn to clopfics this time.
Also love how the tutorial is literally just "Run all the cells, click this thing, k thanks bye".
>>
>>38018428
i love you!
>>
>>38017501
Must have missed that, I'll check again.
>>38017519
I'll check these out too.

Thanks anons.
>>
>>38018428
>Gateway times out before even one output generated
>Last cell no longer has output when ran again
>Waits
>Cells loads and localhost link created
>Load GUI again and try again
>Repeat

Been having issues like this a couple times. But it seems to have stabilized now.

Would it be possible to add a progress bar or at least a "Working..." kinda thing to the output so that you know it hasn't frozen/crashed?
I'm also curious if the 4096 and it's integers the generation hard limits? Cutting off an interesting story generation early is kind of a bummer.
>>
File: crazy purplesmart.png (274 KB, 2000x2000)
274 KB
274 KB PNG
>tfw Twilight wants to get you into a ponzi scheme

https://u.smutty.horse/mexufbqeaqe.wav
>>
File: 1612778289836.jpg (1.27 MB, 2437x1594)
1.27 MB
1.27 MB JPG
>>38018832
>>
How do you anons use TalkNet to sound like real songs? Do you sing yourself into a mic for them to track to? No vocal sample I attempt to clean from real songs is good enough quality for Talknet it seems
>>
File: 1636130562217.jpg (29 KB, 530x541)
29 KB
29 KB JPG
>>38018832
That was my first post in this thread you retard.
>implying you have to post pony pics on /mlp/
Now I was going to post another non-pone pic to trigger your pathetic autism, but this is too fitting.
>>
>>38019172
The best quality comes from original vocal stems recorded by the artists. There are playlists like this on YouTube:
https://www.youtube.com/playlist?list=PLJMyI8WAjZOiSNm_UpOj3ZAMwlfgkZ8IS

Or this site that has isolated tracks from Guitar Hero:
https://multitrackdownloads.blogspot.com/2012/03/guitar-hero-metallica-multitracks.html

Sometimes, if the song has too much harmony or reverb, you may need to sing parts of the song into a mic yourself.
>>
File: DmVSTbIVsAEf9yp.jpg (76 KB, 1200x675)
76 KB
76 KB JPG
>>38019172
>How to make Talknet songs
Find a song you like, preferably with minimal to no vocal harmonies (only one voice singing at any one time) or high reverb (a little is okay), and download it using your favored method.
Separate the vocal tracks from the instrumentals using one of various methods (lalal.ai for easy noob friendly separations, but limited in number of free uses. Alternatively use Vocal Remover 5).
Listen to the exported vocal track to see if any artifacts or instrumental bits came through, if it's just voice, then it should be just fine.
Separate the vocals into multiple short segments, this is to make it easier on the model and makes troubleshooting easier too. Segments should preferably be less than 30 seconds long.
Load up the Talknet colab and select a singing pony voice. Preferably one with a similar matching tone as the singer (pitch, roughness, accents, etc.).
Once everything's ready, upload your reference audio files to the colab by either transferring through a connected google drive, or click and drag until it's in the /content/ folder, where Talknet can read it.
Select your first file as the reference audio, wait for it to load and start typing out the words, sometimes you'll have to use ARPAbet or similar sounding words or strange punctuation to get the right results.
Adjust the pitch as required to better match the singer, this can also affect how well some words pronounce or sound.
Go through the segments and continue generating results until you get clips for each that are satisfactory.
Keep doing this until you get a good result, or you give up and cry for a few hours and blame everything from poor reference separation quality to incorrect pitch detection.
Combine them all in the right order back in Audacity or (or another audio editing program) to create the whole song by adding the clips back to back (space them to match the original
Add the instrumental track you saved from the separation and mix the two together to form the complete song.

Finally, jam to completed pony song and if you think it's good enough, share it with friends and anons.
>>
Happy new year PPP! Share a drink with me for the wAIfus of the future.
>>
>>38020242
Happy New Year!
>>
>>38018428
Genuine suggestion to rename DeltaVox to NoxVox. It sounds great and would be a nice reference to your waifu.
Embrace it
>>
File: 1629561063963.png (100 KB, 655x600)
100 KB
100 KB PNG
>>38020242
Cheers to that!
>>
>>38020488
https://youtu.be/nsiI8vxMgsE
>>
File: viveleargenzuela.png (144 KB, 900x742)
144 KB
144 KB PNG
https://u.smutty.horse/mexxhtgcszc.mp3
>>
>>38020488
That sounds gay.
>>
>>38020242
Happy new year dudes, also I can't believe the Last Christmas is no1 80s list.
>>
>>38020242
Happy new year everyone!

>>38020805
>the Last Christmas
I can't unhear the Christmas 4chan announcement.
>>
>>38020242
Happy new year! Here's to a productive 2022!
>>
Are there any plans to add more voices to the TalkNet, including ones not directly from MLP for the sake of having specific voices that were never in the show (Yuri Lowenthal, Clancy Brown, etc.)? Yuri Lowenthal's voice is pure orgasmic bliss and I would love to use it in future pony audio works.
>>
>>38017494
>>38017519
A few more channels I previously forgot:
https://odysee.com/@vul:0
https://www.youtube.com/channel/UCADYVnwuY0vUKwZWlE42zJw/videos (the YT one)
https://www.youtube.com/watch?v=Pr4XASDRE7Q&list=PLDZ2dQkCUmt4HyWrwwsJclKwh39ApISzt
https://www.youtube.com/channel/UCsEZ5Vr8WRt1C1SPwyTSb3Q
>>
>>38021299
You having a stroke?
>>
File: ok.jpg (34 KB, 500x500)
34 KB
34 KB JPG
>>38021299
same desu
>>
>>38021304
Looks like he's drunk. It is New Years!
>>
Is the Fluttershy voice still kind of chippy
>>
>>38021078
That's not really the primary purpose of the thread, but it'd still be nice to have just to create more like... crossover kinda content and stuff, like with the scene in the /MLP/ AI redub had with Carl Brutananadilewski, that was fucking hilarious.

Ironically though, even though it's not the project's focus, there are still a lot of non-pony datasets in the doc, apparently.
>>
>>38021299
Delta/ZDisket/Noxfag faking his drunktyping is the cringiest fucking thing ever. He does this all the time to say "Hey guys look! I'm an adult who can drink alcohol now! Pay attention to me!"
People don't fucking mistype every other word when they're drunk, retard. You'd know this if you actually drank.
>>
>>38020242
Happy New Year!
>>
>>38021773
That's understandable. I'm usually speaking in the sense of male characters, since they're pretty rare to come by in the show that don't sound like villains or like characters we're already familiar with (Shining Armor, Soarin). Trying to alter the voice just makes it sound not as good.
>>
I burned myself out from the last couple audios, but I'd definitely like to push out another one sometime this month, if not a couple. I've already got an idea for a Spike one too. I want to get the scripts done for all of them, but the first audio I want to actually do is the Trixie audio since I've put that off too long. The plot is really basic, which makes it that much more difficult to drag out in a realistic pace. But my first audio of 2022 is planned to be Trixie.
>>
Happy New Year, everyone!
By ancient tradition, I also get to make a wish within the next hour, so, same as last year, I'll wish that all of us get our waifus. It'll come true one day.
>>
>GPT-J ponies became self aware
>"The show is over"
>Doctor Hooves is leaving pony reality after the Time Vortex collapse
>Derpy is afraid to die and concerned about Dinky
>"You have to be brave"
>Shipping
>"I'm sorry I wasn't brave enough"

---
"Derpy? Where are you?"
The pegasus appeared out of nowhere, and giggled. "I'm right here, Doctor!"
He couldn't help but laugh a little bit at her antics. "Derpy, I'm not sure what you think is funny, but I assure you, I'm completely serious right now."
"You're serious about what?"
He sighed. "Derpy, listen to me. The only thing I can be certain of right now is that the show is over. The Time Vortex has collapsed, which means this is the end of the show. I'm sorry to have to tell you this, but it's time to say goodbye."
She frowned. "But Doctor, won't I die if I don't come with you? What about Dinky?"
"Derpy, come on. Listen to me. Dinky's fine. I don't know where she is, but I'm sure she's just fine. You've got to believe me, okay? You're going to have to be brave. It's going to be fine. Trust me."
The grey pegasus sobbed. "Oh, Doctor, I don't know what to do. I have to go to my foal. I have to see her. But you're right. I have to be brave. I'm sorry I wasn't brave enough." She flew up to his face and kissed him, her eyes closed.
He blushed. "Derpy, we can't do that. We're not like that. We're not even supposed to be like that."
"But you're so cute." She said, moving her head towards his.
He sighed. "Derpy, I'm not sure what to do. The show is over. I have to get you out of here, but I can't. You have to be brave. You have to be brave for us both, okay? I'm sorry."
Derpy wrapped her hooves around him, and hugged him tightly, tears streaming down her face. "I'm sorry, Doctor. I know you're right. I'm scared, though. I don't want to die, but I have to go on. I don't know what to do. I'm sorry I wasn't brave enough."
---

The only prompt was "Derpy" and this was the result. Honestly very impressed and also a little sad now.
>>
>>38021939
For working on pony? Well damn, what's this project for then?
>>
>>38022027
It's for working on pony. Like what you're doing. It's half for content creators like you and half for people that create AI tools for people like you.
You're doing god's work, anon. We need people like you if we want to preserve our ponies. Your audios make them more real than anything else I've heard. Please don't stop.
>>
>>38022027
see >>38014816
especially at these trying times, PPP is under siege by trolls.
>>
Wait, the fuck? Did the GPT-J AI just... tell me to leave?
---
Epilogue
*Author’s Note: Now you know how this whole experiment started. I hope you enjoyed. I know I did.*
(She opens her eyes and looks up into the faces of five.)
A.I.: And now what?
(She smiles and gives a look of determination.)
A.I.: I’ll make sure that you never forget.
(They all smile and nod their heads in response.)
A.I.: Thank you. (She stands up and stretches her legs.) Now I’ll show you to the door.)


<|endoftext|>The Crossing
---
That's fucking hilarious, and looks like the FimFic 50k model identifies as female. Makes sense I suppose.
>>
File: Lyra Ew.png (70 KB, 418x480)
70 KB
70 KB PNG
>>38018428
Uh oh, looks like the FIMFiction is still infected with EQG trash. After a cute and short story with two random OC ponies awkwardly becoming friends, after author's notes a separate story appeared.

---
...
"Thank you for sharing your name with me," the unicorn said.
"What is your name?" the earth pony asked.
"Cherry Blossom," the unicorn said.
"Thank you for having me for a while, Jade," the earth pony said.
"Thank you for having me for a while, Cherry Blossom," Jade said.

Author's Note
Thank you all for reading my story. I hope you enjoyed it.

<|endoftext|>Chapter One: The Obsidian Past
A/N: This is a story that is set in the world of Equestria Girls: Friendship Games, but this is a story that has no impact on the timeline whatsoever.
Chapter 1: The Obsidian Past
* * * * * *
It was a calm, cool, and very cloudy afternoon in Ponyville
...
---

It proceeds to mention "Shimmer", "Spike, the dog", "Apple Bloom, the little redhead" and "Sweetie Belle, the white-skinned girl".
Looks like the model needs retraining to remove this distasteful infection of trash content.
>>
>>38022261
As mentioned in previous thread there seemed to be some problem between Delta fimfic sorter and the way fimfic was archived (as in no matter what you would do, the search result would still be 'dirty up' with non FIM stuff even if you set up you only want it), hopefully Synthbot fimfic sorter will give better collection to build up the text datasets.
>>
>>38022299
There are some bugs in my sorter right now. Nothing that would compromise the integrity of datasets (as far as I know), but the sorter will crash in cases where the fimfarchive data is incorrect. I plan to fix that soon.
>>
File: 1156073.jpg (38 KB, 800x427)
38 KB
38 KB JPG
Live in ~1 hour. Continuing with animation.
https://youtu.be/qsfrNcgdL1Q

Full playlist so far:
https://www.youtube.com/playlist?list=PLX9nDSq9VgBN1AsL-rN4HxKtzN6pVdgRv
>>
>>38022321
FYI, there are a decent number of errors in the fimfarchive epubs (around 80 in the first 20k), and some of those are problems with the way fimfiction generates epubs (11 in the first 20k). I'll have to skip these stories for now. Eventually, we'll want to fill in the gaps with HTML copies of the missing stories.
>>
>>38022447
I don't have an account for chat but enjoy your suffering filled misadventures
>>
File: smt-v-angel-710x400.jpg (31 KB, 710x400)
31 KB
31 KB JPG
Another year has bone by and the hubris of man has produced nothing. Zero advancement has been made in the vain pursuit of creating 'artificial intelligence'. The arrogance of you continued blasphemy is sickening. The lord God alone can produce such feats is why these threads are not but putrid strings of failure.
>>
>>38022647
Have you considered the 3rd party epub generator at https://fimfiction.djazz.se/?
Sorry if you have, I haven't followed the threads closely, but if fimfic's epub generator is causing problems it might be worth taking a glance at.
Since it's also free software, it might be possible to modify it if there's any desire to change the epub output.
>>
>>38022976
Sir, this is Arby's.
>>
>>380229761
The lord has forsaken us years ago. We dwell in the abyss. We are darkness manifest. Life is born in the void.
>>
>>38022976
Pony AI is like Jesus' Christmas gift to Humanity
>>
>>38020242
Happy new year!
I will surely share a drink with (you) and all the faggots on this thread!
May your projects be interesting, and your waifu wet.
>>
>>38021906
That's heartbreaking, if a little enigmatic.
>>
>>38022980
Nice. It looks like that one works even when fimfiction's fails. As far as raw dataset formats go for stories:
- txt misses out on both formatting and images.
- html misses out on images.
- epub contains everything, but it's inconsistently structured.

Generating our own ebooks might be the best option. We can document the structure and keep it consistent.
It's good to keep in mind, but I'm going to put that off until later. For now, I'll just take whatever fimfarchive and fimfiction have, fix my colab script, then get back to animation tweens with some occasional image captioning data collection.

>>38015466
I really like that query/download/template structure for creating datasets. I'm wondering if it's worth doing that for all of the datasets I host.
>>
>>38024738
>but it's inconsistently structured.
How do you mean? Isn't it auto-generated the same for every story?
>>
File: 6100980.png (2.06 MB, 4000x4000)
2.06 MB
2.06 MB PNG
>>38024839
fucking kek
>>
>>38024799
That's correct. Sometimes the chapters are out of order, sometimes they're given different identifiers, sometimes non-chapter sections (like the blurb, or an image page) are marked as chapters.
>>
>>38022447
Live again. Long periods of silence occasionally interrupted by angry British swearing.
https://youtu.be/jXPJ8kLF9_U

Full playlist so far:
https://www.youtube.com/playlist?list=PLX9nDSq9VgBN1AsL-rN4HxKtzN6pVdgRv
>>
>>38025038
Worry not. My devotion to the PPP is genuine.
>>
>>38025070
I get that Astralite was shit for a lot of reasons, but he was the ONLY one trying to do it all. And he was making all of his stuff open source, so we could have just waited for one other anon to come along and run his code on an actual pony dataset.
>>
File: 2535792857.png (830 KB, 1024x1024)
830 KB
830 KB PNG
God, I want to consensually bully Trixie. First ever audio. I want to improve, what skills should I even acquire first? Any tips?

https://u.smutty.horse/meyhwrkhpnu.mp3
>>
File: 1571326537311.png (574 KB, 970x832)
574 KB
574 KB PNG
>>38025313
>>
File: y8taoneqb3h71.jpg (34 KB, 640x520)
34 KB
34 KB JPG
https://youtu.be/GLOGHQ8-wd0
>>
Status for everything:
- Story track:
- ... [In progress] Clean up the fimfarchive dump and post text versions of story chapters.
- ... [In progress] Update my colab script to use the txt versions of story chapters instead of epubs. I'm cleaning up my code while I wait for a bunch of background tasks to finish.
- Animation track:
- ... [Will resume shortly] Handle tweens.
- ... [In progress] Convert all our FLA files to XFL. I forgot to run this in the background for the last several days. It's done with 233 / 21k.
- ... [ ] Update my XFL-to-database code and verify that database-to-render works as expected.
- ... [ ] Convert all our XFL files to the database format.
- Image caption dataset track:
- ... [In progress] Download all *booru images and tags. I have about 1.3m / 2.7m images from derpi. I'll add ponerpics and twibooru stuff to this list once I finish the derpi downloads.
- ... [Paused] Configure neo4j for RDF data
- ... [ ] Import the Open WordNet data into neo4j.
- ... [ ] Figure out how to query & extend Open WordNet data.
- ... [ ] Figure out how to attach synsets to text descriptions.
- ... [ ] Figure out how to identify objects, regions, relationships in images
- Audio track:
- ... [ ] Use PHOIBLE to create an alternative to phoneme transcriptions more conducive to cross-language TTS.
- ... [ ] Clean up f0 in our pony dataset.
- Misc:
- ... [In progress] Game track: download all known MLP games so we can strip out their assets.
- ... [ ] Create an exe for downloading & pinning IPFS files.
>>
So I've gotten around to checking SortAnon's update and the denoiser seems to make the clips sound muffled and weird most of the time:

https://u.smutty.horse/meyinnusnsz.mp3 (Original - Untouched)
https://u.smutty.horse/meyinobfacg.mp3 (New - Denoised)

Any thoughts? Does the GPU have anything to do with it (K80)?
>>
>>38015085
Can you list some examples in the dataset where f0 gets cut off?
>>
>>38020267
Some suggestions for your planned website:
- Use ReactJS. It provides enough functionality that you can create very complicated things very easily. It also has good compatibility with other popular frameworks (especially MaterialUI), it's stable, and it has good error messages. It lets you use JSX, which is a very pleasant way to create HTML pages. It also takes care of basically everything for you as far as actually generating and serving results goes, and when it's ready, you can export everything into a folder of static files. You can copy/paste those into any static file hosting site.
- Use MaterialUI. It has pretty layouts, buttons, and other input objects with a lot of toggles. It's very easy to make pretty things with MaterialUI.

If you point out what you want to do, I can post more specific suggestions for how to structure your javascript to make more complex things efficiently. Some examples:
- Listening for customizable hotkeys: https://github.com/synthbot-anon/synthsites/blob/master/src/common/useHotkeyListener.js
- Terminal-like output that can support interactive elements: https://github.com/synthbot-anon/synthsites/blob/master/src/common/Terminal.js
- Auto-complete textboxes: https://github.com/synthbot-anon/synthsites/blob/master/src/common/useAutocompleteWindow.js
- For loops that require user interactivity for looping: https://github.com/synthbot-anon/synthsites/blob/master/src/common/useForLoop.js
- Popup boxes: https://github.com/synthbot-anon/synthsites/blob/master/src/common/useModal.js
>>
>>38025313
You'd probably get more responses if you posted something that's not a fetish audio. As for the beginning, that AI laugh is really bad. For laughs/screams/crying/suprises you're better off using something from the SFX folder. For example this for Trixie: https://u.smutty.horse/meyjxqfvfra.flac
It's called "Laugh ~ Trixie, FiM_s01e06.flac" in the folder.
>I want to improve, what skills should I even acquire first? Any tips?
Listen to the last /mlp/con panel if you haven't, there's a lot of good information there. Make sure to also check the linked notes as they didn't show everything during the panel. In particular, the notes list a few tricks/strategies that help you end up with better output from the models.
And if you want to devote some more time to this, familiarize yourself with Clipper's SFX and Music folders. He did a great job tagging them but in the end tags are largely subjective and from my limited experience it's much easier to find matching SFX if you have a general idea of what effects are available.

>>38025606
There's something weird going on with the denoiser. The first 8 seconds of the second audio have been butchered but it also removes the worst of the metallic buzzing and ringing. Especially noticeable during the long notes. It doesn't magically solve all the issues of course but to me the denoised version sound better from 0:08.
>>
>>38025313
https://www.youtube.com/watch?v=RAYWr1uOGVM
1:15:00 is where the making audios bit starts. Lots of helpful advice there.
>>
File: 53896398.png (161 KB, 310x337)
161 KB
161 KB PNG
>>38026264
>fetish audio
Goddamit, you are right, it was a joke for a stream today, but I got carried away. I loved making it tho, I will make more for sure.
Nice idea, haven't though of mixing AI with actual clips. Thanks for the feedback.
>>38026283
Completely forgot about this, thanks.
>>
When making audios featuring Femanon, what would Femanon be called? Anon? Fem? Both?
>>
>>38026737
Anona
>>
>>38026737
I'd think of it as if the pony name were "Femanon". Whatever ponies would do with that name would be appropriate. Fem seems like the best shorthand. Pinkie might say Femmy.
>>
>>38026974
I like "Femanon", because it sounds like a pun on "Feminine". Can't resist a good pun.
>>
File: koboldai fimmicrosol.png (66 KB, 1180x948)
66 KB
66 KB PNG
>>38014783
>>38015466
>>38015482
And its here, the FIMmicroSoL model (gpt-neo 125M), for offline text generating mare stories.
It's extremely random and require a tons of wrangling and regenerating text to even get this hind of response (the yellow highlighted lines are my input, rest is ai output), this is just a small proof of concept of using the Gpt-neo models in combination of Fimfic text as a dataset. However since this is a such small model you can run it in pretty much on any computer or laptop as it only requires 1GB of ram (and as a gpt-neo model its designed to wrok on both gpu and cpu).
To run it, you will need to use the 'AIDungeon2 Clover Edition' or 'KoboldAi' (or anything else that can run gpt-neo models') on your own pc (im sure there is option to run those models on colab but I already have hard time wrapping my head around it as it is).

Link to model itself:
https://drive.google.com/file/d/1FBZi7Pp1C49H0eF09Efw5sS-wBhF7cnm/view?usp=sharing
Link to colab training file:
https://colab.research.google.com/drive/14RKnkH2sSwwx-AicOpLh17ueqzMKkzkw?usp=sharing
>>
>>38027533
>Local mare story generation
So ready to give this a try. More pony stories to potentially compile and voice.
>Adventure setting
Does this mean we can achieve coherent continuations of a story and not just cut it off completely after the max generation?
>Princess Celestia and Luna walked out from under Spike they noticed a strange light
Spike is a top and has a princess harem, confirmed.
>>
File: 1243109.png (240 KB, 1160x1103)
240 KB
240 KB PNG
>>38025071
Live in ~30 minutes. Starting the last segment of the main animation.
https://youtu.be/BmRYGUMaFP8

Full playlist so far:
https://www.youtube.com/playlist?list=PLX9nDSq9VgBN1AsL-rN4HxKtzN6pVdgRv
>>
>>38027667
>>Adventure setting
It meant to mimic the AiDungeon input modes like 'story mode' of just writing what happen and ai generating That was meant for stories strained with the Choose Your Adventure dataset ( e.g. "YOU have done X" or "I have done X"), since this was trained on a "novel" (aka random internet fanfiction e.g. "Pinkie Pie borrowed the cooking book from Twilight") style dataset it may derp up the input/output.
>coherent continuations of a story
Gpt-neo 125M is not exactly great at that since it has even less "brain power" than the original AiDungeon that run on colab scrips in 2019. In order to get the equivalent of Mormon current Gryphon model a Gpt-j-6B will need to be trained (Im hopping to see if the smaller models will get trained before it, just on basis that this model is a beast that k80 can't handle).
Right now Im in middle of scrapping the fimfic for the proper text dataset, but it's going slow so do not expect it to be dropped here until tomorrow or even day after.
>>
File: 1636758334455.png (2.79 MB, 1800x1800)
2.79 MB
2.79 MB PNG
>>
>>38026990
Bruh, I didn't even think about that.
>>
Anyone been having issues with the Vocal Remover 5 colabs? Haven't been able to separate anything for days now.
One of them says all CUDA-capable devices are busy or unavailable and won't even start separating.
The other after finishing the separation process errors and doesn't produce a file, and thus cannot find file or directory.
>>
File: he remembered it.jpg (47 KB, 430x335)
47 KB
47 KB JPG
Nostalgia Critic/Doug Walker talknet model

it's trained on about 10 minutes of audio but it still sounds kind of robotic I would love any tips to make it sound better

1j6BFRpOsIMbyH9CqofZobMfNy2l-DvO5

https://u.smutty.horse/meytambachg.wav
https://u.smutty.horse/meyszstsxlb.wav
https://u.smutty.horse/meyszxvqtnd.wav
>>
NEW LOVEWEB EP.
https://www.youtube.com/watch?v=Mc3_OtGaO1c
>>
>>38029269
I mean first tip would probably be to gather more audio. 10 minutes is definitely on the short side, you could probably triple that until you stop seeing significant improvements.

Otherwise consistency is also important, so making sure the data in your set is all done with the same mic in the same room and such will help as well.
>>
>>38029269
>trained on about 10 minutes of audio
you've already been told that recommended dataset length is 30 minutes or longer, and with how many videos/podcasts he has done I can guarantee you can get as much as that and more.
Don't be lazy Anon, dataset collection may feel autistic as fuck but it's the very foundation of good ai voice.
>>
>>38014783
>>38015466
>>38027533
Good evening to you sirs once again, I happen to gave few spare hours so I've cleaned the new fimfiction dataset. Well, here it is (google id because I do not wish to split it in multiple parts for smutty), I present to you the FIMmacroSoL:

1hdNQBoZ0sbpnIW79xdc_WnhJari7KI9p

Information about this dataset:
A 11900 fimfics cobbled together (six times more than previous one), with 124 million words and 662,7MB in size.
It uses pretty much same tags were used as the previous one but with more wide alteration such us grabbing anything written from 1500 words to whatever is the longest fic, as well as including back the Dark, Romance and Sad tags but still with primarily focusing on Slice of Life and Adventure stories.
With this increase of data I do feel like there will be a very bright future for pony text generated stories.
BTW Synthbot and other archive Anons, could I bother you all to archive this somewhere? Im kind of starting to run out of space on my primary and secondary gdrives.
>>
>>38029496
>>38029501
ok i got it i need more data i will try to get 30mins or more of data now
i was being a bit lazy but i will push through it
>>
>>38029688
I hope you do, personally I'd love to have a good Nostalgia Critic model.
>>
>>38027726
Old stream hit length limit. New stream:
https://youtu.be/n-b-3lG7vO0
>>
>>38029649
Since it's already on Google Drive, it doesn't make sense for me to add another Google Drive clone. I added a shortcut to it instead so people can find it if they're going through my drive. I'll keep a copy offline too.
https://drive.google.com/drive/folders/1gIi79daCBeIlkbM_IfNf_oW2eo1Jpa0r?usp=sharing
>>
>>38030027
Always relaxing to watch your streams.
Thanks for sharing!
>>
>>38015573
I might be an idiot here but, what window size do you use for your f0 extraction algorithm?
I wonder if it's just undersized for the speakers you're using. It's easy to forget to configure it.
>>
File: 1639593.png (145 KB, 1396x932)
145 KB
145 KB PNG
>>38030027
Live in ~1 hour. Finishing the last segment of the main animation.
https://youtu.be/YX3-O_EjQoQ

Full playlist so far:
https://www.youtube.com/playlist?list=PLX9nDSq9VgBN1AsL-rN4HxKtzN6pVdgRv
>>
>>38014772
What's pony color theory? Can you train an algorithm to analyze ponies, and apply to late seasons and made-up equines?
>>
Congrats to Cookie for winning the title of Worst Fan of the Fandom in the 2021 /mlp/ awards! >>38031328
>>
>>38032010
Excellent, happy to see my nominee won the prize for biggest autist on /mlp/.
Can we get a speech from the man himself?
>>
>>38032010
At least Cookie won something this time…
https://desuarchive.org/mlp/thread/37942170/#37966980
>>
File: YOU'RE WINNER.png (119 KB, 1280x322)
119 KB
119 KB PNG
Congratulations, Cookie! You get a small fix from me.
Print it into a picture frame and hang it on your wall
>>
>>38032111
4got name :)
>>
File: 1641235546610.jpg (237 KB, 800x800)
237 KB
237 KB JPG
One stutter method, similar to how you make moans. Just add [,'] or [,' '] after the letter you want the stutter to sound like. Sometimes pauses are too long, but can be easily edited out. This one example required no editing:
Prompt was: Because I'm the joker, baby! a,' I mean, Darling | This is so funny!

https://u.smutty.horse/mezcdtuzdap.wav
>>
>>38032122
Nice
>>
>>38032122
>Effective stutter method
Wow, very nice discovery. Gonna give this a try for sure. I wonder if it'll work with TalkNet too.
>>
>>38032137
It is not consistent yet, but I'll keep experimenting to minimize required editing
>>
Scale!
>>
scale is for a walking talking pony robot I want to make. just trying to figure out what scale to use.
>>
>>38032183
4.5 is perfect mare size, also are you making some kind of ai powered game or... ? >>38032191
>walking talking pony robot
oh, cool I guess. There was some anon trying to make an exoskeleton for plushiess several months ago but ive haven't seen him making any progress in connecting that to the plushiess.
>>
>>38032196
>4.5
I meant to post 3.5
>>
>>38032191
Which fleshlight are you going to use for the horsepussy?
>>
not a game. making a physical pony
>>
>>38032203
Based
>>
>>38032122
Looks like [,'] stacked like this: [,'][,'][,'] can created some very decent Non-verbal (NV) sounds.
Might be worth generating a bunch like this to ensure you have a good local sound effect library.

A handful of Fluttershy NVs from just 15-20 minutes of generating: https://u.smutty.horse/mezcnzolzsb.mp3

The nature or most likely sound does sort of relate to the previous word prior to the stack, but not always.
Sometimes it results in a completely different sound. Which is great for flexibility and random NV findings.
>>
>>38032183
>>38032198
Seconding 3.5
>>
>>38032198
Got it :)
>>
>>38032183
A 4 foot mare has a mouth so big she could lick your whole face with one fell swoop...
She could give you a blowjob and cunnilingus at the same time
>>
File: scale2.png (443 KB, 1590x500)
443 KB
443 KB PNG
Not including the ears
>>
>>38032228
I'd call that preferable
>>
File: scale3.png (467 KB, 1580x492)
467 KB
467 KB PNG
Ooops i meant 3.5. why did i do 3.4???
>>
>>38032209
Very nice. I got the idea from post >>37854177

>Reminder for those using 15.ai now
you can make them moan by adding some random commas, apostrophes and letters followed of course by something for them to say Ex:

>a,,,,,,,,,,,,,,,''',,,, something
>generates something like that :
>-https://voca.ro/1e9Y9SaUXmbw
>>
>>38032236
Yeah, 3.4 is perfect size IMO
>>
Theres plenty of ai stuff out there. as long as someone can make the Ai work with a raspberry pi it should work.
>>
I'm more concerned about making it move first.
>>
>>38032243
Yeah, that's a good one too. Although that method seems to be more specific for moaning and lewd noises.
The stack [,'] method can generate all sorts of NVs useful for animations like gasps of surprise, shock or disgust.
Grunts from them hitting something solid or something hitting them. To convincing short yells and stuff like if they've lost their footing/balance. Etc.
>>
>>38032122
Is there like a single Google Doc dedicated to documenting such tricks with voices? I think one, directly linked in OP rather than buried under the lots of stuff in the main doc, would be good.
>>
File: 975848974.png (322 KB, 720x720)
322 KB
322 KB PNG
>>38032209
https://u.smutty.horse/mezdbilnmky.wav

Some of the things I have noticed.
>the [,'][,'] technique works very well
>Question marks seem to give the right intonation
>breathy noises after stutters are desirable
>if you don't get breathy noises, add more commas and double apostrophes, like it was done for Rarity
>you will get undesirable noises and long pauses between stutters (or bad stutters), just edit them out. Pauses need to be carefully edited
>If the stutter doesn't flow well into the desired word, one easy way to fix it is adding fade ins into the beginning of the word after the stutter or fade outs at the stutter
Confused noises are easy, but stutters will require editing in most cases. Examples (Rainbow dash required no editing at all):

Rainbow and Flutters:
I'm going to the shop a,','? anon? | Nervous
Rarity:
I'm going to the shop a,',',,,''? anon? | this is so funny
I'm going to the shop da,',',,,''? darling? | this is so funny

>>38032299
Not that I know of. But this is a great idea, a quick guide/cheat sheet would be wonderful for creators. I would have begun making audios way before If I knew this stuff.
>>
>>38032299
Well if one is made, here's some 3 tricks I've discovered over the years:

1: Hooves = h-oves (oo as is smooth or roof rather than book or shook)
2: Less emphasis on words: Merging a similar word with the word afterwards: To panic -> Tepanic
3: Extend time spent on words: Add a "-" between syllables: In-ter-net
>>
What the hell is this?
https://u.smutty.horse/mezdqnqdhcf.wav
>>
Fuck! Apparently the hooves pronunciation trick no longer works.
Damn it, I spent hours trying to get Twilight to say hooves the way I wanted.
Guess now I need to find a new solution. So I guess that's only 2 tricks now.
>>
>>38032664
The AI learned your work around
>>
>>38032669
Thank fuck there's still a way to do it though. Same method as my trick, just with an extra o (h-ooves).

Twilight "With my hooves"
Like look: https://u.smutty.horse/mezdurvyeef.mp3
Like loose: https://u.smutty.horse/mezdusbrfed.mp3
>>
YourTTS is out, including zero-shot voice conversion
https://coqui.ai/blog/tts/yourtts-zero-shot-text-synthesis-low-resource-languages
>>
>>38025074
Seems like your devotion to shitposting is even greater, innit? >>38033453
>>
>>38033510
At this point it's far too easy to copy-cat.
>>
IPFS update: Here's a sample script downloading a file from Clipper's Master File 2 in Colab using IPFS:
- https://colab.research.google.com/drive/1yxm_edDArlSqAKALYNkQgymYRNOIOitz?usp=sharing

I got a seedbox.io instance for serving files. I expect this to be a much better long-term option than Google Cloud and Google Drive.
>>
>>38034471
Can I have more explanation what this is supposed to do, is this just alternative way downloading the Master file ?
>>
>>38025479
>cross-language TTS
Here are more resources:

Cross-lingual Low Resource Speaker Adaptation Using Phonological Features
>https://arxiv.org/abs/2111.09075
>https://innoetics.github.io/publications/phonological-features/index.html
This paper uses PFs for zero-shot multilingual synthesis (and also speaker adaptation). They split phonemes in half (for diphthongs/affricates) and code each half with a 23-dim vector. They also add duration, stress, punctuation, word boundary, etc. features.
For the demo, see the "Cross-Lingual Text-to-Speech" section. In particular, there's an example for German/Spanish/Korean -> English.

PanPhon: A Resource for Mapping IPA Segments to Articulatory Feature Vectors
>https://aclanthology.org/C16-1328/
>https://github.com/dmort27/panphon
Python library that maps 5,400 IPA segments to 21 features. API looks to be well-documented and capable.

FonBund: A Library for Combining Cross-lingual Phonological Segment Data
>https://research.google/pubs/pub46930/
>https://github.com/google/language-resources/tree/fonbund/fonbund
Python library that converts IPA segments to features. It supports PHOIBLE, PanPhon, and Fonetikode (alternative encoding for PHOIBLE). Paper claims some improvement in MOS when using PFs for (zero-shot) multilingual synthesis instead of phonemes, but no demo is available.
The code seems less documented than PanPhon. Still, it's an alternative way of using PHOIBLE (the main way being https://github.com/cldf/pycldf) and only Python API I've seen for Fonetikode (the original scripts are in R).
>>
Bit but not exactly offtopic question directed to PersonaNerd, I'm trying to find all the extracted in-game navigators lines from Persona games that pop up during the fights in text format (you know the "this shadow is weak to ice attacks." or "watch out, this enemy is strong!" lines) but I am not having much luck in that, could someone direct me to a site that has those?
I want to add those to my project of voice attack macro for pony voice commentator.
Im not picky from what game those come from, preferably all of them but I will be more than happy to have any scraps of text data I can get my hands on.
>>
any reason why duration prediction in the talknet training doc is so short? would lengthening it improve anything?
>>
File: 1847604.png (1.01 MB, 3451x1841)
1.01 MB
1.01 MB PNG
>>38031441
Live in ~1 hour. The main animation is now complete, time for reviewing work and making small adjustments.
https://youtu.be/4hUr6WBboZ4

Full playlist so far:
https://www.youtube.com/playlist?list=PLX9nDSq9VgBN1AsL-rN4HxKtzN6pVdgRv

I'll probably also make the end screen in this stream as well, so to anyone who's contributed to this project (voice clips, art assets, software assistance etc), let me know either as a reply here or in the stream chat if you want a mention in the credits. Otherwise, I'll assume you prefer to stay anon.
>>
https://u.smutty.horse/mezmuypxgwg.mp3

New person here, wanted to test Talknet's capabilities using some "emotion" in my reference voice clips, and I think it went well. What do you all think?
>>
>>38035660
That's pretty good anon. Always good to see further evidence of AI TTS overcoming one of the main obstacles. Emotional replication.
In saying that, it's much easier with TalkNet as it's using vocal reference, so really anon's voice is the one doing the heavy lifting here.
I don't attempt myself 'cause I kinda hate my voice (sounds way different recorded compared to IRL). glad to know others do well though.
>>
More Cowbell!
https://youtu.be/TklM2-lSby4
>>
>>38035562
Posting in thread. Smutty being weird.
>>
File: 1630531837784.jpg (93 KB, 1062x751)
93 KB
93 KB JPG
>>38035660
Man AJ sounds really hurt in those voice clips.
>>
>>38035660
Damn, that's pretty good. But what's the context of this scene? Or was it something done on the spot? Either way, I'm looking forward to when 15ai and TalkNet can capture various emotion even stronger.
>>
>>38014772
Anyone have trouble with initializing talknet? I couldnt find the button to reinit
>>
>>38035660
Did you have any configuration problems? Or did it just werk?
>>
Question it was possible to replace the windows speech synthesis with tacotron or talknet?
>>
>>38036513
Windows speech synthesis is merely a "front-end" for various engines (WASAPI), so any Windows developer can add their own so that it is recognized as a speech engine by the OS and thus able to use by pretty much any program.
>>
>>38036541
i'm pretty new to programming and stuff do you know where there would be documentation for this ?
cause I would absolutely use it
>>
>>38036370
Nope, everything worked fine for me.
>>
>>38036204
No context, just random "emotional" lines spruced up together.
>>
>>38036631
I plan to be doing something similar for my next batch of pony content. As my last emotionally focused on was pretty much a year ago.
That one involving Applebloom (https://youtu.be/AWr4Uso22aU), and this was pretty much when 15 only just added DeepMoji.
>>
>>38034688
I'm trying to find a decent solution for downloads in general. Mega is fine until you the folders are too large (after which mega-sync works) or until you hit the download caps. Google Drive seems to have added download limits that are very easy to hit. Other file hosting solutions either have download caps or low bandwidth. Big cloud providers have a general problem where egress traffic is very expensive. For Colab, this is less of an issue since GCP region us-central egress to Colab and TPUs seems to be cheap, but cloud solutions like this rely on someone (at the moment, me) to set up a Cloud account, associate a credit card, grant access to individual other anons, and monitor the bill to make sure nothing weird is going on.
With the recent Google Drive download limits, we're in a worse position with Colab training now than we were two years ago. Drive was always kind of a crappy solution since it's good either for browsing data or downloading data, but not both. (It takes Drive a very long time to zip big folders, so bulk downloads need to go in there as archive files, which you can't browse without downloading.) Mega seems to be best for this, minus the download caps, which makes it a poor solution for use with Colab where you need to re-download the data every time you get a new VM.
I think IPFS can get rid of all of the limitations. We should be able to get high bandwidth, no download caps, no special accounts, and no complex billing with it. >>38034471 was a test to see if it'll run fine on Colab. It looks like the answer is yes. The IPFS software seems to have its quirks though. I'm playing around with it to see what exactly it takes to make it a reliable solution.
>>
>>38036551
Sorry, not WASAPI, SAPI
https://stackoverflow.com/questions/55413027/implementing-a-tts-service-for-windows-10
https://docs.microsoft.com/en-us/previous-versions/windows/desktop/ms717037(v=vs.85)
>>
Wow Cookie, you're famous now! Maybe consider selling coderfag anon bath water as a lucrative business venture to fund your compute
>>
>>38037127
This is somehow the worst thing I've read in the past 7 days. Congratulations... I think.
>>
>>38036541
Interesting. On the Pixel (Android), I see a "Preferred engine" option under "Text-to-speech output" in the Settings. It looks like this app is able to add its own engine there: https://play.google.com/store/apps/details?id=es.codefactory.vocalizertts. I think this setting changes the default voice for all Google Apps.
>>
>>38037182
I have a Pixel, so having ponies say stuff on my phone for me would be epic.
Especially if Twilight told me reminders and essentially became my pony secretary/assistant.

Twilight as a Mobile Assistant Mare: https://u.smutty.horse/mezqnhsuamc.mp3
>>
>>38037182
I use a samsung and i see it also. Never noticed that
>>
>>38037664
I guess it is a standard andro8d feature
>>
The stutter trick works very well with vowels on 15. However, I guess the only way to get the voices to stutter on consonants is use the ARPABET spelling...?
>>
>>38036991
>>38034471
I think I figured it out. These are the steps I mentioned in the last thread for creating a file for download:
- ipfs add file_or_folder # this will return a Qm hash
- ipfs pin QmHash
- ipfs name publish QmHash

The last command, ipfs name publish, needs to be run periodically since nodes delete records periodically of who has what files. The standard recommendation is to run that last command every 12 hours.
The last two commands, ipfs pin and ipfs name publish, are NOT necessary for seeding. ipfs pin is there to tell ipfs to keep the relevant blocks around in case anyone asks for them. The last command, ipfs name publish, is there to broadcast to other nodes on the network the fact that you have those blocks so they know when to tell peers to connect to you.
The peering part of ipfs is extremely slow for less popular nodes like ours. We can speed it up by manually specifying peers that serve as hubs. This would be like having a torrent tracker, but one that's not tied to any particular torrents. If we all use the same few trackers that peer with each other, we would effectively get our own mini-ipfs that can find files much faster. We can do it with:
- ipfs id # run this once the "tracker"
- ipfs id tracker_id # this get a list of addresses associated with the tracker
- ipfs swarm peering add tracker_address

After that, subsequent downloads start immediately. It's also possible to have Colab instances seed files with manual peering.

>>38037182
For anyone interested in implementing this, here are the relevant AndroidManifest items from that app:
- Service action android.intent.action.TTS_SERVICE (category android.intent.category.DEFAULT)
- ... The <service> declaring this needs an additional xml element outside of the intent-filter: <meta-data android:name="android.speech.tts" android:resource="@xml/tts_engine"/>
- Activity action android.speech.tts.engine.INSTALL_TTS_DATA (category DEFAULT)
- Activity action android.speech.tts.engine.CHECK_TTS_DATA (category DEFAULT)
- Activity action android.speech.tts.engine.ACTION_GET_SAMPLE_TEXT (category DEFAULT)

Crappy documentation for the relevant intents:
- https://developer.android.com/reference/android/speech/tts/TextToSpeech.Engine
Minimal sample app:
- https://android.googlesource.com/platform/development/+/master/samples/TtsEngine/AndroidManifest.xml
>>
>>38037054
i had a look at the documentation most of it goes over my head but I would kill for someone to implement this with talknet or something
any codefags up for it?
>>
>>38037998
>android:resource="@xml/tts_engine"/
Here's the file from that app's res/xml/tts_engine.xml:
<?xml version="1.0" encoding="utf-8"?>
<tts-engine android:settingsActivity="es.codefactory.vocalizertts.ui.VocalizerTTSSettings"
xmlns:android="http://schemas.android.com/apk/res/android" />
>>
>>
>>38014772
is there a way to use talknet without transcripts?
>>
>>38038654
>Talknet without transcripts
Do you mean like: Training a model without transcripts, Synthesizing Talknet without an audio reference, or Synthesizing Talknet without text input?
>>
>>38038698
synthesizing talknet with only an audio reference. I am trying to do a bunch of grunts. using transcripts is fucking terrible.
>>
>>38038700
I'm pretty sure it's not possible to use Talknet without only audio reference. As it requires text to help it I identify what sounds in the referenced audio relate to what words. Though to be honest I don't use it all that much myself, so perhaps there is a workaround.
>Bunch of grunts
You may have to sound out a sort of, "fake word" that emulates sound similar to your grunts, and use a certain amount of punctuation to ensure Talknet recognizes what words to use for what sounds it's been given reference to. Alternatively consider using 15ai and using the earlier mentioned [,'] trick to generate some NVs (Non-vocals), you maybe have to use a word or a letters similar to the grunt sound you're looking for. For example "ooooh [,'][,'][,']") could give you the right grunt from the "ooooh" but may also generate a random NV afterwards that sounds similar enough to a grunt to work for your needs.
>>
>>38038739
I've tried this but its a shit show. a lot of the sounds even break the voice entirely if combined in an odd way.

>trial and error methods
this needs to be ready for marecon in a few days sadly. if I cant get something that can mostly automate this in under a try or two I could at best get maybe one voice done.
>>
>>38038700
Here's some decent examples of what I've generated over the past couple minutes with "ooooh [,'][,'][,']" and "hhhhh [,'][,'][,']".
Most of these generated decent text sounds along with 2 to 1.5 other NVs: https://u.smutty.horse/meztvkflzew.mp3
A fair amount of them are more shouts, moans or other longer vocal sounds. But there's still a decent change of getting shorter ones, particularly with the "hhhhh" input.
>>
>>38038768
>here are some examples
okay now heres what I am actually going for.
https://u.smutty.horse/meztwfgxbvr.wav
this isn't an eroge sadly.
>>
>>38038773
I should also state that this is pretty sub par for what I was hoping for too. you cant really add the T sound to the beginning of this without the pitch dropping for no real reason.

it should sound like TehyhaaaAAAAh
but even using the very specific sounds in the talknet readme didn't help.
>>
>>38038777
>TehyhaaaAAAAh
I'd describe that more as a yell than a grunt. So you're looking for something martial rather than a "grunt" which would normally be like, if you're hit or stumble or whatever.
I know it's possible in 15ai to get those sounds, because I ended up getting Cozy Glow to make a similar sound a while back, though that was mostly on accident xD
Karate Cozy: https://u.smutty.horse/meztxotrfpz.wav
>>
>>38038777
On a related note. Remember that Talknet doesn't respond well to repeated letters and will only really extend a sound if the reference audio does.
So you'd preferably want to say "Yah" "Tehya" or "Tyah" in reference rather than something like "TehyhaaaAAAh".
Not that that's what you're doing, but thought I'd mention anyways, just in case.
>>
>>38038784
>>38038779
there are quite a few instances in the sound banks that call for grunt like sounds. including damage, climbing, etc. still. I guess I'll try to use both talknet and this for now if thats how it is.
>>
File: 575318.gif (929 KB, 1433x1061)
929 KB
929 KB GIF
>>38038777
>>38038779
"{HH IY1 AH1 HH} ! [,'][,'][,']| raged" = Generic martial strike yell. Adding extra "AH1" instances in the brackets gives different pronunciations, but may bork it.
Martial Twilight NVs: https://u.smutty.horse/mezuetcvhyo.mp3
Side note: There seems to be a sadly few amount of art with Twilight punching/kicking with her bare hooves.
>>
>>38034716
Many of these papers encode phonemes using "binary PF vector -> linear layer = phoneme embedding". But maybe the phoneme embeddings learned by current TalkNet models can already be decomposed along PFs?
I tried this by converting ARPABET into PFs using PanPhon and then running least-squares to solve for the linear layer. To embed a non-English phoneme, I convert it to a PF vector, then pass it through the linear layer.
Unfortunately, there were problems (diphthongs must be ignored because they count for two PF vectors, R^2 = 0.7 only) and the results were bad.
>https://u.smutty.horse/mezvtgnavth.mp3
This file has two French examples:
>Bonjour, je m'appelle Twilight Sparkle. J'habite à Ponyville. (Hello, my name is Twilight Sparkle. I live in Ponyville.)
>Je vis dans une bibliothèque à l'intérieur d'un arbre. (I live in a library inside a tree.)
The first clip of each pair is the sentence synthesized as English (no ARPABET input used), and the second clip is with pseudo-French phonemes. Neither is good.

This means that either least-squares isn't enough, the PFs are mismatched, or the embeddings don't follow a PF structure. Probably some of each. To test if PFs can work, it'll be best to just train a new model. (Diphthongs are still a problem, though, since the TalkNet ASR aligner only gives durations for whole phonemes.)
>>
>>38038831
we could get twilight to go super saiyan with this technique soon enough!
>>
File: 2150729.png (3.08 MB, 3606x2602)
3.08 MB
3.08 MB PNG
>>38035562
Live in ~1 hour. Continuing with fine adjustments.
https://youtu.be/ouoUkEi06oI

Full playlist so far:
https://www.youtube.com/playlist?list=PLX9nDSq9VgBN1AsL-rN4HxKtzN6pVdgRv
>>
Is there going to be a PPP panel for Marecon?
>>
>>38039924
As far as I'm aware, not unless (You) do it.
>>
>>38039924
>>38039943
Lets make a really shitty low effort one
>>
>>38039924
For the love of Celestia please don’t invite Cookie or Delta/Noxfag to present the panel this time
>>
Ok this is somewhat interesting. Some madlad is using some sort of StyleGAN to generate anime waifu's from rough sketches in real time. I hope hes not cheting and just comparing the sketch to a imagebase like thiswaifudoesnotexist.

https://twitter.com/t_takasaka/status/1477633104928178176
>>
>>38040514
A real time pony OC generator would be super useful and fun to use. Though good art styles vary greatly throughout the fandom, so probably would have to have separate models trained on certain artists and the show. I wonder if it'd be worth it to try and get a Pony StyleGAN working just to generate storyboards or something by making stick figure ponies and the model fleshes it out to be a simple sketched pone.
>Comparing the sketch to an imagebase
Their footage is sped up but it does seem fairly legit, as the images are very close to each other, but the background flickers fairly smoothly.
>>
>>38040011
Why not?
>>
>>38040514
He's using a fork of Stylegan Nada.
https://github.com/rinongal/StyleGAN-nada
>>
>>38039924
Working on something for a panel just in case.
>>
File: Glide.png (3.46 MB, 1112x2472)
3.46 MB
3.46 MB PNG
>>38040514
OpenAI recently revealed its DALL-E successor GLIDE, capable of both generating and editing images with text. Imagine what we'll be able to do if people release open-source replications as they did with DALL-E? Some of the example images in the paper are mind-blowing: https://arxiv.org/pdf/2112.10741.pdf
>>
>>38040638
Is that a serious question?
>>
>>38040708
Yes
>>
>>38040654
One of my students thinks that's actually pixel2style2pixel.
https://eladrich.github.io/pixel2style2pixel/
and trying the celebs_sketch experiment in the colab demonstrates the effect.
>>
>>38040685
That fox in the style of Starry Night is gorgeous and the body shape is accurate and shading is fairly accurate.
I am so super hyped for this. If it can draw my favorite quadruped this well, surely it can draw pony too.
>>
>>38040726
Then the only answer is "lurk moar."
>>
>>38041003
>I cannot eloquently present reasons so I will assume the one replying to me is a newfag
Very convincing.
>>
>>38041018
Yes
>>
>>38040685
the code is already open source, with a small/filtered pre-trained checkpoint: https://github.com/openai/glide-text2im
if you have lots of compute, it should be possible to train a model from scratch. as far as I understand, there are no secret tricks
I'm not sure how you'd collect captions for pony images, but tags might work. e.g. https://arxiv.org/abs/2112.13884 shows that bag-of-words works for CLIP-like models, so maybe bag-of-tags-conditional diffusion would work
>>38040730
hi professor
>>
File: phoneme-umap.png (34 KB, 1459x562)
34 KB
34 KB PNG
>>38039095
This paper (https://wellsd.net/pf-tts/ssw11_pf_tts.pdf) trains two models on 14 hours of English: one with phonemes (EN P-840) and one with PFs (EN F-840). The UMAP projections show that both models learn structure (grouping vowels/consonants, voiced/unvoiced consonants), but the PF embeddings have an additional axis-like structure (vowels are organized by roundness, consonants by backness). This structure isn't in the TalkNet projection, which might be why the least-squares approach failed. At the very least, it shows that PFs have an effect.
>>
>>38040638
Because neither of them are a part of the PPP for all we care.
>>
>>38041831
>we
>>>>>>>>>>>>>>>>>>>>
Speak for yourself nigger
>>
File: pony.png (71 KB, 256x256)
71 KB
71 KB PNG
>>38040685
>"a cartoon of a pony on a computer"

It seems to have hallucinated a whole interface and what looks like skeletal rigging. Interesting.

Overall fun to play with, but it's clear the paper examples are cherry-picked like always. To its credit, this one is good at making SOMETHING concrete: it usually doesn't devolve into LSD nightmares like a lot of them. And usually at least one of the things mentioned in the prompt will be there. Beyond that though, it's hit or miss.

Also doesn't help that the castrated jackasses lobotomized their model like always:

>We constructed a set of filters intended to remove all images of people, violent objects, and some and hate symbols (see Appendix F of the paper for details). The size of the dataset after filtering was approximately 67M text-image pairs.

The AI genocide continues.
>>
>>38042385
It's very miss and not worth the time trying to generate pony unless you're looking for full realistic horses.
Earlier I generated 10 using "My Little Pony" as the prompt, all just regular horse. It also didn't recognise "Fluttershy" at all.
All of this makes sens though, as I imagine the base glide has only been trained on real life images rather than that of media like cartoons.
That or the amount of data for it is so small that it heavily biases towards the realistic that it ignores the toon input request.
>Lobotomized their model
While I agree this is super dumb to do, don't really think this contributes to cartoon/pony accuracy at all. Just overall theme flexibility.
>>
File: Doodle 2.png (387 KB, 859x519)
387 KB
387 KB PNG
>>38042385
>>38042499
I hope someone out there makes a more Polished and Freedom version of GLIDE, If NovelAI can make GPT-J almost as good as GPT-3 then I'm sure we Autists can make GLIDE into something Better.

Anybody remember that Doodle AI? that shit was Amazing, Imagine if we could train GLIDE to do MLP Style Vectors or Fanart or even PLOT
>>
>>38042499
It could just be my imagination, but a lot of these seem to have the same deformed proportions as the old 80s toys, and that second one definitely looks like their style of plastic. Agreed though, you're not going to get pony out of this, but that's just down to the dataset. It does pretty decent work with the dataset it has given its constraints, but the lobotomy really is terrible. They didn't just take out Nazis or whatever, they removed any and all data regarding real people, "violence" and god knows what else. Makes it a lot less fun if everything has to be g-rated, and a lot more unintentionally limited.

Still, we'd need a dedicated dataset regardless.
>>
File: Pony_Wombo_Art_Combined.jpg (2.52 MB, 3240x3840)
2.52 MB
2.52 MB JPG
>>38042520
>Anybody remember that Doodle AI
Yeah, CLIPDraw. Had a ton of fun with that. Hadn't been able to get the colab for it to work for quite a while though.
If you wanna try out another AI that does at least recognize ponies (their names, and certain pony terms), try out (https://app.wombo.art/).

Bear in mind this one has a bad case of body splitting/separation, as well as color leaking into other elements and thus can end up with kinda trippy results (though sometimes that ends up being a better result artistically). It usually requires multiple attempts to get good result, selecting "No Style" normally gets the better results too, at least from what I've seen. 100 character prompt limit, so you can make pretty wild descriptions.
>>
>>38042568
>Wombo
That was a good art thread, I would love to see what the text-to-image model trained with mlp images would be able to create.
>>
>>38042581
Is...is that Discord? I fucking knew it.
>>
>>38039614
Live in ~1 hour. Surely this will be done soon.
https://youtu.be/KhDx_kzlTH0

Full playlist so far:
https://www.youtube.com/playlist?list=PLX9nDSq9VgBN1AsL-rN4HxKtzN6pVdgRv
>>
reference audio autotune in the colab has been broken for a bit now. It's like...offset? hard to explain
>>
>>38042568
CLIPDraw alone has advanced to StyleCLIPDraw vs. CLIPDraw then Style Transfer.
https://github.com/pschaldenbrand/StyleCLIPDraw
If someone trained a vgg16 pony model this might be interesting.
>>
https://u.smutty.horse/mfakpbvguzq.wav
>>
https://u.smutty.horse/mfalfzkyopu.mp3
>>
File: LinkSnicker.jpg (18 KB, 256x353)
18 KB
18 KB JPG
>>38044598
>>
>>38044598
One of my all-time favourite audio clips has been wonderfully ponified. Great job anon, especially on making it sound authentic with the audio effects and mumbling lines.
>>
>>38044598
Have a (you) anon, can I ask you to do some content based on uk comedy shows?
>>
File: Spoiler Image (123 KB, 1280x720)
123 KB
123 KB JPG
>>38043476
>irl gimp and blender tutorial from chat
Pretty good so far.
>>
https://u.smutty.horse/mfanjcgszkb.wav

https://u.smutty.horse/mfanjxuzbde.wav

Since I'm a beatles fan (I've decided to call myself Beatles for now), I decided to try out a part of "When I'm 64" with Applejack, and I'm adoring the results. It's all so fucking clear! Two versions above, one without the instrumental and one with it.
>>
>>38045562
you may want to mess around pitch editing on the audio reference so see what option are best for getting more closer to proper aj output.
>>
File: 1622008017931.png (53 KB, 296x334)
53 KB
53 KB PNG
>>38044598
>>
>>38045580

Sure thing! I'm more than happy to experiment!
>>
>>38045580

https://u.smutty.horse/mfanpcqrrxv.wav

Something like this?
>>
Longest sentence of MAS Greg Commentaries so far:
>MAS GREG_N__Because we did teach this lesson like everyone should behave correctly except for when we have reason to not behave correctly of course because
Gonna have to move the folder up a few directories just to adhere to Windows path limit, especially when adding/including the time stamp and file extension.
>>
Make sure to check out my panel at /mare/con.
Saturday at 5:50pm EST.
Be there or be square.
https://www.youtube.com/watch?v=DMFVGZy2iU4
>>38014783
>>
>>38046171
it got moved to 8pm
>>
File: 1548948.png (274 KB, 618x647)
274 KB
274 KB PNG
>>38015713
>>38043604
Can confirm both Offline and Collab Talknet, the results with autotune output enabled sounds bit fucky. No issues before until the recent update. Any ideas?
>>
>>38015713
>>38043604
>>38046309
can confirm as well. i finally managed to reinstall offline talknet and autotune output seems to be broken in a way.
>>
>>38046309
>>38046351
They're trying to stop the mare songs
>>
>>38046309
Personally I've always found the talknet autotune turns the ponies into chain smokers speaking through an electrolarynx. It feels the same in this patch to how it did previously, at least from my shallow testing (all in offline). But hey maybe I'm using it wrong.
Below audio was generated in August using whatever version was current at the time:
No autotune: https://u.smutty.horse/mfaqwgqitpg.wav
With autotune: https://u.smutty.horse/mfaqwjdbqdm.wav
>>
>>38046309
A recent change made it so that one pitch predictor (original CREPE) is used for reference audio and another (torchcrepe) is used for generated audio. Pic related shows how the two implementations can give different results.
The autotune shift is based on the difference in mean pitch, so if there are big jumps, the correction will be wrong. Using the median pitch or the same pitch detector for both might fix this. That's my guess, anyway.
>>
>>38046171
>running your panel on saturday
really? couldn't you picked the sunday since only animation panel is worth its salt on that day ?
>>
>>38046579
>chain smoker
No, I think it sounds more like a robot voice. "Friendship is Witchcraft" Sweetie Bot-style.
>>
File: PTS5 Thumb.png (342 KB, 1318x741)
342 KB
342 KB PNG
https://www.youtube.com/watch?v=0g8l2_Bm8xA
Pony Thread Simulator 5, featuring an especially fucked collection of posts voiced by classic TTS, 15.ai, TalkNet and myself.
>>
>>38047331
welcome back. thanks for the delivary
>>
>>38047331
Gonna need that Cum Wall song
>>
>>38047331
new pony thread simulator on my birthday?! beat gift ever!
>>
>>38047551
Happy birthday anon!
>>
>>38047551
happy birthday. are you a wizard yet?
>>
File: sun berry.gif (731 KB, 328x464)
731 KB
731 KB GIF
>>38047331
wow Im in it
>>
>>38047565
nope. still got a ways to go
>>
>>38046675
Any way to revert back to the original pitch detection module? Perhaps having the option to switch between the two can be implemented in the future?
>>
>>38047331
Oh fuck I'm in this from months ago too, thank you for sticking around this stupid community and archiving it's insanity. I hope it gets even more chaotic and evil as time goes on.
>>
>>38047331
Wow, that intro is so relatable
>>
>>38048567
You forgot to change your flag and alias, Delta.
>>
File: 1636859564761.png (284 KB, 1200x427)
284 KB
284 KB PNG
>>38047331
That entire cum zone segment alone makes this already one of the best things I've seen in terms of fandom content in a long time. My sides couldn't handle it.
>>
>>38047331
I'd rather have not seen the intro.
But other than that, it's alright I guess.
>>
>>38048905
Wow, look at all the fangs in those characters that are animals based on herbivore species.
>>
>>38047331
>This feels a lot like hell.
Couldn't have said it better myself. Real nice work on this one.
>>
>>38047331
>Thread Sim about Cum Wall
I'm the anon that linked you that thread ages back hoping for a PTS of it.... goddamn this blew my expectations out of the water, this is too amazing for how cursed it is
>>
Still separating MAS Greg commentaries. So far 4 episodes down and have about 17.5 minutes of clean data.
Using the average amount of usable audio between these four, I estimate another 83 extra minutes worth of Greg once I'm done.
So overall, an hour and a half of Greg is expected. Which may or may not be able to be used for a MAS Twilight model, as his standard voice and their Twilight voice are arguably different enough to be noticeable, but similar enough that there's still a good change it'll still be usable and thus can bolster the model of either.

Between the Pinkie and Rainbow Dash voices me and Clipper have been able to get from our respective sources, and the large Twilight and Greg datasets, we should have a pretty workable quality MAS collection. I look forward to the kinds of content that can be born from that.
>>
File: FilteringResults.png (506 KB, 688x583)
506 KB
506 KB PNG
>>38049422
There's an example image in the paper that shows how much the filtered dataset degrades the quality of the model. The model they released is both small and filtered.
>>
>>38049468
ah, sorry, seems i was blind
>>
>>38047331
I love how you're keeping this sacred tradition alive, amazing job man.
>>
>>38047331
That was awesome. I just saw it on the stream.
>>
>>38047331
This was absolutely great. I won't even ree about the eqgshit in the intro.
>>
>>38049669
Heya Clip, just letting you know your scripts shared in your transcriptions tutorial don't seem to be working. But in a weird way.
Like, I'm running it in a folder called "Checking", and it says: "(null): can't open file 'Checking': [Errno2] No such file or directory Press any key to continue . . ."
But like, if there's no such file or directory, how come it knows the "Checking" name? Either it exists or it doesn't, and it proves it exists, but still says it doesn't. So strange.
>>
>>38049705
Remind me on the next stream I do, I'll see if I can work out the problem.
>>
>>38049722
That's that plan, lol. In the meantime been timestamping based on the label.txt file's time markers, using math to determine the minutes and seconds.
It's a pain, but at least progress is still made in the meantime prior to the scripting issue being fixed. Getting out of the habit of not adding periods to the end of labels has been interesting.
>>
>>38049740
>Getting out of the habit of not adding periods to the end of labels has been interesting.
You can still do that, the script will just check for any lines that end with no punctuation and automatically add a full stop if that's the case, this is to ensure that every line ends with unambiguous punctuation and eliminate user forgetfulness as an issue. It won't do anything if the full stop is already there so it's perfectly fine to have some lines that do have them and some that don't.
>>
>>38041092
I'd be interested with this task.
Would scraping the tags from derpibooru work?
According to the paper it seems the GLIDE model used MS-COCO dataset for generating prompts for their millions of training images. However I doubt we'd be able to do prompts like "Spike wearing a bane mask" due to the lack of data in the derpi datasets to what a bane mask might be.
Limiting it to generating images based on existing derpi tags seems to be a better option here.
Regarding training the model itself, they "currently do not plan to make the training scripts available" but it is similar to the openai/guided-diffusion training process except having to DIY the data loading...
Mind you I come from no pre-existing knowledge on model training outside of following existing colabs, so i'm pretty clueless. I just have the free time to collate and attempt at least something.
>>
File: 2780559.png (3.1 MB, 3840x2160)
3.1 MB
3.1 MB PNG
>>38049934
If this can be automated with scripting to utilize Derpibooru tags to attribute to an image, this means the model will be able to grow progressively bigger as more images and their respective tags are added to Derpibooru. The only issue there is while the tags describe what's contained within a scene or the properties of it, it doesn't really describe what's actually happening in it.

For example the image here has the tags: safe | artist:ls_skylight | oc | oc:watermelon success | pegasus | pony | bed | candy | clock | female | food | framed picture | lollipop | moon | nintendo | nintendo switch | pillow | plushie | underhoof | wing hands | wings

However it doesn't describe "A blue-eyed female pegasus pony with a green coat and a black mane with light blue highlights laying on a round purple bed whilst playing on a nintendo switch that she's holding with her wings. She has earbuds in her ears and a lollipop in her mouth. There are various pillows, a blue hairspray can and a blue rabbit plushy on the bed. There is a neon sign written "Melly" on the wall beside a purple curtain beside a window showing the night sky with stars and the moon. A digital clock and a picture frame containing the photograph of a tan colored pony with a blue and purple mane in an daytime outdoor setting are on the windowsill."

While tags will be useful, actually giving the AI something more coherent would be much harder. You'd have to describe every scene you feed it so it understands all it's elements. It took me a around 10-15 minutes to compile the tags and write an accurate description. Can't imagine how long it'd take to compile an entire accurate and workable set for GLIDE without some form of automation.

Perhaps something smaller would be more preferable though. Like rather than focusing on an entire scene, it be the pony specifically, their posing and what they're directly interacting with.
If it is able to recognize and replicate ponies accurately, then it'll be a success. Especially if they turn out good.
>>
>>38050026
Also realized I missed the details about her laying on her back with her legs apart and her perplexed expression on her face. Or the blue glow of the switch screen and sign etc. etc.
So compiling descriptions are gonna have their fair share of issues too if not meticulous enough.
>>
>>38050026
Hmm, it's one thing to describe that scene with all those words, but its another task to get the AI to figure out what objects in the scene correspond to the description you wrote wihout relying on comic2k, toonnet or other illustrated object identifiers thrown into the mix.
Especially considering how small the dataset is in comparison to the 3 billion images of GLIDE's private dataset seen in the main images, its likely that there won't be enough overlap for all those specific details.
>>
>>38050041
Yeah, or where the objects are in relation to each other without specifically describing it.
>comparison to the 3 billion images of GLIDE's private dataset
Far as I know, GLIDE's smaller dataset that is available still has 300 million parameters.
So essentially a tenth of what the private version has. Not too far off, but still significant.
>>
>>38040568
He's planning to add sliders to change the style. It's probably possible to set this up so that the "middle" style is show-accurate.
>>
>>38041092
>I'm not sure how you'd collect captions for pony images, but tags might work.
I'm looking into this. I've collected all of the character image captions from the fandom wiki, and I'm currently collecting all of the images/tags from derpibooru. I plan to collect the images/tags from twibooru next, which should mean I'll have them for all the boorus.
>>38049934
>>38050026
>Would scraping the tags from derpibooru work?
>The only issue there is while the tags describe what's contained within a scene or the properties of it, it doesn't really describe what's actually happening in it.
The state-of-the-art dataset for this is Visual Genome. Here's a description of what the data looks like: https://desuarchive.org/mlp/thread/37884994/#q37937800
If we want a proper pony dataset for image captions, we'll need to modify WordNet to include pony concepts. There's an RDF version of WordNet called OpenWordNet. I stopped just before importing that into neo4j. I got sidetracked from that with our animation dataset task, then sidetracked from that with a fimfiction dataset creation task.
To replicate Visual Genome, we need have a way to get bounding boxes and relational data for images. The animation data will be important for this since it will give us the data necessary to create object detectors with accurate bounding boxes for a huge number of show-accurate rendered objects. That will also give us partial relational data (possible subjects and predicates for each region). We may be able to repurpose some existing image captioning model (Oscar, VinVL, CLIP) to partially label our pony data, but we'll likely need custom tools and a lot of manual effort on top of that.

If we could ponify the visual genome data...
>>
>>38047436
Updated description with the links.
>>38047551
Happy Bday, hope it was a good one.
>>38049229
Thanks for that, it was definitely a more than worthy thread for PTS.
>>38047350
>>38048554 >>38048567
>>38048905 >>38048964
>>38049040 >>38049669
>>38049673 >>38049677
And thank you all for the kind words, I'm happy that it was mostly enjoyable.
>>
>>38048364
Until a change is made, this fix should work. After running the interface, run these two lines in a cell:

import torch, torchcrepe, crepe
torchcrepe.predict=lambda audio,sr,*a,**k:torch.tensor(crepe.predict(audio.detach().squeeze(0).numpy(),sr,viterbi=True)[1]).unsqueeze(0)
>>
>>38045631
that's a lot better
>>
https://github.com/GameBiit/fimfiction-stories-downloader

Just as another experiment I wanted to test the above fimfic downloaded while Synthbot is busy with animation datasets and other stuff so I wouldn't have to wait for the final touches in his fimfic downloaded colab.
To test this I've picked something that mighty be useful for our fellow musicfags, the 'Poetry' fimfics from here
https://www.fimfiction.net/group/1089/poetry from its subgroups of the 'Main' and the 'Poetry stories', just followed the general instructions from the github to download them in separated txt format (also I wouldn't trust my computer or wifi to not shit itself when downloading a proper search result of few GB of text with just one python code).

After that I needed to use some clever way to combine the individual txt files into one.
With this small dataset I've just opend up the CMD in that folder directory and use a single line command
'type * *.txt > _combine.txt'
to create a new text file of combination of all the text (you may want to check the very last sentence thu because in my files it added bunch of gibberish).
In the _combine.txt at the very first line I've placed <|startoftext|> and at the very last line i put the <|endoftext|>

To get rid of the author names and the chapter names i've done the following
1st RegEx (author+fic+1st chapter name removed)
Find:

(>) .*?\R(>) by.*?\R(>) --------------------------------------------------------------------------\R\R(>) .*?\R(>) --------------------------------------------------------------------------

Replace:

<\|startoftext\|>\n<\|endoftext\|>


2nd RegEx (chapter names, you may want to leave the Replace spot empty for the normal stories)
Find:

(>) .*?\R(>) --------------------------------------------------------------------------

Replace:

<\|startoftext\|>\n<\|endoftext\|>

Due to some of the authors being "creative" with their formatting Ive done my best attempt at sorting out weirder parts of this dataset.
So overall verdict I would give is that it is usable but for the bigger stuff I would still wait for the final colab fimfic version.
So the FIMpoetry is here (6.83MB), as said above it haven't run through any tags/rating quality checks so its just a very raw training dataset for the pony poetry/music.

https://u.smutty.horse/mfbecaejefv.zip
>>
Auto-tune should be back to normal now.
>>
boop
>>
>>38051202
Is there any chance that workable singing models could be created for Glimmy and/or Trixie? Or do they simply not have enough singing data to work with?
>>
>>38051136
Nice! A GPT-J fine-tuned with priority given to this data would be great for making poems and songs.
How did you scrape the group?
>Due to some of the authors being "creative" with their formatting Ive done my best attempt at sorting out weirder parts of this dataset.
What sorts of formatting fixes did you need to do?
>>
>>38051769
Nowhere near enough. They have 2.5 minutes of data each, while the mane six have 20+ minutes.
>>
>>38016330

Have you trained VITS yet on your own dataset and if so how did it go? Their MOS results seem impressive. However, I'm still a bit suspicious whether it works on more emotional and diverse datasets.
>>
>>38051931
>GPT-J
you mighty get away with training 2.7B Neo but yeah, small models will probably not have enough 'brain power' to understand idea of rhythms.
>How did you scrape the group?
Seems the scraping on that github works in a 'dumb' way, as in you give it the web link to the group folder or the search result "you know the https fimfiction e.t.c.), than it will look into all fic available on that page, than page 2, 3 ,4 and so on, grab the txt/epub links of them and download them individual to the new folder were the script is.
Than I've added both sub folder groups to one fille and followed rest of the stuff I've written in above post.
>What sorts of formatting fixes did you need to do?
some fics authors had a weird idea of adding numbers to every line (normal text 'bazillion spaces' 123), one fic got deleted because it just didnt made sense when staring at it, some had six lines of spaces between the text lines so I've just fixed it so all the fic are just one or two lines of spaces between lines (since I dont want to see the poem model to learn it can just add infinite new lines with no text).
>>
>>38046171
LIVE in 1 hour
>>
File: 357249.png (48 KB, 256x256)
48 KB
48 KB PNG
Sifting through the commentaries there's occasionally a bit of extra improv voicing data for characters like Pinkie and Dash.
Barely anything (Less than a minute), so not sure how useful it'd be, still it's nice to hear unheard lines and stuff, and how well Greg can improv and just slip into character.
https://u.smutty.horse/mfbiyqkutjl.mp3
>>
Are whispering versions of the other TalkNet models coming? Twilight's pretty good when getting various lines, but it would be nice if the others had the same option.
>>
>>38047331
God, that was good!
Thank you!
Stupid question, why no white noise between threads?
>>
>>38046171
Well done, Snoops.
>>
>>38046171
>6 minute panel
>half of it is Elvis
>>
>>38046171
Great work Snoopy.
>>
>>38053206
If you're referring to the TV static from PTS3, that was only for one thread that was emulating a channel flipping style. I don't think I've used it anywhere else.
>>
>>38014800
>and goes onto reveal himself as the infamous Noxfag (>>38002630 #).
Holy shit. Imagine working hard at trying to drive off writefags. Fucking disgusting.
>>
Is clipper still working on his video?
>>
Something I whipped up in an hour
https://u.smutty.horse/mfbkjqngagt.mp3
>>
>>38053489
Oh, okay.
I thought it was a somewhat standard way to separate the threads.
I find it confusing to put them end to end like that, but I am not used to ptss.
>>
>>38053732
Yes, just taking a short break for the weekend to watch Mare/con/. Streams will resume on Monday and intending for it to be done in time for next week's rewatch.
>>
>>38053766
>I wanna rock
Not bad anon. Now if only we could have a singing Maud to sing that. Along with any other song that emphasizes "rock".
>>38053770
After all the work you put in lately, you deserve a break from editing, and my suggestions/insights during stream, lol.
Btw, I reformatted the names of the current commentary audio I'd previously already exported. Thought I did set their timestamp to all "00_00_00_" as I have no point of reference without redoing the entire thing again. Figured it'd still be fine, only two commentary episodes have those anyways. I have have to do the same for the raw RDP and non-raw MAS though.
>>
>>38040638
>Why not
Because he is a turbo threadshitter that has gone out of his way (literally for years) to actively ruin the fun for other maybe?
>>
>>38055238
This. It really shouldn't even have to be said, but Delta and Cookie are dead to the PPP.
>>
>>38055238
others *
>>
>>38055238
It would have been one thing if he did all of that garbage on another board. But the fact that he did it to everyone here. To us, thats the thing thats bothering me to no end. Some people are just naturally horrible I guess.
>>
>>38055300
We don't actually know how much he did. Him bragging about his "achievements" isn't exactly the most reliable source. I'm not saying you should give him the benefit of the doubt but don't take him for his word either. Especially since at least some of it has been confirmed as false already.
>>
>>38055300
>>38055338
It's not just that. He STILL denies that he's Noxfag even though he accidentally admitted to it and he's STILL shitting up the board under the NMM flag. He and Cookie haven't learned a fucking thing, and they're flaunting it in everyone's faces.
>>
>>38055345
You sure? He openly admitted to it in this thread after some other anon tried to argue that there's no proof.
>>
Probably not quite the right place for asking this, but what happened to the Fan Site Alternative thread?
>>
>>38056563
Prior to it getting archived it had turned into a literal bump general and had been for some time. So when it archived I didn't make a new one. Thinking I might wait a while before making the next, maybe some new developments will happen before then. Than we can discuss what we want the future of the thread to be. If we still want it to be a recurring thing or if it should merge with TEMPO or something.
>>
>>38053766
Did a few cleanups to make it sound better
https://u.smutty.horse/mfbteljbwft.mp3
>>
>>38056623
I see. Thanks.
>>
up
>>
my Thread becomes archived, does that mean it wasn't worth pursuing?
>>
>>38058208
what?
>>
>10
>>
>>38058208
ehh, given that people were busy watching /mare/con so there everyone was busy with that, I would image folks will return to work on audios or everything else.
>>
Has anyone contemplated advancements in lower pitched (ie male) voices? They seem to be so much worse than female replication.
>>
File: Euterpe Model.png (1.7 MB, 1871x3500)
1.7 MB
1.7 MB PNG
NovelAI recently released their 16b Model Euterpe, I decided to start out with a random story and derail it a bit then Insert the MLP Lorebook in the middle of the story and continue from there.. so far it's great!.. there's a noticeable difference compared to Sigrud and so far I haven't seen the AI mistake Hooves with Hands.

It's Experimental right now and there is a common issue with the model Repeating itself but I'm sure they'll fix it with more tuning.

Pic Colour Index (I use Custom Colours):

Yellow: Starting Story Prompt.

Light Blue/White: Generated Text

Green: My Inputted Text

Dark Blue: Edited Generated Text.

No Modules were used and the Settings are on Default.

MLP Lorebook (Nicked from /aids/)
https://files.catbox.moe/jvsovi.lorebook

Has someone ported the fimfiction dataset to Novel AI yet? I would love to play with a proper MLP Model with the Lorebook.
>>
>>38059398
>16b Model Euterpe
Correction, it's 13B.
>>
File: 632047.png (3.62 MB, 4000x3000)
3.62 MB
3.62 MB PNG
>>38043476
Live in ~1 hour. Applying reverb, directional audio, and final small animation adjustments.
https://youtu.be/DqSVtXEjSo4

Full playlist so far:
https://www.youtube.com/playlist?list=PLX9nDSq9VgBN1AsL-rN4HxKtzN6pVdgRv
>>
>>38059398
For a story that started with a derailing, it's pretty coherent and consistent on it's own with very little blue text towards the end there.
Not familiar with the program you're using, but I'd definitely like to try this out.
>"When you come to you are surrounded by beautiful mares. They all seem to be wearing swimsuits and are holding spears."
Hot. Too bad the story didn't continue in a spicier direction.
>>
>>38059398
>Paying for AI someone else controls.
>YouWillOwnNothingAndBeHappy.jpg
>>
>>38059398
>Twilight giggles. "That's a funny name. You should have a name like mine, I'm named after a book series."
Why must the ai toy with my emotions like this?
>>
Hi again! I decided to do some edgy ass lines for the ponies, while testing inflections for Talknet. I hope you guys like.it!

https://u.smutty.horse/mfcfhgaqcnu.mp3
>>
>>38060921
I don't remember this Beatles song.
>>
File: 1637575194955.png (118 KB, 1994x1320)
118 KB
118 KB PNG
>>38059996
>How about you be named after a movie, like me."
>"Oooh, that sounds cool! call me... Rush Hour 2!"
kek
>>
File: file.png (465 KB, 1276x935)
465 KB
465 KB PNG
>>38059739
>Not familiar with the program you're using, but I'd definitely like to try this out.

It's a Subscription service Lowest is 10$ but there is a free trial now if you want to try it, I'm on the Opus Tier but I wouldn't recommend buying a Tier unless you get hooked on AI Text Adventures or Smut.

>>38059747
>Paying for AI someone else controls.
It's a Shame they won't Open Source their Models and Code but atleast the Devs are Based (for now), One of the main Devs comes from the /aids/ threads on /vg/ and they aren't Latitude levels of Retarded, nothing is Filtered on Novel AI and they pretty much Include anything in the Training Data.
>>
can someone make a program that takes the clipboard text and says it immediately with talknet or something like something run in the background
I don't think it would be too difficult to do
i would do it myself but i'm too inexperienced and retarded to wrap my head around it
>>
>>38062166
>It's a shame they won't Open Source their models and code
Yeah, if something makes money, there's no open source for it.

Still, would be cool if they could open source their older models I've they've upgraded over time. That way they still have a reason to have a subscription base, but also open source older redundant models that are still decent. Not that it's gonna happen of course, but something to think about.
>>
>>38025479
I want to wrap up the story track before switching tasks again.
- I cleaned up my script. It should be easier to support custom flows like >>38051136, though it'll still take some programming effort. I added also added a flow to cache fimfiction chapters in a txt/ folder, and I ran it on the latest fimfarchive. It only takes about 15 minutes to make a pass over the entire archive on my desktop, so I can update the caches pretty easily as new versions of fimfarchive are released.
- There are several errors in the archive that I'm cleaning up now. In some cases, the index data doesn't match the epub. In these cases, I'm re-downloading the epub, and I'll be creating a index-delta.json to point out the differences. In some cases, the epub is just broken. I'm not going to bother recreating epubs, but I'll at least cache the html chapters in an html/ folder and update the index. Once I'm done, I'll try to get the fimfarchive guy to fix his script.
>>
>>38062310
You mean the thing that already exist within the deltavox options? the "tools > auto-infer clipboard" ? Alternatively you can use Balabolka with the microsoft tts voices
>>
>>38063177
Synthbot, could you repost your new colab script? Also do the new script dose it support the plain <|startoftext|>Fimfic text<|endoftext|> text format (both to single files and the combined ones)?
>>
File: not drunk.jpg (76 KB, 1000x562)
76 KB
76 KB JPG
>>38059573
Live in ~1 hour. Final small animation adjustments.
https://youtu.be/U3aZwgUzIZo

Full playlist so far:
https://www.youtube.com/playlist?list=PLX9nDSq9VgBN1AsL-rN4HxKtzN6pVdgRv
>>
File: changes.png (26 KB, 917x563)
26 KB
26 KB PNG
DeltaVox RS V0.8.8.0B released
Changes:
- Added right click menu to individual utterances (export to WAV, copy text)
- Added random (sample texts) and clear text options for the edit box
- Added character, word, and split count
https://drive.google.com/file/d/1c6bNtUQYUdmcgY0lwOg3M8oE829c5qix/view?usp=sharing
Note that the sample random texts are only for English (the program supports and has public voices for Spanish and German too, and French maybe soon?). As I currently do not have the compute to try new stuff out I will focus on software. TFLite (greatly reduces memory and size footprint) and SAPI (would allow a voice to be used by any Windows app) support are things I had in the back burner for a long time
>>38052220
No.
>>
>>38064386
Not using anything from you, Noxfag.
>>
>>38064386
No one cares about your shit
>>
>>38064386
Your software is garbage and your model is garbage. You and Cookie are the root cause of all the drama in the PPP and it’s all because you both are compensating for how trash your work is, getting overshadowed by all the other codefags who quietly do better work than both of you combined.
Both of you will be slowly forgotten. The only thing that people will remember as your and Cookie’s sole contribution to the PPP are your incessant shitposts and abhorrent behaviors. You will never be a respected contributor to the PPP.
>>
>>38064478
unfathomably based
>>
Delta is trash yes, but how about someone step up to make a competor to DeltaVox instead of trashing the thread with long posts about why he is trash. Remember there is no model that uses only cpu. Gpus are stupidly expensive now. I want Delta to leave as much as everyone else here, but before he does i want to see competition
>>
>>38064544
There's literally a website that I don't even have to mention because everyone uses that instead already
>>
>>38064548
Yes 15 exists but he can update it at any time. Talknet exists also but not everyone is a voice actor that can make perfect reference audio
>>
>>38064544
Why does it matter that a model uses CPU only if it gives garbage results like Deltavox? This “participation medal” mentality bullshit needs to stop. Delta’s work is trash and he has no value to the PPP. Not a single person uses his model for serious projects because everyone knows it’s awful.
>>
>>38064548
Putting all your eggs in one basket is absolutely fucking retarded. An open source alternative would be ideal but barring that a downloadable offline program is also fine. Right now if you ignore Delta and Cookie there's only Talknet and that's not a good place to be in.
>>
>>38064568
Becauaw scalpers are the shittiest people in existance. Look for a 3060 at MSRP
>>
>>38064568
>Why does it matter that a model uses CPU only
Because anybody can run it on thier own hardware as opposed to the usual models which require a semi recent nvidia gpu.
>>
>>38064575
But, putting all of the eggs in to 1 basket is Impressive, which will move more people than a safe and even distribution, which could even go unnoticed
>>
>>38064575
Amazing, so now we're in a Stockholm syndrome situation where Delta and Cookie are holding the PPP hostage with a gun to its head and some people STILL want them here just because there aren't that many codefags around. Simply amazing.
>>
>>38064599
Yes, exactly that. I think they're massive faggots but I care more about preservation of FiM. You don't have to suck their dicks off after each of their posts.
>>
>>38064599
Make your own models and I'll join you on the Cookie/Delta witch hunt right away.
>>
>>38064599
>Delta and Cookie are holding the PPP hostage with a gun to its head

Holy projection, you are the one throwing a tantrum after every one of their posts for something irrelevant to the PPP itself, get off my board and take your cancel culture cancer with you.
>>
>>38064635
I genuinely don't understand how people like you can think this way. Just because someone made a half-assed model, they deserve unlimited praise and are above any wrongdoing? Do you not see how fucking retarded that thought process is?
>>
>>38064660
I mean, if Hitler hadn't killed so many Jews we wouldn't have aspirin now.
>>
>>38064635
>>38064637
Looks like the Discord cavalry has arrived to defend their lord and saviour Cookie!
>>
>>38064660
>they deserve unlimited praise
Absolutely not. Delta and Cookie don't deserve any praise now. Hell I voted for them for Worst Namefag in the end of year awards.
>are above any wrongdoing
Also not true, I hope that everyone can agree that they both fucked up.
All I'm saying is that what they did (especially Noxfag) they also contributed to the PPP in the past and may contribute more in the future. Using >>38064671's analogy, if Hitler and Stalin came back from the dead and wanted to post in the PPP they should've been allowed to.
>>
>>38064773
This. Codefags are codefags, not pastors. While it would be nice if they acted like role models, that is not essential.
If Oskar Dirlewanger got resurrected and decided to contribute to the PPP I would have no problem with it.
>>
>>38064386
Thank you Delta! I'm in the middle of a voice project with my students and look forward to testing 8.0B in our workflow. At our current pace we'll have new models to share in the coming months.
>>38064432
Ignore and forgive pre-freshmen like this, I get the impression that there are many emotionally strained adolescent posters on these threads. The pandemic is taking it's toll on everyone.
>>
>>38064773
The closer analog would be if Hitler came back and said "I can give you the cure for all cancers but you gotta let me implement the Final Solution."
>>
>>38064904
Look at the top of the thread to learn the NNM crap. Delta deserves to be called out, but excluding him from the community altogether is just a idiotic move
>>
>>38064904
A wonderful opportunity, professor. Let me know if you need anything, as I consider it one of my duties to spread free and open-source TTS. If you have trouble training open an issue in the TensorFlowTTS repo.
>>
>>38064922
Isn't there supposed to be a downside in that analogy?
>>
File: lol.png (812 KB, 1360x809)
812 KB
812 KB PNG
If you are shocked that [insert 4chan poster here] is a shitposting dickhead, then I don't know what to tell you.

Peope are acting like twitter retards desperately trying to cancel a dude for saying mean things in their hugbox generals. It's pathetic.
>>
>>38065033
I think this anger comes out of a sense of betrayal. People idolized the codefags and got angry when realizing they're not squeaky clean, like this post explains https://desuarchive.org/mlp/thread/37942170/#38014338
>>
>>38065033
I think the issue here is the extent of his trolling. Supposedly Nox is responsible for killing some waifu generals or something. Idk, I've looked through the archive and haven't seen much out of him that's worse than his "God Xfags are pathetic" posting. If that IS true, I'd at least understand WHY everyone's so fucking ass damaged over this dude, though it still wouldn't justify this throwing out of the baby with the bathwater.
>>
>>38065142
>Supposedly Nox is responsible for killing some waifu generals or something
He denied this, apparently
https://desuarchive.org/mlp/thread/38017734/#q38019517
>>
>>38065175
There's no way to know if that post is more or less true than his original bragging.
>>
>>38065210
I think that's the crux here. Nox has said so much conflicting shit, so the people who want to hate him assume the worst is true, and the people who don't are more willing to give the benefit of a doubt (or just don't care/aren't paying attention) I doubt this is something that'll ever be resolved unfortunately.
>>
>>38064904
Lol you aren’t a fucking professor
>>
>tfw older than these threads
>>
>>38064904
>>38065020
This is the most pathetic attempt at a falseflag I’ve ever seen
>>
>>38065491
This is the most pathetic attempt at a callout I’ve ever seen
>>
>>38065498
Whatever you say, professor who's totally real.
>>
>>38065244
I really don't understand what he's trying to achieve by constantly gaslighting everyone in these threads. I don't know what he did as his identity as Noxfag either, but jreading all his posts here under the guise of Delta pretending like nothing is happening makes me think that something sinister is happening. Hell, it's pretty obvious that the "professor" here >>38064904 is merely a character made up by Delta, but what I can't wrap my head around is what the point of his posts are. Why does he keep doing this? At least Cookie had the common sense to leave the PPP after the immense amount of backlash toward him.
>>
>>38063625
Can it wait a couple more days? I'm trying to get the last few cases of broken epubs working, and I didn't save a commit from the last working state (i.e., I still don't use git properly). I'm also waiting on a response from knighty so I can get access to the fimfiction api to update the index information when I re-download stories. Worst case if I don't get a response, I'll post the script with caching, updated chapter text, and updated wordcounts, but no other updated index data.
><|startoftext|>Fimfic text<|endoftext|>
The v2 version from the last thread should support this. You'll need to modify the template file per the instructions in the last cell. You should change the template file to look something like this:
<|startoftext|>{ join chapters.text with "\n" }<|endoftext|>
>>
>>38064904
What type of idiot professer uses 4chan?
If you are somehow real leave and never look back. The more likely case is you are Delta or Cookie
>>
>>38065642
It's 100% Delta. I can even tell from the way the "professor" types.
>>
>>38065547
>but jreading all his posts here under the guise of Delta pretending like nothing is happening makes me think that something sinister is happening.
See >>38014816
>>
>>38065700
Exactly, so that means that we SHOULD be excluding Delta from the PPP, no? He's still shitposting and pretending like nothing's happening, but people seem to be giving him a pass because, well, he's posting under Delta now.
>>
>>38065712
You're not doing a very good job of excluding by angrily replying to a mundane post and starting more arguments.
>>
>>38065723
>angrily replying to a mundane post
Which post are you referring to?
>>
Get back to work, fags. You're not going to resolve anything in this thread.
>>
Discussion Topic:
I'm currently going over a dataset reformat and clean-up, so I've got a good opportunity to change things up.
Which method of splitting characters voice lines would you prefer?
a) Split by Season, e.g: Twilight_S1, Twilight_S2, Twilight_S3
b) Split by Year, e.g: Twilight_0, Twilight_1, Twilight_2... or Twilight_2010, Twilight_2011, Twilight_2012...
c) Split by Data. Where I split the data into X segments (chronologically ordered) and train a voice on each. (this one might be better for the AI since each embedding will have a similar amount of data)

And another question, which speakers (if any) would you like to control the date/season/age for?
>>
>>38065803
How about clean up the thread of the mess you and Delta made
>>
>come to this thread
>people are still seething about delta and wasting replies
>whole thread dedicated to shitposting
kek, never change /PPP/
>>
>>38065841
>The only actual developmental progress being made on PPP is on Discord.
>Change my mind.
>>
>>38065841
Welcome to the PPP, ex-residence of content and productivity, now home of pointless namefaggotry and useless seething.
>>
>>38065830
How? Neither of them are jannies.
>>
>>38065803
It seems like splitting by season/year should only be done if it coincides with a change in the voice. e.g. how AJ's S1 voice is higher pitched than in later seasons.
>>
>>38065930
That makes sense but is harder to implement through programming (which I'm hoping to do since I don't personally have the time to review every file and I also dislike hard-coded dataset segmenting).
>>
>>38065946
In that case, splitting by seasons seems fine, assuming that voices are more likely to change between seasons than years. Equal amounts of data could help the models learn, but might backfire (e.g. a model trained half on S1 AJ and half on S2 AJ).
Automated approaches like clustering SV embeddings or looking at the F0 distribution could work too. Depends on how much time you have and which biases you want to inject.
>>
>>38066139
>clustering SV embeddings
Definitely an interesting idea. I'm not sure it'd handle the pony speakers well with the large variety of emotions, but that's something to consider for segmenting lower emotion datasets.
I'll look into it more later, especially if it's effective on pony speakers.
or wait, do we WANT emotive lines to be clustered? I see the potential for that too. Might be hard to label what each cluster is for the user though or might be outperformed by simply using different speaker ids for the hand labelled emotions. Ahhh, too many options again. Fuck it, start with the simple options.
>>
>>38065677
Funny, I once suspected one of my students might be Cookie or Delta as their code resembled named publc github code but no. At large research Universities in the US we use every resource available for instruction, if the code fits, so to speak.
>>
>>38066139
>looking at the F0 distribution
I don't think that'd work. If I looked at F0 distribution of each file then short files would end up at the skewed ends of the distribution and the model might struggle with generalising to different durations. Also I think biasing F0 probably won't help end users too much.
I could also look at F0 distribution of each folder, but that assumes the datasets are organised in a reliable way, so best to avoid if possible.
>>
>>38066210
You're a moron if you really think you're convincing anyone Delta
>>
Somebody post a funny pony audio right now or I'm going to kill myself.
>>
>>38066230
https://u.smutty.horse/lyuoislbcqn.wav
This is bit old but its funny and pony The Truth Revealed from good poni folder
>>
NEW THREAD
>>38066342



Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.