[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/mlp/ - Pony

[Advertise on 4chan]

Name
Spoiler?[]
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File[]
  • Please read the Rules and FAQ before posting.
  • There are 72 posters in this thread.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Janitor applications are now being accepted for the next ~72 hours. Apply here.

New board added: /xs/ - Extreme Sports

Self-serve ads are available again! Check out our new advertising page here.


[Advertise on 4chan]


File: AltOPp.png (1.54 MB, 2119x1500)
1.54 MB
1.54 MB PNG
TwAIlight welcomes you to the Pony Voice Preservation Project!
https://clyp.it/tm03e5en

This project is the first part of the "Pony Preservation Project" dealing with the voice.
It's dedicated to saving our beloved pony's voices by creating a neural network based Text To Speech for our favorite ponies.
Videos such as https://youtu.be/GuJKTodX1FA. or https://youtu.be/DWK_iYBl8cA have proven that we now have the technology to generate convincing voices using machine learning algorithms "trained" on nothing but clean audio clips.
With roughly 10 seasons (9 seasons and 5 movies) worth of voice lines available, we have more than enough material to apply this tech for our deviant needs.

Any anon is free to join, and many are already contributing. Just read the guide to learn how you can help bring on the wAIfu revolution. Whatever your technical level, you can help.
Document: https://docs.google.com/document/d/1xe1Clvdg6EFFDtIkkFwT-NPLRDPvkV4G675SUKjxVRU

We now have a working TwAIlight that any Anon can play with:
https://fifteen.ai/
https://derpy.me/vCzm2 (48KHz Training)
https://derpy.me/hdJQF (48KHz Synthesis)
https://derpy.me/NR7Xi (Ngrok Synthesis)
https://derpy.me/YTJ94 (Guide)

>Active Tasks
Cookie is working on controllable speech
Research into animation AI
Research into pony image generation

>Latest Developments
Clipper sorts animation files (derpy.me/O24pp)
15 working on a new model (final2.15.ai)
Clipper looking for AI skit ideas (derpy.me/JfVsA)
New datasets (>>36548549 >>36550153 >>36567503 >>36545686)

>Voice samples
https://derpy.me/fHs3K
https://derpy.me/O1xdh

>Clipper Anon's Master File 2.0:
https://mega.nz/#F!L952DI4Q!nibaVrvxbwgCgXMlPHVnVw
https://mega.nz/folder/0UhSmYAB#WBrB-qCprQTofkAhwMp5CQ

>Synthbot's Torrent Resources
https://derpy.me/ZJNca

>Cool, where is the discord/forum/whatever unifying place for this project!?
You're looking at it.

Last Thread:
>>36536892
>>
FAQs:
>READ THE DOC
Do it now
derpy.me/V7cMp

>Where can I find things made with the voice AI?
In the Good Poni Content folder: derpy.me/23EUs

>Did you know that such and such voiced this other thing?
Yes. We are very much aware. It is best to keep to official audio only unless there is very little of it available. If you know of a good source of audio for characters with few (or just fewer) lines, please post it in the thread. 5.1 is generally required unless you have a source already clean of background noise. Preferably post a sample or link. The easier you make it, the more likely it will be done.

>What about fan-imatitions of official voices?
No.

>How do I make the voices?
Several guides are available. In depth guides on how to do training and synthesis (making the ponies speak) are in the doc. If you don't want to use the navigation bar in the doc, the sections are also directly linked in the OP. If you want to use the WiP 48KHz notebook, some kind Anons have put together some image guides for you.
48KHz Training: derpy.me/wW2hX
48KHz Sythesis: derpy.me/j4MXQ

>How do I make the ngrok links?
Doc: derpy.me/SfIhY
Video: derpy.me/qYgIp

>Where are all the voice samples?
In the doc.

>Is a place I can find all the pony models?
In the doc.

>What about muh waifu?
Check the doc.

>Will you guys be doing a [insert language here] version of the AI?
Probably not, but you're welcome to. You can however get most of the way there by using phoenetic transcriptions of other languages.

>What about [insert OC here]'s voice?
Not a priority. Again, however, you're welcome to. There are already people doing this.

>Where can I view the PPP /mlp/con panel?
YouTube: youtu.be/WtuKBm67YkI
CyTube chat: pony.tube/videos/watch/b83fbbfc-6d4e-4768-8deb-edb61ea38abb

>I have an idea!
Great. Post it in the thread and we'll discuss it.

>Do you have a Code of Conduct?
Of course: fifteen.ai/code

>Is this project open source? Who is in charge of this?
derpy.me/CQ3Ca
>>
File: ASmallAnchorUUUU.jpg (35 KB, 284x300)
35 KB
35 KB JPG
>>36577682
Anchor.
>>
I like how that last thread got like over 200 posts as soon as 15.ai came back online and hit the bump limit right after
>>
>>36577726
15's work has undergone so much change in just a year. Something we couldn't have imagined at the start of this project, he accomplished in just a couple months and continues to improve.

We all enjoy his work, and his work has no doubt kept some folk inspired to continue their own, as well as keep bystanders watching this or even joining in to contribute.
>>
Happy 16th birthday to 15 :)
>>
NEW:
https://a5f97c414a78.ngrok.io/
>>
>>36577682
The final2.15.ai link in the OP doesn't work. How are people accessing the site?
>>
>>36578006
Just go to the regular 15.ai site. The testing link will be removed in the next OP.
>>
>>36578006
are you retarded
>>
>>36578019
>>36578026
why would you leave a dead link in the OP
>>
>>36578030
Mistakes/forgot. But fifteen.ai is linked first.
>>
File: 1587919941414.webm (553 KB, 1280x720)
553 KB
553 KB WEBM
>>36578006
>The final2.15.ai link in the OP doesn't work.
>>
First the retard in the last thread >>36577908 who couldn't get the site to work, now this >>36578006
>>
>>36577791
can't believe the site has been up for a year already. PPP is turning 2 in about 1.5 months. time moves too quickly
>>
>>36578051
Is anyone still keeping track of the number of PPP posts graph? The last time I saw that was maybe a year ago, wondering what an updated graph would look like now
>>
>>36578085
I made the most recent graph, different anon(s) made the previous graphs. was probably going to update it for mlpcon, but I could do one now if you want
>>
>>36578124
do it
>>
Reminder to link any /v/ or /co/ threads here when they happen
>>
https://vocaroo.com/1nqS9D5ssVqz

I wanted Fluttershy to say "Well done, girls, you did the right thing" at the end, but I just couldn't get a good take after several tries.
>>
>>36578146
For Fluttershy I use "|Yay!" if I want her to sound pleased. This is what I got on my first try with "Well done girls, you did the right thing.|Yay!", I dunno if this is what you were aiming for
https://voca.ro/14VlL2jWvHO4
>>
>>36578175
I was going for something more solemn, but thanks, anyway.
>>
>>36578175
Use this one instead https://voca.ro/1oeKvCYnm3hX
She sounds so smug, it's perfection
>>
>>36577693
https://vocaroo.com/1cyLe4eCHQuF
>>
>>36578195
That's a good one. Here's another one, fellas. E-points if you know the song.

https://vocaroo.com/12iPkXelR62q
>>
I've been experimenting with a system where I tag fanfics with characters, any shippings, and their ratings, and train a GPT-2 model on them. The result is a model where you can insert a prompt and have it start from there like normal, OR you can specify some tags and let it come up with something on it's own.

I tried it with a bunch of BFDI fics, and I'm probably going to do it with MLP fics too if I have the time. Here's a set of examples https://pastebin.com/TKwdx4q1

It seems REALLY prone to inserting the ship 4x into stories, even when the tags specify something completely different. It's probably because 4x is a popular ship and has a ton of fics compared to everything else, so whenever it uses anything it learned from a 4x fic it tends to completely derail things.

Posting this here because I figured this concept might be good for anyone working with GPT-2.
>>
>>36578274
Chestboobs only look bad on horses.
>>
File: 2545789.png (1.35 MB, 1204x1700)
1.35 MB
1.35 MB PNG
>>36578274
>>36578209
>>
>>36578297
Very nice edit
>>
File: 1594855918578.png (241 KB, 300x310)
241 KB
241 KB PNG
>>36578006
>>
>>36576994
https://vocaroo.com/1hpSjW2TiSP2
https://vocaroo.com/1hEO9PVobppI
https://vocaroo.com/13EQ9Ox4qvuf
https://vocaroo.com/17m96vsN6jJI

more moany pony
>>
>>36578336
https://u.smutty.horse/lzrdaxjdhzh.wav
>>
>>36578336
>>36578341
>>
I think I found a word that's not in 15's lexicon, should I email it to him? Does he respond to them?
>>
Is it just me or does Discord sound a bit off in the latest version of 15.ai? Occasionally he'll sound like himself on certain words but most of the time he sounds almost like a different character.
>>
>>36578406
Wait, apparently it was just the line I was trying to generate??? Regenerating a few more times didn't fix it, but improved things a bit. No clue why this specific line would make him sound so strange, but here's an example. It's mainly in the middle of the sentence.
https://vocaroo.com/18ncy2NqonpY
>>
>>36578404
Yes. Missing words are something he's very likely to respond to in an e-mail.
>>
>>36578446
Maybe it's a data issue? Clipper, do you know what might be happening?
>>
>>36578452
The only dataset issue I can think of that would cause that effect would be labelling two different voices as the same character and therefore confusing the model. Given how many times I've gone over the MLP voice dataset, and the fact that this strange effect seems to only be present in very limited circumstances rather than a consistent thing, I find it quite unlikely that that's the case.
My guess is that it's a quirk of the model applying it's learning from all characters to each output, something that (maybe?) will work itself out as the model is trained more.
>>
clipper, im testing the new ngrok voices and i must say, they are pretty good for most part. Obiously its gonna need more training since characters have random background crackling and/or voice changes dramatically (for life of me i cant get Cozy Glow to stop sounding like toddler high on helium) while there are few characters that are sounding 90% on the spot.
I also need to report one bug (i think?), when selecting character model that has alternative version (e.g. Mean Twilight) it will also automatically select that character og main model as well.
btw, what happened to Postal Dude voice?
>>
>>36578549
I mean, I don't think the problem is with the quality, it's just that the characters don't quite sound like themselves? They sound more like what you would imagine a cyborg version of the characters to sound like this. This is especially bad for TF2 mercs like demo and heavy.
>>
Last link from me for the day probably
http://f6883fc417a4.ngrok.io/
>>
File: choc.jpg (29 KB, 630x414)
29 KB
29 KB JPG
>>36578336
https://vocaroo.com/146OV9ecR12K
https://vocaroo.com/1dww4Kiye0It
https://vocaroo.com/13UoWyEZgHuq
https://vocaroo.com/12VlGr2PhpTt
https://vocaroo.com/1nBfKd2s6P0D
https://vocaroo.com/1cRXuPx11nJ7
>>
>>36578566
not sure if im correct but could it be that there is simply too many voices and the training mode just shits its pants when it tries to map out hundreds of different voices ?
Like im messing with the nameless model and its pretty 50/50 on him saying stuff correctly (im trying to get the word "working" but the model keep making the sound "wowking").
>>
>>36578616
Well by Nameless model you're talking about the ngrok? I think we're talking about the 15.ai Discord model.
>>
>>36577302 (Philomena Scraper Anon)
Any chance you can go through the boorus and check which ones have database dumps, which ones have a JSON API, and which ones need to be scraped the hard way? We only need to scrape enough information to figure out:
- What is the image URL
- What are the tags

>>36577194 (Practical Anon)
I think we're starting with butts, but we should be good with whatever anons feel like cropping, as long as it's tagged appropriately. If we don't have enough images right now, we can just wait a year or two until the algorithms are good enough.
>>36577210 (Cookie)
How long did it take to scrape that? Do all of those images have the original image filename or the image id? Any chance you can upload those 800GB as split tar file?
>>36577323 (Cookie)
>>36577352 (Philomena Scraper Anon)
(You)ing these so I can reference the code samples later.

>>36577461
Tagpls looks promising, but there's a lot it doesn't do that we would need. For example:
- We can't label derpibooru tags as relevant/irrelevant for a given experiment and have the results update the set of images under experiment.
- We can't train a YOLOv3 model on already-labeled images and display results to labelers to reduce their workload.
- The export functionality seems to be broken.
- I don't think it's open-source, so I can't modify tagpls to handle any of this.

Maybe we can share tags with each other since our data should be compatible.
>>
>>36578549
>clipper, im testing the new ngrok voices
What made you think the new ngrok was written by Clipper?
(for future reference, I'm the one to nag about bugs)

>for life of me i cant get Cozy Glow to stop sounding like toddler high on helium
Most of the Children characters are struggling. I've added pitch conditioning which is currently in training. That should resolve some of the issues.

>when selecting character model that has alternative version (e.g. Mean Twilight) it will also automatically select that character og main model as well
That IS definitely a bug. Thanks for pointing it out, I'll have a look into it.

>>36578566
>it's just that the characters don't quite sound like themselves?
I think it's related to the fact this model is a VAE. It's much more stable and can produce a large variety of outputs, and can be controlled better once I get around to it, but it takes longer to learn a latent Z code that doesn't contain speaker information.

>>36578616
>im trying to get the word "working" but the model keep making the sound "wowking"
Hmmm, I can't say for sure if that's because of bad training data or because of a failure on the model side.
I have not checked every single file since there's too much, I just remove ones that perform exceptionally poorly. If a file is only slightly incorrect that it will normally be left in the training data.
>>
Can you guys tell which character this was?
https://voca.ro/161e9mtsr8zm
(this is from the ngrok)
>>
>>36578832
the input was "faggot nigger faggot" if anyone is wondering lol
>>
>>36578832
Sounds like Derpy lol
>>
>>36578784
>How long did it take to scrape that?
It took multiple days, but I did it on a 17mbps connection so it should go faster for most people.
>Do all of those images have the original image filename or the image id?
{image id}.{ext}
>Any chance you can upload those 800GB as split tar file?
I deleted a lot of the data to save space.
I can redownload it all again, but if you have a large enough Google Drive, I think it'd make sense to use a Google Colab session and just download straight into the Drive.
>>
>>36579026
800gb seems a bit small for all of Derpi. My archive is approximately 2.5TB compressed.
>>
>>36579084
https://derpibooru.org/search?q=score.gt%3A100%2C+tag_count.gt%3A10%2C+-screencap%2C+-meta%2C+-edit%2C+-transparent+background%2C+%28explicit+OR+safe%29&sf=id&sd=desc
That's the search query left in the source code right now. From what I remember, I ended up using a minimum score of 100 so the number of safe images was roughly equal to the number of explicit images. (first goal with that dataset was creating a basic NSFW detector)
>>
>>36578832
>>36578976
it was actually Twilight but to me it doesn't sound like her at all :/
>>
Is it really necessary to have the massage about crediting 15.ai in so many different places?
>>
>>36579419
https://twitter.com/fifteenai/status/1362927419779334147
>>
>>36579419
>>36579443
Even with that many reminders Twitter retards still manage to fuck that up
>>
>>36578784
I've put together a super basic metadata scraper. Gathers image ID, tags, and location of full res pic. Should be easy to modify for other fields and change the format of the output. Can get it here:
https://pastebin.com/QzK0ZTRm

I only know of Derpi doing the database dumps. At least quickly looking I don't see mention on the other boorus. Would recommend scraping from PonerPics as it has imports from both Derpibooru and Ponybooru. Twibooru might have some original stuff, but likely not too much that hasn't also been added to the other boorus. Also, Twibooru is based on BoR and not Philomena so this script would not work for it. It does have an API though.

>Philomena Scraper Anon
It's just Doc Anon again.
>>
>>36579443
i mean it's an easy thing to forget. most internet stuff doesn't involve credits, you just throw some junk up on the screen. this is special, so it should be credited, but.. i'm sure it's not a lot of people's habit.
>>
>>36578784
quick search of "butt,-eqg,-anthro,-human,-animated,score.gt:0" with the default filter on derpi yields 13.7k images. should be able to get at least 5k, more if NSFW is included. seems okay, transfer learning should help

regarding boorus:
- derpibooru: has a DB dump and API. DB dump is best for large-scale scraping, though you need 31 GB of space
- ponybooru, ponerpics, bronyhub: have the Philomena API (enabled even if the docs page is not, e.g. ponerpics) but no dumps. maybe you could ask admins for dumps
- twibooru: uses the old API from booru-on-rails. no dumps, maybe talk to admins
- rainbooru: no API or dumps
- bronibooru: shutting down in two weeks. probably no API. I think TPA will host a dump
- onlycomfy.art: curated selection of comfy art, mostly from derpi I think. API for reverse image search
- others: manebooru (offline, also a derpi import I think), lyrabooru/foalcon (too specialized)

for any Philomena booru, you can use DerPyBooru (https://github.com/joshua-stone/DerPyBooru) for convenience
getting images from derpi is probably enough, since it has the most images and largest community (so more people voting). using just a single booru also means the tags will be more consistent
the recent boorus (ponybooru, ponerpics, bronyhub, twibooru) are mostly imports from derpi, but not completely--could be worth scraping if you want to deal with deduplicating, checking tag consistency, etc.
I'm not sure about rainbooru/bronibooru, but you'll have to check tag consistency and scrape (or wait for a bronibooru dump)

>We can't train a YOLOv3 model on already-labeled images and display results to labelers to reduce their workload.
some of the server-side software supports this:
CVAT: https://github.com/openvinotoolkit/cvat#deep-learning-serverless-functions-for-automatic-labeling
Scalabel: https://www.scalabel.ai/doc/auto-label.html
requires setup though. and I think there was another one but I don't remember the name
>>
>>36579564
Why not train it on NSFW images (excluding eqg, anthro, human, and a few other things) outright?
>>
>>36579577
yeah, you could. just depends on what the goal is
>>
>>36579564
Might want to filter humanized as well.
>>
>>36579443
Dude said he was on an iPad and didn’t see the message. Give him some slack, jesus is 15 autistic?
>>
>>36579640
He uses 4chan like the rest of us, so probably.
>>
Anyone got any advice for setting up a Colab for Mozilla TTS? Its sucks not being a codefag that cant make heads or tails of this shit.
>>
File: graph.png (60 KB, 2078x1280)
60 KB
60 KB PNG
>>36578132
Okay, here's a preliminary version with the recent post counts. TwAIlight pic still needs to be added and text sizes need to be adjusted. Let me know what developments in the last ~7 months should be included.
>>
>>36580314
Nice
Your convergence seems clean.
Oh wait, it's not a NN, it's (us)

What append early January? 15's reopening or something?
>>
>>36580316
Looks like it was thread 70, which only lasted ~4 days: Jan 2th to Jan 5th. 15.ai was brought back online on the 3rd/4th after at least a month of downtime (multiple months?). Lots of clips/content, and discussion about animation, potential legal problems, and 15 (after the site was posted on HN).
>>
File: 1601181672599.png (148 KB, 2078x1280)
148 KB
148 KB PNG
>>36580314
Fake news, just got the Fulton County addendum. Here's the real chart.
>>
>>36579640
i have an ipad its fucking unmissable because of the layout
>>
>>36577693
it ain't mine, but i will keep posting until i have confirmation that is going to be used
Lilac (new).zip https://www21.zippyshare.com/v/5SawkLi5/file.html
Carol (new).zip https://www91.zippyshare.com/v/FXEwBQIp/file.html
Milla (new).zip https://www19.zippyshare.com/v/X3emKUC6/file.html
>>
bump
>>
The emotional contextualizer is astonishing tech. When you finally find the right zoomer emojis to tweak the text you can get human-tier emphasis.

55 seconds long, an argument about nothing copied from ATHF feat. Jane, Applejack, and Scout. Sorry no soundfx, I was really invested in the raw enunciations.

https://voca.ro/1mAKPaqFltRr
>>
>>36580878
This is really impressive. What's the process for finding different contextualizers? Do you just input random sentences that you think will sound close to what you want?
>>
>>36580878
Great work. It is tough finding the right emojis. Cause the other thing is even if your context sentence works for one take, it may not work for another where you want a similar or the same emotion. Certain words, for whatever reason, seem to make the character speak in a different way. Like some characters want to blow past three words at once and other words they get hung up on, they're really quiet or silent.
>>
https://cdn.discordapp.com/attachments/115996574416502798/812726749208117248/fluttershy-1613839502063-82.1289.wav
https://cdn.discordapp.com/attachments/115996574416502798/812727697499095100/fluttershy-1613839741686-78.9551.wav
https://cdn.discordapp.com/attachments/115996574416502798/812729212737552414/queen-chrysalis-1613840083257-74.8047.wav
having way too much fun
>>
File: 1600074552874.jpg (21 KB, 200x200)
21 KB
21 KB JPG
>>36581358
>cdn.discordapp
>>
>>36581137
I'm still exploring and excited what more I'll find. That was only about 30 mins of work (generation and editing, the whole thing) to make that 1 minute scenario. I only generated at most twice for each sentence since you get three possibilities with each gen now. I use very basic contextualizer phrases for that. Mostly I used "i'm tired" which gave me that more exasperated, downbeat vibe, "lol", "Help me!", "Fuck you". Really basic stuff but it worked so well. It used to take me like 45 minutes of just generating to get sort of what I was going for, but in 30 mins I was able to do all that. If I dedicated even more time I'm absolutely positive it could have been better.
>>
>>36580343
holy fuck that bump lol
>>
>>36578813
I vaguely remember that at one point with the old Ngrok model, you said something about the possibility of making hybrid voices halfway between characters. Would that still be a possible future addition with this new model? It'd be a pretty neat thing to mess with.
>>
>>36581878
Would hybrid voices be a feasible way of giving a voice to characters that don't have one, without it ending up sounding too much like "X character but off-brand"?
>>
>>36581878
>Would that still be a possible future addition with this new model?
Yep, just like text|text2, no additional training required, just need to know how to write a website/interface for the feature.
>>
>>36580878
nice
>>
>>36579640
The site literally stops you for 10 seconds telling you to credit the site, it's not 15's fault lol
>>
>>36579640
>Give him some slack, jesus is 15 autistic?
I'd be annoyed too if I spent years working on a neural network only for some retard to take all the credit despite insisting that all you do is give credit like a decent human being
>>
>>36578784
>>36579469
>>36579026
>>36579564
CVAT looks good enough. We'll just need the list of image URLs so Clipper can dump them into the site and start labeling.

>>36578494
Do you have a query in mind for what you want to label? We can turn that into a giant dump of image URLs. If not, maybe we can use:
>butt,-eqg,-anthro,-human,-animated,score.gt:0
>>
So anybody doing anything interesting with their waifu's voice?
I am just making twilight say Good morning <myname> It is time to get up my love.

OO <myname> somebody is calling you ! I wonder who it could be !
ect
>>
>>36582570
A list of URLs should work, those can be easily imported to the VIA thing posted in the previous thread. As for tags, "butt" is a bit vague, I'd say use anatomically correct instead, those kinds of images should be more consistent. Also filter pony life.
>>
>>36582579
We should all collaborate and work on /mlp/ makes MLP, episode 1. with actual good writing, audio editing and animation and all. anyone else down for a project like this?
>>
>>36582579
awhile ago someone made a Princess Celestia car navigator voice.
>>
>>36582593
Here's the fully query:
>anatomically correct,-eqg,-anthro,-human,-humanized,-animated,score.gt:0,-pony life

>>36579469
>>36579026
Can either of you generate a list of image URLs that match that query? Preferably sort them in descending order by score. If not, I can do it later tonight.
>>
>>36582593 >>36582620
> A list of URLs should work
It is just a text file with 1 URL per line or something like that?
I should have all the metadata already downloaded so I'll upload something that matches that query in a few minutes.
>>
>>36582610
ehhh
>>
>>36582625
Yes, that should be good. 1 URL per line.
>>
>>36582610
I'm interested to see how quickly that would devolve into absolute degeneracy
>>
>>36582610
>/mlp/ makes MLP
>good writing
>>
>>36582654
that's the problem, everything has become so polarized lately. In my day, we enjoyed small, tasteful amounts of things that you're not allowed to have on TV (or whatever medium).
but then it was like STRICTLY BANNED (even on the internet) so.. if you're going to go 1% forbidden content you might as well go 100% and taht is so dull.
it's supposed to be like, hey let's pretend this isn't forbidden
>>
>>36582672
First rule of content creation is including nigger in any project to keep them away
>>
>>36582593 >>36582620 >>36582649
https://u.smutty.horse/lzrpyvjyakg.txt
This contains everything from that query with score greater than 100
.
https://u.smutty.horse/lzrqbwtyjpx.txt
This contains everything from that query with score greater than 0
>>
>>36582620
On it. Shouldn't take very long. I've modified the scraper to record the scores as well. Using PonerPics as source as it has everything Derpi and Ponybooru have, as well as it's own stuff. If you want me to rerun later on Derpi, let me know. Can get the fully configured script here:
https://pastebin.com/t1XvZSRG
>>
>>36582708
Note that >>36582705 put the lists up for derpibooru. If you can run that on the other boorus, that'd be great. Also, can you add that to the Master Doc for future image scraping tasks?

>>36582593
You're up. If you can post some early results of labeling, that'd be great. We (probably Cookie) can use that to start playing around with networks.
>>
>>36582705
Looks like you beat me to it.

>>36582620
>>36582733
Here's what I generated. It includes all pics and not just ones above zero score. Since it's ordered by score it would be easy to edit that.
https://u.smutty.horse/lzrqdwnrxkq.txt

>can you add that to the Master Doc for future image scraping tasks?
Will do, just got to find somewhere to put it.
>>
>>36582733
First 100 on the list from >>36582705. For images that were unsuitable, I just skipped them without labelling them. Hopefully that's a good enough filter system for rejected images.
https://u.smutty.horse/lzrqechexsw.csv
>>
>>36582743
To clarify, I used the list of images with the minimum score of 100, in case that matters.
>>
>>36582743
I think it might be good to label direction of ponies in the images. Like directly facing, lean right/left, right/left. Just because I think some of the odd generations from the other project might be from it combining a left and right view no each side. An idea anyway.
>>
>>36582740
>>36582708
Can you update the script to dump just the image URL, one per line? That way people can just copy/paste the results into any labeling tool that accepts URLs.
>Looks like you beat me to it.
We'll still need the URLs for non-derpibooru images.
>>
>>36582770
>Can you update the script to dump just the image URL, one per line?
Easy to do:
https://pastebin.com/ujc2UfdV

Here's the list with only URLs:
https://u.smutty.horse/lzrqhvslwct.txt
>>
Does anyone else still think it’s surreal how we went from "is this even possible" to "oh shit this might be possible" to "this is already possible" in less than two years
>>
>>36577693
>>36582705
>>36582770
I've labelled the first 2000 images on the URL list, again skipping any that were unsuitable.
https://u.smutty.horse/lzrqechexsw.csv (1 - 100)
https://u.smutty.horse/lzrqsqaofvw.csv (101 - 1100)
https://u.smutty.horse/lzrqspzjdmo.csv (1101 - 2000)
This only took ~2 hours to do, so I don't think trying to use YOLO for the remaining ~14000 will save any time compared to me just doing it manually. Could you let me know if these labels are useful? If so, I can (probably) do all the rest tomorrow.
>>
>>36583042
> This only took ~2 hours to do, so I don't think trying to use YOLO for the remaining ~14000 will save any time compared to me just doing it manually
Oh wow, okay yeah you might be able to do it before I can train a YOLOV3 for it.

>Could you let me know if these labels are useful?
I think they're good. Looks like x and y correspond to the top left corner of the selection?
Then assuming that's correct, I can do x+width and y+height to obtain the bottom right coordinate.

>>36582819
Do you know of the image IDs from PonerPics match the Derpibooru ones? We might have to explicitly label the source site in future.
>>
>>36583057
>Do you know of the image IDs from PonerPics match the Derpibooru ones? We might have to explicitly label the source site in future.
There are two types of imports on PonerPics: The older images are staff imports. These will share image IDs with Derpibooru. Then there are user imports which imports the images from other boorus within about 30 minutes of their posting. Every so often the staff will make another import from Derpi and merge the user imports into these, so that they share the same ID. All user uploaded images on PonerPics will have an ID above 6000000 so that they are easily identifiable.

>>36583042
I wonder what way we could split this up? Perhaps by block of IDs?
>>
>>36583075
Should also note that all imports are marked as such in the tags. Derpi imports are tagged "Imported from Derpibooru", Ponybooru imports are tagged "Imported from Ponybooru".
>>
>>36582672
I remember when Blaster Nation's advertisers threatened to pull everything if they didnt go back like 100 pages and censor out a casual topless scene. that was basically the beginning of the end. they caved, then every webcomic, every web video, all 100% sharia law, leaving only flat-out porn (which is what brad and leslie did next, and it was great until it was awful)
>>
>>36583036
this is all lagging way behind where it should have been by now.
>>
>>36583092
Elaborate
>>
>>36581358
atleast use anonfile or u.smutty.horse
>>
>>36582610
>He doesn't remember /mlp/ makes mlp
killed by discordfaggotry
>>
>>36583192
throwback to when the vocodes faggot tried to "recruit" on dicksword for his shit cashgrab website
>>
https://www.youtube.com/watch?v=86QU7_SF16Q

Can this technology eventually be used to (relatively) easily edit Twilight's wings out of every episode and comic?
>>
https://twitter.com/JtSenop/status/1363297627941441537
>>
>>36583117
given the hard work put in and the advancements we saw, it took waaay longer than it should have. many areas of tech are like that, especially medical stuff. it's kind of depressing seeing how close we were 20 years ago and how little happened since then
>>
>>36583281
She'd still have the weird smallheaded proportions.
>>
>>36583422
I don't really follow. Doesn't that mean that we've reached this point BECAUSE of these advancements we've seen?
>>
>>36583401
Lmfao
>>
>>36583467
I'm saying that while the changes in the last two years are impressive, the changes over the previous two decades are fucking shameful. There's a real "Fuckin' finally, what took you so long?" feeling to every recent development.
>>
>>36583479
Eh, that's typically how scientific progress is made. One person discovers something new and takes it as far as possible. The only problem is that being that person requires both luck and commitment so they're very few and far between.
I'd say that we (this whole project) have actually been very very lucky.
>>
File: 1501545806274.gif (28 KB, 210x210)
28 KB
28 KB GIF
Why cant I get these mares to just sing in straight eighth notes. Do I have to stitch each word together individually?!
>>
>>36583759
How do you get them to sing
>>
>>36583766
How did I?
>>
>>36582610
You could get the animation and writing down perfectly, but it also falls on the performance of the voices. And as clear as voices like 15's have become, they still need to be modified manually to at least sound like they are emitting the emotion you want them to.

But that just depends on if the goal is to have the voices sound as natural as possible or to be as clear as possible. When it comes to the audios I make with it, I try to get both, so the audio process is more tedious for me.
>>
>>36583243
>vocodes patreon is still at $0
lmao
>>
Anybody know how to get a decent or good whispering tone for the 15 models? I used to use "|I like to whisper" most of the time. Not sure if it was a placebo, but it used to work alongside ",,,,," at the beginning of sentences, but it's not as effective at the moment, so I'm curious to know how you folks have tried this.
>>
>>36577693
>>36582770
>>36583042
>>36583057
So I've put together a tool to quickly label pony plots. I wasn't sure what fields we would need so I did horizontal direction (which way the pony is facing), vertical direction (looking from above/below/middle), and distance (near/mid/far). Once we've figured out what fields we need I can modify as needed. As is the tool will load a list of URLs from a txt file and export into a .plot file (just to make it easier to keep track of which file contains processed data). I think for splitting up the work we could package up a list of txt files for people to download, process, and then submit. That way you're not working on too big of a block at once. You can try the tool below, let me know what you think:
https://mega.nz/folder/NW5UEboD#cmfmR1GH879T-fYQ9uceZA

Some example files are below:
https://u.smutty.horse/lzrthlpgboi.txt
https://u.smutty.horse/lzrthlpfyab.plot
>>
>>36583401
my fucking sides
>>
File: LinkSnicker.jpg (18 KB, 256x353)
18 KB
18 KB JPG
>>36583401
Kek, nicely done.
>>
>>36583401
Beautiful
And also that Twilight sounds literally perfect, anyone would easily be fooled that Tara might have said actually said this
>>
>>36584495
sorry what? we're doing ai horsebutt now?
that's the best thing I've ever heard
>>
>>36584519
From the little I've used them thus far, the intonations of the current models are insanely good. It's easier to get lifelike intonation then ever before, in most cases.
>>
>>36584536
This, I don’t understand when anons say that the older models were more lifelike than the new ones. The new ones are such a huge improvement they’re not even comparable.
>>
>>36583401
Haha I can’t stop watching this, it’s cracking me up every time
>>
File: 1226193__tbc_placeholder.jpg (172 KB, 1360x1176)
172 KB
172 KB JPG
>>
Has 15 not revealed the site is back up to his 10 year olds? It's been lovely not waiting in line.
>>
>>36584495
I might be doing something wrong. I'm not sure if I'm running it properly.
>$ apt-get install -y libnss3 libxcomposite1 libxcursor1 libcups2 libxss1 libxrandr2 libatk1.0-0 libgtk-3-0
>$ ./nacl_helper_bootstrap nw
>https://pastebin.com/k6EvErgM
>>
>>36585471
I'll admit I haven't tested on Linux. Scirra's advice is to chmod a+x the nw file and run that. Someone on the forum suggested running "./nw package.nw". I'll see what I can figure out tommorrow (it's quite late now).
>>
https://u.smutty.horse/lzrwnatohlq.wav
>>
File: grid.png (1.61 MB, 2760x1676)
1.61 MB
1.61 MB PNG
here are the results of using CLIP on TPDNE with the Mane 6 as test cases. for each row, the text input was just the name of the character (e.g. "Twilight Sparkle"). it works by generating 1000 random inputs, picking the top 10 (loss based on CLIP, discriminator, and L2 regularization), and then optimizing each picture for 100 iterations. truncation was arbitrarily set to 0.5 because 1.0 was too high (model was too "creative")
clearly, CLIP knows Twilight, Rainbow, and Pinkie. Fluttershy and Rarity are okay, but it fails on Applejack (rest in peace) and "Princess Celestia" (didn't bother including). haven't tried it with more complex or abstract prompts yet
I used this notebook: https://github.com/nagolinc/notebooks/blob/main/CLIP_%2B_TADNE_(pytorch)_v2.ipynb
I just had to make some tweaks because the TPDNE model is internally a 512x512 model, but produces 1024x1024 images. I'll post my version of the notebook tomorrow
>>
>>36585587
>AJ
huh?
>>
File: 1602534652205.gif (2.2 MB, 558x428)
2.2 MB
2.2 MB GIF
>>36585587
>Applejack
>>
>>36585587
What the holly fucking fuck in hell!?
Are you telling me ALL the images are IA generated!?
One can now ask for Twilight Pic, and a computer can generate as much image as one's want!?

ARE YOU FUCKING TELLING ME THAT THE FIRST STEP TOWARD INFINITE PONE CONTENT IS ALREADY BEHIND US!?
>>
>>36586026
now we need to combine all of that with text generator and animation generator and we are set for infinite pony episodes.
>>
>>36586026
This Pony Does Not Exist has been a thing for a while now. This "just" pipes that output through another mechanism to pick out specific images based on a given description. Not to downplay too much, it's neat to see, but what will be truly impressive is if we can replicate DALL-E: https://openai.com/blog/dall-e/. Not just "Give picture of Twilight," but "Give picture of Twilight holding AppleJack at gunpoint." The initial implementation isn't something you can just pick up and play with (GPT-3 is involved for one thing) but there are already open attempts to replicate it: https://github.com/lucidrains/DALLE-pytorch.
>>
>>36583057
Update on the ngroks?
>>
>>36583401
Has anyone uploaded to this derpi yet?
>>
File: 1591883332391.jpg (71 KB, 600x506)
71 KB
71 KB JPG
>>36586598
>derpi
>>
>>36586229

The post was a bit long read the Pastebin

https://pastebin.com/mZQj04HA

tl;dr Some suggestions.

https://files.catbox.moe/aop8ar.mp4
>>
>>36586330
When transfer learning the model failed to use the pitch postnet because the latent space appears to already contain all the pith information or something else is up. I think I've either got a bug or I have to restart the entire training run to get the sharp audio quality working.
Yesterday I rewrote the loss function to calculate one of the terms in log space and the model is now completely stable in half precision which means today I'll start the process of uploading my data onto Cloud and getting this model running on 8x V100's or something along those lines to speed this up.
Also added caching to the dataloader for pitch information. The CPU/dataloader was getting strained once I got the 2080 Ti's running in half precision.
Also given sometimes the model would fail to speak correctly, I'm currently retraining the AlignTTS model with
https://github.com/jaywalnut310/glow-tts#update-notes
> 2) putting a blank token between any two input tokens to improve pronunciation.
so I can provide more accurate Alignment information to the VDVAETTS model (VDVAETTS = the new ngrok = hopeful replacement for tacotron2).

My checklist of stuff to do is getting big again, but next I'm going to;
- train a normal tacotron2 and extract alignments from that model before scoring alignments and feeding into Viterbi Algorithm. I need to check if normal tacotron2 alignments perform better than AlignTTS (and by how much if so).
- try and incorporate a system for filtering out bad audio files without requiring tacotron2's soft alignments which is how I currently do it.
- train HiFi-GAN directly on VDVAETTS predicted spectrograms. Similar to training on Ground Truth Aligned tacotron2 outputs, this should give the audio a slight bump in quality.
>>
>>36585471
>>36585496
I just tested it on Debian and can confirm that all you need to do is run "chmod a+x nw" and then run it with "./nw".
>>
>colab asked me to complete a captcha to prove I was still there while my model trains
Is this new?
>>
>>36587586
I think so. It did the same thing to me while I was last hosting Ngroks. Never seen it before recently.
>>
>>36577693
>>36583042
>>36586749
>>36582770
Images 2001 - 11000, in blocks of 1000:
https://u.smutty.horse/lzsbvovhkeu.csv
https://u.smutty.horse/lzsbvoztzsl.csv
https://u.smutty.horse/lzsbvpexeze.csv
https://u.smutty.horse/lzsbvpkfqxt.csv
https://u.smutty.horse/lzsbvpoqksc.csv
https://u.smutty.horse/lzsbvptwzpv.csv
https://u.smutty.horse/lzsbvpzfwkc.csv
https://u.smutty.horse/lzsbvqeheum.csv
https://u.smutty.horse/lzsbvqjixot.csv

~5000 to go, I'll aim to have them done tomorrow.

>>36584495
I gave it a go but wasn't able to load any images from URL for some reason. I'd paste the URL in and click load but nothing happened, aside from the progress bar displaying 1 of 113 for every URL I tried. Clicking all the way through to 113 also didn't do anything. Trying to import local files directly also didn't work. Tried this with both Windows 32 and 64 versions.
>>
File: 1589385534619.png (363 KB, 720x640)
363 KB
363 KB PNG
>>36586598
>derpi
>>
>>36587741
>I'd paste the URL in and click load but nothing happened
To get it to work you need to put your URLs in a txt file, one per line, click choose file, find your file, and then click load. See the example files. When you click export, it will give you a .plot file. You can load these in the same way you do a txt file of URLs, but it'll load in your tags as well.
>>
>>36584495
>>36587741
>>36587895
I put together a demo video.
https://u.smutty.horse/lzscjivgssz.mkv
>>
>>36585496
>>36587047
I could have sworn I tried that. I spawn a new instance of my container for this, and ./nw worked just fine.
Dockerfile: https://pastebin.com/txz3W41G
Run command: https://pastebin.com/BtePC3E7
Pulse isn't necessary, but it's just part of my standard setup. I needed to start dbus for this to work.
>docker exec -u root [container id] service dbus start
>>
>>36587977
>I needed to start dbus for this to work.
Correction: even that doesn't seem necessary now. I have no idea what I changed since last night, but it works now with just that dockerfile and that run command.
>>
File: Derpy Thinking.png (94 KB, 750x906)
94 KB
94 KB PNG
>>36587741
Wait, what are you tagging?
I mean, you basically create a rectangle across the pone?
Is there a post explaining how to help that I miss?
Like, what is the tag format?
Did one "only" tag the orientation? (not the race, coat color and mane color for example? Or whatever else?)

More importantly, if the database correctly tags everything, would the AI be able to generate a new image from the given tags? (Like, ok AI, generate a blue pegasus with yelow mane, facing 3/4 and showing her genitals? Well, probably more {mane:blue,race:pegasus,mane:yellow}etc.)

>List of all images
>>36582705
>>36582740
>>36582819

>Tagged data
>>36582743
>>36583042
>>36587741

>Tool to Tag
>>36584495
>>36587953
>>
>>36588271
He's tagging pony butts by drawing a rectangle around pony butts. He's using VIA https://www.robots.ox.ac.uk/~vgg/software/via/via_demo.html to tag them. VIA has its own tagging format.
>More importantly, if the database correctly tags everything, would the AI be able to generate a new image from the given tags?
I think Cookie is doing the AI part, so it depends on what model he ends up using. Maybe >>36583057 can answer.

>>36587953
Would it make more sense to have the tool read in Clipper's CSV files instead of a URL list? Some images might have multiple butts, and the tags for each butt might need to be different.
>>
https://arxiv.org/abs/2102.09978
>TransMask: A Compact and Fast Speech Separation Model Based on Transformer
>Speech separation is an important problem in speech processing, which targets to separate and generate clean speech from a mixed audio containing speech from different speakers. Empowered by the deep learning technologies over sequence-to-sequence domain, recent neural speech separation models are now capable of generating highly clean speech audios. To make these models more practical by reducing the model size and inference time while maintaining high separation quality, we propose a new transformer-based speech separation approach, called TransMask. By fully un-leashing the power of self-attention on long-term dependency exception, we demonstrate the size of TransMask is more than 60% smaller and the inference is more than 2 times faster than state-of-the-art solutions. TransMask fully utilizes the parallelism during inference, and achieves nearly linear inference time within reasonable input audio lengths. It also outperforms existing solutions on output speech audio quality, achieving SDR above 16 over Librimix benchmark.
>>
>>36586635
Oh ho, neat! I'll have to poke around with this and see if/how well it currently runs. Thanks!
>>
File: 1604233611164.png (419 KB, 300x900)
419 KB
419 KB PNG
>>36588324
>He's tagging pony butts by drawing a rectangle around pony butts.
Six million years of human evolution and it finally leads this we truly have achieved the apex as species
>>
>>36588324
>Would it make more sense to have the tool read in Clipper's CSV files instead of a URL list? Some images might have multiple butts, and the tags for each butt might need to be different.
Maybe? I'm really not sure what all we want to tag. I was hoping once we figured out what we need that I could add it to the tool to make tagging easier. The tool's kind of a proof of concept until then. Was thinking we'd do the cropping in a second step.

>multiple butts per image
Would have to rethink how the tool operates a bit.
>>
>>36588324
>I think Cookie is doing the AI part
I don't have enough GPUs to do TTS and image synthesis at the same time (and I'm not as familiar with image synthesis as I am with TTS).
I mainly wanted to do this on the side with a few lower power GPUs to have something fun to do while VDVAETTS is in training.
> it depends on what model he ends up using.
Haven't picked one, I'd like to sweep through a few repos with pretrained models and go from there.
for StyleGAN2 with 8x V100's
>Its total training time was 9 days for FFHQ and 13 days for LSUN CAR
So I'm not planning on going all in without at least a couple of good results from initial testing.

>>36588271
>More importantly, if the database correctly tags everything, would the AI be able to generate a new image from the given tags?
I did some reading yesterday about conditional image generation (basically what you're asking), and I still haven't found a solid answer.
I know it's obviously possible (and probably very easy) to add inputs to the model, but I'm not sure about controlling outputs without affecting the output quality, especially when there is so little training data, it's unlikely we'd have enough examples of each mane colour and coat colour and that it could actually be controlled independently.

Note: I am saying this as someone that only started reading about this yesterday.
>>
File: RPG Victory Dance.gif (1.74 MB, 480x270)
1.74 MB
1.74 MB GIF
https://mega.nz/folder/oHYxmSYI#1vRaf-wG9X730dbsFN4oTg

I have excellent news. Just as soon as I brought up the 15.ai thing with /tlhg/ over in /trash/, it turns out some guy already had the clean audio channel from the first 120 episodes of The Loud House. There's still SFX, but those can likely be removed with some post-processing program like the ones you guys used. This is a gigantic first step in the goal to hear the sisters in the same way you guys listen to your favorite MLP characters now.
>>
File: thumbs up.gif (1.92 MB, 500x390)
1.92 MB
1.92 MB GIF
>>36589393
Congratulations! it's a pain for the first few transcriptions but you eventually get into the rhythm; hope everything goes well for you, I don't know anything about the show but I hope to see some good results!
>>
>>36589401
Thanks, man. Now that there's actual audio, I gotta start readin up on all this fancy script work to make transcribin easier. I'm also tryin to drum up volunteers to make the work go by faster.
>>
>>36586749
>I think I've either got a bug or I have to restart the entire training run to get the sharp audio quality working.
I found a bug.
(and it's with the dataloader)
I'm also going to take the opportunity to add all the new datasets that were uploaded since last time, so here's a checklist;
[ ] - download new datasets (if not already downloaded)
[ ] - extract and add speaker metadata
[ ] - train AlignTTS on new speakers
[ ] - train VDVAETTS on new speakers
[ ] - upload to ngrok
I expect to lose about 2-3 weeks to this. Longer if the additional speakers cause problems.
(sorry to anyone that wanted an updated model, yay to anyone that wants Fallout voices)
>>
question, in colab tacotron2 training, how do i swap the "pretrained_model"?
Do i just fl in the google drive link in place of "download_from_google_drive("1c5ZTuT7J08wLUoVZ2KkUs_VdZuJ86ZqA" ?
>>
>>36590174
Yes, that's one way to do it. I'd suggest copying the original link and pasting it as a comment next to that line, in case you ever want to use it again.
>>
>>36590181
so if my model is named "twilight_ABC" and the google link is "XYZ123" I just need to make the line:
download_from_google_drive("XYZ123","twilight_ABC")
Or do i need to rename the model to 'pretrained_model' ?
>>
>>36590191
Don't change the name of anything, just the link. The output name will be whatever you set in the standalone cell that tells you to name your model.
Also, if the model is already in your drive and you want to keep training it (rather than keeping it separately as it currently is) don't bother changing the link, just change the name in the aforementioned cell and the notebook will automatically resume training on that model (and overwrite it as it trains, so again, only if you don't care about keeping its current state as backup).
>>
could any of you convert ponies the anthology 1(or episode 1) with this
https://github.com/Sterophonick/mirror-meteo-avi2gba
following these instructions?
http://www.gameboy-advance.net/video/meteo_avi-2-gba_video_codec.htm
im on linux and i don't have enough space to install wine(+i never got it to work anyway) and i want to play with gba video
use mgba to test if it worked.
>>
>>36589393
>>36589463
nice to see that you are doing well with your project, you have a full list of of audio source?(the podcasts, the casagrandes, some commercials, the lines in that recent nick racing game(if they are from the same VA's), etc)
that could be an easy warmup for the denizens of your general to accrue, id also recommend(since you people are on trash) that you could use some of our best pone results to try and rally the troops.
>>
>>36589393
if i may recommend a "trick" in case you aren't able to get the subs from the dvd/video files, it would be to upload just the audio with random still image on Youtube (with settings made so ONLY those with the direct link can see), wait between 6 to 12 hours for their servers to auto transcribe all of it and than use some online youtube sub ripper to get the srt from them, I've done it with an audiobook and results were pretty decent.
>>
https://www.youtube.com/watch?v=jH-6-ZIgmKY

Soon.
>>
>>36590451
Soon, a virtual chatbot for your favorite pony waifu, voice included. In about 20 years, a full-on robot.
>>
>>36590220
Not yet, but I know some people certainly got the entire series downloaded and are already lookin at the podcasts.

>>36590232
Sounds like a great idea, I'll keep it in mind. Thank ya.
>>
>9
Bumparooney
>>
File: crazy.png (429 KB, 1600x1402)
429 KB
429 KB PNG
>>36590203
Where in retrowave hell did you find this program!?
I don't even know what codecs it can gobble up (and strangely enough, I already have all the required program and aven the requested video (but in webm, I need to convert it to mpeg1...))
And more importantly, why in hell did I convert this video!?
Here it is faggot.

>https://litter.catbox.moe/khi0qm.gba

Also, update your git, some fork are clearer, and at least, copy the link you give to your repo.
Also, a working codec is:
Stream #0:0: Video: msmpeg4v2 (MP42 / 0x3234504D), yuv420p, 640x360, 604 kb/s, SAR 1:1 DAR 16:9, 25 fps, 25 tbr, 25 tbn, 25 tbc
Stream #0:1: Audio: adpcm_ms ([2][0][0][0] / 0x0002), 44100 Hz, 2 channels, s16, 352 kb/s
Figure out what ffmpeg command line convert to it and add it to the git too.
>>
File: 652263.gif (147 KB, 1113x846)
147 KB
147 KB GIF
F
https://u.smutty.horse/lzsmahudekn.mp3
>>
File: PONYANTHO1.png (76 KB, 728x526)
76 KB
76 KB PNG
>>36590203
>>36592264
Okay, I think you won't like it.
20mins of video in less than 32Mb with an old ass codec can't lead to good quality.
But you ask for it.
So here it is, behold! The mighty PONYANTHO1.gba !

>https://litter.catbox.moe/ioafv7.gba
(beware, it's a litterbox link and will expire in 3 days)

Here is the ffmpeg line used.

set "VCODEC=msmpeg4v2"
set "ACODEC=adpcm_ms"
set "FILTER=scale=240:160:force_original_aspect_ratio=decrease:flags=lanczos,pad=240:160:(ow-iw)/2:(oh-ih)/2,format=yuv420p"
ffmpeg -i "%INPUT%" -vf "%FILTER%" -c:v %VCODEC% -c:a %ACODEC% "%OUTPUT%"

It seems that Meteo don't recompress the images itself and only do some shady tricky conversions.
So you will probably be able to pump up the video quality by adjusting audio / video bitrate in the AVI rather than in Meteo.
Makes you wonder what Windows can do that Linux can't?
>>
>>36592857
wouldn't be easier to just use the VLC "convert/save" option ?
>>
File: Corpse_raping.webm (420 KB, 1920x1080)
420 KB
420 KB WEBM
>>36592950
I don't know, I don't use VLC.
If you can tweak the encoding options as much as ffmpeg, then, please, have a try.
I for one will end the experience here.
I mean, if I have to encode pones, at least, may it be something useful.
okay, meme are useful, but you got my point.
>>
File: dalle dataset.png (301 KB, 1657x313)
301 KB
301 KB PNG
Saw someone mention https://github.com/lucidrains/DALLE-pytorch, currently working on a Derpibooru dataset for it out of boredom. I'm doing it all by hand because I don't know jack shit about automating this stuff.
>>
File: DaftPony.png (850 KB, 1280x905)
850 KB
850 KB PNG
>>36592415
Sad news, but something like this seemed likely after the last album. Kinda had that vibe to it. And as we've seen with MLP, better to end on a high note.
>>
>>36593327
That's great! But do you know how to train it? I kinda got stuck because I couldn't find out how I was supposed to start the training process.
>>
>>36592415
Nice!
>>
>>36577693
>>36587741
11001-16200:
https://u.smutty.horse/lzsmwmvxgnb.csv

That's the whole list done now. Relinking all deliveries for convenience - >>36583042 >>36587741
And the original list of URLs - https://u.smutty.horse/lzrpyvjyakg.txt

>>36587953
I still can't get the actual images to load in the tool, I think the issue may be something to do with the format of the URLs I'm using, they're quite different to the ones in the example file. Here's a sample list of what I was using, are you able to make these work on your end?
https://u.smutty.horse/lzsnadjzwkz.txt

>>36588271
For now, I've done two things with this list - (1) filter out unsuitable images (like when the butt is significantly obscured, poorly drawn etc.) and (2) draw a rectangle around the general location of the butt(s) in each picture. In the CSV files, you'll see the information stored as a list of the image URLs (the image ID number is the most important part of that), the start of the rectangle as x and y coordinates and then the height and width of the rectangle which you can add to the original coordinates to get the coordinates of all other corners of the rectangle. Any rejected images will be listed as having no coordinates.

For at least the first attempt at image generation, it's probably a good idea to keep it relatively simple, there's no point spending the time doing any more complicated labelling if we can't get the basics to work. Sometime later I can maybe go back through them and precisely draw around and tag the butt, tail, legs, face etc. and add a simple tag describing the pose and direction the pony is facing, most likely using the tool >>36587953 is kindly developing.
>>
>>36593327
Admire the passion, but as someone who's gone up against some similar issues, you really might want to learn some basic python and make yourself a tool. The demo images on that github are their results with a 2000 image dataset: presumably we'd need easily twice that or more to start getting good images. No idea on the contribution of dataset size vs. training time though.
What do the guidelines look like for making a dataset? I see you've standardized resolution at least, but I worry those still may be too noisy with the text and other elements. I should probably look through existing datasets and see what their images look like.
>>
u.smutty.horse/lzsndlluioh.mp3
>>
>>36593472
>I still can't get the actual images to load in the tool
Figured it out. I have the tool automatically add "https://" to the front of the URLs when requesting the image. You can make your URLs work by removing the "https://" from the front of them (The request that gets sent if you don't becomes "https://https://"). Could set up a macro in Notepad++ if you wanted. Will see about fixing this on the next version.

Also, could we figure out a list of all attributes we might want to record? Even if we don't use them all in the initial pass, I could get the tool set up with everything we might need. I have a few ideas already to improve the tool, will see how far I get I suppose.
>>
>>36592415
Sad new, but rad sound.
>>
>>36593599
u.smutty.horse/lzsndlluioh.mp3
add an https for it to embed on 4chanx
https://smutty.horse/lzsndlluioh.mp3
>>
Out of curiousity, anyone know of anybody whose compiling voices for Fire Emblem: Three Houses
>>
>>36594257
forgot the u.
https://u.smutty.horse/lzsndlluioh.mp3
>>
>>36589603
Poll for how often I should post these;
https://www.strawpoll.me/42683861#

Next ngrok model Checklist;
[x] - download new datasets (if not already downloaded)
[?] - extract and add speaker metadata
[ ] - update duration predictor to output variance
[ ] - train AlignTTS on new speakers (2~ days)
[ ] - validate new transcripts using alignment data (1~ day)
[ ] - train VDVAETTS on new speakers till completion (2+ weeks)
[ ] - upload to ngrok

>>36586635
Thanks for including the readme and dataset times in the Adventure Time dataset!
Dataset looks good, hope it get's onto 15.ai quickly!
>>
>>36594436
Thanks Cookie! the only thing I'm disappointed about the dataset is the lack of audio on other characters plus some of the Game Audio I mentioned probably won't be compatible but I guess we'll see.
>>
>>36594469
Whoops forgot the name.
>>
File: 1564545494198.jpg (67 KB, 463x500)
67 KB
67 KB JPG
>>36594491
>Whoops forgot the name.
Never thought I'd live to see those posts on /mlp/ again.
>>
File: file.png (67 KB, 945x945)
67 KB
67 KB PNG
>>36594506
https://u.smutty.horse/lzsogxwrhwm.mp3
>>
File: 1584032136238.jpg (88 KB, 600x500)
88 KB
88 KB JPG
>>36594436
weekly seems too long maybe once every 3 days could be nice as i look forward to seeing these updates giving me an insight to the overall picture of where were heading and also lend a hand where necessary
>>
>>36594506
its a sad world we live in
>>
>>36594506
Better get some practice before the G5 zoomers arrive ^:)
>>
>>36590203 (me)
>>36592264
>>36592857
billion thanks!!
i must say its sound and looks a little better than shrek on the same format
not my fork though, so can't do much about that
full disclosure if you where to have done episode 1 my response would have been
"now THIS is the quality the show was meant to be watched at, not that YayPony color corrected crap"
also many thanks for using the one with the intro song intact, youtube being youtube made them take it out.
>>
chess, but pone(just thought id switch it up with simply saying "bump")
>>
>>36595539 (me, again)
i fell kind of evil.... could you do "never gonna give you up"?
id be sweet to be able to link it to people asking for a gba rom
>>
>>36577693
>>36584495
>>36593472
Here's an update to the Pony Plot Tagger Tool. v1.01:
https://mega.nz/folder/gf4EHTwT#Yfy2NWLbWe9n9PAbJXx6Qg

In this update:
>Fixed the issue loading links with "https://" at the beginning
>Made the UI scalable, so you can adjust to get a better view of the plot
>Switched from storing data in a dictionary to in an array
>Added ability to crop images
>Added loading indicator for images

Known bugs:
>Crop location data isn't correct in exports
>UI may not scale correctly if window is snapped, drag the corner a little to fix
>Probably some weirdness with the cropping tool I'm sure

To do list:
>Fix cropping data saves
>Add multi-plot support
>Add ability to remove entries
>Look into possibly supporting the CSV exports that currently exist
>Possibly some kind of illustration to help with tagging. May or may not happen.

With the crop data export issue I probably would have waited until I fixed it, but it was already uploaded by the time I realized and it's getting late for me. Mainly provided so people can try it out.

Sample data:
https://u.smutty.horse/lzsrefiobza.plot
>>
Is 15.ai really that much better? I feel like the old models had their strengths, too.

Here's one of the old Tacotron Rarity models. Not even the MMI variants, just a regular 22KHz one. The only thing I've done is swap out the vocoder with HiFi-GAN. Can anyone get a better-sounding take on 15.ai, with or without stitching?

https://u.smutty.horse/lzsrttahxmb.ogg
>What's that? I look like I fuck human men? Darling, I'm an {IY1 K W AH0 L} opportunity slut. I fuck {W IH1 M AH0 N}, too!
>>
>>36596660
Neat, the 15 model can jizz soft-serve ice cream though.
>>
>>36589393
Feels good to get your waifu talking. I'm working single-handed on the Tangled show dataset. I just want to hear Rapunzel say cute things.
>>
>>36596660
Holy shit. Is switching the vocoder difficult? because it's amazing how good that sounds for a dinky 22KHz model.
>>
>>36596660
>vocoder with HiFi-GAN
huh, how did you do that?
Did you use that when training on colab or just the synthesizing audio (colab or offline) ?
>>
>>36596660
>Doesn't present how to do what he did
Lelelelelelellleleelel
>>
>>36577693
>>36596839
>>36596850
>>36597015
Here's a notebook that lets you try out HiFi-GAN. It features Rarity and JC Denton for now, but I could train more models if there's any demand for it.

https://colab.research.google.com/drive/1dxVcqe4m-AU8NAA1I1MW1N9HYBO_oii_?usp=sharing

You could also wait for Cookie's new thing, which also uses HiFi-GAN and will probably sound better.
>>
>>36597481
>and will probably sound better.
I've gone from 900 speakers to around 3800 speakers after updating my dataset. Even after the bug fix, I don't think i'll have a competitive Rarity. (the training time alone is going to be massive with such a large dataset)
I can transfer learn from the big dataset to an MLP subset after training, but I don't have an estimated date for that.
The original ngrok tacotron2 model took 2 weeks to train for 350 speakers. It turns out the additional speakers increase training time almost linearly (which fucking sucks!), so it's possible 3800 speakers would take almost 10x longer to train than the 350 speakers.

I'll also increase the weighting for the MLP datasets just like last time;
https://github.com/CookiePPP/codedump/blob/master/tacotron2-PPP-1.3.0/data_utils.py#L52-L55
and that reduce the impact of the larger dataset.
>>
>>36597565
>3800 speakers
Holy shit! How many of those are characters vs. generic speakers?
>>
>>36597580
https://pastebin.com/MKQ53FhP
3200-ish come from videa games.
>>
>>36597565
That's one big dataset! But I have an idea: Remember how I wrote instructions on how to train custom ngrok models?
Since less speakers train faster, do you think it'd be possible to cut down the number of speakers in the dataset to make an alternate, smaller version of it, then have another anon train that to hold everyone over until the stupidly large model is done?
I'd have to update my training script, but I could give it a try.
>>
>>36597678
>Remember how I wrote instructions on how to train custom ngrok models?
Do you have a pastebin? I lost that.
>>
>>36597684
Here, it isn't updated to the latest version of Cookie's repo, but it works.
Training Notebook: https://colab.research.google.com/drive/1uvP6cHtDYsgy_0mmlguY_CZrzy6T5e5r?usp=sharing

Synthesis Notebook: https://colab.research.google.com/drive/1pArfzHa_m4RkkvwtbMYshbGxunhzqqh8?usp=sharing
>>
https://desuarchive.org/mlp/thread/36432529/#36464860
> Milla, Carol, Lilac
Can somebody reupload these for me?
I get 403 error from the UK but according to zippyshare it's just region locked.

>>36597678
Sure!
Currently converting hundreds of mp3 files to flac but I'll be able to upload a subset once I start training AlignTTS.
Any specific voices you need?
>>
>>36597716
Works on my machine.

Carol (new).zip
u.smutty.horse/lzsuqtblhdf.zip
Lilac (new).zip
u.smutty.horse/lzsuqtldiuq.zip
Milla (new).zip
u.smutty.horse/lzsuqsipitv.zip
>>
>>36597716
Obviously the Clipper MLP dataset, and also the BFDAI BFDI and Poopsikins Adventure Time datasets would be good to have. I forgot how many speakers the MLP dataset has, but the other two have around 85 combined. Feel free to throw in any other datasets you feel would be good for padding. I think that my absolute limit would be around 800 speakers, but that's just an estimate.
>>
https://vocaroo.com/172v1FY3r3yd
>>
>>36597607
>95% is just NPCs
Bloody hell Cookie, this is some dedication, hats off to you.
And while I'm sure that 'Witcher 3__Novigrad Citizen Woman 03' gets plenty of fanfics but wouldn't be bit more piratical to run a test run with just MLP character in case if there is something funky happening with code and the multi model would be in need of retraining ?
>>
hey poopsikins, where is the midna data so that cookie or 15 can use it?
>>
>>36597896
I'm not really expecting those speakers to be used very often. The interesting thing to me is how many speakers I can add to a model while keeping quality up.
For stuff like the automatic fanfic -> audio drama conversion, I want to be able to pick/generate new speakers dependant on what happens in the text.

> wouldn't be bit more piratical to run a test run with just MLP character in case if there is something funky happening with code and the multi model would be in need of retraining?
This is kinda true. It really depends on the problem since having less speakers makes cloning their voice easier.
For confirming that the model isn't bugged, definitely best to train on datasets that are confirmed good.
For something like testing the effect of the pitch postnet, having more speakers is better in my opinion, the average audio quality drops and it's much easier to tell if the modification had a positive impact.
For doing the 15.ai thing of making a couple of speakers with extremely high quality/accuracy, having a small number of speakers is definitely helpful.
And for fixing edge cases, train/test exclusively on the hardest speakers.
Yeah... I'm definitely making this harder for myself. Eh whatever, I can transfer learn from the big dataset to the smaller one and get the best of both worlds later anyway.

>>36597896
>the multi model would be in need of retraining?
Thankfully, that's quite rare. Almost every time I can retain the speaker embeddings, encoder and attention between training runs.
>>
Also, anybody know anything about hosting home servers securely?
I'm pretty sure the standard recommendation is "Don't", but it costs £0.05 more for me to get business broadband, at which point I can just host the inference server from home and have something less annoying for people to use.
>>
>>36598045
install gentoo
>>
>>36598045
If you're worried about security, consider using your home server as a backend only. Host the actual website on something like AWS. That should help a little.
>>
>>36598099
Is there a search term for that? I'm not sure what to search for guides/tutorials/blogs about the kinda setup.
>>
>>36598272
You should ask 15 about it. He's hosting his servers on AWS and he probably knows more about web hosting than any of us.
>>
>>36598279
meant to write
>his servers and his website on AWS
but the point stands regardless.
>>
>>36598272
you could also ask in the Fan Site Alternative thread, bunch of webdev/admin people lurk there
search terms: maybe "deploying ML models to production" and general web/server/hosting stuff. I'm not an expert, but hosting locally should be okay as long as you take security precautions: don't expose your home IP (proxy server? e.g. VPS from EC2?); sandbox the ML model (e.g. chroot jail, containers, etc.) so it can't touch unnecessary files or make unneeded network requests; and so on.
you'll also probably need to setup a domain name, setup HTTPS (Let's Encrypt provides free certificates), and maybe port forward your local server to the proxy
>>
File: Buttsauce.png (97 KB, 1008x962)
97 KB
97 KB PNG
https://voca.ro/1cxwojeNoz2y
How fitting.
>>
>>36598798
By the way am I doing this right?
>>
>>36597979
It's just a handful of voice clips spliced and banjo-kazooified with some reverb.
https://www.youtube.com/watch?v=6YLakSs0mlU
>>
>>36597979
It was only around 15 seconds of audio so I didn't bother with it, you can look at the archives someone put up a lot of sources of midna so if you want to do it yourself all the files are there.
>>
File: example.png (294 KB, 1292x784)
294 KB
294 KB PNG
Been working a bit more with the DALLE thing.

>>36593354
Yeah, I tested it with what I have so far and it seems to train properly in Colab, but I'll probably rent a GPU once I have enough images and then train it using that.

>>36593486
I'll see if I can get to work on a tool, this is taking forever. The dataset is 512 x 512 jpegs, with matching text files containing a few different descriptions of the image.
>>
>>36599842
Funny but clever to provide different description of the same image.
>>
>>36599842
Would it be too troublesome to write up simple instructions on how to train it? I unfortunately could not understand the github page's readme.
>>
>>36599879
Here's a colab script I cobbled together. I can't guarantee it'll work for you, I had to use a high ram runtime and make a few changes to the repo files to get a test run going. Even if you can't use it directly, hopefully you can use it for reference!
>>
>>36599903
https://colab.research.google.com/drive/1xYW-eSmlueBll1mMdE5wpT25cPVYtT5H?usp=sharing
FUCK i failed to put the link yet again, i do this fairly often
>>
>https://arxiv.org/abs/2102.11533
>Accurate Learning of Graph Representations with Graph Multiset Pooling
>Graph neural networks have been widely used on modeling graph data, achieving impressive results on node classification and link prediction tasks. Yet, obtaining an accurate representation for a graph further requires a pooling function that maps a set of node representations into a compact form. A simple sum or average over all node representations considers all node features equally without consideration of their task relevance, and any structural dependencies among them. Recently proposed hierarchical graph pooling methods, on the other hand, may yield the same representation for two different graphs that are distinguished by the Weisfeiler-Lehman test, as they suboptimally preserve information from the node features. To tackle these limitations of existing graph pooling methods, we first formulate the graph pooling problem as a multiset encoding problem with auxiliary information about the graph structure, and propose a Graph Multiset Transformer (GMT) which is a multi-head attention based global pooling layer that captures the interaction between nodes according to their structural dependencies. We show that GMT satisfies both injectiveness and permutation invariance, such that it is at most as powerful as the Weisfeiler-Lehman graph isomorphism test. Moreover, our methods can be easily extended to the previous node clustering approaches for hierarchical graph pooling. Our experimental results show that GMT significantly outperforms state-of-the-art graph pooling methods on graph classification benchmarks with high memory and time efficiency, and obtains even larger performance gain on graph reconstruction and generation tasks.
>>
What do you guys think is the next step?
>>
>>36600434
Wait for v14 of 15.ai to release
>>
>>36577693
Found two more Adventure Time games to rip audio from and expanded it to the dataset, I'm starting to think maybe I didn't rip everything with the previous games so I may dig a bit deeper later but for now I'll leave this here and send this to 15.

Expansion:
https://drive.google.com/file/d/110lDyDSXFDRwwM4wQzAbOX0taN8Ogfiq/view?usp=sharing

I think own all of the AT games so I'm at a dead end for new sources.
>>
File: PTS2Thumb.png (121 KB, 1280x720)
121 KB
121 KB PNG
https://www.youtube.com/watch?v=xkr4yEMhCsc
I made another Pony Thread Simulator with the voices.
I used 2 different TTS methods in it so hopefully 15 doesn't send his kill bots after me.

On a different note, does anyone happen to know a good reliable way to restore corrupted files? I've recently had a project folder completely corrupt on me (All contents are gone, reads as 0 bytes in size) and it would be nice to not have to completely start over. Yes, I know rule zero is to have backups, I'm doing that now that skimping on it has bitten me in the ass.
>>
>>36600514
i've used a 'Stellar Drive Recovery' in past, but it all depends on what went funky with your files/drive if it will be able to get any old files.
>>
>>36584495
So soon we might have ai generated pony porn? My man...
>>
wtf happened to the new ngroks, I haven't been in this thread in a month
>>
>>36600603
I can summarise;
https://desuarchive.org/mlp/thread/36536892/#36577018
https://desuarchive.org/mlp/thread/36536892/#36577176
>>36586749
>>36589603
>This model is 26% trained ... give me any failure points/issues you find... most of them should disappear with more training
>In theory I can do ... and create controllable speech
> ... I think I've either got a bug or I have to restart the entire training run
>I found a bug... I expect to lose about 2-3 weeks to this.
>>
>>36600514
>"You open the door for Rainbow Dash. As she walks in, she blushes heavily as her heavy mare cock swings from side to side."
holy fuck I'm dying
>>
>>36600630
man that star trek episode where Data tries to make a daughter and her brain just catastrophically fails when emotionally jostled too hard... is making more and more sense the more I learn about how hard it is just training a simple pony voice AI
>>
Here's a Google Sheets link that keeps track of datasets I've been sent. It will be updated in real-time. Feel free to remind me if you don't see a dataset listed here even though you've sent me one.
https://docs.google.com/spreadsheets/d/1dd8yv2MyRhxCNWWO04xOk6-h_XnKU_xj34qZ8pobCfc/
>>
>>36599307
>only around 15 seconds of audio so I didn't bother with it
but classic derpy had even less and she worked...
>>
File: file.png (7 KB, 1299x46)
7 KB
7 KB PNG
>>36600976
Man, had a good chuckle on this one.
>>
>>36600976
You're missing my dataset: 1hqydpu00jwYk6IpGdEHZNlhM7MybeyRR
Do I need to make any adjustments to it?
>>
>>36600490
you have done the entire series?
>>
>>36601059
also found something interesting
https://soundeffects.fandom.com/wiki/Adventure_Time
https://soundeffects.fandom.com/wiki/My_Little_Pony:_Friendship_is_Magic
https://soundeffects.fandom.com/wiki/My_Little_Pony:_Equestria_Girls_(2013)
https://soundeffects.fandom.com/wiki/Category:My_Little_Pony_Franchise
also is pony life added?
https://soundeffects.fandom.com/wiki/My_Little_Pony:_Pony_Life
and what about old mlp?
>>
>>36601059
Yes, there are more characters here. >>36548549
I ripped animatics from the boxset and ripped audio from most of the games to create this dataset.
>>
>>36601089
are these gonna be on 15?
>>
>>36601101
Look at 15's post above, it's not concrete but I hope a few of these characters will be on 15 one day.
>>
>>36600976
Shouldn't the "Distorted" column (N) have "yes" be red and "no" be green? Because "no distortion" is good, and "yes distortion" is bad.
>>
>>36600976
Can we make comments on the spreadsheet directly?
>>
>>36601089
>animatics
what does that mean?
that you did just the animatics or that you did all the episodes and the animatics?
also you could look into this >>36601083 to maybe find which sound effect pack should you use to try and remove them from the audio like clipper did with the dub, also what dub did you use and how well did it work?
and clipper if you read this, do you have the list of low to none usability clips?
what i found of sound effect could come in handy to try and make them usable
and poopsikins, i repeat myself >>36600980 give midna
>>
>>36601104
oh yeah. i read that post. my brain saw "list of datasets i've been sent" and it just shat its pants and decided to go home. Thanks for holding my hand
I don't know why people get mad at me when I'm like "like I/someone said a minute ago..." that's how you learn that you're not paying attention.
>>
>>36600976
>twilight
>Most consistent across all three seasons
my dude, she is nine seasons worth
>>
>>36600976
I've just sent you a dataset for JC Denton. It's the same one I use in >>36597481.
>>
>>36600976
based
>>36601192
unbased
>>
>>36599842
Excellent work so far. You might want to check the labeling though. Trying to understand any of this crap is like dealing with infinite matryoshka dolls, but the stuff here:

https://towardsdatascience.com/variational-autoencoders-vaes-for-dummies-step-by-step-tutorial-69e6d1c9d8e9

Seems to suggest labels should be more like booru tags: individual words or phrases describing key elements in the pics. The natural language combination of multiple elements ("Applejack and Rainbow Dash pressing their cheeks together") comes later in the process I think: here you're just trying to tell the system a few basic things that are in each pic, so it can go "Applejack, Rainbow_Dash, Cuddling, ok, this is a pic that has those three things." It might also be a good idea to start very, VERY simple at first: say a dataset of just the Mane 6, and just a small set of actions and props, lest you go mad. But I still wish I could get better info on exactly what these datasets should look like. Everything with neural net stuff feels like the classic "How to draw a Pony" joke: Step 1: draw a circle. Step 2: fill in the details. Step 3: enjoy pone.
>>
>>36601493
Never mind, I'm an idiot, you had it right the first time. It IS supposed to be brief natural language descriptions, separated by newline, as described waaay down at the bottom of that github. Wild.
>>
>>36601177
There's bonus animatics if you have the blu-ray, these animatics don't have any music at all and the sound effect are nearly non-existent so it was an excellent source to transcribe. as for Midna I'll maybe look into it.

>>36600976
I don't mean to prod but there's more AT characters in the earlier dataset I sent you, love reading your notes in the spreadsheet
>>
>>36600976
Any chance of Flash Sentry being added? He was on the ngroks after all.
>>
File: 1611890163314.jpg (321 KB, 945x945)
321 KB
321 KB JPG
>>36600976
>Twilight Sparkle
>Most consistent across all three seasons
>all three seasons
Twilicornfags btfo
>>
Look 15, I know that there are people out there who leaked the 15.ai test site a few times. But you can't honestly expect to test the site by yourself, can you?
>>
>>36577693
>>36600976
I would like to toss here the (yet to be completed) audio dataset of Mr Rogers, I've extracted 82 minutes out of 5 (from 9 total) youtube interviews, down sampled from 44100 to 22050 with lots of background noise such as air condition fan, squeaky chair and Mr Rogers touching his mic from time to time (in next post im gonna include links to OG audacity files in case someone can do some audacity magic and remove all those noise)
Im will probably continue cleaning the auto transcription from the files 6~9 however with how things are irl its gonna take bloody forever, so I think it would be better to let you guys have all of this audio instead of just hoarding it on my pc.

Mr Roger WIP (1 to 5) , audio, transcription and basic English words to ARPAbet transcription.
https://mega.nz/file/sssCkChJ#a-aGO0vg2pbIQBQD_LWNKUFq8uFScEaEKcJEKRnu8vY
>>
>>36602100
It's called izotope you dumb nigger. Can't even read the tools section of the doc and you're trying to clip shit
>>
>>36602123
>not shitting on people for being ignorant without spoonfeeding them too
>>
>>36600976
Hmm, just out of curiosity, you haven't gotten a dataset of teenage Steven? I have the movie in 5.1 so I could clip it for you. The voice is substantially different so it'd be a separate character.
>>
File: computer sailor moon.jpg (55 KB, 640x480)
55 KB
55 KB JPG
>>36602100
Audacity 1~5
https://mega.nz/file/dplGgIzK#Qt1UNUJB8zYSHr1okszz4qYe8WPg4ME_asHbqU1Vs88
https://mega.nz/file/I51kHArD#mZpe1eoTnUiPyfxclO_YUhzjJMgJhbe9injzPT3qT9s
https://mega.nz/file/E88E3QCT#pNPAJEd8mowlvB4Z9waANr83x7EkiPUfKcUUY1PVV2M
https://mega.nz/file/dhsChaRI#1FrttQIVBkyJNg6rXy77dGOi-9JJlbljgQXpMSqvosk
https://mega.nz/file/V48mDaqL#J3tPjkZ6sd-kW9yIMem63Lcsz-CwABiBhd195R0-NaI
Audacity 6~9
https://mega.nz/file/9w9EACQD#5ZUDwwDtI6e5Tymq01_kney9deNAN3JPJ2xOS_MiZtQ
https://mega.nz/file/Et00mAbQ#1Vse1r4-xPqYQQC5P-o6heU7G7azc0vXnULvA_NX4wE
https://mega.nz/file/JwtWzaxI#wOwBi10g_qYePdl5OVsSpT4NS-8GZty7jIT1xBbg8gA
https://mega.nz/file/Ips0VSKA#mjxUOqKqf5wnLJOEW9WULVI9VKOQqHWXjhr6O-ImRsA
>>36602123
I swear on my mum, I was going to look into that after I complete the whole set.
Anyhow, going from pic related to actually contributing to project makes me feel pretty fucking good to to see my little 'brick' in the construction of PPP and other future tts ai projects.
>>
File: 1549529704367.png (58 KB, 512x512)
58 KB
58 KB PNG
>>36600976
>Yona - I don't know what Neanderthalic language this character is trying to emulate
>Cheerilee - Nearly identical to Celestia, except Celestia sounds more like a goddess
>Twilight - Most consistent across all three seasons.
>EQG - I don't watch this show, so I have no idea.
Those are some excellent opinions, or dare I say, based?
>>
Hey guys. Is it still possible to find your presentation that you did for /mlp/ convention somewhere on yt, if so where ?
>>
>>36602409
not sure if it is on YT but i got you a alternative link here senpai https://pony.tube/videos/watch/22793881-dafd-4efd-98a2-58d27ea51dab
>>
>>36602409
There was a link in the main doc to the source presentation:
https://mega.nz/folder/OFZzRQqK#Coi5IEZOnfd8Tc-YYEIiqg

Unfortunately the folder appears to have been deleted (probably automatically by Mega).
>>
>>36600976
I sent you over 16 hours of fully transcribed Geralt from The Witcher 3.
>>
>>36601975
Begone, faggot
>>
>>36601975
He wouldn't have to if people stopped leaking the test site.
>>
>>36602303
Dangerously based.
>>
I've started working on a Bloodlines dataset, short one (Pisha) to see if the audio quality will be fine enough and to get some practice with making the sets. The game's convenient as dataset fodder, as all audio is neatly organized in folders and the game even has a script file with all lines written out, so making a transcript should be just a matter of copypasting and some editing instead of manually transcribing audofiles.
I wanted to ask, are non-dictionary words (i.e. Malkav, for instance) supposed to be transcribed in ARPABET, or nah? And what's the best format for bot training, be it 15's or otherwise?
>>
>>36600976
Hey 15, I sent a request, can I have access to edit the sheet?
>>
File: profile_twilight.png (1.2 MB, 1500x2635)
1.2 MB
1.2 MB PNG
Is the anon who made these still around?
These are very helpful. I'm wondering if they'd be interested in making more of these.
>>
>>36603302
I think he posted here recently? I remember someone asking the same question and he replied, I think.
>>
>>36603302
>wings
stopped reading
>>
File: 1532457858238.jpg (83 KB, 915x874)
83 KB
83 KB JPG
>Mr Roger is on the list
Sasuga 15-sama
>>
File: retarded.gif (2 MB, 240x180)
2 MB
2 MB GIF
>>36602100
Just noticed that automatic regex script fucked up some of the labels, here are the re-exported English transcript:
https://pastebin.com/LuimVgUJ
ARPAbet version:
https://pastebin.com/PuECs476
>>
>9
>>
>>36603302
>>36560127
>>
>>36603051
I'd be happy to just be able to leave comments on cells. I want to fill in information on voice actors.
>>
>>36605613
Can't you just add notes?
Usually everyone can do that, no?
>>
>>36577693
>>36596413
Another update to the Plot Tagger. v1.0.2:
https://mega.nz/folder/sSZxGagR#nIhHwcrjrBjePgReiggTeQ

In this update:
>Now saves and exports correct values for the crop data
>Can now save and load .csv files
>Fixed issue with tagger not loading more than 10 images
>Added a button to duplicate entry for multi-plot support
>Added a button to remove an entry from the list
>Fixed issue where elements didn't resize on window snap/maximize

If you have feedback for this thing, let me know. I'm thinking that the distance tag is not necessary since there is now the ability to crop. Maybe I'll remove it in the future.
>>
>>36602100
Thank you, you glorious bastard. If anyone deserves to be in Equestria, it's Mr. Rogers.
>>
File: 1600872506316.jpg (23 KB, 600x400)
23 KB
23 KB JPG
thread's slowing down now
its always the quietest moments that's the most defeaning
>>
>>36608677
Looks like everyone's freaking out over G5. Things'll probably be back to normal in a week or so.
>>
>>36608745
Saw your thread simulator, had me on the floor. Keep it up lmao
>>
>>36608767
Glad to hear it! Will do.
>>
>>36608745
They'll have to get over it. It's not like we can make our own show or anything.
>>
>>36609016
>It's not like we can make our own show or anything.
>
>>
>>36609294
This is the best thread to post that kinda comment
>>
>>36606104
My only options when right-clicking are "Copy" and "get link to this cell".
>>
>10 in 5 hours
>>
>>36611682
it's over
>>
File: 1590355355996.png (165 KB, 400x329)
165 KB
165 KB PNG
SUPERCHARGE
THE
BLUEPRINT
>>
will g5 have a dataset made out of it?
im asking since i don't see the option to use dishwater slog fro pony life so i don't know the stance with respect to pre and post g4 voices...
>>
>>36615754
>will g5
who gives a fuck
>>
>>36615754
Nobody cared about PL. If G5 becomes at least somewhat popular (which it will with these designs) it will definitely have a dataset made out of it.
>>
>>36615818
Ha based.
>>
File: Dishwater_Slog_ID.png (281 KB, 419x469)
281 KB
281 KB PNG
>>36615964
>Nobody cared about PL
i do, for him
>>
>>36616174
Fuck off, get fucked, die in a fire and take your utterly revolting bullshit with you.
>>
>>36616228
Not the same anon, but I liked Pony Life.
It was clearly different than and not Friendship Is Magic, but it was pretty much enjoyable as a quick mood lifter the days I was working from home.
I mean, they do make cute pone stuf most of the time, so it's not half as bad actually.
But it's definitely not the same things as FiM
>>
Great, now we have nu-bronies
>>
>>36615754
>pony preservation project
>>
>>36616228
https://voca.ro/1opEC1xZKLYq
>>
A skit I made using my input and AI Dungeon. Feel free to dub this in 15.ai voices.

Princess Celestia: Hey Luna, have you heard of Lucid Dreaming? It's like lucid dreaming, except it's real! You could be having your own dream right now!
Princess Luna: (Sigh) No.
Princess Celestia: Well it turns out that people can use it to control their own dreams. So you won't be needing to watch over anypony in their sleep anymore.
Princess Luna: What? But I'm Princess Luna, guardian of the Night. I'm supposed to watch over people's dreams.
Princess Celestia: Look, Luna, I know you're upset about this, but there is no way for us to stop what we believe to be a natural phenomenon. We'll just have to live with it.
Luna closes her eyes, and looks down. She bites her lip painfully hard.
Princess Luna: How could you do this to me? You know I'm not wrong. You know I never ever ever lie.
Princess Celestia: Stop it! Stop being so dramatic, this is for the best, I promise you.
Princess Luna: NO! You don't understand! If you really knew me, you'd see that I'm right! I've been watching over dreams for centuries! Centuries! That's a lot of watching! I've got a lot of experience. You think YOU were watching over the same set of foals for over a millennia?
Princess Celestia: For Goodness Sake. Don't you realise you can use Lucid Dreaming to focus on your dreams instead of monitoring other's? Try it! You'll see!
Princess Luna: (Sigh) Very well. (smirks) I'll prove you right, but only because I'm curious.
Later that night, Princess Luna goes to sleep and realises she's dreaming. She tries to control her dreams, and succeeds after a while.
Princess Luna: OK, so it'll take some getting used to, but I think I can do it.
From that moment on, she manages to master her dreams. She learns to have multiple lucid dreams a night, and can control them as much as she wants.
>>
>10
>>
>>36615754
No. This thread is about preserving FiM. Others Anons with other series may have joined in over time but it was always FiM at the core.
>>
>>36615754
If you want a character's voice, the best thing you can do is make the dataset yourself. It's easier than you might think, just a bit tedious. Otherwise, I wouldn't bet on it. Most folks around here seem perfectly content with the G4 roster, with not much excitement for the new stuff.
>>
File: GetOut_sqna.webm (265 KB, 853x480)
265 KB
265 KB WEBM
>>36617812
Dhur Duhr Durh, wessa only save FiM!
And porn too, because porn is based.
And Equestria girls, because it's the same voices.
No, no pony life, it's... disgusting because I don't watch it, but I know I will not like.
>>
>>36618594
Literally the only reason I ripped and worked with the EQG voices at all was for FiM.
>>
>>36616694
Why do you have a name?
What do you contribute to the thread that is so important that needs a name?
>>
File: unknown (61).png (36 KB, 747x247)
36 KB
36 KB PNG
>>36618676
WELL GEE AND HERE'S THE ANSWER TO THIS MYSTERY

BEGONE TWITTERNIGGERS
>>
File: get_a_load_a_this_guy.png (612 KB, 878x1080)
612 KB
612 KB PNG
>>36616694
>>
>>36618594
Pony Life's odd, fast pacing and high pitch would harm any model if added along the main voices. Then you add there's too much noise with sound effects all over the dialog and we still haven't got a high quality release to even get clean voices for when the characters talk at normal pace. Is not a matter of opinion.
>>
>>36600514
Very nice work
>>
>>36616174
i like and identify with this horse
i assume there's FiM-style art of all new characters
>>
File: what.png (912 KB, 1920x1080)
912 KB
912 KB PNG
>>36616694
>have you heard of Lucid Dreaming? It's like lucid dreaming, except it's real!
>>36618676
oh fuck off names are great
>>
>>36618706
this is a child, anon. you're raising him right now. you want to know what kind of kids result when you raise them on random belligerence?
i'll give you a hint, they have a lot of piercings
>>
>>36619309
Not our problem.
>>
>>36593472

Hey Clipper! (And everyone else!)

I’m the guy from a few weeks ago who said I also want to try to make a voiced green, after seeing yours.
Well, I did in fact try.
I have a more serious one in mind for the future, but for now I made a short(ish), silly, low-quality one, mainly for practice purposes.

I’m not a voice actor and I’m not a very good editor, so don’t expect much :P

https://www.youtube.com/watch?v=IS2flGbDJ2Q
>>
File: GLORIOUS CONTENT2.png (2.64 MB, 1280x720)
2.64 MB
2.64 MB PNG
>>36619752
Fucking glorious. This is a greentext I haven't seen before, helluva way to first experience it. Great work! Some of the lines sound a bit noisy/raspy, but otherwise very well done.
>>
>>36619926

Thank you so much! I've never done anything like this before. So it's really good to hear you say that.
>>
>>36619752
>>36577693
>>
>>36619752
Oh good lord I actually remember this greentext.

Great job anon. You got a pretty nice voice. I love the emphasis you got in some of the lines too. Like RD saying "a whole mmminute!" Maybe could have redone a couple others but I give it a good 8.5/10.
>>
>>36620238
Thank you!!!

Yes that one is great. I really love how a lot of the lines came out.
>>
>>36619752
I'm very happy to see you actually went and made something, especially inspired by the stuff I've made. You also sell yourself short - even if you don't think the quality's as good as you'd like, you've still done infinitely better than everyone else who did nothing.

For your first try, I'd say you did quite well. Especially good use of music - I know how tedious it can be to find that one clip that actually works. Shame about the voices being rather noisy but not much you can do about that I suppose, hopefully the next iteration of 15.ai and Cookie's ngroks will be able to fix that. On that note - remember that you can take stuff directly from the show, especially things like "oh", "um" and "er". I've never really been able to get any TTS to do those convincingly.

A few other tips to improve if you later try something more serious - try to work on varying the pace and pitch of your voice to emphasise the key points of dialogue. I don't really have a set of "rules" for how to do this well, and I can't predict how well your voice will adapt to that, so the best I can suggest is to revisit the voice lines I did in other projects and note how I deliver lines that have particular emotional weight. Also, generally decrease your pace a bit, some of the lines follow each other a bit too closely imo.

Also, you don't have to narrate every line of the greentext verbatim. In fact, I've come to believe that for this kind of thing it's generally better that you don't. If you can replace (or supplement) a line of descriptive narration with a sound effect, then I'd strongly suggest you do that. The part with the glass breaking is a good example of that. Every time you need to explicitly clarify what's going on with the narration removes a little bit of the immersion. Stuff like "Rainbow said" and Twilight shouted" also aren't necessary when you already have the voices doing that.

tl;dr - good job. It's a solid foundation to build from and I hope you do more.
>>
>>36615754
Maybe. Depends if g5 is actually any good and if there's a demand for voices there. I reserve all judgement for when it's actually released.
>>
>>36620466
I just want to preserve FiM and be rid of all this other stuff.
>>
>>36620452
Hello Clipper! Thanks for watching it!

Good tip on the “oh”s. I used a sound effect for RD’s laugh but not much else in terms of pony voices.

Yeah, I could’ve put more effort into my voice lines. I really just wanted to make one to be honest. I’m eager to do more though!

I actually did change a lot of the lines, mainly to remove those “(pony) said”s you’re talking about. I know there was too many left (stupid lazy brain) but it used to be a lot worse if you look at the original green!

Thanks so much for your feedback. I’m glad you think my first attempt wasn’t a complete flop.
>>
>>36619395
you are literally experiencing a problem with this fucker as we speak. you're contributing to making the problem worse. pro tip: anytime you say 'not my problem' you are probably the problem.
>>
>10
>>
>>36621064
>The only way forward is to make everything worse for everyone until the most slack jawed wide eyed negro can just barely skate by
No.
>>
>>36622569
no, the opposite of that.
>>
>>36619309
Anon, the fucker couldn't be bothered to look up the proper etiquette for posting here.
Coddling this behavior is how you get problem glasses and dangerhair people who think their opinino on something is the most imortant ever and must be forced down everyone's throats.
>>
>>36622643
Mass graves? I agree
>>
>>36622654
No, your behavior is what causes that. Before those people, we all got along fine and enjoyed fun shit like using names and saying whatever the fuck we wanted on the internet. Then little tinpot dictators like you making up 'etiquette' fucked it all right in the ass. You're a second away from accusing that poor innocent kid of thoughtcrime. How can the simple act of using a name bother you THAT much?
>>
>>36622930
>anon, don't you get it, you should throw away everything the site you are a part of stands for to make sure every single retard is accomodated!
>site culture? Traditions? Who woul want that?
Even boomers on /pol/ figured out you're not supposed to namefag, have some faith in this kid.
>>
>>36622930
>Yeah back in my day the REAL cool kids namefagged and jerked each other off all day about how spoonfeeding was great
No, we didn't. Is gaslighting completely infused with your DNA at this point?
>>
File: These are just hooks.png (78 KB, 622x622)
78 KB
78 KB PNG
>>36622948
>>36623051
Stop entertaining this retard/troll and his worthless opinions.
>>
File: pinky cute.gif (2.58 MB, 334x390)
2.58 MB
2.58 MB GIF
Don't mind me, I'm just posting cute pones...
>>
>>36619309
Nice.
We need to make his parents pay for not aborting him.
>>
>>36600514
fucking kek, this was excellent. nice intro, good use of music/SFX, using full size images/GIFs. along with good thread choices and voices
>>
>>36615754
Nobody likes it.
NPCs are nobodies.
>>
>>36616694
One day we will have an AI to write convincing speeaches.
>>
>>36623839
I like it so far ^:)
>>
I don't really contribute to the server. I just wanted people to identify me as this name.
>>
File: 1610528369734.png (326 KB, 1091x1028)
326 KB
326 KB PNG
>>36624092
>Sonictroon
>>
>>36624107
Hey it's a good name, you should make a name too they are cool.
>>
File: 1604005870513.jpg (59 KB, 528x672)
59 KB
59 KB JPG
>>36624092
>>
>>36624125
Fuck off namefag. You have contributed nothing of value to the thread.
>>
I mean if the audio is easy to obtain and people make datasets on them for 15 or the Ngroks... I'll probably do audios in the future with the G5 voices. Izzy will probably be a fun one. She looks cute. I imagine her sounding like Kerfuffle or Autumn Blaze, and now I'll be disappointed if she doesn't.

Speaking of, I really hope included in the new batch of characters for V14 is Autumn. There is a lot of potential for content of her character.
>>
>not even PPP is free of G5 faggotry
Fuck this world.
>>
File: 1585378317521.gif (2.77 MB, 418x594)
2.77 MB
2.77 MB GIF
>all the namefags congeal out of the gutterwork to try and convince everyone else that G5 isn't THAT bad
>>
File: Pony computer.gif (136 KB, 250x273)
136 KB
136 KB GIF
>>36624601
That ship sailed long ago when non-pony voices got added, Anon.

>>36624605
>G5 isn't THAT bad
Can't be worse than EqG or S8/S9.
>>
>>36624616
>That ship sailed long ago when non-pony voices got added, Anon.
Maybe. I got started with this to keep FiM going, specifically. That's why I've been contributing to these threads for over 2 years since they began. This thread is about preserving them. Surely here more than anywhere else on the board we can keep them from getting overrun from all this other nonsense going on.
>>
>>36594436
Next ngrok model Checklist; [21/02/26]
[x] - download new datasets (if not already downloaded)
[x] - extract and add speaker metadata
[x] - update duration predictor to output variance
[?] - train AlignTTS on new speakers (unknown training time. Still hasn't converged.)
[ ] - validate new transcripts using alignment data (1~ day)
[ ] - train VDVAETTS on new speakers till completion (2+ weeks)
[ ] - upload to ngrok

>>36593472
>That's the whole list done now. Relinking all deliveries for convenience
Heads up, I don't think I'll be able to do any pony plot training. Got some IRL stuff that has to take priority for a bit.
>>
>>36624594
Please do a Trixie audio before any G5 stuff
>>
>>36624630
That's why I'm here as well. Ironically this project has gathered enough attention that the ratio of nor/mlp/eople to outsiders is worse than in any other thread on /mlp/.
>>
File: headless bloom.png (107 KB, 376x395)
107 KB
107 KB PNG
Been playing around with DALL-E still, trying to train a semi-working model on a WIP dataset to make sure that I know what I'm doing. At one point it generated a Apple Bloom nearly identical to a piece of fanart in the dataset, except without her head for some reason
>>
>>36624689
Oh, definitely! G5 won't have any dataset-obtainable audio till at least when the show comes out in 2022, so that's a whole year off. Once 15ai comes back up, I'm gonna try to continue my Fluttershy audio (was trying with V13, but getting her to whisper was impossible, so hopefully that's not the case this time). Once that's done, I'll either continue my other planned Dash audio (since I've already got a script for that) or hop straight onto the Trixie one. While 15ai is down, I'll probably create the Trixie script this weekend, see what I can cook up apart from the general concept I've already got in mind.

>>36624601
I mean, this is called the PONY PRESERVATION PROJECT. It may have started with G4, but I can't see it stopping there. Don't know if G5 will be as good as G4, or good in general, but some people are gonna want at least a few characters preserved.
>>
>>36624763
>I mean, this is called the PONY PRESERVATION PROJECT
Under this logic it could also be about preserving real life 3D meatspace ponies you autistic literal nigger. You know damn well why this project was created.
>>
>>36624763
>this is called the PONY PRESERVATION PROJECT.
It was always about our ponies, the mane 6.

>Don't know if G5 will be as good as G4, or good in general
It won't be. It'll likely be EQG tier.
>>
>>36624770
Of course everyone knows why this project was created, to preserve the FiM characters. I'm only saying that with G5 coming, it's definitely going to include those characters too at some point. I'm betting we see datasets for these characters in the first week of the show airing on Netflix. This project is bound to grow beyond the reigns of G4 eventually.
>>
File: 1608268481050.gif (3 MB, 311x392)
3 MB
3 MB GIF
>>36624760
>except without her head for some reason
what does the AI meant by this
>>
>>36624788
Well they're not going to receive support from me for G5. Unless some of the VAs make some appearances as their old characters, I'm out. And considering the quality of the g5 posters I doubt they'd have even half the competency to see it out.
>>
>>36624544
Why do you feel like you don't contribute enough to have a name?
>>
>>36624763
How much is left to do on the fluttershy audio?
>>
>>36624816
Not him, but not every contributer uses a name.
>>
>>36624814
No doubt G4 VAs are gonna be in G5. They've all been in the previous gens, or at the very least G3, and I'm sure they had enough fun that they'd like to come back in some form.

>>36624817
About 5-ish minutes, I think. It's a "short" audio, and the remaining clips I have to generate is just a few lines of dialogue that consist of Fluttershy speaking softly/whispering. How easy it is to get this varies entirely on the version of the models. V9 and a bit of V12 was easy, but V13 was impossible. I've got like maybe a minute's worth of dialogue left to generate. The rest is lewd.
>>
>>36624899
>No doubt G4 VAs are gonna be in G5
Well if they make one or two lines as their original characters when it comes time for their cameo, then maybe I'll go to collect them. Outside of that, no.
>>
>>36624779
>It'll likely be EQG tier.
Unless they physically dig up Mane 6 graves or something similarly ridiculous, it's not possible for G5 to be as bad as eqg. Eqg was particularly insulting because it reused the Mane 6 and put them in a bastardized version of human world. At the very least G5 is its own thing. After yesterday's stream i'm hoping for G1 level of quality but G3.5 is also a possibility.
>>
>>36624798
we must sacrifice apple bloom to the gods if we ever want to see the rest of g4 again





Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.