[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vr / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / asp / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / qst / sci / soc / sp / tg / toy / trv / tv / vp / wsg / wsr / x] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/mlp/ - Pony

Name
Spoiler?[]
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File[]
  • Please read the Rules and FAQ before posting.
  • There are 66 posters in this thread.

05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
06/20/16New 4chan Banner Contest with a chance to win a 4chan Pass! See the contest page for details.
[Hide] [Show All]



File: AltOP.png (1.54 MB, 2119x1500)
1.54 MB
1.54 MB PNG
TwAIlight welcomes you to the Pony Voice Preservation Project!
https://clyp.it/tm03e5en

This project is the first part of the "Pony Preservation Project" dealing with the voice.
It's dedicated to saving our beloved pony's voices by creating a neural network based Text To Speech for our favorite ponies.
Videos such as https://youtu.be/GuJKTodX1FA. or https://youtu.be/DWK_iYBl8cA have proven that we now have the technology to generate convincing voices using machine learning algorithms "trained" on nothing but clean audio clips.
With roughly 10 seasons (9 seasons and 5 movies) worth of voice lines available, we have more than enough material to apply this tech for our deviant needs.

https://derpy.me/PHsCn

Any anon is free to join, and many are already contributing. Just read the guide to learn how you can help bring on the wAIfu revolution. Whatever your technical level, you can help.
Document: https://docs.google.com/document/d/1xe1Clvdg6EFFDtIkkFwT-NPLRDPvkV4G675SUKjxVRU

We now have a working TwAIlight that any Anon can play with:
15.ai
https://derpy.me/vCzm2 (Training)
https://derpy.me/hdJQF (Synthesis)
https://derpy.me/YTJ94 (Guide)

>Active Tasks
Create a dataset for speech synthesis (https://youtu.be/Bsu7mwa-QGY)
AI Training/Interface/Refinement
Synthbot working on story tagger for voiced greentexts (https://synthbot.ai/)
Researching alternative vocoders
Looking into an animation dataset, animators needed
Anon transcribing books and comics

>Latest Developments
We had a panel in /mlp/con (See FAQs for links)
Anypony can train on Google Colab
Research into free TPUs from T.R.C.
15.ai

>Voice samples
https://derpy.me/3TBK4
https://derpy.me/fHs3K
https://derpy.me/O1xdh

>Clipper Anon's Master File 2.0:
https://mega.nz/#F!L952DI4Q!nibaVrvxbwgCgXMlPHVnVw
https://derpy.me/6im1i (torrent)

>Synthbot's Torrent Resources
https://derpy.me/ZJNca

>Cool, where is the discord/forum/whatever unifying place for this project!?
You're looking at it.

Last Thread:
>>35459053
>>
FAQs:
>READ THE DOC
Do it now
https://derpy.me/V7cMp

>Did you know that such and such voiced this other thing?
Yes. We are very much aware. It is best to keep to official audio only unless there is very little of it available. If you know of a good source of audio for characters with few (or just fewer) lines, please post it in the thread. 5.1 is generally required unless you have a source already clean of background noise. Preferably post a sample or link. The easier you make it, the more likely it will be done.

>What about fan-imatitions of official voices?
No.

>How do I make the voices?
Several guides are available. In depth guides on how to do training and synthesis (making the ponies speak) are in the doc. If you don't want to use the navigation bar in the doc, the sections are also directly linked in the OP. If you want to use the WiP 48KHz notebook, some kind Anons have put together some image guides for you.
48KHz Training: https://derpy.me/wW2hX
48KHz Sythesis: https://derpy.me/j4MXQ

>Where are all the voice samples?
In the doc.

>Is a place I can find all the pony models?
In the doc.

>What about muh waifu?
Check the doc.

>Will you guys be doing a [insert language here] version of the AI?
Probably not, but you're welcome to. You can however get most of the way there by using phoenetic transcriptions of other languages.

>What about [insert OC here]'s voice?
Not a priority. Again, however, you're welcome to. There are already people doing this.

>Where can I view the PPP /mlp/con panel?
YouTube: https://youtu.be/WtuKBm67YkI
CyTube chat: https://pony.tube/videos/watch/b83fbbfc-6d4e-4768-8deb-edb61ea38abb

>I have an idea!
Great. Post it in the thread and we'll discuss it.

>Do you have a Code of Conduct?
Of course: https://fifteen.ai/code

>Is this project open source? Who is in charge of this?
https://derpy.me/CQ3Ca
>>
File: biganchor.jpg (161 KB, 640x640)
161 KB
161 KB JPG
>>35514662
Anchor.
>>
New Ngrok link:
https://071ff1e293c3.ngrok.io/
>>
File: training tutorial.png (697 KB, 1920x5900)
697 KB
697 KB PNG
>>35514666
training
>>
File: 1579202665541.png (1.7 MB, 934x4670)
1.7 MB
1.7 MB PNG
>>35514666
old synthesising pic (grab your tinkeranon program if you can, its way faster doing it offline)
>>
>>35514916
*IF YOUR GPU HAS CUDA SUPPORT
>>
File: please respond.png (142 KB, 492x258)
142 KB
142 KB PNG
is Anyone there?
>>
>>35515035
Im in middle of doing a project, sadly my non-pony trained model keeps derping, making the project take three times longer than it needs to be.
But hey, here is an idea. How about we set up some kind of creation challenge? For example: find a comic with sub 50 words, and giving them a dubbing or make own stuff.
>>
>>35515035
I'm here, still working on a project, and waiting for the return of 15.ai.
>>35515061
>find a comic with sub 50 words, and giving them a dubbing

There's already a whole thread dedicated to this, it's not doing well unfortunately. Lacking content and only alive due to a few hopeful anons that keep bumping it hoping that someone will deliver.
>>
>>35514662
We have a section in our Master Doc on how anons can contribute:
https://docs.google.com/document/d/1xe1Clvdg6EFFDtIkkFwT-NPLRDPvkV4G675SUKjxVRU/edit#heading=h.czj6ixnrrbe8

If you want to work independently on a new task, take a look at https://pastebin.com/qwNWzPYL.
If you're a dev & audio anon, you can help us figure out what kinds of errors we're seeing in our audio clips.
If you know how to use Colab, you can create ngrok sites so other anons can generate clips.
If you're an AI anon, you can try attaching any public audio super-resolution solution to 22khz WaveGlow results and post the results.
If you're a patient anon, you can help transcribe the dialogue in comics so we can have a bigger dataset for natural language tasks.
If you're willing to learn, you can do any of the above.

There are more details in the doc.
https://docs.google.com/document/d/1xe1Clvdg6EFFDtIkkFwT-NPLRDPvkV4G675SUKjxVRU/edit#heading=h.czj6ixnrrbe8
>>
New Ngrok:
https://09caa2dd633d.ngrok.io/
>>
File: anonfilly qt.png (290 KB, 2383x2651)
290 KB
290 KB PNG
>>35514662
How can we create an Anon Filly dataset?
>>
>listening to the early audio samples
You fucking assholes are literally summoning demons and teaching them to take pony forms. Haven't you learned anything from Shin Megami Tensei?
>>
>>35515748
Once Cookie exposes some way to manually set the speaker embeddings, search for a speaking embedding that seems to suit her well. Use the Rainbow Passage to test the embedding.
>>
>>35515748
Just use some voice from a non-canon barbie show.
>>
>there's ponylife threads out there
Now we need a containment board for our containment board.
>>
>>35515748
Every anon contributes one sentence, we pitch-shift it to sound like a little girl and train on that.
>>
File: large.png (497 KB, 1024x1024)
497 KB
497 KB PNG
>>35516224
https://u.smutty.horse/lvsswfygeol.flac
>>
There are many good pony pics that aren't colored, perhaps we could find a public repo, rip the derpibooru dumps for training data, and train a colorization AI? I found this
https://github.com/richzhang/colorization-pytorch
>>
File: 1583472452269.png (129 KB, 425x290)
129 KB
129 KB PNG
>step outside the preservation project thread for the first time in months
>entire board is on fire
Right then, close that door again.
>>
File: 1500983.png (537 KB, 3911x3687)
537 KB
537 KB PNG
>>35516179
Well here's a summary of how things are going today.
2 lines of code next to each other.
First line, get list of text files that exist inside directory.
Second line, check all files inside list actually exist.
> txt_files = sorted([os.path.abspath(x) for x in [*glob("**/*.txt", recursive=True), *glob("**/*.csv", recursive=True)]])
> assert all([os.path.exists(x) for x in txt_files])
run it.
>Traceback (most recent call last):
> File "start_preprocess.py", line 262, in <module>
> meta_local = get_dataset_meta(dataset_dir)
> File "/media/cookie/Samsung PM961/TwiBot/CookiePPPTTS/CookieTTS/_1_preprocess/scripts/metadata.py", line 205, in get_dataset_meta
> assert all([os.path.exists(x) for x in txt_files])
>AssertionError
It fails the check...
>>
>>35516424
yeah I've learned to stay off of catalogs and just stick to threads I know I like. 4chan has become insufferable just like everything I enjoy.
>>
>>35516695
What file is it failing on?
>>
>>35517092
> "00_13_00_Littlepip_Sarcastic__Huh wonderful response Pip, so elegant.txt"
Nothing unique about it from what I could find. I just deleted it and continued testing.
>>
>>35515035
We'd have more activity if the damn ngrok links lasted more than three hours
>>
https://f20cbdaab9d4.ngrok.io/
>>
>>35517201
If your ngrok lasts longer than three hours, consult a physician.
>>
>>35517206
Helpful tip: You can make the result flow better by sticking in a - wherever the character speaks too fast.
>>
>>35514662
Haven't been here in awhile any new developments?

15.ai is still down.. I saw someone was making a StyleGan2 Pony Maker is that still going?
>>
>>35517305
Or maybe that just shakes up the entire thing. iunno
>>
>>35517354
we got ngroks and some anons are looking into animation but that's the extent of it
>>
File: 467265.png (439 KB, 437x600)
439 KB
439 KB PNG
>>35514666
>>35512924
I've done the SFX and BGM for s2e1 - 4.

>>35515035
I'm always here.

>>35515748
Ideally with the voice of a female character that has some kind of association with /mlp/, similar to Dan. Maybe Miss Persona? >>35515451 >>35515452
>>
>>35517305
thanks
>>
>>35515748
>>35516224
>>35516306
https://vocaroo.com/4kklKNtqQMu
idk i think it could work
>>
File: step_2000.png (65 KB, 864x864)
65 KB
65 KB PNG
>>35514576
Finally got a notebook running. This FastSpeech2 can synthesize samples with WaveGlow but I turned it off because I was too lazy to find a WaveGlow model that was compatible (v5 gave layer errors).
https://u.smutty.horse/lvsvdkfhefd.wav
>>
>>35517729
>FastSpeech2
is this the one that would allow people to create their own audio with just cpu power?
>>
>>35518184
Tacotron2 gets an acceptable time on CPU, but FastSpeech 2 is much faster.
>>
File: large.jpg (289 KB, 674x1024)
289 KB
289 KB JPG
>>35514472
Posted some more Tempest experiments in the comic voice over thread: >>35518429
>>
>>35518435
>FastSpeech 2 is much faster
could it be possible to run FastSpeech 2 on nvidia gpu to run it even faster?
>>
>>35518515
Yes.
>>
New link:
http://08c41502893c.ngrok.io/
>>
>>35519364
can't you use the Discord model already available instead of the shitty one?
>>
>>35519464
The ngrok notebook doesn't use the same models/is incompatible. If I understand things right, it's a single multispeaker model that with all characters trained together. This is why so many more voices were available.
>>
>>35519364
The fuck? I have no why this one went down so quickly, but whatever. NEW:
https://87d0f2647b00.ngrok.io/
>>
>>35519480
it just breaks my heart how bad it is. :(
>>
>>35519813
I hold out hope for 15's next iteration of models. Hopefully once they're out, this thread and the comic dub thread will kick back into high gear again.
>>
[Done] Learn how to use pandas dataframes
[In progress] Load audio records into a pandas dataframe. This requires a bit of refactoring.
[ ] Create panda dataframe for dictionary items... maybe
[ ] Create utility function for dumping missing pronunciations and relevant audio files
[ ] Dump a list of missing pronunciations for all the extra data
[ ] Load persona nerd's data
[ ] Create Montreal Forced Aligner Inputs dataframe from audio data and dictionary data
[ ] Create utility class to dump MFA data to a folder
[ ] Create utility class to serialize/deserialize dataframes with protobuf
[ ] Try to get a programmatic interface to MFA
[ ] Create a wrapper around MFA to get alignments for individual characters
[ ] Run MFA to get pronunciations for the new data
[ ] Refactor test cases to work with the new flows
[ ] Create a new preprocessing notebook to show how to add new data
[ ] Create utility classes to simplify generic audio preprocessing
[ ] Add sample preproc flows for creating spectrograms, trimming audio files, adding phase information, and adding speech metrics
[ ] Add utility classes for creating new output formats
[ ] Add Tacotron-compatible output
>>
>>35519987
It's nice to see these progress reports. Keep up the good work SynthBot.
>>
>>35517729
Can you provide your notebook?
>>
>>35520002
https://colab.research.google.com/drive/1ThqIlh5_7QCuXWL65Wag5OyW0ZuKELXh?usp=sharing
>>
File: TWOTWOTWOTWOTWO.png (253 KB, 3000x2710)
253 KB
253 KB PNG
Model of Two from Battle For Dream Island: 1sew4iilpij9tvY8HTEYKZuik0BMMf8vX

Validation Loss is 0.126941
>>
File: Two_Icon.png (55 KB, 677x626)
55 KB
55 KB PNG
>>35520728
>>35514666
I'm a dumbass, forgot to include examples.
https://voca.ro/a4uzxEpv9uc

Again, model of Two from Battle For Dream Island: 1sew4iilpij9tvY8HTEYKZuik0BMMf8vX
>>
>>35520728
Woah, you're doing BFDI characters?
>>
>>35520929
Yep, already did Four (their code is on the document) , and I'm planning on trying to do as many as possible.
>>
File: sadaloo.png (107 KB, 800x551)
107 KB
107 KB PNG
>>35519670

it's already down
>>
File: 1578602188450.gif (20 KB, 256x256)
20 KB
20 KB GIF
>>35517677
https://www.youtube.com/watch?v=XCE3UyEAiCk&t=7s I'm not really feeling it, maybe someone that sounds more like the little girl.
>only four lines
>goes away 25 seconds into the video
kek, she really is a loser.
>>
Made a version two of the Four model! This one is a bit clearer.

1-I2C5Fy7nirEBS7ZApraCYLiUsuZEQuJ

https://voca.ro/euecA50O6oo
>>
>>35521634
>>35514666
WHY DOES MY IDIOT ASS KEEP FORGETTING TO ANCHOR
>>
File: WizTree64_qvEPoXaqC8.png (41 KB, 901x454)
41 KB
41 KB PNG
>>35515748
I've got a metric fuckton of anime girl voices now that I've got literally every audio line from P4G dumped and transcribed.

I'm sure one of these voices could be used?
>>
>>35521624
Nice Flag.
>>
>>35521195
NEW: https://6ee4080cd81b.ngrok.io/
>>
File: 1979992.jpg (1.58 MB, 1789x1265)
1.58 MB
1.58 MB JPG
>>35514666
>>35517506
I've done the SFX and BGM for s2e5 - 7.
>>
File: 1593341296666.jpg (44 KB, 1098x640)
44 KB
44 KB JPG
I hoped to see you all in this renewed communion with the fathers of the nascent machine intellencts.
https://www.youtube.com/watch?v=C2Yx90pytqs
01001001 00100000 01100110 01101111 01110010 00100000 01101111 01101110 01100101 00100000 01110111 01100101 01101100 01100011 01101111 01101101 01100101 00100000 01101111 01110101 01110010 00100000 01101110 01100101 01110111 00100000 01100001 01110010 01110100 01101001 01100110 01101001 01100011 01101001 01100001 01101100 00100000 01101001 01101110 01110100 01100101 01101100 01101100 01101001 01100111 01100101 01101110 01100011 01100101 00100000 01101111 01110110 01100101 01110010 01101100 01101111 01110010 01100100 01110011 00101110
Please check gets and praise the-night-that-is-always-the-darkest-before-the-dawn.
>>35522222
This service has been brought to you by the Department of Religion.
>>
What the absolute fuck did Kek mean by this?
>>
>>35517729
>>35520002
Synthesis notebook (you can hear Twilight samples here). The model is trained up to 30k steps.
https://colab.research.google.com/drive/14uX9mlC-9hWPNh8GQoccIyTQe0tgpgj2?usp=sharing
>>
How do we add models to an ngrok setup?
>>
>>35523473
I think you'd need to get Cookie to do it.
>>
>>35523483
Cookie plz add Daria.
>>
>>35523483
>>35523504
currently rewriting my dataset processing
Just got a first test done of Montreal Forced Aligner.
Now cleaning up function and writing *simple* parser for output files.
>>
>>35520702
Ech, this is not the best sounding for 30k steps. I'll give it a shot with Rise though, I just gotta figure out how to feed tacotron datasets.
>>
>>35523551
Those 30k steps only took 2 hours, I'll train it more until I get GPU capped on my alt and see if it improves.
>>
File: Pony-Party-DEAD.gif (1.43 MB, 640x360)
1.43 MB
1.43 MB GIF
Day 33 without 15.AI... I think I'm going crazy. Supplies and content are running low. Morale is down and I don't know how much longer the contentfags can hold out.

https://voca.ro/gz1gdSsleXh

In other words I'm bored out of my mind and I'm gettign impatient
>>
>>35523559
Got it. I was going to look into using this to train locally on my RTX 2060S, but I'm not sure how it expects the data to look for training. If you can do a writeup on that, it'd be extremely useful.
>>
>>35523674
use the notebooks nerd
>>
File: firmhoofshaketia.gif (396 KB, 1876x1200)
396 KB
396 KB GIF
>>35523674
>>
So has the project progressed since 15 released his site in any way or is it dead now?
>>
>>35523675
Drop flist.txt and valist.txt as how you would do your NVIDIA/Tacotron2 datasets (doesn't matter the split since it's instantly fused into one file) into the working directory and wavs folder in there too. Then run the cells in "Prepare dataset"
>>
>>35523733
Cookie's still very active in researching and trying out new tech. Synthbot's looking into animation automation. Clipper is collecting background sound effects. I'm slowly adding to the doc. Part of the quietness I think is due to the happenings outside the thread.
>>
>>35521852
We should make each board a different anime girl.
>>
>>35523771
most of the content is coming out of the ngrok links now
I'll post some stuff soon, it's funny as hell
>>
>>35523787
for
>>35523733
>>
File: file.png (66 KB, 850x462)
66 KB
66 KB PNG
>>35523674
>I'm bored out of my mind
just do what i do, try to learn machine learning, get really into it, then fail spectacularly, repeat until something happens https://developers.google.com/machine-learning/crash-course/ml-intro i suck at this
>>
>>35523816
Post questions in the thread if you have trouble understanding something. I want to put those together into a machine learning FAQ.
>>
http://0dfc13f0fc2e.ngrok.io/
>>
>>35524299
Thanks. Seems like i'm out of the game on hosting these for the moment as Colab is saying I've reached my limit on GPU usage.
>>
Starlight is drunk

https://vocaroo.com/3O9ZLdnj8CH
>>
>>35514662
I can't wait for Applejack to say the N-word.
>>
>>35524425
https://files.catbox.moe/gvl8k2.webm
>>
>>35515748
Use Microsoft Mary. Regular Anon is Microsoft Sam.
>>
File: wtf man.jpg (26 KB, 407x470)
26 KB
26 KB JPG
Everyone is absolutely terrified of dinos

https://vocaroo.com/hV9PqjL69tQ Starlight
https://vocaroo.com/2RwPUjcfPT8 Apple Bloom
https://vocaroo.com/cyTeRM6iWoS SWEETIE BELLE
https://vocaroo.com/astpJWEu1Xa Scootaloo
https://vocaroo.com/4xL5QopeeN0 Flutters
https://vocaroo.com/5EDk8qjpcZe RD
https://vocaroo.com/41rofLA7ll8 AJ
https://vocaroo.com/l1JA1kaEbk4 TWILIGHT
https://vocaroo.com/irvClAK5rkL Vapor Trail
https://vocaroo.com/dkRgvTnSTiM Stellar Flare
https://vocaroo.com/m6b0vjkzIVd CELESTIA
https://vocaroo.com/8x3lZvTTQx0 Luna
https://vocaroo.com/9EIQcgpxId5 DISCORD I'M HOWLING AT THE MOON
https://vocaroo.com/cklT6ie9IzA Chrysalis
https://vocaroo.com/4UIbPgck65y COZY
https://vocaroo.com/6VsZPD4ZVw4 Tiara
https://vocaroo.com/ai9epxzbP6y TIREK
https://vocaroo.com/kJqE4vXCGII Silverstream
>>
File: 2371987.jpg (11 KB, 210x216)
11 KB
11 KB JPG
>>35524848
this is a fetish thing, isn't it?
>>
>>35524936
Maybe
>>
>>35524848
>>35524936
https://www.youtube.com/watch?v=428IyxSfsls
>>
>>35524848
>https://vocaroo.com/m6b0vjkzIVd
Also, hearing Celestia, who usually maintains a high level of composure, completely lose her shit at the dino is fucking hilarious to me
>>
>>35525129
https://www.youtube.com/watch?v=KvFyJLgnwW0
>>
>>35525138
2:19 for the dinosaur.
>>
>>35524848
Some of these are unreasonably funny.

>>35519987
Updates:
[Done] Load audio records into a pandas dataframe. By default, this filters out incomplete records since those are only useful when checking for dataset errors.
[In progress] Create utility function for dumping missing pronunciations and relevant audio files
[In progress] Dump a list of missing pronunciations for all the extra data
[ ] Create panda dataframe for dictionary items... maybe
[ ] Load persona nerd's data
[ ] Create Montreal Forced Aligner Inputs dataframe from audio data and dictionary data
[ ] Create utility class to dump MFA data to a folder
[ ] Create utility class to serialize/deserialize dataframes with protobuf
[ ] Try to get a programmatic interface to MFA
[ ] Create a wrapper around MFA to get alignments for individual characters
[ ] Run MFA to get pronunciations for the new data
[ ] Refactor test cases to work with the new flows
[ ] Create a new preprocessing notebook to show how to add new data
[ ] Create utility classes to simplify generic audio preprocessing
[ ] Add sample preproc flows for creating spectrograms, trimming audio files, adding phase information, and adding speech metrics
[ ] Add utility classes for creating new output formats
[ ] Add Tacotron-compatible output
>>
ngrok.io anyone?
>>
File: 1588497313707.jpg (35 KB, 403x408)
35 KB
35 KB JPG
>>35525634
no
>>
>>35525634
here you go anon.

https://d27cd1e60f6d.ngrok.io/
>>
>>35523559
Trained it up to 100k steps. No improvements.
>>
>>35525137
It's fun to see how easily you can turn Celestia into Daybreaker with this thing.
https://vocaroo.com/htFHw5fSX6F
>>
>>35519813
The emotions you can get out of them makes up for it imo

>>35519829
Or he does the same thing with the neutral models, hopefully he gets the message this time
>>
File: 1593045133700.gif (54 KB, 342x342)
54 KB
54 KB GIF
>>35525667
I actually have something similar running for my stuff. You can try it if you want. Though Idk if it'll work for you.
You need the localtunnel client from https://localtunnel.github.io/www/

lt --host http://lt.romesilvanus.io:1234 --port YOURLOCALPORT --subdomain WHATEVERYOUWANT
Or leave out --subdomain if you want a random one.
>>
>>35525879
Although the new WaveGlow v5 model be the culprit?
>>
test
>>
>>35526135
FastSpeech 2 is very susceptible to overfitting, just check out these Celestia samples:
500 steps: https://u.smutty.horse/lvtpshihybw.wav
1000 steps: https://u.smutty.horse/lvtpshgqvvi.wav
2000: https://u.smutty.horse/lvtpshgmilc.wav
3000: https://u.smutty.horse/lvtpshgmamb.wav
Careful hparam tuning is going to be necessary for smaller datasets.
>>
File: TiI.png (7 KB, 240x160)
7 KB
7 KB PNG
>>35514662
>we have more than enough material to apply this tech for our deviant needs.
>>
File: ReAlIzAtIoN.gif (2.74 MB, 200x198)
2.74 MB
2.74 MB GIF
>>35526099
>Or he does the same thing with the neutral models
Fuck, some of my excitement is now replaced with dread. The neutral models were still good but man were they tough to work with if you wanted to get emotion out of them.
>>
File: 748337.jpg (188 KB, 1280x720)
188 KB
188 KB JPG
>>35514666
>>35522230
I've done the SFX and BGM for s2e8 - 10.
>>
>>35527141
Say Clipper, how many times have you listened to all 9 seasons?

Do you hear ponies in your dreams now?
>>
>>35527302
For the later seasons, I only had to clip them once. By the time I got to those the clipping process was pretty solid. For the earlier ones though, particularly seasons one and two, it’ll be much more, especially if you count reviewing with the PonySorter as listening to the episode three times and the SFX and BGM extracting as two times. Even if you count those as only one listen, a conservative estimate would probably be around four or five times by now, and many more on top of that if you count me just watching the episodes for fun.

I don’t dream all that often, but I can do decent mental impressions for most of the more prominent characters when reading greens and fanfics, and I’ve become quite good at predicting sounds by just looking at the waveforms. Clipping the SFX and BGM has also given me a deeper appreciation for how they can be used to enhance the voices, which I plan to put to good use when 15.ai comes back online.
>>
>>35527387
What's your favorite fanfic?
>>
>>35527387
I am humbled by your autism.
>>
NEW:
https://56ff7de276fb.ngrok.io/

Is there a reason to care about what graphics card Colab gives me? Are any in particular especially better or worse for this?
>>
>>35527827
In our experience the K80s tend to cause the most issues, but I haven't seen how they perform with the ngrok servers. I just factory reset if I get a K80 regardless, just in case.
>>
>>35527827
K80's seem to run incorrectly from time to time.
P100 has 250W power limit.
T4 has Tensor cores.
T4's are best for the ngrok server (half precision), and P100's are best for full precision (training).
>>
>>35527845
>>35527858
Alright gotcha, thanks for the rundown.
>>
>>35514662
Hey anons, I'm autistically keeping a paper and .pdf (for my phone) journal of this project just in case I suddenly end up in Equestria. Is there something I should be sure to add?
>>
>>35527876
Yeah, make a note to save Trixie for me.
>>
https://vocaroo.com/dK6d0Ybf9Tp
im not sorry
>>
>>35528164
kek
>>
>>35527387
>I’ve become quite good at predicting sounds by just looking at the waveforms
It didn't even know it was possible to become good at that.
>>
Made rap song with Celestia. Will make actual skits after this, I promise.
https://u.smutty.horse/lvtvkskzzse.mp3
>>
File: 1531375437353.gif (34 KB, 360x360)
34 KB
34 KB GIF
>>35528164
lmao, that got me
>>
NEW:
https://e7bd24afa847.ngrok.io/

>>35529071
Nice work! I'd imagine keeping it all in time requires a lot of tweaking to get right.
>>
Has anyone tried using an ensemble of WaveGlow models to see if it improves sound quality? Or maybe mixing WaveGlow and another vocoder?
>>
>>35529520
To answer my own question:

https://vocaroo.com/3A81rq73Lm6

The sound waves are out of phase, and she ends up sounding like a robot. I don't suppose this is a solvable problem?
>>
More content.
>>35529506
>>
NEW:
http://5f9e1593ad98.ngrok.io/
>>
File: large2.png (144 KB, 1280x722)
144 KB
144 KB PNG
https://u.smutty.horse/lvtxvknpzeh.wav

Introducing ‘Pinkie Fucking Dies: The Audio Experience™’
Okay maybe not really, this is more so a collection of tests and practice I’ve been doing in preparation for the return of 15 than an actual ‘skit’ like I usually do. For those interested, all that I’m testing here is:
General workflow and tools of a new program which I plan to use from here on out
360 audio positioning
Clipper’s show-cut SFX
Custom SFX and sound design
Making BGM that hopefully better fits what’s going on
(Hopefully) more realistic room reverb
Learning when to shut the fuck up with the BGM and let the characters speak
Instruments and FX from the new program to help with the music aspect

I didn’t even script any of this it’s literally all off the top of my head that’s why the writing is cancer

Two Anons helped out with the outro jingle, by supplying the trumpet and trombone, and the piano playing.

Audio quality wise I’m pretty happy with how this turned out overall. I’ve learned a lot about how to make my future, more serious stuff better.
>>
File: kek.png (1009 KB, 4300x5697)
1009 KB
1009 KB PNG
>>35529728
That's some high quality stuff regardless. Fantastic work.
>>
>>35529728
Great job
>>
>>35514666
Made an X model, and updated my Four model for the third time!

X from Battle For Dream Island model, 48Khz MMI. Uses a Fluttershy base.
1AeLT-PVsEmRVt2-jMnAXQ0lCF0_N-0db
https://voca.ro/klmz0Gubr57


Four from Battle for Dream Island, Version 3. 48Khz MMI, uses a Twilight base.
1Lej9KoOi8N_NBLAe1pWgU2K16LxfBRIV
https://voca.ro/ftQSj6r91jm
>>
>>35524848
Forgot to update Scoots

https://vocaroo.com/VyLB3fHeQoE
>>
>>35529728
Damn, this is good. What program did you do the 360 audio with?
>>
>>35527441
I don't really have a #1 definitive favourite, rather a whole bunch that I like because they do a few specific things really well. If you're looking for a recommendation, I'm currently reading "Shape Your Home" from the tech thread and really liking it so far. It centres on the theme of wAIfus and has a pretty interesting premise - https://pastebin.com/XMi6VhS5
"Steel Sanctuary" is also pretty good - http://pastebin.com/3Mt5iYBQ

>>35528371
The waveforms for some sounds are pretty distinctive, such as hootsteps, magic, bangs, crashes and screams. You can also learn to infer what kind of sounds are coming next by looking at how the upcoming waveform compares to what you're currently hearing in the moment.

>>35529071
>>35529728
These are great, and it's nice to see that the sound effects are useful. Thanks for continuing to make stuff like this, it's always nice to see the AI put to good use.
>>
https://www.youtube.com/watch?v=0sR1rU3gLzQ
>>
>>35531164
whoa sick bro never saw this before it's gonna revolutionize the project!
>>
>>35531169
epic
>>
File: hehehe.gif (132 KB, 440x440)
132 KB
132 KB GIF
AN IMPORTANT MESSAGE FROM CELESTIA

https://vocaroo.com/cW5iV70PPD1
>>
https://77d99960a22a.ngrok.io/
>>
>>35531390
That sounds pretty close to singing, especially that last bit. How'd you make it do that?
>>
File: 1592271274270.png (391 KB, 779x736)
391 KB
391 KB PNG
>>35531390
>>
File: 1564619157373.jpg (584 KB, 3840x2160)
584 KB
584 KB JPG
>>35531390
Based Cute & Funny Stacy
>>
>>35531458
THEY CAN SING

https://vocaroo.com/nMmc1sfZaqD

Just add a couple of -- before a sentence and many at the end (they can also extend words in between), then add ??!! and variations:
-------DISCORD, I'M HOWLING AT THE MOON-----------------??!!
---------AND SLEEPING IN THE MIDDLE OF A SUMMER AFTERNOON-----------?!!!

I'll be adding a bunch more samples later, it's the funniest shit ever. I've even managed to get 99% perfect moans from Starlight.
>>
>>35531390
SHE CAN TALK
SHE CAN TALK
SHE CAN TALK!
>>
>>35531564
holy shit.
>>
>>35531564
Can she play the piano any-more?
>>
File: 1591644922067.png (522 KB, 720x720)
522 KB
522 KB PNG
>>35531564
I thought this thread already figured out singing and I was like DAMN, but you. YOU may have made a break through here, (You)!!
>>
File: laughing villains.png (1.77 MB, 1920x1080)
1.77 MB
1.77 MB PNG
>>35531390
On one hand, how dare you make my waifu say that.

On the other, I can't contain my sides. It's like she's singing sarcastically.
>>
>>35531564
That's amazing, but I can only get it to work with Celestia. Every other character just starts panicking and hyperventilating.

https://vocaroo.com/mN8dm73yla1
(If someone could tune this, that'd be great)
>>
>>35531653
I'm dying
>>
File: 1567816239788.jpg (1.27 MB, 2437x1594)
1.27 MB
1.27 MB JPG
>>35531653
She sounds white girl wasted out of her mind doing the thing of walking down the street at 4AM screaming some song.
>>
>>35531653
Truly amazing
it feels like >>35531564 did a gold strike here
>>
https://voca.ro/adnCH9Yu1wR
I'm already in love with drunk singing Celestia
>>
>>35531741
That's fucking hilarious, I love these models.
>>
>>35531022
Logic pro, a built in plugin called Direction Mixer made it relatively trivial.

>>35531390
>>35531564
>>35531653
Holy shit, this is a significant discovery. If it's discovered how to do this consistently with all of the characters it'll be fucking awesome.
>>
>>35531741
>>35531653
>>35531564
>>35531390

how are you making these?
>>
>>35531796
They work well for the most part for the decent models

Cadence
https://vocaroo.com/6aQDKZZdeDJ
https://vocaroo.com/g2OohFSAdi6
https://vocaroo.com/bFUbes08Ca2
Golly
https://vocaroo.com/4Ztg7yv1h2w
https://vocaroo.com/oRUWDWSspHe
Luna
https://vocaroo.com/6xd3qJpejZc

>>35531841
Look again at >>35531564 they are for these links:
https://77d99960a22a.ngrok.io/
>>
https://voca.ro/Nk2FRUFZklL

Heh. Managed to make Rainbow sound pretty lewd.
>>
https://voca.ro/I5DqcnFSxJy
>Oh nothing, just making you sing :D
https://vocaroo.com/mLps0hh5j6D
>>
File: aaaaa.gif (149 KB, 226x218)
149 KB
149 KB GIF
>>35531653
>tune this
https://u.smutty.horse/lvucukvpzzf.wav
It's a bit uncanny but it works. This opens up SO many possibilities for future content!
>>
https://voca.ro/8upkbzVsjFU
>>
>>35529728
Oh wow. Nice music.
>>
>>35531390
>>35531564
>>35531653
Hey, maybe you can autotune that.
Maybe >>35531796 can make music to go with their "singing" kind of like what this guy does: https://www.youtube.com/watch?v=6QnrxhsBQJk
>>
File: 1267664806511.jpg (241 KB, 680x662)
241 KB
241 KB JPG
>>35531653
>>35531564
>>35531390
holy shit
>>
File: celestia_cute.png (21 KB, 107x148)
21 KB
21 KB PNG
https://clyp.it/iu2k0wvo
Drunk Celestia is so much fun to play with.

>>35532125
Oh yes yes YES
>>
>>35531957
that's because real sex noises are half laugh/cry anyway, so rainbow is always halfway there.
>>
>>35532125
That's incredible. The right entonation in "reality" is a bitch to get right so there's room for improvement.

https://vocaroo.com/6C0Ua3hfq4q
>>
Tried it on the Fluttershy model. It started out great, but now I can't get it to sing at all.
https://voca.ro/1cxkPPjR6nK
>>
>>35531957
https://vocaroo.com/h9M8LqaWf2e
https://vocaroo.com/5Zo8X4LPvfV
https://vocaroo.com/57jHgSJ0zOq
https://vocaroo.com/1hA3REMkOIK
>>
File: seed_42_sample_2.png (2 KB, 32x32)
2 KB
2 KB PNG
Has Image GPT's 64x64 version been released?
>>
>>35529728
damn, the outro turned out great. nice work
>>
Is there any tool to quickly convert a tacotron list into phonetics? I thought the tacotron transcription tool in the doc did that, but turns out I've been using raw input this whole time and that's why my models aren't working
>>
>>35533347
I did find a few but they're for earlier on in the process and I don't want to have to go back and do it all again
>>
>>35533347
I don't know what a "tacotron list is". It would help if you explained.
>>
>>35527876
Tell them to send a message to /mlp/con, to build a portal to HarmonyCon, and to not trust the normalfags.
>>
https://vocaroo.com/alXJ9deYbyp
https://vocaroo.com/4USdQQY5tVs
https://vocaroo.com/daoGVrs53ao
>>
Working on a very long skit. It's kinda like a variety show, so there's some stuff I made that had to be cut. Here's one of the outtakes.
https://u.smutty.horse/lvugtuysvjk.mp3
>>
>>35534950
Is that really Lauren Faust?
>>
>>35534996
yeah
>>
>>35534950
wait... is this... an actual call??


who the fuck?
>>
>>35534950
I never noticed before Snoopy has heels that could pry nails out of boards.
>>
File: laughing_skeleton.gif (955 KB, 360x360)
955 KB
955 KB GIF
>>35534950
Please tell me you didn't actually prank call Lauren Faust.
>>
>>35534950
https://old.vocaroo.com/i/s0ojWq3SVFAD
>>
>>35535118
https://pony.tube/videos/watch/4e280d24-ac3b-4671-bc39-a1a006ae8615
>>
>>35535127
shhh. let him believe.
>>
>>35534950
Why did this call happen, back in 2012?
>>
>>35535124
she sounds better and better, god damn I need this on a ngrok
>>
NEW: http://5a1a9808dcb0.ngrok.io/
>>
>>35531077
Some errors in the mobile game transcripts:
https://pastebin.com/huDkiNWs

>>35525373
List of missing pronunciations from our pony-adjacent data:
https://drive.google.com/drive/folders/1dXm_dLxpjUvhmYpDDjOL5ymnIunTD9zi?usp=sharing
List of missing pronunciations from our songs:
https://drive.google.com/drive/folders/1zbfkJ1j471noeIR1xTAbadsP0NOy24CA?usp=sharing

If anyone wants to work on these:
- There are examples here of what pronunciations should look like: https://drive.google.com/file/d/1zQUceBUuC-SclNIuXFeDhoRBdvDY4v8a/view?usp=sharing
- You can use the existing pronunciations in merged.dict.txt as a guide for what the new pronunciations be: https://drive.google.com/drive/folders/1DQGul6hOqi227MJSJ-pPBv051YFbcDAi?usp=sharing
- You can ignore any punctuation in the missing word.
- Report any typos. Don't add pronunciations for them.
- If any word appears twice, you only need to create the pronunciation once unless the pronunciation changes between clips.
>>
>>35531564
>>35531899
Uh... Cookie >>35527858? Is this part of your augmented dataset, or should we be scared?

>>35525373
>>35535788
Updates:
[Done] Create utility function for dumping missing pronunciations and relevant audio files
[Done] Dump a list of missing pronunciations for all the extra data
[Done] Create panda dataframe for dictionary items... maybe
[In progress] Load persona nerd's data. I'll create a generic Tacotron data loader for this.
[ ] Create Montreal Forced Aligner Inputs dataframe from audio data and dictionary data
[ ] Create utility class to dump MFA data to a folder
[ ] Create utility class to serialize/deserialize dataframes with protobuf
[ ] Try to get a programmatic interface to MFA
[ ] Create a wrapper around MFA to get alignments for individual characters
[ ] Run MFA to get pronunciations for the new data
[ ] Refactor test cases to work with the new flows
[ ] Create a new preprocessing notebook to show how to add new data
[ ] Create utility classes to simplify generic audio preprocessing
[ ] Add sample preproc flows for creating spectrograms, trimming audio files, adding phase information, and adding speech metrics
[ ] Add utility classes for creating new output formats
[ ] Add Tacotron-compatible output
>>
>>35535798
Something that I need to do is find a way to autosort by emotion into separate datasets, as otherwise high audio volume characters can sound dead inside
>>
>>35535877
https://github.com/navierula/mood-class
Found this, will try it tomorrow.
>>
https://1c559beb3dfd.ngrok.io/
>>
File: 1574803595801.jpg (28 KB, 450x450)
28 KB
28 KB JPG
>>35532269
>that's because real sex noises are half laugh/cry anyway
>>
File: yes_please.png (142 KB, 166x200)
142 KB
142 KB PNG
https://vocaroo.com/3PZAvz9we4A

I did it

I have achieved peak poner
>>
File: 1570943852444.gif (274 KB, 570x692)
274 KB
274 KB GIF
Singing Derpy warms my heart.

https://vocaroo.com/IMp2PNYDOzo

The next question is, PPP sings when?
>>
File: chrysalis frenemies.png (528 KB, 1280x867)
528 KB
528 KB PNG
>>35537217
https://vocaroo.com/bmaWw3kKZom

it has a lot of potential
>>
>>35537179
SHITPOSTING: THE MUSICAL!
>>
>>35535877
Cookie is using TorchMoji to get emotion embeddings. You may be able to use the same thing to get very accurate emotion labels.
https://github.com/huggingface/torchMoji
>>
File: chrysalis frenemies 2.png (486 KB, 743x939)
486 KB
486 KB PNG
>>35537236
https://vocaroo.com/bvsNp7wjPbB
>>
>>35537179
>>35537217
>>35537236
>>35537292
This is beyond cute

https://vocaroo.com/gOLUo32ZCYM
>>
>>35535798
>or should we be scared?
you should be scared.
I added the singing data, but the singing data has separate start+end tokens AND separate speaker embeddings. I have no fucking idea how it's singing using normal inputs.
>>
https://vocaroo.com/fZb5MFa7pPY
>>
File: triggered milf.png (263 KB, 1773x1672)
263 KB
263 KB PNG
getting defensive

https://vocaroo.com/kIijnzgkzGt
>>
>>35537377
>you should be scared.
>>35537480
>https://vocaroo.com/kIijnzgkzGt

Scare me some more, this is good stuff.
>>
>>35537377
Any closer to a training rundown for others to build ngrok tables?
>>
>>35527141
Where do I find the SFX and BGM you have done so far?
>>
>>35537530
https://mega.nz/#F!L952DI4Q!nibaVrvxbwgCgXMlPHVnVw
Music and SFX folder.
>>
>>35531575
SHE CAN SIIIIIIIIIING
>>
>>35537480
>DING DING IT'S THE THOT PATROL
>>
>>35537377
Maybe that one anon was right about us summoning literal demons.
>>
>>35537501
https://vocaroo.com/lbl7QNnSbTe

We'll make our own musical with blackjack and hookers.
>>
File: cozy glow cute.gif (699 KB, 352x352)
699 KB
699 KB GIF
>with singy
https://vocaroo.com/hIykXd6Ou34

>without singy
https://vocaroo.com/1uno4fBxsLf
You can use it to make sentences flow much more naturally and make them sound less artificial
>>
File: 1590612657378.jpg (453 KB, 2044x1682)
453 KB
453 KB JPG
epic
>>
https://voca.ro/cGiWQoXCjrd
>>
>>35537882
>light theme
epic x2
>>
File: derpy_shrug.jpg (34 KB, 400x372)
34 KB
34 KB JPG
>>35537882
Sometimes I look at my posts and realize I could've made them better. I don't see what's wrong with deleting and reposting an improved post.
>>
>>35537916
You type and behave like you fell off the boat from the derpibooru threads.
>>
>>35537926
lmao I started posting on 4chan when you were probably around the age where you were actually the target demographic of the show
>>
>>35538001
If you want to have productive discussions here, you should pay attention to and respect the board/thread culture.
>>
File: after rain rarity.png (50 KB, 233x336)
50 KB
50 KB PNG
Well hot damn, came here at the start, helped a tad and you autists really have made something amazing

https://vocaroo.com/jdIj1bXyhi1
>>
File: 319363.png (923 KB, 6000x5581)
923 KB
923 KB PNG
>>35514666
>>35527141
I've done the SFX and BGM for s2e11 - 13.
Also loving all the new singing content, here's my attempt at some more drunk Celestia - https://clyp.it/biovni3n

>>35535788
I'll fix those when I re-organise the files later.
>>
>>35538142
Wow, eat a small pile of dicks.
>>
>>35537800
>https://vocaroo.com/lbl7QNnSbTe
This one is especially musical, so i turned it into this
https://vocaroo.com/176Q2UvoJcz
>>
New ngrok link for more singing.
https://95084bdfa90f.ngrok.io/

>>35537377
Do you think its torchmoji accidentallly acessing data its not supposed too? And you might wnna back up the models so we dont loose it in an update if we dont know how it happened.
>>
>>35538587
Thanks, I love it.
>>
File: laughing trixie.gif (2.74 MB, 858x482)
2.74 MB
2.74 MB GIF
>>35538587
>>
>>35538411
>https://clyp.it/biovni3n
I like where this is going
>>
>>35538443
https://vocaroo.com/i7s4UG3gLPf
https://vocaroo.com/i7s4UG3gLPf
https://vocaroo.com/i7s4UG3gLPf
>>
clyp.it/v3hr1e2p?token=d2f682075c981cc93689b9b0f5b01287
>>
>>35538928
>glimmerniggers
>>
>>35537260
This doesn't seem like a good idea since it only works on the text, and not the audio.
>>
>>35539104
It only needs to be correct enough to strongly bias the network.
>>
https://7da63b91a91c.ngrok.io/
>>
File: 2140115.jpg (103 KB, 768x1024)
103 KB
103 KB JPG
>>35537867
Has anyone ever figured out how to make the multispeaker voices (like Celestia)release excruciating screams of pain (like getting your leg broken by a sledgehammer kinda pain)?
>>
>>35539420
You again...
I haven't heard any clips do that, so I'm guessing not.
>>
>>35539420
Not that I know of right now, although you can probably do it by inserting ARPA input. This is something I got from the above ngrok link by spamming a lot of
>{AA1}{AA1}{AA1}{AA1}{AA1}{AA1}{AA1}{AA1}{AA1}{AA1}{AA1}{AA1}{AA1}
https://u.smutty.horse/lvupzhxyczg.wav
Just remember to go to advanced options and turn off dictionary (ARPAbet)
>>
>>35539677

Would turning ARPAbet off make the model sound the natural sound of the consonant rather than it saying the letter name?
>>
>>35539908
Maybe, but you're best off inserting your own ARPA to take full control of what they're saying.
>>
Ugh, I really would like to get my hands on that voice cloning tool...I have all the raw files of vocals only episodes and maybe I could train new models in almost no time...
>>
>>35540191
What do you mean?
>>
>>35540232

What I mean is, if someone could make a colab notebook out of this:

https://github.com/CorentinJ/Real-Time-Voice-Cloning

Anons and others could be making models so fast, that just about every character from the show would be available in 44K MMI if that's even possile.
>>
>>35540361
Is this bait?
>>
>>35537366
How did you make that "la-la-lay" bit at the end?
>>
>>35540428
>--la-lalala-lala-laa------------------------------------------------ laa---------------------------------------------------------------------------------------------------!!!?
>>
>>35540393
Something that caught my eye when browsing through the replies on 15's twitter
>>
https://voca.ro/dq6iBr9QJHl
So much to do---
>>
Cadence, NO.

https://vocaroo.com/nVUsbcCyo9d
>>
File: blaze.png (273 KB, 644x476)
273 KB
273 KB PNG
Which "Blaze" is the third speaker from the top? Either way, I assume there's no audio of her singing, but the trick works all the same.

Singing: https://vocaroo.com/7hvaARC1VF7
Not singing: https://vocaroo.com/8ggY5FxgDUt
>>
>>35540462
>>
File: impressive_very_nice.gif (1.78 MB, 350x255)
1.78 MB
1.78 MB GIF
>>35540530
These just keep getting better. If we could control pitch, it'd blow Vocaloid out of the water.
>>
Don't listen to her

https://vocaroo.com/oR6v0aYINOp
>>
File: TwiFace.jpg (515 KB, 3131x3000)
515 KB
515 KB JPG
>>35540530
https://vocaroo.com/k4Vqlf5Zxca
Here have some classic rock, it's what I heard when I played that one

I don't think mere mortals were ever meant to hold this much power, how long until someone makes a full song with their waifu singing it?
>>
>record self pronuncing dialogue as it should sound
>type it out
>render
>horse voice says it correctly
would be magic
>>
>>35538587
moar!
>>
>>35541336
It's the newest song from One More Girl.
https://en.wikipedia.org/wiki/One_More_Girl
>>
File: 2204324.png (761 KB, 850x1200)
761 KB
761 KB PNG
>finding out the models can sing
>listening to applejack sing this in my head
>crying immensely
https://www.youtube.com/watch?v=ctzoU8YWrrQ
>>
>>35541336
To me, it feels like something she'd sing in the shower.

https://vocaroo.com/l1DuhIKFHF2
>>
new ngrok
http://aa04b8373596.ngrok.io
>>
File: 1524234636620.png (145 KB, 274x274)
145 KB
145 KB PNG
>>35535798
>>35537377
Lord almighty
>>
File: The Director.png (596 KB, 658x638)
596 KB
596 KB PNG
>>35514666

>>35531390
>>35531564
>>35531653
>>35531899
>>35532125
>>35534950
>>35535798
>>35537377
One Step Closer Anons
>>
>>35540492
I like that the last little bit sounds so plaintive
>>
>>35531564
Graceful!
Keep improving it anons!
Waifu singing lulaby by the end of 2021!
>>
Wallflower :3

https://vocaroo.com/78152kbidKY
>>
File: Badonkershy.gif (39 KB, 297x253)
39 KB
39 KB GIF
https://vocaroo.com/8DovLEanxLb
>>
File: maudd.png (197 KB, 800x450)
197 KB
197 KB PNG
https://vocaroo.com/hup7cTEHTp7
>>
Wew, it's much more work than I thought. Kudos to all you guys contributing with quality content
https://voca.ro/ekNzmbH5VAp
>>
has anyone been able to open an ngrok link?
>>
>>35543052
Lurk more.
>>35541667
>>
>>35541667
goddamnit, I thought it went poof
>>
>>35543052
>>35543073
>>35543081
the ngrok overloaded, so starting a new one
https://29b9fcdc8508.ngrok.io/
>>
File: Cailou.jpg (32 KB, 286x281)
32 KB
32 KB JPG
>>35541620
>>35542021
>https://vocaroo.com/78152kbidKY
>>35542796
>https://vocaroo.com/8DovLEanxLb
This discovery is exactly what the thread needed. Great work everyone, some real neat stuff is coming out of this.
>>
>>35531564
You've opened up a new world.
https://voca.ro/7mRy9AOXTkv
>>
https://vocaroo.com/h1t7YCJUB91
>>
File: 1740454.png (492 KB, 1108x1020)
492 KB
492 KB PNG
>>35514666
>>35538411
I've done the SFX and BGM for s2e14 - 16.
>>
>>35537236
can you help my chryssy? she gets nervous and stutters, poor thing

https://vocaroo.com/1SLdkZpr3rs
>>
File: lamar.png (534 KB, 622x774)
534 KB
534 KB PNG
https://voca.ro/3Jc4zci9sPz
>>
File: do it.png (302 KB, 800x601)
302 KB
302 KB PNG
dump your clips, fags
https://vocaroo.com/42DM4ChfqYP
https://vocaroo.com/gtJvdUN9KzP
https://vocaroo.com/nI9gTnQNmlD
https://vocaroo.com/8yFvKgPb87c
https://vocaroo.com/RMhBoI2gvPg
https://vocaroo.com/nJIMZDv1z3o
https://vocaroo.com/dNjVVF5NT2p
https://vocaroo.com/2sSrmrdCnUD
https://vocaroo.com/kyWgObdxD4J
https://vocaroo.com/ckhlyF8p84B
https://vocaroo.com/6qP0ao80oKK
https://vocaroo.com/hHN6IJks5Ol
https://vocaroo.com/o0e41Ggq43N
https://vocaroo.com/4zzqcz7lzPe
https://vocaroo.com/mTCOljl5l86
>>
>>35543663
forgot the important one

https://vocaroo.com/7HhHBMnFuxe
>>
>>35543663
I've always been proud of this one
https://clyp.it/owdkos00
>>
NEW:
https://80a59cbd27c4.ngrok.io/tts
>>
>>35543841
not working
>>
>>35544011
https://80a59cbd27c4.ngrok.io
I forgot the link breaks if /tts is included in it, this one should work.
>>
>35544052
very no. on the other hand, i wonder what the voice source is for Fluffle Puff's gasping
>>
>>35543663
It's too much fun.
https://u.smutty.horse/lvuzoknhxlf.wav
https://u.smutty.horse/lvuzokgidqg.wav
>>
>>35543836
>>35544312
kek
>>
I can’t get the singing to work on most of the models, am I doing something wrong? I’ve jut been adding — at the beginning and between certain words.
>>
>>35544352
— isn't --
>>
>>35544352
This >>35531564 explains all that there is to it, at least as we know so far. Getting it to actually work just seems to require a LOT of tweaking, trial and error. Also, some characters do work better than others for this, Celly seems to be the easiest to get results out of.
>>
Celestia floods the castle

https://vocaroo.com/nZinfiOGTpr
>>
File: wallflower.jpg (100 KB, 768x1024)
100 KB
100 KB JPG
I think I'm in love

https://vocaroo.com/8PGxFVkqY31
>>
For anybody interested, Ive finally got sorted the Witcher3 Geralt voice.
22 wav:
https://mega.nz/file/R08lUbiJ#lOQZ2tqvdi_25aBKPnfn1tvvOXynCmWfEqQlRwtcTUE
48 wav:
https://mega.nz/file/J1t3yZrA#aTUf2ADbACbzevtFHeTRCQ89Tp8FbEQvq_JEAH1Xf8Y
Text for training and validation (arpha already included):
https://mega.nz/file/d9kh3BLC#AlOBucXkmbiDvEjRGEpso0Ef89eoSeys63mqdo3eqlE

Im just leaving those links here in case anybody else would like to train this voice model.
>>
>>35546068
Interested anon here. Thanks for sharing.
>>
>>35546140
Oh yeah, its 5 and half hours long, the original file compiled were almost 6~7 hours long but I thrown away any clip under 4 second to encourage model to learn how to make proper length sentence structure.
BTW, I didnt othered to check if there are any audio clips/edits mixed with it but it shouldn't be that much of a problem, right?
>>
>>35546183
I checked a few scattered clips, and the audio seems fine.
>I thrown away any clip under 4 second to encourage model to learn how to make proper length sentence structure
That kind of filtering is very easy to do in the training scripts, so you don't need to do them yourself. The algorithm also implicitly adjusts how it generates clips based on clip length, so I don't think short clips should cause problems as long as they're long enough to not trip up training performance. In practice, I think clips around 0.7 seconds and shorter cause problems. It's better to keep the dataset as complete as possible.
That said, 6-7 hours should be plenty of data to work with.
>>
NEW:
https://053f2d45fa8f.ngrok.io/
>>
>>35541336
Does anyone have ideas on how we can get a database of common chord progressions played at various tempos, with various instruments, in various styles?
>>
>>35546469
do you seriously need help knowing what singing sounds like? am I just that naturally musical?
>>
FastSpeech2 is really awesome, I trained a model with my Rainbow Dash Presents dataset which I previously gave up on with Tacotron2 due to resultant models being plagued with alignment issues, and it performs extremely well.
https://u.smutty.horse/lvvcmdhswwc.wav
https://u.smutty.horse/lvvcmdofcyt.wav
>>
>>35546574
I'm not asking for a transcription of chord progressions in songs, I'm asking for an extension of Clipper's BGM and SFX work to include music we can just add to generated clips. Like, generate a clip, figure out what tempo it's in, find a chord progression that would work with it, paste it in.
>>
>>35546617
oh that's actually super smart. I rescind my condescent
>>
>>35546613
Is there a training script?
>>
>>35546633
>>35520702
Synthesis notebook >>35522980, although the model it's running was my first ever one before I realized just how fast it overfits so it's bad. I plan to add the enhancements I did to my private version tomorrow.
>>
>>35546650
cool, thanks for your work. hope to try it out with my data tomorrow
>>
>>35546659
If your data is 1 hour or less it might start overfitting after about 1k or 500 steps with n_warmup_steps hparam set to 2000. I'll write a comprehensive guide sometime.
>>
File: cadence wheeze.png (171 KB, 1000x820)
171 KB
171 KB PNG
>>35543472
>https://vocaroo.com/bmaWw3kKZom

she sounds like she has brain damage
>>
>>35546802
linked the wrong file
>https://vocaroo.com/1SLdkZpr3rs
>>
File: pusspuss (248).png (475 KB, 900x900)
475 KB
475 KB PNG
>>35546444
rip?
>>
File: 1593174666890.jpg (218 KB, 2007x1365)
218 KB
218 KB JPG
>>35514666
This is a google drive for all of the good Pony AI content that you guys make using 15.ai
https://drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp?usp=sharing

The content that resides there is entirely dependant on anons that lurk these threads and see these links to add saved content to the drive.

You Can Upload Content Here
https://drive.google.com/drive/folders/1ghKZKsOvBoI8KnDgDdLOQrUB2aon0Xod?usp=sharing
>>
>>35547097
It's up to someone else, I'm at my GPU usage limit again
>>
https://135fb2aeb16c.ngrok.io/
>>
b
>>
I've noticed an issue with the resources section in the Google Doc, some of the Mega links for the downloads of the 5.1 Netflix audio don't have anything in the folders. Could whoever did those upload please double-check the folders and re-upload?
S2 - https://mega.nz/folder/p1kTyaIK#bTia7IpcRrWFkFkkCwnZJA
S4 - https://mega.nz/folder/0xUjUQyK#-bbdcJQHDRIZDAf-7SnDxw
S5 - https://mega.nz/folder/9xUzVCwS#MXAaSROO3dxbT2lYkhlM6A
>>
New FastSpeech2 synthesis notebook, with RDP and Celestia models:
https://colab.research.google.com/drive/1GvWSYXzJ7CQLp5FGseUgNU9ujGfKKymL?usp=sharing
>>
>>35549042
Working on it. Will update the doc as I have the links ready.
>>
>>35531390
Based and cunnypilled.
>>
File: 828074.png (268 KB, 900x553)
268 KB
268 KB PNG
>>35514666
>>35543453
I've done the SFX and BGM for s2e17 - 19. I've also done the SFX for s2e20, but I can't do any of the music for 20 - 26 until the 5.1 audio's available to download. I'll keep working on the SFX for 21 - 26 in the meantime.
>>
>>35550393
Season 2 should already be fixed in the doc. Others coming soon.
>>
>>35514662
So, anything interesting to report?
>>
>>35550450
Ngrok can sing apparently.
>>
Speaking of can anyone host a new Ngrok?
Did anyone try to use Rome's reverse proxy instead of Ngrok?
>>35526121



Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.