Fine tune Coqui Cantonese TTS

By admin, 28 十月, 2024

This is continue for Coqui Cantonese Development Notes

2024-10-30

   --> TIME: 2024-10-29 13:49:43 -- STEP: 129/157 -- GLOBAL_STEP: 68425
    | > decoder_loss: 0.4340108335018158  (0.41140215849691586)
    | > postnet_loss: 0.40669307112693787  (0.39625994965087535)
    | > stopnet_loss: 0.2480345070362091  (0.3161741283743881)
    | > decoder_coarse_loss: 0.42141956090927124  (0.3995050713997479)
    | > decoder_ddc_loss: 0.004036255180835724  (0.0066533084226006916)
    | > ga_loss: 0.00028112504514865577  (0.0008964127673254916)
    | > decoder_diff_spec_loss: 0.17114706337451935  (0.16164369188075847)
    | > postnet_diff_spec_loss: 0.19191323220729828  (0.18958402153595474)
    | > decoder_ssim_loss: 0.6188733577728271  (0.6018433372179665)
    | > postnet_ssim_loss: 0.6118573546409607  (0.6032356453496356)
    | > loss: 1.179294466972351  (1.2175781227821527)
    | > align_error: 0.722508043050766  (0.7450022502231969)
    | > amp_scaler: 1024.0  (1024.0)
    | > grad_norm: tensor(1.0017, device='cuda:0')  (tensor(0.9195, device='cuda:0'))
    | > current_lr: 1.275e-06 
    | > step_time: 2.3045  (0.6297789155974869)
    | > loader_time: 2.3656  (0.14574224265046823)
! Run is kept in /gemini/code/TTS/recipes/mdcc/tacotron2-DDC/mdcc-ddc-October-29-2024_10+40AM-7dc2f6fd
Traceback (most recent call last):
 File "/root/miniconda3/lib/python3.11/site-packages/trainer/trainer.py", line 1833, in fit
   self._fit()
 File "/root/miniconda3/lib/python3.11/site-packages/trainer/trainer.py", line 1785, in _fit
   self.train_epoch()
 File "/root/miniconda3/lib/python3.11/site-packages/trainer/trainer.py", line 1504, in train_epoch
   outputs, _ = self.train_step(batch, batch_num_steps, cur_step, loader_start_time)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/root/miniconda3/lib/python3.11/site-packages/trainer/trainer.py", line 1360, in train_step
   outputs, loss_dict_new, step_time = self.optimize(
                                       ^^^^^^^^^^^^^^
 File "/root/miniconda3/lib/python3.11/site-packages/trainer/trainer.py", line 1226, in optimize
   outputs, loss_dict = self._compute_loss(
                        ^^^^^^^^^^^^^^^^^^^
 File "/root/miniconda3/lib/python3.11/site-packages/trainer/trainer.py", line 1157, in _compute_loss
   outputs, loss_dict = self._model_train_step(batch, model, criterion)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/root/miniconda3/lib/python3.11/site-packages/trainer/trainer.py", line 1116, in _model_train_step
   return model.train_step(*input_args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/gemini/code/TTS/TTS/tts/models/tacotron2.py", line 338, in train_step
   loss_dict = criterion(
               ^^^^^^^^^^
 File "/root/miniconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
   return self._call_impl(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/root/miniconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
   return forward_call(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/gemini/code/TTS/TTS/tts/layers/losses.py", line 511, in forward
   decoder_ssim_loss = self.criterion_ssim(decoder_output, mel_input, output_lens)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/root/miniconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
   return self._call_impl(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/root/miniconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
   return forward_call(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/gemini/code/TTS/TTS/tts/layers/losses.py", line 148, in forward
   assert not torch.isnan(y_hat_norm).any(), "y_hat_norm contains NaNs"
AssertionError: y_hat_norm contains NaNs
real    191m19.757s
user    389m26.424s

 

New init steps:

  1. clone code: git clone https://github.com/hgneng/TTS.git
  2. install deps: cd TTS && pip install -e .[all,dev]
  3. copy dataset: cd /gemini/data-1/ && time tar xvf mdcc-dataset.tar -C /tmp/
  4. cp /gemini/code/TTS/recipes/mdcc/tacotron2-DDC/train.csv /tmp/mdcc-dataset/
  5. cp /gemini/code/TTS/recipes/mdcc/tacotron2-DDC/valid.csv /tmp/mdcc-dataset/
  6. link dataset: cd /gemini/code/TTS/recipes/mdcc/tacotron2-DDC/ && ln -sf /tmp/mdcc-dataset
  7. run training: cd /gemini/code/TTS && time bash recipes/mdcc/tacotron2-DDC/run.sh | tee out

Model test command:

~/code/hgneng/TTS$ TTS/bin/synthesize.py --text "ngo5 wui2 syut3 jyut6 jyu5" --config_path recipes/mdcc/tacotron2-DDC/tacotron2-DDC.json --model_path recipes/mdcc/tacotron2-DDC/model_10000_411.pth --out_path ./demo.wav

 

标签

评论

Restricted HTML

  • 允许的HTML标签:<a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id> <img src>
  • 自动断行和分段。
  • 网页和电子邮件地址自动转换为链接。
验证码
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
请输入"Drupal10"