相关页面:https://cto.eguidedog.net/node/1407
2025-6-10
TODO: test_sentenses还没有改。
--> TIME: 2025-06-10 12:23:11 -- STEP: 2025/2035 -- GLOBAL_STEP: 2025
| > loss: 2.2471892833709717 (2.505195030148514)
| > log_mle: 0.7013579607009888 (0.7612298656042398)
| > loss_dur: 1.545831322669983 (1.7439651638343387)
| > amp_scaler: 16384.0 (16400.262034739455)
| > grad_norm: tensor(5.0095, device='cuda:0') (tensor(6.8441, device='cuda:0'))
| > current_lr: 2.5e-07
| > step_time: 1.6003 (0.38814580022552847)
| > loader_time: 0.0039 (0.06788439256173602)
> EVALUATION
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: yue-cn
| > phoneme backend: yue_cn_phonemizer
| > Number of instances : 5663
| > Preprocessing samples
| > Max text length: 57
| > Min text length: 1
| > Avg text length: 12.11566307610807
|
| > Max audio length: 237120
| > Min audio length: 4480
| > Avg audio length: 51397.32473953735
| > Num. instances discarded samples: 0
| > Batch group size: 0.
! Run is removed from /gemini/code/TTS/recipes/mdcc/glow_tts/run-June-10-2025_12+07PM-4a6359e8
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.11/site-packages/trainer/trainer.py", line 1833, in fit
self._fit()
File "/root/miniconda3/lib/python3.11/site-packages/trainer/trainer.py", line 1787, in _fit
self.eval_epoch()
File "/root/miniconda3/lib/python3.11/site-packages/trainer/trainer.py", line 1643, in eval_epoch
for cur_step, batch in enumerate(self.eval_loader):
File "/root/miniconda3/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
return self._process_data(data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
data.reraise()
File "/root/miniconda3/lib/python3.11/site-packages/torch/_utils.py", line 722, in reraise
raise exception
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/gemini/code/TTS/TTS/tts/datasets/dataset.py", line 638, in compute_or_load
ids = np.load(cache_path)
^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/numpy/lib/npyio.py", line 427, in load
fid = stack.enter_context(open(os_fspath(file), "rb"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/gemini/code/TTS/recipes/mdcc/glow_tts/phoneme_cache/bWRjYyNhdWRpby80NDdfMjAxMjIxMTYyNF83MTczOV81MTMuOTZfNTE0LjI0_phoneme.npy'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.11/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
~~~~~~~~~~~~^^^^^
File "/gemini/code/TTS/TTS/tts/datasets/dataset.py", line 212, in __getitem__
return self.load_data(idx)
^^^^^^^^^^^^^^^^^^^
File "/gemini/code/TTS/TTS/tts/datasets/dataset.py", line 268, in load_data
token_ids = self.get_token_ids(idx, item["text"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/gemini/code/TTS/TTS/tts/datasets/dataset.py", line 251, in get_token_ids
token_ids = self.get_phonemes(idx, text)["token_ids"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/gemini/code/TTS/TTS/tts/datasets/dataset.py", line 228, in get_phonemes
out_dict = self.phoneme_dataset[idx]
~~~~~~~~~~~~~~~~~~~~^^^^^
File "/gemini/code/TTS/TTS/tts/datasets/dataset.py", line 613, in __getitem__
ids = self.compute_or_load(string2filename(item["audio_unique_name"]), item["text"], item["language"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/gemini/code/TTS/TTS/tts/datasets/dataset.py", line 640, in compute_or_load
ids = self.tokenizer.text_to_ids(text, language=language)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/gemini/code/TTS/TTS/tts/utils/text/tokenizer.py", line 110, in text_to_ids
text = self.phonemizer.phonemize(text, separator="", language=language)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/gemini/code/TTS/TTS/tts/utils/text/phonemizers/base.py", line 132, in phonemize
p = self._phonemize(t, separator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/gemini/code/TTS/TTS/tts/utils/text/phonemizers/yue_cn_phonemizer.py", line 40, in _phonemize
return self.phonemize_yue_cn(text, separator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/gemini/code/TTS/TTS/tts/utils/text/phonemizers/yue_cn_phonemizer.py", line 36, in phonemize_yue_cn
ph = cantonese_text_to_phonemes(text, separator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/gemini/code/TTS/TTS/tts/utils/text/cantonese/phonemizer.py", line 26, in cantonese_text_to_phonemes
jyutpings = pycantonese.characters_to_jyutping(text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/pycantonese/jyutping/characters.py", line 101, in characters_to_jyutping
words_to_jyutping, chars_to_jyutping = _get_words_characters_to_jyutping()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/pycantonese/jyutping/characters.py", line 14, in _get_words_characters_to_jyutping
corpus = hkcancor()
^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/pycantonese/corpus.py", line 396, in hkcancor
reader = _HKCanCorReader.from_dir(data_dir)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/pylangacq/chat.py", line 187, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/pylangacq/chat.py", line 1057, in from_dir
return cls.from_files(
^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/pylangacq/chat.py", line 187, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/pylangacq/chat.py", line 1009, in from_files
return cls.from_strs(strs, paths, parallel=parallel)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/pylangacq/chat.py", line 187, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/pylangacq/chat.py", line 970, in from_strs
reader._parse_chat_strs(strs, ids, parallel)
File "/root/miniconda3/lib/python3.11/site-packages/pylangacq/chat.py", line 255, in _parse_chat_strs
executor.map(self._parse_chat_str, strs, file_paths)
File "/root/miniconda3/lib/python3.11/concurrent/futures/process.py", line 837, in map
results = super().map(partial(_process_chunk, fn),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/concurrent/futures/_base.py", line 608, in map
fs = [self.submit(fn, *args) for args in zip(*iterables)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/concurrent/futures/_base.py", line 608, in <listcomp>
fs = [self.submit(fn, *args) for args in zip(*iterables)]
^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/concurrent/futures/process.py", line 809, in submit
self._start_executor_manager_thread()
File "/root/miniconda3/lib/python3.11/concurrent/futures/process.py", line 748, in _start_executor_manager_thread
self._launch_processes()
File "/root/miniconda3/lib/python3.11/concurrent/futures/process.py", line 775, in _launch_processes
self._spawn_process()
File "/root/miniconda3/lib/python3.11/concurrent/futures/process.py", line 785, in _spawn_process
p.start()
File "/root/miniconda3/lib/python3.11/multiprocessing/process.py", line 118, in start
assert not _current_process._config.get('daemon'), \
AssertionError: daemonic processes are not allowed to have children
real 17m42.845s
user 39m16.468s
sys 36m43.403s
New init steps:
- clone code: git clone https://github.com/hgneng/TTS.git
- chnage repo: pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
- install deps: cd TTS && pip install -e .[all,dev]
- cd /tmp && tar xJvf /gemini/code/TTS/recipes/mdcc/glow_tts/phoneme_cache.tar.xz
- run training: cd /gemini/code/TTS/recipes/mdcc/glow_tts $ time TRAINER_TELEMETRY=0 python train_cantonese.py 2>&1 | tee out
训练完后,把最新的checkpoint.pth链接到best_model.pth,下次训练时才能继续上次的进度。
把event文件复制到/gemini/output/,可以在tensorboard里看到训练趋势。不要创建符号链接,似乎有bug会把目录删掉并终止训练。
Model test command:
~/code/hgneng/TTS/recipes/mdcc/glow_tts $ python test.py
评论