Description:
When using the model vsr_trlrwlrs2lrs3vox2avsp_base.pth, the performance on the LRS3 dataset is significantly lower than expected. After processing the data in the same way as before, the model's word error rate (WER) on the LRS3 dataset is around 23-24%. The final result indicates:
================ Final Result ================
Total samples found : 1321
Valid samples : 1321
Skipped samples : 0
Total words (N) : 9890
Substitutions (S) : 1667
Deletions (D) : 407
Insertions (I) : 214
WER : 0.231345
Steps to Reproduce:
Use the model vsr_trlrwlrs2lrs3vox2avsp_base.pth.
Process the same data in the same way as the previous steps.
Evaluate the model on the LRS3 dataset.
Has anyone encountered a similar situation?
Description:
When using the model vsr_trlrwlrs2lrs3vox2avsp_base.pth, the performance on the LRS3 dataset is significantly lower than expected. After processing the data in the same way as before, the model's word error rate (WER) on the LRS3 dataset is around 23-24%. The final result indicates:
================ Final Result ================
Total samples found : 1321
Valid samples : 1321
Skipped samples : 0
Total words (N) : 9890
Substitutions (S) : 1667
Deletions (D) : 407
Insertions (I) : 214
WER : 0.231345
Steps to Reproduce:
Has anyone encountered a similar situation?