-
-
Notifications
You must be signed in to change notification settings - Fork 103
Contributing
hexgrad edited this page Mar 21, 2025
·
1 revision
Although misaki is a powerful G2P toolkit, there is a lot of room for improvement. PRs are welcome.
AI-generated code is acceptable if and only if it has been tested.
- To add G2P for a new language, you would define a callable Python class that implements
__init__and__call__. -
__init__should set up heavy resources such as dictionaries. -
__call__takes a string of graphemes and returns a pair: a string of phonemes (required), and a sequence ofMToken(optional). - Refer to the README to see how first-class G2P solutions have been implemented in other languages.
- Because each language gets its own Python
G2Pclass and language-specific dependencies, you can freely develop in your specific language without impacting the functionality of other languages. As a result, if you are the sole maintainer for your language, PRs will likely be rubber-stamp approved.
- Some languages (English, Japanese, Chinese) implement token-level alignments, but many still do not. This means instead of just string-in-string-out, the G2P class also returns a sequence of
MTokenon call. - This enables quality of life improvements often associated with TTS, such as smarter chunking, auto-scrolling, and word highlighting.
- Unified G2P interface.
- English tokenizer can probably be made much faster and more memory efficient.
- Add unit tests.
- Python allows for fast development, but porting to other languages may enable easier deployment to edge devices.
…to be continued…