A Python script to extract sentences, audio, and translations from 50Languages and generate Anki flashcards from them for language learning.
- Install dependencies:
pip install -r requirements.txt - Run the script using something like:
python fiftylangs2anki.py --src en --dest tr --start 5 --end 10 - You should see a deck package named
50Languages_en-tr_5-10.apkgin the current working directory (You can also provide a different output file using the--outoption).
The --src and --dest flags take codes of two languages. These are languages you choose in 50Languages's interface
and are shown in the URL of each lesson.
The --start and --end flags specify the range of lessons to download. By default,
all lessons will be downloaded (from 1 to 100).
The generated Anki notes use a notetype consisting of front-back and back-front card types. More options to customize this may be added in the future.
If you want to generate an updated deck and import it again to Anki,
the notes should have notetypes with the same ID for the deck to be imported without creating duplicates.
This script generates a random notetype ID for each run, which is undesirable in this situation.
The solution is to use the --model-id option in the second run, passing it the same ID of the notes
generated in the first run. After importing a deck generated by this script to Anki,
you can find the notetype ID of the notes by running the following code in the Debug Console:
pp(mw.col.models.by_name("50Languages_en-tr_1-100")['id'])
(Change "50Languages_en-tr_1-100" to the name of your notetype.)
This will give you the ID, which you can pass to the script like:
python fiftylangs2anki.py --src en --dest tr --model-id 1409094762
All materials downloaded from 50Languages are cached under the cache directory for re-use in subsequent
invocations of the script that involve the same source or destination language.
I uploaded some decks generated by the script on AnkiWeb: https://ankiweb.net/shared/by-author/1493917421
- 50Languages's content is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 license (CC BY-NC-ND 3.0). See https://www.50languages.com/licence.php
