notes - temp

18 Apr

Ok I've tried to get xdg desktop portals to work with this. I think it's fundamentally too limiting. Summary:

if we use a Screenshot portal
- it always makes a new file in the pictures folder
- we can't select the window to screenshot by default
- cumbersome for the user
if we use a ScreenCast portal
- We could actually, and take a screenshot of the pipewire node
- But we can't get the window ID of the window that we're casting from the GNOME extension, which is super limiting
  - So the extension would have to guess what window you're casting, or the user would have to select it manually, which is really ass

17 Apr

Holy shit! How didn't I know about this!!

https://flatpak.github.io/xdg-desktop-portal/docs/window-identifiers.html https://flatpak.github.io/xdg-desktop-portal/docs/doc-org.freedesktop.impl.portal.Screenshot.html https://flatpak.github.io/xdg-desktop-portal/docs/doc-org.freedesktop.impl.portal.ScreenCast.html

New flow idea:

when we receive a new texthooker sentence,
request a screencast
user selects the window to attach to
we now have a way to screenshot that window (awesome)
we get the parent_window identifier and forward that to our extension for positioning purposes

14 Apr

I think my goals are a bit conflicted here. I want the app to be as simple as possible and min-config (see the GNOME stuff), but I want the platform to be as flexible as possible. For this reason, I went with the "anti-schema" of record kinds, instead of trying to standardize a single glossary format or whatever. If I take this to its logical extreme, the platform should be able to support:

any dictionary format (that's the point of the record kind stuff)
texthooker aggregation
AnkiConnect integration, and other flashcard apps potentially
collecting statistics on words learned/learning

But despite all these features, the apps should stay as simple to configure as possible. I think that's the key here - I don't mind more features, but I do mind more config. It's fine if you want to add a feature to see what words you've looked up the most, but it shouldn't require extra config.

In summary: state is cheap, config is expensive. Convention over configuration.

This gives me a good guideline and goal to follow for the project.

What's the original reason I started this project? I got annoyed at how:

my Memento and Yomitan dictionaries don't sync
- by extension, how I have to configure 2 different tools whenever I set up a PC again (and a new browser)
there are no good integrated sentence mining tools for Linux

I've ironed out a bunch of design issues, and AnkiConnect is like almost functioning. Most importantly, I've decided:

wordbase-engine no longer tracks the current profile - this is left as a responsibility of the app
I've figured out the scope of the different crates and the project as a whole - see paragraphs above.
wordbase and wordbase-engine are solid foundations to build on, but wordbase-desktop is a lost cause. I've just been hacking on top of it instead of trying to clean anything up. I want to definitely rewrite this without Relm4 at some point.
For some audios like NHK ones, we know the pitch position. We should move those audio buttons into the pitch reading pills.
scanning ようやく gives 漸く as the top result. ffs, it should be the READING first! we have to prioritise it somehow. back to messing with the lookup query
AI integration for understanding a sentence given context is insanely useful. This should 100% be a part of the normal workflow. (hmm..)
- gemma3-1b seems to be useless for this. so is 4b
ドアノブを引いたところで、俺は動きを止める。 - weird scanning fail with 引いたところで

Test session 4 - 13 Apr

I still haven't done the ankiconnect lmao. I've been procrastinating the stylesheet (but at least it's pretty now tho). This ACTUALLY has to be my priority now.
The desktop app is a piece of shit and a mess
- Where the fuck is all the state managed?!??!?!
- I don't think Relm4 is the right fit for this app, since so much of the state is stored outside of the component model.
- I want to rewrite this in raw GTK/Adwaita later, but for now I can keep hacking in features
IMO the engine shouldn't keep track of a "current profile". Leave this up to the individual app. (But it should still keep track of profiles)
If one of my goals is "keep it simple", then why did I add in texthooker support into the core app? This is a niche that only applies to Japanese visual novel language learners. I want to move this out to its own app later.
- Move it out of wordbase and wordbase-engine into... somewhere else
- Keep support for overlays in wordbase-integration - it's useful, and it means we don't need 2 separate GNOME extensions
- Write texthooker listen logic directly in wordbase-desktop (and wordbase-engine-cli? ehh idk)
Now that I have more dicts, lookups are starting to lag a bit. Currently, requesting a lookup "blocks" the async runtime. If we request a new lookup before the old one completes, it should stop the old lookup and no longer await it.
looking up 替える gives results for 変える first - wrong! I know why this happens tho. in the lookup, we sort by reading matches AND headword matches, then reading matches OR headword matches. this should be hm AND rm, hm, rm, everything else
- fuck, this is wrong! if we looked up like "きびす" (踵), we'd get results for 踵<かかと> first. that's wrong
- uuughhhh, maybe we detect if we have kana in the lemma and if so look up by reading first? and otherwise headword first?
- can't reproduce anymore
we should be able to spawn sub-popups from looking up in the popup. maybe? future goal.
the audio should really be in the sticky header, not the meta tag flow box. When pitch accent readings are split onto multiple lines, it's a bit harder to read. They should be kept together. (Frequencies don't matter as much who cares)
"（ふぅ…騒がしかった…）" - deinflects as <騒がしかっ>た. could we make this <騒がしかった>?
there's a lot of Forvo audio records which have a headword but no reading e.g. 薄情. this makes the forvo audio appear as a separate entry below. I think we can unify this record with the main record by doing this:
- if we have a record with a term which has a headword but no reading, it gets cloned into all the other "buckets" for the same headword
- i.e. 見る, 観る would contain the same records for みる
- this will probably suck if we do this to ALL record kinds. maybe only yomichan audio records?
what would be really cool is, if you click on a sub-query lookup, it opens as a new navigation page in GTK with a new webview
- recycle sub-webviews
- you can swipe left to go back to the previous page seamlessly, like in epiphany
wrong furigana generation - 黄色い声 (きいろいこえ)
- added test case - it's the same issue as before. I'll have to fix them together
interesting dictionary quirk: ぶいぶいいわせる - we have records for ブイブイ言わす with reading ぶいぶいいわせる - this doesn't match! and even yomitan fails to generate furigana in this case.
- its fallback is better than ours I think, we should use its strategy

Test session 3 - 12 Apr

Now that I've done a bunch of generic bug fixing and improvements, I want to do more targeted high-impact improvements

AnkiConnect!!!!!! (!!!)
The first-hover popup experience MUST be perfect, since it's like 90% of the popup use case
- it sometimes just doesn't show up. like, I know it scans, I can see the text highlight, but it doesn't appear on top for whatever reason. This has to work perfectly at all costs, even using hacks.
Terms like 此処等 are more common in kana as ここら, they should be displayed as such
- How does Memento/Yomitan determine what becomes reading-only?
When exiting the overlay, it presents for some reason? I think this is from a previous bugfix
"何かを念じるかのようだった。" - "念じる" has wrong order of lookup
"こんな図書室には似つかわしくないぐらい、専門的で高価そうな本ばかりだ。" - 高価 has wrong ordering. it's below 高い
Future goal: can we get some way of highlighting known/unknown words? I'd like to use AnkiConnect for this tracking ideally, since that gives us the best estimate of how well the user knows a word. Then we can mark new words in a different color in the overlay or something.
Future goal: I have a term like 履帯映像, I want to ask AI a question like "break down the kanji in this phrase". I should be able to do this integrated into the dictionary popup.

Test session 2 - 11 Apr

Test session 1 - 10 Apr

Misc

TODO:

local audio server - SORTA DONE
ankiconnect

quick test problems:

clicking off the overlay box, then it updates the sentence, and I start hovering back on - it doesn't immediately show popup box until I hover over a different word and back
when the app goes full screen, the overlay should appear at the top
dictionary popup should always appear above

audio server:

i query たべなかった
i get the term "食べる (たべる)"
i get audio bytes as a record

FUTURE ROADMAP:

I don't like how Term is hardcoded to a headword and reading. It's totally possible to have multiple readings for the same headword, i.e. a hiragana katakana and reading for the same kanji. And things like NHK16 Yomichan local audio dict use katakana readings for pronunciation. It'd be really useful to support that, but we just don't right now.

However I also don't want a reading_1, reading_2, reading_3 etc columns in the database. It's theoretically possible to do that, but that's really ugly.

I want to rewrite the Yomitan importer to take advantage of async. But how will it affect parsing performance? Since right now we can use multicore to parse all of the entries in parallel. But our bottleneck is inserting them into the database. Need to investigate more.

what happens we import a yomitan dictionary?

we import (expr 日本語 reading にほんご)
- we know for a fact that にほんご is a reading of 日本語
- we don't know for certain whether 日本語 is a headword, but it probably is
or: we import (expr 日本語 reading ())
- we don't know what 日本語, it could be a reading or a headword
- for now we'll treat it as a headword, but if we get evidence of the contrary (i.e. an entry where reading = 日本語), then we'll change our mind
or: we import (expr () reading にほんご)
- if we've already imported (日本語, にほんご) then this term already exists, skip
- else, we don't know if this is a headword or a reading yet. assume it's a headword for now

querying:

CREATE TABLE IF NOT EXISTS term ( text TEXT NOT NULL CHECK (text <> ''), record INTEGER REFERENCES record(id), headword INTEGER REFERENCES term(id), UNIQUE (text, record) );

if we search for X, our goal is to find any terms where term.text = X
if we find a bunch of terms, but all their headword = NULL, then we ASSUME that X is a headword, not a reading
if we find at least 1 term for which headword != NULL, then X is a READING of 1 or more headwords
- where headword = Y AND headword != NULL, pull in any terms where term.text = Y - and mark Y as the headword
TODO: what if those terms pulled in also have term.headword != NULL? should that be allowed?

OR: have a headword and term tables

super basic yomitan async benchmark:

async: 65.8 sec
sync: 74.6 sec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notes - temp

18 Apr

17 Apr

14 Apr

Test session 4 - 13 Apr

Test session 3 - 12 Apr

Test session 2 - 11 Apr

Test session 1 - 10 Apr

Misc

FilesExpand file tree

NOTES.md

Latest commit

History

NOTES.md

File metadata and controls

notes - temp

18 Apr

17 Apr

14 Apr

Test session 4 - 13 Apr

Test session 3 - 12 Apr

Test session 2 - 11 Apr

Test session 1 - 10 Apr

Misc