Prototype for an integrated content-based language learning environment. This doc may be out of date.
Warning
This is only intended for local use. There are zero security measures. The database schema may change at any time and break previous versions.
Is this usable at its current state?
No. Not yet. It technically has a functioning database and text reader, but the dictionary and translation integrations are quite primitive. The UI needs a lot of work. No clue how to package and distribute this thing.
Links
Disclaimer
LLM wrote some of the non-critical code and they might be bad
- language-agnostic nlp
- text segmentation & tokenization
- lemmatization
- pos tagging
- dependency parsing
- arbitrary additional morphological features
- tracking known/learning terms
- phrase tracking and detection
- translation integration
- annotated text reader
- language-specific nlp
- japanese — auto furigana
- japanese — inflection derivation chain
- reasonable ui
- everything other than the text reader (80% there)
- the text reader (50% there)
- full ui consistency
- dictionary integration (post-first-release)
- stardict kind of works
- srs (post-first-release)
- import data from lwt (planned sibling project)
- docker container (gotta be able to run it somehow ╮(╯▽╰)╭)
- make ux fun (post-first-release)
- token status stats for documents
- better filtering of doc/lang listings
- term browser + editor outside doc
- Backend in Rust (Axum + Postgres)
- NLP Service in Python
- Frontend in Elm
- how to handle lemmatization? should lemma be used as default? how does user manually assign lemma? should lemma and reflexes be separate entries? how to relate them in the database?
- some ideas here
- how to integrate user-provided dictionaries?
- how to allow extensions? should there be support for custom nlp scripts?
See the justfile
s.