Skip to content

[not prod ready yet but close] A content-based, NLP-enhanced integrated language learning environment emphasizing language exposure and active learning.

Notifications You must be signed in to change notification settings

chaosarium/Influx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Influx

Prototype for an integrated content-based language learning environment. This doc may be out of date.

Warning

This is only intended for local use. There are zero security measures. The database schema may change at any time and break previous versions.

Is this usable at its current state?

No. Not yet. It technically has a functioning database and text reader, but the dictionary and translation integrations are quite primitive. The UI needs a lot of work. No clue how to package and distribute this thing.

Links

  • Phase I dev log here
  • Continuous dev log here
  • The concept here

Disclaimer

LLM wrote some of the non-critical code and they might be bad

Features

  • language-agnostic nlp
    • text segmentation & tokenization
    • lemmatization
    • pos tagging
    • dependency parsing
    • arbitrary additional morphological features
  • tracking known/learning terms
  • phrase tracking and detection
  • translation integration
  • annotated text reader
  • language-specific nlp
    • japanese — auto furigana
    • japanese — inflection derivation chain
  • reasonable ui
    • everything other than the text reader (80% there)
    • the text reader (50% there)
    • full ui consistency
  • dictionary integration (post-first-release)
    • stardict kind of works
  • srs (post-first-release)
  • import data from lwt (planned sibling project)
  • docker container (gotta be able to run it somehow ╮(╯▽╰)╭)
  • make ux fun (post-first-release)
    • token status stats for documents
    • better filtering of doc/lang listings
    • term browser + editor outside doc

Development notes

Architecture

  • Backend in Rust (Axum + Postgres)
  • NLP Service in Python
  • Frontend in Elm

Key issues to decide / address

  • how to handle lemmatization? should lemma be used as default? how does user manually assign lemma? should lemma and reflexes be separate entries? how to relate them in the database?
  • how to integrate user-provided dictionaries?
  • how to allow extensions? should there be support for custom nlp scripts?

Running development server

See the justfiles.

About

[not prod ready yet but close] A content-based, NLP-enhanced integrated language learning environment emphasizing language exposure and active learning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •