# Ideas Collecting some thoughts while working on other oahpas on how things can be improved Oahpa needs to be split into two separate services, a front-end that handles user interaction and user groupings into courses, so that instructors and students can interact with meta-data about their experience, and a back-end which is separate from this, and only handles the generation of question/answers and needs to be stateless, so that question/answers can be validated without a need to rely on sessions. For Morfa and Leksa this is easier, but morfa-c will need to track which words are used and which question format, Sahka will need to track which dialog, and the position in the dialogue, or something else, where the dialog is held as a discrete unit from answer validation. More thinking required there. ## Testing A more modular structure will make it easier to write automatic tests for generating exercises, such that we can have a quick way of testing whether new code has broken any existing features. I suggest strongly using tests, however, no strong preference toward test-driven development instead of writing tests after developing features; whichever option is most efficient. Because the individual exercise types should function in a stateless way, e.g., the frontend should provide the exercise with a set of word prompts while a user is working with a set of exercises, it will be easier to produce tests specifically to test certain problems that crop up. For instance, in univ_oahpa we have seen a database unicode error appear out of nowhere (affecting s / š), and it would be good to produce at least one test for this to ensure that something basic does not appear again. ## Configuration file The general idea is one code base that can run all languages, with the only changes being made in a configuration file which describes exercise types and language grammatical and lexical rules. Examples are checked in in game_configuration_fin.yaml, game_configuration_sme.yaml, and at one point I did have the checked-in system able to run both (installed with separate install configuration files). Somewhere in this codebase is also an example of how to make agreement in Morfa-C abstract enough that it can cover more than one language, and several types of word agreement. ## Database There are a lot of small issues with using MySQL that can be fixed and take time. Rather than take this time, we should use PostgreSQL. It's still open source and free, is not owned by Oracle. ## Drills Drills should be separate modules from eachother, with good and clear class inheritance. One of the problems in Oahpa 1.0 is that there is a mess of operations that are spread out across views, forms, models, and game objects. If you draw out a flowchart of the steps involved in generating a question, and validating that question, you quickly see how messy it can be. ### Drill validation One of the consistent issues in validation of answers is that there is a lot of repeated code for the same kind of thing. Language-specific drill validation features, such as spell relaxing (i¨ -> i, in South Sámi; accepting both ä/æ) and morphological feedback need to be implemented in a general way that allows developers to create new game types without worrying about these kinds of things. Better class inheritance. ### Morfa and Contextual Morfa The code for these drills is a lot more complex and contains language specific rules that make it difficult to edit for new languages, however there is a limited set of features required for generating sentences from some kind of abstract (XML, YAML) representation: * Semantic membership * Syntactic ordering * Morpho-syntactic agreement between constituents * Selection of words for presentation of question and answers (pronoun + verb, demonstrative + noun) (Am I missing any?) In any case, this set of requirements would be easy to represent in an abstract way, where administrative users can define rules of agreement for their contextual questions, providing a way for users to edit rules without drastically altering code. For an exercise like (non-contextual) Morfa, this would be quite easy, because defining questions and answers only involves describing a set of tags for wordforms that should be presented and accepted, and defining one or two words to accompany the question to the user as additional context. For Contextual Morfa, there is a lot more context, but we get some things for free, such as syntactic ordering implicit in question text. #### Installation It should be possible to remove the dependency of Morfa-C on installing the lexicon without increasing load. One of the difficulties has often been that if you change a word in the lexicon, and reinstall it by deleting the lexicon, this then means that Morfa-C questions will need to be reinstalled, because in the current Oahpa there are many-to-many relationships for individual wordforms, which are none-the-less not always used. Morfa-C questions could be stored such that they only store the semantic set and tag data, thus allowing for words to be added and removed to the lexicon without touching Morfa-C. Should improve debugging as well. ### External services used in Drill validation Sahka and Numra are somewhat unique in that they do not validate answers by rules only found in Oahpa. Questions in Numra (and Klokka) are generated with help of a separate FST; and answers in Sahka (and eventually Teksta) are validated with a FST and CG lookup server. There is nothing wrong with this from a software design standpoint, however the way this is implemented now makes dependencies difficult and unclear to new programmers. It would be nice to keep Sahka's validation lookupserver under the same roof as the code for generating Sahka questions and answers. ## Lexicon Lexicon is something that needs to be easy to update and renew without too much interruption to a production instance of Oahpa. Lexicon should also be kept in a separate module from drills. ## User logs In currently running Oahpas, User interaction logs are included in the drills module, which is problematic because it would be easier to accidentally delete them. Needs own module.