In Unix, directories can be referred to by absolute or relative reference. No matter where I am, the command ls -l /home/trond/gt/ will always list the content of gt. Since the path starts with "/", it refers to the Root directory, which always is the same. If I stand in my home directory, I can write ls -l gt (since it contains gt), but if I stand in any other directory (e.g. gt itself, or sme, etc.), I cannot do that, since these directories do not contain a directory gt. Standing in sme (a daughter of gt, I must write ls -l ../. So, standing in your home directory, the command ls ../ tells you the name of the other users on the system.
To move from one directory to the other we use the cd (change directory) command. To get from your home directory to the Northern Sámi directory, print cd gt/sme. To get one directory up, cd .., to get two up, cd ../... Check where you are with the pwd command. One up and one down (from Northern to Lule Sámi) cd ../smj.
To start a program (like emacs, lynx, lexc, twolc), print the name of the program. With lynx, it is convenient to start with lynx gt/doc/index.html (or lynx index.html, if you already are in the doc directory). From index.html you can find all the other files (such as this one).
Use the TAB character to complete names. So, if the directory name (or file name) you want to type has a uniqe beginning, type it and press TAB. In your home directory, print ls g and then print TAB. It will complete with "t/" (if you have no other directories or files that start with g). Then print s and TAB, it gives a pling (there are more than one directory starting with "s". Press TAB again, and it lists the 7 candidates. Then, complete with me (if you want to go th Northern Sámi). The same you can do with file names as well.
Unix remembers all your previous commands. Typing the up arrow gives you the previous command, down arrow the next. The command history documents what you have done until now. Here is a typical case: You type lookup -flags mbTT bin/sme.fst (because you want to check some Northern Sámi words). Nothing happens. Then you do pwd, and finds yourself in your own root directory, instead of in gt/sme, where you thought you were (you get the answer "/home/trond" instead of "/home/trond/gt/sme". Type cd gt/sme to get to the right place. Now, instead of typing the long lookup command again, hit the up arrow key three times (one for the cd command, one for the pwd one, and voila, you see the lookup command again. Press enter, and this time it will work.
To list all lines in a file with a certain content we use the command grep. So, to list all lines that contain the string "Sg1" in the file sme-lex.txt, write grep Sg1 sme-lex.txt (in the gt/sme directory). If you want to search for whitespace (mellomrom, välilyönti) as well, use the ' ': The command grep 'an K' src/sme-lex.txt gives all suffixes ending in "an" and leading to the K lexicon. Similarily, grep ' in leat' corp/ntunix (where "ntunix" is the name of the file containing the New Testament) gives all the strings " in leat" in that text. Note that grep 'in leat' corp/ntunix gives a different results (it matches many other strings, e.g. "geain leat" as well). Grep and other searches can be extended by REGULAR EXPRESSIONS. Thus, the symbol "." means "any one character", and the search string 'lea. ' thus matches both lean, leat and any other 4-letter word beginning with lea (including the word "lea" followed by two empty spaces). A list of regular expressions can be found below.
To see the content of a file, use the command less. Thus, (in src/) to see the file punct-sme.txt, print less punct-sme-lex.txt (less pu and then TAB will be enough, by the way). If the file is long (e.g. noun-sme-lex.txt), less gives you one screenful at a time, and you can go forward by hitting the space bar (mellomromstasten, välilyöntinappi). You return to the command line by printing q.
Often, the output from grep and other commands is all to long for one screenful. Here we need to do more than one thing at the time, and we use the "|" symbol (English: pipe, Finnish: putki, Norwegian: pipe, dessverre). It is a VERY IMPORTANT symbol for all of us, and it means: I take what I get from the left and give it to the right. Thus, since the command grep ' in leat' ntunix returns more than one screenful, we give the output of that command as input to the less command, by using the pipe. Write: grep ' in leat' ntunix | less, and you get the result one screenful at the time. The command may be quite complex. Thus, to see the analysis of the New Testament, write (in sme
cat corp/ntunix | preprocess --abbr=bin/abbr.txt | lookup -flags mbTT bin/sme.fst | less
The first command (cat ntunix) just takes the text and feeds it to the pipe. The second command (preprocess --abbr=abbr.txt) uses the program preprocess to separate commmas, etc., and make the file into a list of words. Thus, the string "No, todyay." becomes
No
,
today
.
The third command, lookup -flags mbTT sme.fst uses the lookup program to analyse the input, by using the Northern Sámi parser sme.fst. The output, an analysed version of the New Testament, is given to the less command, one screenful at a time, and the whole process takes less than 3 seconds.
The output can be manipulated further. To extract all adverbs, replace "less" above with the string grep '+Adv' | sort | uniq -c | sort | less (to get them in alphabetical order) or grep '+Adv' | sort | uniq -c | sort -nbr | less (to get them sorted according to frequency).
The magic commands are sort (sort the file, one line at a time, alphabetically), uniq (delete repeated identical lines), rev (reverse lines, e.g. "every day" becomes "yad yreve"). Note how combining rev and sort gives a reverse wordlist, sorted according to last letter.
Remember to type fg, and not open the same program again! Otherwise, you will end up with several parallel versions of the same program running. If you suspect you did just that, then go to the command line (ctrl-Z), and print jobs. This should give you a list of how many programs you have open in the same window at the same time (no more than one is recommended, unless you have a very tidy head).
If the man page is too cryptic, you can instead write info sort (or any command name). The info pages have much more text, stretching over several pages. The text is structured as a primitive web browser: TAB brings you to the next link, you follow the link by pressing enter, and when the top of the screen says Next:, Prev:, Up:Top, etc., you go there by pressing n, p, u, etc. You leave both man and info by pressing q.
Somtimes, but not always, the apropos command may also help. To have a list of commands related to e.g. sorting, type apropos sort. Searching around for help on a Unix system, not knowing what to look for seldom is a good idea, thus, you should ask a local guru, and you should buy a reference book.