nowac (Norwegian Web as Corpus) data info: NoWaC v 1.0 This is the first version of NoWaC (Norwegian Web as Corpus), a large web-based corpus of Bokmål Norwegian currently containing about 700 million tokens. retrieve date: 20130805 src: note: local versjon of nowac-1.1.words.freq is better than the src version due to a character correction local version < online version> 6997c6997 < 6873 à à prep --- > 6873 à ? prep 13228c13228 < 3218 à à subst_prop --- > 3218 à à subst_prop 25842c25842 < 1437 pà pà ukjent --- > 1437 pà pà ukjent