[Original review, Jul 24 2018]
This month, three plotlines in my life collided. I know Swedish and Norwegian well, and I’d thought vaguely from time to time that I’d like to learn Icelandic too; I’ve always been a great admirer of Tolkien, and I knew he had been interested in Icelandic; and I have a couple of Icelandic friends. But none of this had ever come to anything. Last week, however, Jupiter aligned with Mars and I entered the Age of Aquarius. I’d just finished reading Le petit prince. I spent the next few days carrying it with me everywhere, snatching all opportunities to try to make sense out of it.
For people who don’t know anything about Icelandic, it has the same ancestor as Swedish, Danish and Norwegian. A thousand years ago they were the same language. But the mainland languages have evolved at a normal rate, while Icelandic, on its faraway island, has changed relatively little; so if you speak Swedish or Norwegian, it’s like trying to read a language which for an English-speaker would be somewhere between Chaucer and Beowulf. You recognise a few of the words at once, others are more or less mangled, and still others are completely unfamiliar. The first impression is that it makes no sense at all. But I know Le petit prince, and I started trying to guess what word was what, just reading without looking anything up.
It was amazing to see how well this worked. For example, let me show you the following sentence:
Þar sem ég hafði adrei teiknað kind dró ég upp fyrir hann aðra af þeim tveimur myndum sem ég var fær að gera: myndina af kyrkislöngunni utanverði.
The first time I saw this, there were only a couple of words I felt at all sure about. Upp and var must be the same words as in Swedish (“up” and “was”). I soon figured out that ég was “I” (it is the same word in some Norwegian dialects), að was att (“that”), and hann was han (“he/him”). The words mynd and kind weren’t like anything I recognised, but they were common, and having already come across them I realised they must be “drawing” and “sheep”. As I read the book for the second time, the other words gradually fell into place too, and after a while I could read it as sort-of-Swedish:
Då som jag hadde aldrig tecknad får drog jag upp för honom den-andra av dem två teckningarna som jag var för att göra: teckningen av pytonormen utifrån.
which I might render into sort-of-English as:
Then as I had never drawn sheep pulled I up for him the-second of the two drawings which I was able-to make: the-drawing of the-python from-outside.
I recalled that there was a sentence something like this near the beginning of the story: it all made sense.
How does it work? I’ve been reading deep learning theory, and it’s tempting to conceptualise it in terms of strengthening of neural pathways. I see a word I don’t know, and I think of some words it could be: aðra to a Swedish-speaker first looks like ådra, “vein”, and you only later think of andra, “second”. This word occurs quite often. “Vein” never makes any sense, but “second” often makes good sense. So the pathway for ådra never gets strengthened but the one for andra does, and after a while my eyes just start seeing it as andra. The same thing happened with numerous other words. As I’m sure many language geeks will attest, it is such a weird and interesting feeling to find the sense emerging from words which initially looked like gibberish! I’m sorry if I’ve gone into too much detail here, but I wanted to explain what I mean when I say it’s like doing drugs. You actually feel the text changing your state of consciousness.
Well, I’m hooked. Though so far, I’ve just barely started: the grammar is still a mystery to me. All the same, on my latest read-through I notice that the endings of nouns and verbs, which are first looked quite random, now seem to be displaying some recurrent patterns…
_________________________
[Update, Aug 6 2018]
I have been making efforts to understand in more quantitative terms what I’ve been doing here. First, I thought it would be a good exercise to try copying out the text of Litli prinsinn: this would force me to look carefully at every letter, and also give me a machine-readable version that I could analyse. I’m now about three-quarters of the way through (he has just said goodbye to the fox). I tried running my incomplete corpus, which contains about ten thousand words, through a script that Not and I developed last year.
The script is simple but quite useful. It counts frequencies for all the words in the corpus, then builds a hyperlinked concordance which shows me up to ten examples for each word. Every word is clickable, so I can take a word I’m unsure of in a sentence and see examples of that word in other contexts. There is a master index which lists all the words in descending frequency order. Here are the first 50 lines. The ‘Freq’ column gives the number of times the word occurs, and the ‘Cumul’ column gives the cumulative frequency:
All of these 50 words (to be exact, some of them are punctuation marks) are now very familiar to me, and as you can see they make up more than 50% of the text. I tried walking down the list to see when I stopped feeling confident. I can go as far as words with four or five occurrences, and I think I know what nearly all of them mean; that brings me up to about 400 words, and 75% of the total. When I look at words occurring two or three times, I start to feel uncertain, but I still think I know the majority of them. That gets me to 900 words and 86%. The 1600 words which only occur once are of course the hardest; but even here I feel I can guess a lot, perhaps a third to a half of them.
Copying out the text has sharpened my understanding of the grammar a good deal, and now I recognise quite a few endings. Though I’m still pretty hazy about the nouns. With multiple genders, multiple cases and marking for definiteness, there are many combinations, and I only know the most common ones.
It’s surprising that one can extract so much information from a tiny sample of just ten thousand words. I’ll see if I have the patience to finish this and then do Ævintýri Lísu í Undralandi as well…
___________________________________
[Update, Aug 8 2018]
I have finished copying out the text of Litli prins; the file now contains about 14,200 words and about 3,050 unique words. I made a small improvement to our script, so that it now creates an alphabetical index as well. This is very useful for finding copying errors: if I see two words close together which are almost the same, that often means that one of them is an error. Tidying up my copied text is not as tedious as I thought it would be. It’s forcing me to look very carefully at everything and consolidate my extremely sketchy vocabulary.
I am sure there are still many errors left, but after this initial cleaning up pass I can look at my alphabetical index and get further on trying to understand the grammar. Here’s a section showing forms of the word stjarna, “star”, which occurs often in Litli prins.
Some of these are compound nouns: for example, stjörnufræðingur, literally “star-ologist” is “astronomer”, and stjarnfræðiþingi, “star-ology-thing” is “astronomical congress”. But what are all the others, most of which look like inflected forms? I can click on any of them and get a hyperlinked page of examples. For example, let’s look at the page for stjörnu, which occurs 15 times:
I see that occurrences of stjörnu usually come after a preposition. For example, we have Hann hefir aldrei horft á stjörnu, “He has never looked at stjörnu“, or En þú ert hreinn og þú kemur frá stjörnu, “But you are pure and you come from stjörnu“. Most of the others are similar. Hm, looks like this is a dative singular? My suspicions are reinforced by the fact that Swedish used to have a dative; it disappeared long ago, but still survives in a couple of fixed expressions like till salu, “for sale”, which has this -u ending.
Still a great deal more grammar to figure out! There are some improvements to the script that I hope to add soon, and which might help…
___________________________________
[Update, Aug 12 2018]
I have added another little improvement to our script. It now creates a hyperlinked version of the original text, with the words colour-marked to show how frequently they occurred in the text you’ve read so far. The initial version uses four colours. Words are in black if they occur more than five times, blue if they occur four or five times, green if they occur two or three times, and red if they occur once. Here’s an example, the start of the visit to the Drunkard:
The colours let you see at a glance approximately how well I now understand the text. Look at the first paragraph:
Á þriðja hnettinum bjó drykkjumaður. Heimsóknin þangað var mjög stutt, en hún fyllti litla prinsinn miklu þunglyndi.
(At the-third planet lived drunkard. The-visit there was very ?short, but it filled the-little prince much ?depression)
Black words like hnettinum (“planet”, I think in the dative) and mjög (“very”) are quite familiar, and I am reasonably confident that I’ve guessed the green and blue ones correctly. Only two words, stutt (“short”?) and þunglyndi (“depression”?) are in red, and these are indeed the ones I feel least certain about. I’m pretty much guessing stutt from context. I’m more confident about þunglyndi, since I know from other examples that þung, cognate to Swedish tung, is “heavy”, lyndi is probably something related to Swedish lynne, “spirit”, and there is a Swedish word tungsint, “heavy-spirited/depressed”.
This was an easier passage than average, and usually there is more red. But it feels motivating to think that, as I copy out more text and process it through the script, the red tide should start to recede…
This calendar month, three plotlines in my life collided. I know swedish and norwegian well, and I ‘d thought vaguely from time to time that I ‘d like to learn Icelandic besides ; I ‘ve always been a great admirer of Tolkien, and I knew he had been matter to in Icelandic ; and I have a couple of Icelandic friends. But none of this had ever come to anything. last workweek, however, Jupiter aligned with Mars and I entered the Age of Aquarius. I ‘d just finished reading Tolkien: Maker of Middle-Earth, which has many striking passages in Icelandic, Old Norse and Old English, and our acquaintance K happened to be on Iceland. Fired with enthusiasm by Tolkien ‘s love of these obscure but wonderfully poetic languages, I asked K if she could possibly get me one or two Icelandic children ‘s books. I equitable do n’t know how to thank her : she turned up with not one or two but half a twelve books, including my favored ,. I spent the following few days carrying it with me everywhere, snatching all opportunities to try to make feel out of it.For people who do n’t know anything about Icelandic, it has the same ancestor as swedish, danish and norwegian. A thousand years ago they were the lapp lyric. But the mainland languages have evolved at a convention rate, while Icelandic, on its faraway island, has changed relatively little ; therefore if you speak swedish or norwegian, it ‘s like trying to read a terminology which for an English-speaker would be somewhere between Chaucer and Beowulf. You recognise a few of the words at once, others are more or less mangled, and still others are wholly unfamiliar. The first impression is that it makes no common sense at all. But I know, and I started trying to guess what give voice was what, just reading without looking anything up.It was amazing to see how well this worked. For exercise, let me show you the follow prison term : The first clock I saw this, there were only a couple of words I felt at all indisputable about.andmust be the same words as in Swedish ( “ up ” and “ was ” ). I soon figured out thatwas “ I ” ( it is the same word in some norwegian dialects ), was ( “ that ” ), andwas ( “ he/him ” ). The wordsandwere n’t like anything I recognised, but they were park, and having already come across them I realised they must be “ drawing ” and “ sheep ”. As I read the bible for the second clock time, the other words gradually fell into plaza besides, and after a while I could read it as sort-of-Swedish : which I might render into sort-of-English as : I recalled that there was a prison term something like this near the begin of the history : it all made sense.How does it work ? I ‘ve been reading deep teach theory, and it ‘s tempting to conceptualise it in terms of strengthening of neural pathways. I see a parole I do n’t know, and I think of some words it could be : to a Swedish-speaker first looks like, “ vein ”, and you only late think of, “ second base ”. This news occurs quite much. “ Vein ” never makes any feel, but “ second base ” frequently makes adept sense. So the pathway fornever gets strengthened but the one fordoes, and after a while my eyes fair start seeing it as. The same thing happened with numerous early words. As I ‘m sure many language geeks will attest, it is such a wyrd and concern feeling to find the feel emerging from words which initially looked like gibberish ! I ‘m deplorable if I ‘ve gone into excessively much detail hera, but I wanted to explain what I mean when I say it ‘s like doing drugs. You actually feel the textbook changing your submit of consciousness.Well, I ‘m aquiline. Though sol far, I ‘ve barely scantily started : the grammar is still a mystery to me. All the like, on my latest read-through I notice that the endings of nouns and verbs, which are beginning looked quite random, now seem to be displaying some perennial patterns … _________________________I have been making efforts to understand in more quantitative terms what I ‘ve been doing here. First, I thought it would be a good practice to try copying out the text of : this would force me to look cautiously at every letter, and besides give me a machine-readable version that I could analyse. I ‘m now about three-quarters of the way through ( he has just said adieu to the fox ). I tried running my incomplete corpus, which contains about ten thousand words, through a handwriting that not and I developed last year.The script is simple but quite useful. It counts frequencies for all the words in the corpus, then builds a hyperlinked concordance which shows me up to ten examples for each word. Every word is clickable, so I can take a parole I ‘m uncertain of in a conviction and see examples of that bible in other context. There is a chief index which lists all the words in descending frequency order. here are the first 50 lines. The ‘Freq ‘ column gives the number of times the discussion occurs, and the ‘Cumul ‘ column gives the accumulative frequency : All of these 50 words ( to be exact, some of them are punctuation marks ) are immediately very familiar to me, and as you can see they make up more than 50 % of the textbook. I tried walking down the list to see when I stopped feeling convinced. I can go american samoa far as words with four or five occurrences, and I think I know what about all of them mean ; that brings me up to about 400 words, and 75 % of the total. When I look at words occurring two or three times, I start to feel changeable, but I still think I know the majority of them. That gets me to 900 words and 86 %. The 1600 words which alone occur once are of class the hardest ; but even hera I feel I can guess a draw, possibly a third gear to a half of them.Copying out the textbook has sharpened my understand of the grammar a good distribute, and now I recognise quite a few endings. Though I ‘m still pretty brumous about the nouns. With multiple genders, multiple cases and marking for determinateness, there are many combinations, and I only know the most common ones.It ‘s surprise that one can extract so much data from a bantam sample of just ten thousand words. I ‘ll see if I have the solitaire to finish this and then doas well … ___________________________________I have finished copying out the text of ; the file now contains about 14,200 words and about 3,050 unique words. I made a small improvement to our script, so that it now creates an alphabetic index ampere well. This is very utilitarian for finding copying errors : if I see two words airless together which are about the lapp, that much means that one of them is an error. Tidying up my copy text is not vitamin a long-winded as I thought it would be. It ‘s forcing me to look very carefully at everything and consolidate my extremely sketchy vocabulary.I am indisputable there are still many errors left, but after this initial cleaning up pass I can look at my alphabetic index and get further on trying to understand the grammar. here ‘s a department showing forms of the bible, “ star ”, which occurs much inSome of these are colonial nouns : for example, , literally “ star-ologist ” is “ astronomer ”, and, “ star-ology-thing ” is “ astronomic congress ”. But what are all the others, most of which look like inflected forms ? I can click on any of them and get a hyperlinked page of examples. For case, let ‘s expression at the page for, which occurs 15 times : I see that occurrences ofusually come after a preposition. For model, we have, “ He has never looked at ”, or, “ But you are arrant and you come from ”. Most of the others are alike. Hm, looks like this is a dative curious ? My suspicions are reinforced by the fact that Swedish used to have a dative ; it disappeared long ago, but calm survives in a pair of fixed expressions like, “ for sale ”, which has thisending.Still a great cover more grammar to figure out ! There are some improvements to the script that I hope to add soon, and which might help … ___________________________________I have added another small improvement to our handwriting. It now creates a hyperlinked version of the original text, with the words colour-marked to show how frequently they occurred in the textbook you ‘ve read indeed far. The initial version uses four color. Words are in black if they occur more than five times, blue if they occur four or five times, green if they occur two or three times, and crimson if they occur once. here ‘s an model, the begin of the visit to the Drunkard : The colours let you see at a glance approximately how well I nowadays understand the textbook. Look at the first paragraph : Black words like ( “ planet ”, I think in the dative ) and ( “ very ” ) are quite companion, and I am reasonably confident that I ‘ve guessed the greens and blue ones correctly. only two words, ( “ short ” ? ) and ( “ low ” ? ) are in loss, and these are indeed the ones I feel least certain about. I ‘m pretty much guessingfrom context. I ‘m more confident about, since I know from other examples that, cognate to Swedish, is “ heavy ”, is probably something related to Swedish, “ heart ”, and there is a swedish son, “ heavy-spirited/depressed ” .This was an easier passage than average, and normally there is more red. But it feels motivating to think that, as I copy out more textbook and work it through the script, the crimson tide should start to recede …
Read more: 15 Mystery Series That’ll Keep You Guessing