Tuesday, March 23, 2010

Language, efficiency, etc.

Isn't that just a stirring title?

I've been having fun lately making a language, with two purposes in mind: beauty and efficiency. No, it's not a coincidence that I'm going for a language with both rhyme and reason.

However, a problem poses itself in trying to maximize efficiency and beauty at the same time. Right now, I'm looking at ways to maximize efficiency, and later I'll make sure that it leaves room for a beautiful language to be constructed. However, efficiency itself is difficult.

I'm going to create a measure of efficiency, first. The efficiency of a statement is the units of information divided by the number of syllables. For example,

"The quick brown fox jumps over the lazy dog."
The - modifies "fox" to make it definite
Quick - describes "fox."
Brown - describes "fox"
Fox - states subject of statement. In fox, a number of things are stored:
1. It's a fox.
2. There's only one.
3. By its placement in the sentence, it is the subject.
Jumps - states the action the fox takes. Also stores a lot of information:
1. The action is jumping.
2. One thing does it.
3. It's present tense.
4. It's third person.
5. It's active.
6. It's indicative.
Over - Describes the action of jumping.
The - Modifies "dog" to be definite.
Lazy - Modifies "dog."
Dog - What is being jumped over.
1. It's a dog.
2. There's only one.
3. Since it's after a preposition, it is what is being jumped over.

Counting it up, we have 18 units of information, and 11 syllables. So that's an efficiency of 18/11, which isn't too shabby.

English seems pretty efficient, right? Well, let's change it to

"The quick brown foxes are jumping over the lazy dogs."

We didn't change the units of info. The number of "fox" and "dog" became plural, and "jump" expanded to "are jumping," which means the same thing, just with a helping verb and an ending. The efficiency falls to 18/14. Which goes to show: the efficiency of English is variable. It depends on the words chosen, and we've got some pretty long words. The efficiency of "This is supercalifragiliciousexpialodocious" is supercalifragilisticexpialodociously low: 8/16 = 0.5.

Anyways, I want a language that gives me a high efficiency. This means I shall eschew with endings (they add syllables to add meaning, which doesn't help the efficiency much). The efficiency for Latin averages out to be something below 1 (that is, as far as I've experimented - feel free to find some highly efficient Latin texts!).

The problem with efficiency, though, is that when something is too efficient, it can lead to mass confusion! Let's say the word "Icar" means "beautiful girl," and "Ikhar" means "ugly girl." You wouldn't want to screw up in the middle of a compliment!

I soon realized that the issue correlates to something else I've been working on: a "Natural Language Processor," which is a neat little programming language based on... well, language. You define nouns, conjunctions, verbs, etc. to do math stuff.

This requires me to study what language is all about: making statements. Obviously, you have the nouns, which you make statements about. In math these correlate to numbers and maps and tensors and whatnot. So, what are functions and operators (sin x, x + y, etc)? The answer isn't intuitive: they're modifiers and conjunctions. In J, a programming language that acts like a Natural Language Processor, these are verbs, but I disagree. A verb is a statement of relation: even "Bob hit Tom," the verb is a statement of relation! Bob is the hitter, Tom is the hit. The word "hit" simply states a relation through the action: the verb is not the action itself, in other words.

So I've decided to analyze the components of language in tandem with making this efficient language. Hopefully, it will help me realize how to most efficiently organize components to maximize the efficency of the language, and life will be easier. :)

In making a language, there's a certain amount of reasoning and logic: but there's also the most important rule of having fun. Why am I creating this language? Certainly not to benefit anyone. I'm doing this purely for enjoyment. Language and math walk hand in hand, and each bring their own rhymes and reasons to the mix.

No comments:

Post a Comment