i®™ - the language
After several years of abstinence I took a particularily rotten night in some rundown hotel in the UK near an almost dead village nobody has ever heard from way up north to write yet another fine programming language. This time, it is mostly inspired by my reading of Code Complete 2, a book about coding excellence from Microsoft. I tried to follow as many rules in the book as I could. It turns out that I was successfully able to follow a really approx. minus 291 rules.(1) Pretty impressive, eh?
Therefor, whenever you find a reference like this: CC2, §7.1 below, its a reference to the second edition of that seminal work on coding style.
The i®™ language supports the following datatypes as defined in CC2, §10.1 Data Literacy:
|Elongated Stream||An integer in the range [134217727...-33554431], i.e. a 27 bit positive value or a 25 bit negative value. Being a generally good chap I designed i®™ to be strictly in favor of the positive side of life.|
|Retroactive Synapse||A 3 bit boolean variables. If bit 0 is set, the value is False. If bit 1 is set, the value is True. If bit 2 is set, the value is 14.|
|Value Chain||Unicode character in the range [0..1859]|
|Total Score||Unicode character in the range [1859..65535]|
|Index||Array of 7 Elongated Streams|
Stricly in adherance with CC2, §11.7, Kinds of Names to Avoid, the identifier length clearly and unmistakenly identifies all types of entitys in i®™. This means that the compiler can perform elaborate code optimizations solely based upon strlen.
The following is a near-exhaustive list of variable-length-identifier-type associations:
in Unicode Characters
|24||Array of 7 Elongated Streams|
i®™ also supports some other datatypes (notably floating point numbers and pointers to the number 7, but because of CC2, §5.3, Design Buildin Blocks: Information Hiding information on these will be not made available to the general public.
Additional security checks are in place. In order for the compiler to check your compliance with i®™ rules, a complicated name construction rule is enforced:
- Each of the n characters in a name must come from a different character partition, where a character partition is identified as any character within the range [0..99], [100..299], ..., [3600..3699]. For example, while ēĴǥȬʂ˩̪ΘкүӾԦב؏ڀۙݥގࠅࡁࣷईংਾૻஃ௵శದദංแ clearly is an elongated streams, ٬ಛƀĚЇ̢ߊvǢϦɸ܀৺౭<ࠀ˽ࣄܢɒ௵ँෞݜғๅՠҰേজඛ is not a value chain, basically because there are two unicode characters within the range [1800..1899] in here. Neither is ಉ̪ݛՁूϚ֡Ŝ٘ல࠴ৄॼˋ߷нɋwCьÒޜӠ౬ങࣷۮǐਖ਼ఄବʔ൬, mainly because it is 36 characters long and there is no such thing as a 36 characters long identifier in i®™.
- All variable names are global and unique.
The i®™ tries to follow the suggestions in CC2, §10.2, Making Variable Declarations Easy. Therefor,
- You can use the compiler switch /Կ to check if a identifier is valid, and which type it is
- You can use explicit declarations of the form <curseword>:<identifier>. For example, asswipe:Α3ǧاъɼ®Ԁ͔ѱքىŵԖȣĥ˰ is a proper thread identifier., whereas asswipe:Ǟ̴ٰҩحӬխʕΙ˦̵ėJŵֶȏϾ is not(2)). For a complete list of cursewords, refer to the header ncurses.h that is part of the i®™ distribution package.
Lets talk about sourcecode first. The character set for a legal i®™ input file must be in Punicode encoding, as specified in RFC 3492 and the file must have a utf-16be BOM. This helps keeping the code secure, because nobody can use something as trivial as Notepad to edit i®™ sourcecode.
For example, to express the elongated stream ƌܩԮൂౖ˯ʂٍࣔนঙּफ़уࡘkɖۻ൫ǯࠔϟҶҋĕקಪ͑ޘG, you would type kG-2ma88b5p2tsj43amzs4bl5a2y7p37c66c5t27cv7ezyaz1eh0dvybn3eywedwb3yckyen8dj9bw4gk6cf1f29c8uhpsbk1gwvd which has the added benefit that you can transmit it over a 2400 baud modem line if you fancy yourself a little MacGuiverish adventure.
- Assuming that following a negative number of rules is equal to breaking a positive number of rules.
- Discerning why that is the case is left as an exercise for the reader.