2018年3月5日月曜日

My original programming language, Expresso -- Explaining the core

Hi, this is HAZAMA. And this is the second entry to Expresso.

First of all, I'll explain several features that stand out.
First, Expresso supports vectors and dictionaries as builtin types. They can be written in literal forms and the compiler treats them as special types(although they are compiled to System.Collections.Generic.List and System.Collections.Generic.Dictionary respectively). But patterns don't support them yet because of the problems of the implementation.
Second, Expresso supports the intseq type, which is short for "integer sequence" and is a generator for integers like the xrange type in Python or the Range type in Rust. Unfortunately, even though it only handles 32 bit integers, it will be frequently used for counting up in for loops because Expresso doesn't support the traditional for loops as in C. If you index into vectors or arrays with an intseq, it will produce an iterator(in .NET terms it will be called an enumerator) that yields elements that match to the integer sequence, which is called a slice.
Third, Expresso also supports match statements like Rust. This construct pattern-matches against values and destructures objects and matches against literal values. Even though it will only match to tuple patterns, the variable declaration statements also now support pattern matches.
Oh, though I forgot to mention this, you can omit the parameters types and the return type in closures if they are obvious like the closures will be passed to functions or methods directly. This implementation expects that writing chains of methods is easy when in the future the methods in System.Linq.Enumerable can be called but I'm being hesitated to implement extension methods. I doubt that it will result in having to iterate through every type defined whenever a new type is defined.

Now that we know the features of Expresso that stand out, let's look at how Expresso works next. Currently the compiler is entirely made in C#. The lexer, the parser, the analyzer and the code generator are all written in C# at present. The binary the compiler will output is in the IL code, and the parser generator is also written in C#. These are the reasons why I chose C# for the language that the compiler is written in(you only need to generate expression trees in order to generate some data that can be executed). I'm keen to implement the compiler in Expresso, but there are a lot of problems that should be solved like how I can split the parser and the analyzer and assume we will use the parser that is written in C#, because the parser and the analyzer are the part that can't be separated, then what will be left is the code generator and so that will make no difference in how the compiler is written. In addition, how I can implement the intseq and the slice type is also a problem when I will write the compiler in Expresso. The intseq type is compiled to the ExpressoIntegerSequence type and doing so is made easy by the feature of the C# compiler so if I will implement it in Expresso, then I should also fully implement the ExpressoIntegerSequence without the convenient C# compiler's feature(This feature transforms the source code into a state machine. So it would be no problem if I know how the compiler transforms the source code).
Even though there are some problems, because the code generation is fairy easy, there is a parser generator and it runs on multiple platforms by default, I can say C# is the language for writing a compiler. If you are interested in creating your own programming language, I recommend you to start by implementing a LISP interpreter rather than recommending you to start by implementing a new language in the first place, for example. After you do that, you can easily see what the parser, the lexer and the interpreter are doing.
Next onto the grammar, it is not currently available in printed format or something similar so if you need to know which construct creates what object, see the Coco parser specification. Of the specifications there are ones that don't work at present because they are not implemented yet(namely, the comprehension and interfaces). If you need to grasp a bit of the grammar, see the files under cloned_directory/ExpressoTest/sources/. Of those files there are ones that the parser can't recognize but you will find what the grammar is like.
For documentations, I'm writing them in Markdown in English only. They are located in cloned_directory/Expresso/Documentation/.
And this wraps up the entry. I'll write another entry if I have more informations to share. See you again ;)

0 件のコメント:

コメントを投稿

なにか意見や感想、質問などがあれば、ご自由にお書きください。