2018年3月25日日曜日

My original programming language, Expresso -- The import statement and interoperability with other .NET languages

Hello, again. This is HAZAMA. And yet another blog post about Expresso.

In a previous blog post, I said that you can now use .NET as the standard library, and I expanded the specification further and it now supports interoperability with other .NET languages via assemblies. In addition, I also revised the specification of the import statement, so I'll cover it.
Let's say we have the following C# source code:

//In TestInterface.cs
using System;
using System.Collections.Generic;

namespace InteroperabilityTest
{
    public interface TestInterface
    {
        void DoSomething();
        int GetSomeInt();
        List<int> GetIntList();
    }
}

// In InteroperabilityTest.cs
using System;
using System.Collections.Generic;

namespace InteroperabilityTest
{
    public class InteroperabilityTest : TestInterface
    {
        public void DoSomething()
        {
            Console.WriteLine("Hello from 'DoSomething'");
        }

        public List<int> GetIntList()
        {
            Console.WriteLine("GetIntList called");
            return new List<int>{1, 2, 3, 4, 5};
        }

        public int GetSomeInt()
        {
            Console.WriteLine("GetSomeInt called");
            return 100;
        }
    }
}

// In StaticTest.cs
using System;
using Expresso.Runtime.Builtins;

namespace InteroperabilityTest
{
    public class StaticTest
    {
        public static void DoSomething()
        {
            Console.WriteLine("Hello from StaticTest.DoSomething");
        }

        public static bool GetSomeBool()
        {
            Console.WriteLine("GetSomeBool called");
            return true;
        }

        public static ExpressoIntegerSequence GetSomeIntSeq()
        {
            Console.WriteLine("GetSomeIntSeq called");
            return new ExpressoIntegerSequence(1, 10, 1, true);
        }
    }
}

Assume that the DLL containing the above code is named InteroperabilityTest.dll and we'll write this Expresso code:

module main;


import InteroperabilityTest.{InteroperabilityTest, StaticTest} from "./InteroperabilityTest.dll" as {InteroperabilityTest, StaticTest};

def main()
{
    let t = InteroperabilityTest{};
    t.DoSomething();
    let i = t.GetSomeInt();
    let list = t.GetIntList();

    StaticTest.DoSomething();
    let flag = StaticTest.GetSomeBool();
    let seq = StaticTest.GetSomeIntSeq();

    println(i, list, flag, seq);
}

Then we'll see the Console.WriteLine outputs, 100, [1, 2, 3, 4, 5, ...], true and [1..11:1] on the console. As you can see, although you generally call only functions using FFI(Function Foreign Interface), here you can create instances and even call instance methods. Of course, it's because we run on the same runtime environment, the CLR. With the compiled DLL, you can create instances and call methods from other .NET languages such as C#. We take full advantage of the .NET environment, huh?
Apr. 7 2018 added: Although you could have "gotten" properties, it now supports "setting" properties.
It's almost there where we achieve complete interoperability with foreign .NET languages. Apr. 8 2018 added: Now you can refer to enums defined on IL codes. That means that we have achieved complete interoperability with other .NET languages, I suppose.
C# can interoperate with C++, so we could interoperate with C++ from Expresso if we wrap it in C#.

So notice the import statement has changed? In the previous post, the import clause takes a string but now it looks more like the ones in Python or maybe Rust(Rust calls it use statements, though). When the import statement is written in EBNF, it looks something like the following:

"import" ident [ "::" ( ident | '{' ident { ',' ident } '}' ) ] { '.' ( ident | '{' ident { ',' ident } '}' ) [ "from" string_literal ] "as" ( ident | '{' ident { ',' ident } '}') ';'

This means that you can't omit the as clause unlike Python. In Expresso, you have to alias imported names. Otherwise you would have to refer to them with names containing "::" or "." but Expresso doesn't allow it. You use "::" when you refer to a type that belongs to a module. This is also true for other expressions.
On the other hand, you also specify the namespace when you're referring to a type in external assemblies that are written in C#. In addition, when doing so you can't import variables and functions directly. This is because IL code doesn't allow you to define variables or functions on assemblies.
The file names in the from clause is relative to the source file that the import statement resides in. That means that InteroperabilityTest.dll locates on the same directory as the source file. This rule applies as well when you refer to .exs source files.

Now that we can interoperate with C#, I'm expecting to write the Expresso compiler itself in Expresso.

オレオレ言語、Expressoについて・・・importと外部アセンブリ編

こんにちは、今月何回目かわかりませんが、はざまです。今回もExpressoについてです。

先日の記事で、.NETを標準ライブラリとして使用できるようにしたと書きましたが、今回その仕様をさらに改良して、任意の外部アセンブリを読み込んで相互運用できるようにしました。それに合わせて仕様が曖昧だったimport文も見直したので、簡単に解説しておこうかなと思います。
まず、以下のようなC#のコードがあったとします。

//In TestInterface.cs
using System;
using System.Collections.Generic;

namespace InteroperabilityTest
{
    public interface TestInterface
    {
        void DoSomething();
        int GetSomeInt();
        List<int> GetIntList();
    }
}

// In InteroperabilityTest.cs
using System;
using System.Collections.Generic;

namespace InteroperabilityTest
{
    public class InteroperabilityTest : TestInterface
    {
        public void DoSomething()
        {
            Console.WriteLine("Hello from 'DoSomething'");
        }

        public List<int> GetIntList()
        {
            Console.WriteLine("GetIntList called");
            return new List<int>{1, 2, 3, 4, 5};
        }

        public int GetSomeInt()
        {
            Console.WriteLine("GetSomeInt called");
            return 100;
        }
    }
}

// In StaticTest.cs
using System;
using Expresso.Runtime.Builtins;

namespace InteroperabilityTest
{
    public class StaticTest
    {
        public static void DoSomething()
        {
            Console.WriteLine("Hello from StaticTest.DoSomething");
        }

        public static bool GetSomeBool()
        {
            Console.WriteLine("GetSomeBool called");
            return true;
        }

        public static ExpressoIntegerSequence GetSomeIntSeq()
        {
            Console.WriteLine("GetSomeIntSeq called");
            return new ExpressoIntegerSequence(1, 10, 1, true);
        }
    }
}

このアセンブリ名がInteroperabilityTest.dllだとして、以下のようなExpressoのコードを記述します。

module main;


import InteroperabilityTest.{InteroperabilityTest, StaticTest} from "./InteroperabilityTest.dll" as {InteroperabilityTest, StaticTest};

def main()
{
    let t = InteroperabilityTest{};
    t.DoSomething();
    let i = t.GetSomeInt();
    let list = t.GetIntList();

    StaticTest.DoSomething();
    let flag = StaticTest.GetSomeBool();
    let seq = StaticTest.GetSomeIntSeq();

    println(i, list, flag, seq);
}

すると、Console.WriteLineの出力をしつつ、100、[1, 2, 3, 4, 5, ...]、true、[1..11:1]という出力がされます。ご覧の通り、普通FFI(Function Foreign Interface)を使用して呼べるのは関数だけですが、インスタンスの生成、インスタンスメソッドの呼び出しも行えます。ランタイム環境が共通になっている恩恵ですね。また、コンパイル後のILの状態ならば、リフレクションの機能を使用してC#などの他の言語から呼び出すこともできます。まさに「無敵」状態です。.NETを採用した強みが出ていますね。
2018/4/7 追記: 以前からプロパティのgetはできたものの、本日、setも実装し、完全にプロパティを扱えるようになりました。完全な互換性を保持するまであとenumを使えるようにするだけと、あと1歩になりました。2018/4/8 追記: ILのenumも参照できるようになりました。これで完全な互換性を持ったことになるはずです。
C#からC++のコードの呼び出しができたはずなので、C#でラップすることで、C++のライブラリもExpressoから呼べることでしょう。

さて、以前のコードと比べてだいぶ様相が変わっているのが、importの箇所でしょう。以前のコードでは、importの直後は文字列になっていましたが、仕様改訂を経てRustとPythonのimportの合いの子のような見た目になりました。import文をEBNFで記述すると、以下のようになっています。

"import" ident [ "::" ( ident | '{' ident { ',' ident } '}' ) ] { '.' ( ident | '{' ident { ',' ident } '}' ) [ "from" string_literal ] "as" ( ident | '{' ident { ',' ident } '}') ';'

つまり、Pythonと違ってas節は省略できません。これは、元の識別子そのままでインポートできないための措置です。インポートした識別子には、必ず別名をつけなければなりません。そうでなければ、"::"や"."を含む名前で識別せねばならず、これはExpressoの仕様上できないからです("::"や"."を含む識別子は定義できない)。import節は、.exsファイルからモジュール内の型を指定する場合に、"::"を使います。これはその他の式でも同様です。
一方、C#で生成したDLLを読む場合には、名前空間から型名を指定してください。また、C#などで生成したDLLを読む場合、変数名や関数名をインポートする識別子に指定することはできません。ILコードではアセンブリレベルにフィールドやメソッドを定義できない制約のためです。
from節のファイル名は、記述されているソースファイルに対して相対パスになります。つまり、上のコードでInteroperabilityTest.dllは、このソースファイルと同じディレクトリに存在します。この挙動は、他の.exsファイルを読み込む時も同じです。

こうして、Expressoから自由にC#のコードも呼べるようになったことですし、そろそろセルフホスティングだか、セルフブートストラップだかも視野に入れていきたい・・・

2018年3月17日土曜日

My original programming language, Expresso -- The intseq type

Hi, this is HAZAMA. And this is the 5th blog post about Expresso in this month.

Even though I wasn't planning to write blog posts about specific types, I have realized a funny fact about it, and I'm writing this post.
Namely, the intseq type is a builtin type to Expresso, and it is alike to the xrange type in Python or the Range type in Rust. Frankly, it is a generator that produces integers in sequence so the basic functionality is the same as that of the similar types in other programming languages. Although the range in Kotlin can be tested for inclusion with the in operator, Expresso can't. In is a keyword in Expresso as well, but it only is used as the right-hand-side in for loops.

module main;


def main()
{
    for let i in 0..10 {    // start..end:step is a ternary operator。When step is omitted, it will be 1.
        println(i);
    }

    for let j in 9...0:-1 { // If you set step to negative integers, the sequence go in the negative direction. Note that we don't check whether it is valid(On Apr. 5 2018 added: The compiler now checks whether it is correct when the expression consists only of literal expressions. In other words, the compiler issues a warning if you write it as 0...9:-1 instead).
        println(j);
    }

    for let k in 0..10:3 {  // Of course, you can set it to more than 1.
        println(k);
    }
    let a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...]; // I wish that we can write it as (0..10).select(|x| x);. From Apr. 5 2018, it can also be written as [0..10, ...];
    let a2 = a[2...4];
    println(a2);
}

The above code prints [2, 3, 4] as expected(The actual output will be something like [List<int>, 2...4:1] because it prints the result of calling ToString on it). The type of a2 will be a slice, which is also a builtin type. Even though Rust has the Slice type as well and it also allows us to write like that, other programming languages don't. So it is a strong point of Expresso.
In Kotlin you can create ranges for floats(because they don't make sense I may get it wrong), but in Expresso you can do so for ints only. Note that it might raise an exception when you try to create a range outside of the range of the int(Added on Apr. 5 2018: Actually the compiler complains because a number outside of the int range will be interpreted as a double and the intseq expression doesn't accept a double).
In addition, the range operator is one of the two ternary operators in Expresso. The other ternary operator is the conditional operator( ? : ).
Added on Apr. 5 2018: As shown in the comment of initialization of a, you can now initialize a sequence with an intseq. Other languages such as Rust, Kotlin and Swift don't support this syntax, so as far as I know it's unique to Expresso only. You can create an array with [0..10]; and of course, with a step: [0..10:2];.

2018年3月16日金曜日

オレオレ言語、Expressoについて・・・intseq編


こんにちは、はざまです。今回も懲りもせずにExpressoについての記事です。

特定の型に関する記事は特に書くつもりなかったのですが、面白い事実に気付いてしまったので、記事にすることと相成りました。
さて、Expressoのintseq型ですが、これは組み込みの型で、Pythonのxrangeや、RustのRange型を思い浮かべていただくと理解が早いかと思います。まあ、有り体に言えば、整数列を生成するgeneratorなので、そこらへんの基本機能は、他の言語と一緒です。Kotlinのrangeはちょっと強くて演算子で包含確認ができるみたいですが、ExpressoはIncludesというメソッド経由です。Expressoでもinはキーワードではあるものの、for文の右辺にしか使えません。
module main;


def main()
{
    for let i in 0..10 {    // start..end:stepは3項演算子。stepを省略すると1になる
        println(i);
    }

    for let j in 9...0:-1 { //stepを負にすれば、マイナス方向にも行ける。なお、整合性はチェックしてません(2018/4/5 追記: リテラルで指定している場合のみ、整合性をチェックするようになりました。0...9:-1と書くと警告が出ます。変数などを使ってるとチェックされません)
        println(j);
    }

    for let k in 0..10:3 {  //もちろん、stepは1以上でもいい
        println(k);
    }
    let a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...]; //将来的には(0..10).select(|x| x);などと書けるようにしたいところ(2018/4/5追記: [0..10, ...];と書けるようになりました)
    let a2 = a[2...4];
    println(a2);
}

上記のコードで、想定通り[2, 3, 4]が出力されます(まあ、Slice型のToString()の結果なので、[List<int>, 2...4:1]みたいな感じですが)。a2の型はこちらも組み込みのslice型になります。Slice型自体は、Rustにもあるものの、このような書き方ができる言語がないので(修正: Rustでもできましたね、この書き方)、特筆すべきところはそこになります。
Kotlinだと、floatに対してもrangeが定義されてるみたいですが(floatの数列って意味をなさない気がするから気のせいなのかな)、Expressoの場合は、intだけです。その範囲外は多分死にます(型しか見てないので、範囲外が来ても明示的なエラーにできず、intへの型変換で落ちるはず)(2018/4/5追記: これは誤りでした。実際にはintの範囲外はdoubleと解釈され、intseqはdoubleを受け付けないと怒られます)。
ちなみにrange operatorは、Expressoに2種類ある3項演算子の1つです。もう一つは、有名な条件演算子( ? : )です。
2018/4/5追記: aの宣言のコメントにあるように、intseqでシーケンスを初期化できるようになりました。意外にも、RustやKotlin、Swiftなどではこの表記は採用されていないので、私の観測範囲ではExpresso固有の機能になります。なお、[0..10];と書けば、arrayにもできます。もちろん、stepを指定して、[0..10:2];などと書くこともできます。機能限定のリスト内包表記みたいな感じですね。

2018年3月15日木曜日

My original programming language, Expresso -- About functions

Hi, this is HAZAMA. And this is an-another blog entry for Expresso this month. We'll be talking a bit about the grammar this time around.

One of the commonest language constructs is the functions and Expresso, of course, has ones. Specifically, Expresso has module-level functions and methods in class definitions. They are implemented as static methods in module classes, and ordinary methods in class definitions, respectively.
module main;


def test()
{
    let a = 10;
    return a + 10;
}

def test2(n (- int)
{
    return n + 10;
}

def test3(n (- int) -> int
{
    return n + 20;
}

def test4(n (- int) -> int
{
    if n >= 100 {
        return n;
    }else{
        return test4(n + 10);
    }
}

def main()
{
    let a = test();
    let b = test2(20);
    let c = test3(20);
    let d = test4(80);

    println(a, b, c, d);
}

This code snippet originated from the test codes again. Return types of Expresso functions will be inferred from the body, but I'm worried about whether to change it or not(In the code above, all the functions will be inferred to return an int). In Kotlin, they are inferred only if the return type is Unit(Void), so I've not decided to make Expresso do that.
Parameters of functions are inferred only when they have the optional values(In the code above there isn't). In the code above, each of the functions:

  1. takes no parameters, and the return type is implicit.
  2. takes a parameter, and the return type is implicit.
  3. takes a parameter, and the return type is explicit.
  4. takes a parameter, and the return type is explicit and it contains a recursive call.
I guess there are no special differences to those in other programming languages.
Note that these are unnecessary comments. Although the parser can parse variadic parameters, the program isn't emitting any code for that so you can't use variadic parameters yet. I added those for the print* functions, but I don't come up with many use cases so I've not decided whether to implement them. Maybe I should implement them because doing so isn't that hard. Likewise, I've not decided to implement so-called string interpolation. This will replace the printFormat function if it exists. This is a difficult problem.(Apr. 8 2018 added: The string interpolation has been implemented and the printFormat function has been removed)

オレオレ言語、Expressoについて・・・関数編

こんにちは、はざまです。今回も例に漏れず、Expressoについての記事です。今回はちょっとだけ具体的な文法に触れます。

どんなプログラミング言語にも当たり前にある要素の1つとして、関数が挙げられます。もちろん、Expressoにも存在しますが、Expressoの場合、モジュールの関数と、クラスのメソッドが存在します。実装上は、前者はモジュールクラスのstaticメソッド、後者は普通にクラスのメソッドです。
module main;


def test()
{
    let a = 10;
    return a + 10;
}

def test2(n (- int)
{
    return n + 10;
}

def test3(n (- int) -> int
{
    return n + 20;
}

def test4(n (- int) -> int
{
    if n >= 100 {
        return n;
    }else{
        return test4(n + 10);
    }
}

def main()
{
    let a = test();
    let b = test2(20);
    let c = test3(20);
    let d = test4(80);

    println(a, b, c, d);
}

このコードもテストコードから引っ張ってきたものですが、Expressoの関数は、戻り値は常に省略しても、本体から推論されますが、この仕様を変えるか悩んでいます(上記の例は、全てintを返す関数と解釈されます)。Kotlinでは、Unit(Voidとほぼ同義)の時のみ推論が効くので、Expressoもそうすべきかもと思っています。
引数の方は、デフォルト値がある時のみ推論が効きます(上記の例には存在しませんね)。
上記の例は上から解説すると、

  1. 引数なし、戻り値省略
  2. 引数あり、戻り値省略
  3. 引数あり、戻り値明示
  4. 引数あり、戻り値明示の再帰呼び出し
になります。まあ、他の言語と大きく違うところはないかと思います。
余談ですが、可変長引数をとる関数のパースはできるようになっているものの、コード生成をするようにしてないので、まだ動きません。とりあえず、print*系のために導入したもので、C#にもあるからまあいいかと思っているものですが、具体的なユースケースがそれほど思いつかず、どうしようか悩んでいます。実装自体はそんなに難しくないので、とりあえず入れてしまうのが正解なのかな〜・・・同様に、一般にString interpolationと呼ばれる機能も導入するか迷っています。これがあると、printFormat関数を置き換えることができるんですよね。実に悩ましいところです。(2018/4/8 追記: string interpolationを実装し、printFormat関数を廃止しました)

2018年3月13日火曜日

My original programming language, Expresso -- Tools are now ready

Hi, this is HAZAMA. And this is another blog entry for Expresso.

So far, you can only call methods in some types such as System.Math, but now programs can construct a new instance of a C#'s type, read properties, and resolve which overloads of a method or constructor to call, so we can see .NET Framework as the standard library for Expresso, which will be a big step forward for it. I also introduced the null literal in order to interact with the .NET Framework. I'm planning to prohibit the use of null literals in contexts without .NET. If you need null in other contexts, you can use the Option type in the standard library(I have no idea of how it and other types in the standard library will be provided).


module main;

import "System.IO.File" as File;
import "System.IO.FileStream" as FileStream;
import "System.Text.UTF8Encoding" as UTF8Encoding;


def main()
{
    var writer (- FileStream;
    try{
        writer = File.OpenWrite("./some_text.txt");
        let bytes = UTF8Encoding{encoderShouldEmitUTF8Identifier: true}.GetBytes("This is to test writing a file");
        writer.Write(bytes, 0, bytes.Length);
    }
    finally{
        if writer != null {
            writer.Dispose();
        }
    }
}

The code above comes from the test codes and as you can see, now you can use .NET Framework as if it would be built into Expresso. You can call a constructor by passing values in the order in which the parameters are defined. The compiler is ignoring the names of the arguments because it simply checks that the types of arguments match to the types of parameters of constructors and methods, but I suppose it would be better to check that the types and names of arguments match to those of parameters of constructors and methods(In the code above I used the names of the parameters for those of the fields of the object creation expression). Note that when you call a constructor of an Expresso's type, you have to pass values in the order that the fields are defined in the class definition. Even though you specify the names of the fields, they will be simply ignored. If you need constructors that initialize a new instance with certain values, you can define factory functions that always pass the same values for the fields that you need to have. Note also that you can't define methods in class definitions that return or take themselves as the parameters(so you will define the factory functions in modules).
As you can see, because Expresso doesn't have the using statement as in C#, the body looks ugly. Maybe I should add a similar construct. In addition, I'm planning to make it so that immutable variables won't allow mutating methods to be called as in Rust(In the code above, writer.write will be affected). And because Expresso doesn't have enums, you also can't write code that uses .NET's enums. I'm going to make Expresso's enums algebraic data types, so you won't likely to use C#'s enums directly, but maybe I'll think something out because otherwise you can't write code that uses full capability of the FileStream class. Finally, I converted the names of C#'s methods to camel case. So be careful not to call them like File.OpenFile("./some_text.txt").(Apr. 8 2018 added: This feature has been removed because it makes the language too complex)

2018年3月9日金曜日

オレオレ言語、Expressoについて・・・.NET Frameworkを標準ライブラリとして扱う編

こんにちは、はざまです。今回もExpressoに関する記事です。

今まではSystem.Mathなどの限られたクラスしか使えていませんでしたが、つい先日の更新でC#の型のnewとプロパティの読み出し、メソッド、コンストラクタのオーバーロード解決ができるようにしました。これで実質、.NET Framework全体をExpressoの標準ライブラリとみなすことができ、大きな財産となるはずです。また、.NET Frameworkの相互運用のためにnullリテラルを導入しました。nullリテラルは、.NET関連の文脈以外では禁止になる予定です。Expressoの文脈で値の有無を表現するには、別途用意するOption型を使用することにする予定です(Expresso独自の標準ライブラリをどう提供するかは未定です)。
module main;

import "System.IO.File" as File;
import "System.IO.FileStream" as FileStream;
import "System.Text.UTF8Encoding" as UTF8Encoding;


def main()
{
    var writer (- FileStream;
    try{
        writer = File.OpenWrite("./some_text.txt");
        let bytes = UTF8Encoding{encoderShouldEmitUTF8Identifier: true}.GetBytes("This is to test writing a file");
        writer.Write(bytes, 0, bytes.Length);
    }
    finally{
        if writer != null {
            writer.Dispose();
        }
    }
}

テストのコードをまるまんま写してきただけですが、このようにして.NETの型を活用できます。コンストラクタの呼び出しは、オブジェクト生成式にコンストラクタの引数の型の値を順番通り指定することで呼び出します。コンストラクタ・メソッドのオーバーロード解決は型のみを見て行っており、名前は今のところ見ていないのでなんでも構わないのですが、将来的には何かそこもバリデーションをかけたいところではあります(上記の例では、コンストラクタの引数名をオブジェクト生成式のフィールド名にしています)。なお、オブジェクト生成式でExpressoのclassを生成する場合には、フィールドが定義された順にフィールドの初期値を設定します。フィールド名をオブジェクト生成式で指定しても、その通りには現状解釈されないので注意してください。適当な初期値を与えたい場合は、ファクトリ関数を作成してください。ただし、クラス内に自身を返したり、自身を引数にするメソッドは書けないので注意です(なので、ファクトリ関数をmoduleの関数にするといいでしょう)。
C#のusingのような構文がないので、このようなtry, finallyの不恰好なコードになっていますが、これも新しい構文を追加するべきでしょうね、きっと。あと、予定としては、immutableな変数からは、自身をmutateするメソッドは呼べないようにも変更します(このコードだと、writer.writeが該当)(2018/4/8 追記: この日現在、実装されていますが、外部アセンブリ由来のコードに対してはチェックしていないので、エラーにはなりません。)。あと、現状Enumがないので、.NETのEnumを使用するコードも書けません。ExpressoのEnumは、Rustのような数学的データ型にする予定なので、導入したとしても、.NETのEnumは呼べない公算が大きいですが、ファイル生成の全パターンを網羅できなくなるので、なんとかします(多分)(2018/4/8 追記: .NETのenumを参照できるようになりました。これと別にExpressoのenumは数学的データ型として実装する予定です)。最後に、.NETのメソッドは、Expressoに合わせてキャメルケースに変換してあります。上のコードをFile.OpenWriteなどと書いて、あれ動かねえなんて茶番を演じないように気をつけましょう。(2018/4/8 追記: この仕様は複雑になるので、廃止しました)

2018年3月5日月曜日

My original programming language, Expresso -- Explaining the core

Hi, this is HAZAMA. And this is the second entry to Expresso.

First of all, I'll explain several features that stand out.
First, Expresso supports vectors and dictionaries as builtin types. They can be written in literal forms and the compiler treats them as special types(although they are compiled to System.Collections.Generic.List and System.Collections.Generic.Dictionary respectively). But patterns don't support them yet because of the problems of the implementation.
Second, Expresso supports the intseq type, which is short for "integer sequence" and is a generator for integers like the xrange type in Python or the Range type in Rust. Unfortunately, even though it only handles 32 bit integers, it will be frequently used for counting up in for loops because Expresso doesn't support the traditional for loops as in C. If you index into vectors or arrays with an intseq, it will produce an iterator(in .NET terms it will be called an enumerator) that yields elements that match to the integer sequence, which is called a slice.
Third, Expresso also supports match statements like Rust. This construct pattern-matches against values and destructures objects and matches against literal values. Even though it will only match to tuple patterns, the variable declaration statements also now support pattern matches.
Oh, though I forgot to mention this, you can omit the parameters types and the return type in closures if they are obvious like the closures will be passed to functions or methods directly. This implementation expects that writing chains of methods is easy when in the future the methods in System.Linq.Enumerable can be called but I'm being hesitated to implement extension methods. I doubt that it will result in having to iterate through every type defined whenever a new type is defined.

Now that we know the features of Expresso that stand out, let's look at how Expresso works next. Currently the compiler is entirely made in C#. The lexer, the parser, the analyzer and the code generator are all written in C# at present. The binary the compiler will output is in the IL code, and the parser generator is also written in C#. These are the reasons why I chose C# for the language that the compiler is written in(you only need to generate expression trees in order to generate some data that can be executed). I'm keen to implement the compiler in Expresso, but there are a lot of problems that should be solved like how I can split the parser and the analyzer and assume we will use the parser that is written in C#, because the parser and the analyzer are the part that can't be separated, then what will be left is the code generator and so that will make no difference in how the compiler is written. In addition, how I can implement the intseq and the slice type is also a problem when I will write the compiler in Expresso. The intseq type is compiled to the ExpressoIntegerSequence type and doing so is made easy by the feature of the C# compiler so if I will implement it in Expresso, then I should also fully implement the ExpressoIntegerSequence without the convenient C# compiler's feature(This feature transforms the source code into a state machine. So it would be no problem if I know how the compiler transforms the source code).
Even though there are some problems, because the code generation is fairy easy, there is a parser generator and it runs on multiple platforms by default, I can say C# is the language for writing a compiler. If you are interested in creating your own programming language, I recommend you to start by implementing a LISP interpreter rather than recommending you to start by implementing a new language in the first place, for example. After you do that, you can easily see what the parser, the lexer and the interpreter are doing.
Next onto the grammar, it is not currently available in printed format or something similar so if you need to know which construct creates what object, see the Coco parser specification. Of the specifications there are ones that don't work at present because they are not implemented yet(namely, the comprehension and interfaces). If you need to grasp a bit of the grammar, see the files under cloned_directory/ExpressoTest/sources/. Of those files there are ones that the parser can't recognize but you will find what the grammar is like.
For documentations, I'm writing them in Markdown in English only. They are located in cloned_directory/Expresso/Documentation/.
And this wraps up the entry. I'll write another entry if I have more informations to share. See you again ;)

2018年3月2日金曜日

My original programming language, Expresso -- Introduction

Hi, I'm HAZAMA. This is the first entry in this blog written in English. This is the introduction to my original programming language, Expresso.
First of all, what do you really want to finish as a programmer in your whole life? I guess and wish it's a programming language.

Expresso first originated about 4, 5 years ago, and it now supports try, catch statements and it would accomplish many tasks so I've decided to release it as an alpha version. But I won't yet publish any official web sites or something.
The language name, Expresso, was coined as a mixture of "expressive" and "Espresso", meaning that it's highly expressive and it would be an easy-to-write programming language. There is a slogan as well saying that "Easy for beginners, elegant for enthusiasts". This slogan deliberately contains a lot of "e"s on the heads of words, meaning "Seeking for good things" in Japanese. That is, "e wo sagasu gengo". The pronunciation of e is the same as that of a Japanese word for "good". So it's just a pun in Japanese.
Expresso would aim to be an educational programming language if it gets spread over the world. So the slogan is shouted. In other words, Expresso is going to be an easy-to-write programming language for beginners and yet it is going to be an expressive programming language for enthusiasts.
You might be wondering why it aims to be a programming language for education. Here is the answer: I learned Pascal when I was a university student and it was an old-fashoned language. And Wikipedia says it is a programming language for education(only in Japanese and as of writing this entry. The English version says "a language intended to encourage good programming practices using structured programming and data structuring", though). That's why I made my original programming language a language for education. Yes, I admit that Expresso won't be faster than Rust.

Its specification, which is vague about many things currently, is a type-strict and object-oriented programming language. Because Rust has a big influence on Expresso, it also has a lot of features from Rust.
Currently Expresso runs on .NET environment only. The reasons for this are that you can rather easily set up a .NET environment and that it runs on multiple platforms. Oh yeah, I like C# the most.

Before covering the traditional Hello world program in Expresso, we'll look at how to set up an Expresso environment. Above all, git clone the repository from Github because we have no official web sites. Then run "git submodule update --init".  The dependencies will be resolved. And then make a directory named "test_executables" on cloned_directory/ExpressoTest/. Because binaries will be created on this directory when tested, it won't get run if it is missing. I guess I want to add this directory to the git repository.
Apr. 7 2018 added: Then, build the InteroperabilityTest project and move the resulting dll file to /ExpressoTest/sources/for_unit_tests.
After that, if you are a Mac or Linux user, then you should now be able to open up the solution file in your IDE and build and run the solution(Note that you may have to change your IDE settings so that it automatically download missing projects with NuGet because it now uses NuGet to download a dependency project). If you are a Windows user, then you must get Coco. And probably you should write a batch file that automatically build the parser with Coco. After downloading Coco from Coco/R for C# section in the above web site, put it in cloned_directory/Expresso/. The Expresso project contains the core source files that powers Expresso. After that, write a batch file similar to cloned_directory/Expresso/parserCompile.sh. It would be better to modify the project setting to automatically run the batch file because then it can automatically generate and compile the parser when you running the build command.
Even though I have said a lot about setting up on Windows, I won't guarantee that it runs on Windows. So I recommend you to use Mac + Visual Studio or Linux + Xamarin Studio(I used the latter and am using the former now)(Apr. 7 2018 added: I found that the EmitterTests don't run on Windows because Mono on Mac and .NET on Windows provide different implementations.)(Apr. 8 2018 added: Now most of the EmitterTests run on Windows. But there are still some tests that issue errors I can't resolve).

OK, enough with pre-execution or something:) Now let's get on a real program, the hello world program.

module main;
def main()
{
    println("Hello, world!");
}

Let's execute an Expresso program before inspecting it. Build the ExpressoConsole project and run mono exsc.exe hello_world.exs -o ./ -e hello_world. Then it should compile the source file to main.exe and to run the executable, you need to have Expresso.dll and ExpressoRuntime.dll in the current directory. So copy them first and then run mono hello_world.exe to actually executing the Expresso program. Do you see "Hello, world!" text on the console? You made it! You've successfully compiled and run an Expresso program!

In Expresso, one file corresponds to one module like Python. Every module has to be explicitly named. The program's entry point will be the main function. At present the main function takes no parameters and returns nothing(you can write those but they will be simply ignored)(Apr. 11 2018 added: now the main function should take an args parameter and be able to return an int.).
As you can see, functions and methods are defined with the def keyword. I think it is a keyword in Python and Ruby, but it is strange to use some keyword derived from "function" like Rust in Expresso. Rust has trait objects but doesn't have objects themselves so it is fine in Rust.
Even though it doesn't appear in this example, names precede types. The (- sign separates the name and the type. This sign is unique to Expresso(it should be), derived from the mathematical ∈ sign. It is a 2 type-strokes sign because of the idea of not wanting programmers to explicitly write the types of variables as much as possible.
If you need to explicitly write the return type of a method or a function, you use -> sign as in Rust. Because the return types of methods or functions will be inferred from their bodies, you can omit them(In fact in the above code, the return type will be void).
The println function that the main function calls is a builtin one, which calls Console.WriteLine method and therefore takes a variable number of parameters and prints them out separated with commas. There are the print function which doesn't put a new line at the end, and the printFormat function which takes a format string as the first parameter that Console.WriteLine takes(Apr. 11 2018 added: now it is unsupported because I implemented string interpolation).

This finishes the introduction. In the next entry, we'll examine the features and how Expresso works.