Koga
In Koga, most of your typical language constructs (Byte, Int, If, While etc) are implemented using the language and not within the parser. You might consider this as a new assembly language since we define these constructs using things like bytes, addresses, and instructions. The examples here try to show roughly how these constructs are implemented, along with how they are used. The language is still young so there is a decent amount of verbosity. Some details may also be omitted.
- Int
- If
- While
- Switch
- Enums
- Pointer
- Array
- String
- Administrator
- Reference, Fields, and Methods
- Structs
- Hello world!
- Concurrency
Introduction
There are a few things to understand before learning the syntax. This will be a very brief introduction that should make more sense as you read through this page. We'll work up to a "Hello world" example.
Documents
At the core of Koga are Documents, a data format that describes information related to your programs as abstractly as possible. These are analogous to JVM class files. A Document needs a globally unique name (e.g. x.y.z), there are no packages. Each document has information related to its function e.g. methods and fields. There are currently four types:
- Host documents specify programs that have their own memory space i.e. a process. An example might be a Calculator program.
- Hosted documents specify a program that exists in a Host's memory space. An example might be an ArrayList or HashMap.
- Interface documents specify a table of methods for Hosted documents to implement. An example is a List.
- Protocol documents, similar to Interfaces, specify a table of protocol methods for Host documents to follow. A protocol method is different from an interface method because Hosts don't share memory space with each other. An example is a window system.
Language
Koga is more of an extendable language of languages than a language in and of itself.
Each source file starts by naming the parser it wants to use e.g. parser hosted;
.
Each parser builds an object that implements one of two interfaces: Usable or Compilable.
-
A Compilable is the simpler of the two, this has a
compile(DocumentBuilder)
method, the idea here being this builds a Document. -
A Usable is more complicated and has a few methods, all fairly similar to each other:
declare
,construct
, andinvoke
. Each accept a MethodBuilder. The idea being these are used to build up a method e.g. the instructions.
Byte and Int
Let's start by looking at how the Byte and Int are defined, and how you can use them.
parser machine;
core.Byte {
byte[1] val;
constructor(b8 imm) {
integer(ADD, II, LDA, val, IL, 0d0, AL, imm);
}
plus(b8 imm) {
integer(ADD, TI, val, val, imm);
}
}
parser machine;
core.Int {
byte[4] val;
constructor(b32 imm) {
integer(ADD, II, LDA, val, IL, 0d0, AL, imm);
}
constructor(Int copy) {
integer(ADD, TI, LDA, val, ADA, copy.val, IL, 0d0);
}
plus(b32 imm) {
integer(ADD, TI, LDA, val, LDA, val, AL, imm);
}
plus(Int in) {
integer(ADD, TT, LDA, val, LDA, val, ADA, in.val);
}
}
parser host;
usables {
core.Byte;
core.Int;
}
test.ByteAndIntTest {
main() {
Byte x 0;
Int y 0;
Byte z 0;
x + 10;
y + 25;
z plus(111);
y + y;
}
}
Let's start with the "machine" parser. This is like writing assembly in the style of a Java class. Looking at the Int source, it's like it has fields, constructors, and methods. It's got one field named "val" which is an address to a sequence of four bytes. The constructor and method statements here are defined in terms of machine instruction builders.
Some Document types have methods, and those methods have instructions. A compiled instruction has:
- a type e.g. integer (i) or jump (j)
- a subtype e.g. ADD or AND or OR
- an source type e.g. TI, TT.
- sources
An instruction could be written as integer(ADD, TI, 0, 0, 6). Note that an instruction and instruction builder are different. This instruction says add the literal value 6 to the value in address 0, store the result in address 0.
The source type is how the instruction should interpret the source values. Each source potentially needs an input type, and these values are chained together into one string e.g. TI, TT. In the case of integer instructions, the first source is treated as a destination value and is implicitly an address. The current source types are:
- Immediate (I) - a direct value, no need to load anything
- Logician (L) - a register value from the small selection of registers e.g. task address, instruction address, object, table.
- Task (T) - a task relative location of a value to load
The statements in these methods aren't instructions though, they are instruction builders. The difference is the builder sources need to be resolved during compilation. Each source is instead two value, a resolve type and a string value. For example, "LDA, val" says to use the local data address for the field val i.e. this usable should have a data field named val. If you had a method with just one Int, then val would compile to 0. If you had two Ints, the second Int's val would be at address 4 as the first Int val uses bytes 0 1 2 3. The resolvable types are:
- Immediate literal (IL) - 0d1, 0b0110
- Argument literal (AL) - these are literal values passed in when invoking the method or constructor. In ByteAndIntTest, these would be 0, 10, 25, 111.
- Local data address (LDA) - The address of a local data field e.g. the val data in Int
- Argument data address (ADA) - the address of an arguments data field, as shown in the Int copy constructor
Let's analyse an example of an instruction builder.
constructor(b32 imm) {
integer(ADD, II, LDA, val, IL, 0d0, AL, imm);
}
- integer(...) - this defines the instruction type integer, you can also write i(...) for brevity.
- ADD - A subtype of the integer instruction
- II - The input type, II are for the two sources, both being of type immediate (I). This means the instruction contains the values to use in the add operation.
- We now describe three more values in order: the destination, the first source, and the second source. Each of these is described with two values, the resolve type and the resolve value. LDA, val - Here we say resolve the value val as a local data address (LDA). The result will be the destination of our addition i.e. the address we store the result.
- IL, 0d0 - Just like above, only this time we say to resolve 0d0 as a literal. This will then resolve to 0.
- AL, imm - Resolve the value imm as an argument literal. You can see these values in the constructor statements in ByteAndIntTest. The system parser resolves literals differently, so there the decimal value is implicit (you omit the 0d).
Now lets look at ByteAndIntTest. This uses the host parser, and you can see it declares the imported usables Int and Byte. The general structure and method statements should look familiar. Each statement type uses the usables interface in different ways. Our first three statements are constructor statements, these call a constructor on the specified Usable and save a new variable with the specified name to its context e.g. x, y, z. The last four statements are invoke statements, these lookup a variable with the given name and then invoke the "plus" method on the usable (note that "+" is an alias for "plus"). Note also that you mutate the variable, you don't make a new one, so x + 5 is equivalent to x += 5 in other languages.
If
Now we'll look at how some basic control flow is defined.
parser machine;
core.Boolean {
byte[1] val;
constructor(b1 imm) {
l(ADD, II, LDA, val, IL, 0d0, AL, imm);
}
}
parser machine;
usables {
core.Boolean;
}
core.If {
Addr end;
constructor(Boolean bool, Block block) {
conditionalBranch(EQ, TI, ADA, bool.val, IL, 0d0, after);
Block block;
jump(REL, I, end);
Addr after;
Addr end;
}
elseIf(Boolean bool, Block block) {
Position end;
conditionalBranch(EQ, TI, ADA, bool.val, IL, 0d0, after);
Block block;
jump(REL, I, end);
Addr after;
}
else(Block block) {
Position end;
Block block;
}
break() {
j(REL, I, end);
}
}
parser host;
usables {
core.Int;
core.Boolean;
core.If;
}
test.IfTest {
main() {
Int x 0;
Boolean a false;
Boolean b true;
If(a) {
x + 5 + 5;
} elseIf(b) {
x + 20;
} else {
x + 50;
};
}
}
If defines an address field instead of a data field previously seen in Int and Byte.
Addresses give a name to an instruction position, you can declare them and update them with the syntax Addr addrName
.
These can be scoped to the method or to the variable.
The If has an address named "end" that is kept to after all the if instructions.
This is where an if/elseIf method that succeeds will jump after executing its block.
You can also position at an address, and when you append an instruction, the instruction is appended before the positioned address.
Blocks are a way to pass a sequence of instructions to the usables.
In our IfTest example, the block passed to the If constructor x + 5 + 5;
will have two integer add instructions.
Blocks are defined between braces, and when interleaving with other arguments, you would close those arguments with a closing parenthesis e.g. (a) { ... } instead of (a, { ... }).
These can be interleaved e.g. (a) { ... } (b) { ... }.
To append a block argument, you just write Block $blockName.
In the If example, we name it "block" and so we just write Block block;
.
Usables methods don't have return types, this allows you to continually chain methods e.g. constructor into elseIf into else. It might seem unnatural to use a semicolon at the end of an if-elseif-else chain, but this does keep the syntax consistent.
When the SystemCompilable constructs an If, it still makes a new variable, however here it has no name so once you stop chaining methods you won't be able to use it again.
In most languages, you can implicitly use a break inside an If.
To do that in Koga, you would give the If a name, say i, and then your block code can do i break();
.
This is more verbose than normal, but it does make it simpler when there are multiple things you can break from e.g. inner loops.
Another thing to note is the constructor that has an Boolean parameter, but you could define multiple constructors without other parameters. You don't really have to stop there, Int itself can have an if method as shown immediately below. Not to say whether you should or not, but it's possible. Remember, there aren't any "method invocations" happening here, we're just adding instructions to our method, same as writing an if in Java.
Int x 5;
x if(5) { x + 10; };
x ifOdd { x + 1; };
While
While is pretty much the same as If, but let's have a look. We'll make some additions to Boolean and Int too.
Boolean {
...
set(Block block) {
context(PUSH);
Block block;
context(POP);
}
}
Int {
...
constructor(Int bool, Block block) {
cb(EQ, TI, ADA, bool.val, IL, 0d0, after);
Block block;
j(REL, I, end);
Addr after;
Addr end;
}
lessThan(b32 imm, Boolean dest) {
l(SLT, TI, ADA, dest.val, LDA, val, AL, imm);
}
}
parser machine;
usables {
core.Int;
}
core.While {
Addr start;
Addr end;
constructor(Int bool, Block loop) {
Addr start;
conditionalBranch(EQ, TI, ADA, bool.val, IL, 0d0, end);
Block loop;
jump(REL, I, start);
Addr end;
}
constructor loop(Block loop) {
Addr start;
Block loop;
jump(REL, I, start);
Addr end;
}
continue() {
jump(REL, I, start);
}
break() {
jump(REL, I, end);
}
}
parser host;
imports {
core.Int;
core.Boolean;
core.If;
core.While;
}
test.WhileWithBreakTest {
main() {
Int x 0;
Boolean y true;
While a (y) {
If(x) {
a break();
};
x + 1;
y = { x < 10; };
};
}
}
So far we've only used unnamed constructors, here we have a constructor named loop. To use this is straightforward, you write the constructor name when constructing the variable. Below is a While variable named w that uses the loop constructor (which accepts one block argument).
While w loop {
...
}
Something to note is the statement y = { x < 10; };
.
The runs the set method on Boolean y that accepts one block.
Another way of writing this is x lessThan(10, y);
.
The problem lies in that the lessThan operates on the Int, so we have nowhere good to write the result, unless we overwrite the Int value.
This means we need a destination parameter, but then you end up with the syntax x < (10, y);
.
Were it possible to remove the single line semicolon i.e. y = { x < 10 };
I think the syntax would be acceptable.
Look at our new Boolean set method, you'll see context statements.
These are adding the Boolean as an implicit argument, so every invoke in the block will have that Boolean appended to its argument list.
This allows us to do { x < 10; }
and run the lessThan(b32 imm, Boolean dest)
method.
This feature is still somewhat fresh and could do with some more thought.
Switch
Switch introduces some new complications, each case needs to add a jump and the case at different positions. We've already slightly seen how this can be handled using Position and Addr statements.
parser machine;
core.Switch {
Addr jumps;
Addr cases;
Addr end;
constructor(Int x) {
jump(REL, T, ADA, x.val);
Addr jumps;
Addr cases;
}
case(b8 label, Block block) {
Addr case;
Position jumps;
jump(REL, I, case);
Position cases;
Addr case;
Block block;
jump(REL, I, end);
Addr end;
}
}
parser host;
usables {
core.Byte;
core.Int;
core.Switch;
}
test.SwitchTest {
main() {
Int x 0;
Int y 2;
Switch (y)
case(0) {
x + 5;
} case(1) {
x + 10;
} case(2) {
x + 15;
};
}
}
This Switch is perhaps underdeveloped compared to how other languages handle switch. You don't need anything more than this though right?
The constructor uses an Int to jump into the jumps instructions which will then jump to a case. There isn't much in the way of checks for bad inputs right now e.g. switch(5) with only 2 cases. Each case will add a single jump statement to the end of the jump instructions and then the case block to the end of cases instructions. The input numbers are just for show (case(0), case(1)).
Enums
parser machineEnum;
literals {
Success 0;
Error 1;
Waiting 2;
}
core.Status {
byte[1] val;
}
parser host;
usables {
core.Int;
core.Status;
}
test.EnumTest {
main() {
Int x 0;
Status s (Error);
s match(Success) {
x plus(2);
} (Error) {
x plus(4);
} (Waiting) {
x plus(6);
};
}
}
Here's a new parser, machineEnum. This creates a Usable just like our other examples and so will be used inside a Host parser in the same way. The difference here though is that the functionality is mostly implemented within the Java code and the enum configuration is quite minimal.
There is a small new concept introduced here, the name argument type. You probably see that "Error" isn't an immediate, variable, or block. You'll see names popup more later when dealing with objects and references. There's not too much you can do with them, what the enum does it match it against its literal values.
As you can see, there isn't a "match" method defined anywhere in Status, nor a constructor. This is all implemented in the MachineEnumUsable using the literals and its field.
Pointer
Let's have a look at how pointers might be implemented. This example introduces generics, the memory instruction, register source type, and register resolvable type. Note this is just a basic implementation and generics are underdeveloped as you will see from the use of "Any".
parser machine;
core.Pointer<Usable T> {
byte[4] addr;
constructor(Any val) {
integer(ADD, LI, LDA, addr, R, task, ADA, val);
}
copyTo(Any val) {
memory(COPY, ITI, ADA, val, LDA, addr, LG, T);
}
copyFrom(Any val) {
memory(COPY, TII, LDA, addr, ADA, val, LG, T);
}
}
parser host;
usables {
core.Int;
core.Pointer;
}
test.PointerTest {
main() {
Int x 1;
Int y 2;
Int z 3;
Pointer<Int> p (x);
p -> y;
p <- z;
}
}
Looking at PointerTest, we construct the pointer at x, copy to y and copy from z. That method then ends with x:3, y:1, z:3.
We can see our first usage of the logician source type. We need to add the task address to the variables's task relative address to get the absolute address of the variable. If the task is located in the process at byte 100, and the Int is located relative to the task at byte 4, then the absolute address is 104. The logician (i.e. the processor) has this task address stored in it's "registers". We can then store and use that value in memory instructions (or any instruction really).
You might be wondering why we don't need to calculate the absolute address of y or z. We don't really need to calculate it for x either in this case because we aren't using this pointer outside of the task. Memory instructions use the source type to infer information about the source values. If the source is an immediate value, its inferred as a task relative address. If the source is a task relative value, its inferred as an absolute address.
You might also wonder what LG, T
is doing.
Here we use the resolve type LG (local generic).
If our pointer were Pointer<Int>, "LG T" would resolve to 4, i.e. the size of the Int.
The third source in the memory instruction is the size of the copy, so we're saying copy 4 bytes.
Generics parameters have to be specified as a Usable generic or a Document generic. A Usable generic will allow accessing the size of the specific generic whereas a Document generic will use things like the document name.
Array
parser machine;
core.Array<Usable T> {
Byte[4] size;
Byte[4] start;
Byte[4] step;
Byte[0] data;
constructor() {}
constructor(b12 imm) {
i(ADD, II, LDA, size, IL, 0d0, AL, imm);
i(ADD, II, LDA, step, IL, 0d0, LG, T);
Byte[AL imm][LG T] data;
i(ADD, LI, LDA, start, R, task, LDA, data);
}
}
parser machine;
core.ArrayPointer<Usable T> {
byte[4] start;
byte[4] size;
byte[4] step;
byte[4] addr;
constructor(Array arr) {
integer(ADD, TI, LDA, start, ADA, arr.start, IL, 0d0);
integer(ADD, TI, LDA, size, ADA, arr.size, IL, 0d0);
integer(ADD, TI, LDA, step, ADA, arr.step, IL, 0d0);
integer(ADD, TI, LDA, addr, LDA, start, IL, 0d0);
}
index(b12 imm) {
byte[4] index;
integer(ADD, II, LDA, index, IL, 0d0, AL, imm);
integer(MUL, TT, LDA, index, LDA, index, LDA, step);
integer(ADD, TT, LDA, addr, LDA, start, LDA, index);
}
copyTo(Any val) {
memory(COPY, ITI, ADA, val, LDA, addr, LG, T);
}
copyFrom(Any val) {
memory(COPY, TII, LDA, addr, ADA, val, LG, T);
}
}
parser host;
usables {
core.Byte;
core.Int;
core.Array;
core.ArrayPointer;
}
test.ArrayPointerTest {
main() {
Int x 4;
Int y 5;
Int z 0;
Array<Int> arr (3);
ArrayPointer<Int> p (arr);
p copyFrom(x);
p index(2);
p copyFrom(y);
p copyTo(z);
}
}
Arrays are similar to pointers, in fact we even have an ArrayPointer. There is a "byte[0]" field that gets allocated in the constructor "Byte[AL imm][LG T] data;". This allocates space in the method (and thus has to be sized at compile time) using the input value and the size of the generic. In this case, 3 Ints (an Int being 4 bytes) leads to 12 bytes allocated on the method.
You can probably guess what the ArrayPointer is for. It currently copies all the array values too, to do things like bounds checking. I do hope to improve this in the future to not need to copy the values. You could also write an Array that comes with a pointer depending on use cases.
Strings
parser machine;
core.String {
byte[4] size;
byte[4] start;
byte[4] step;
constructor(Name const) {
Symbol(CONST, constSymbol, AL, const);
class(ADDR, LI, start, table, constSymbol);
class(SIZE, LI, size, table, constSymbol);
i(ADD, II, LDA, step, IL, 0d0, IL, 0d1);
}
constructor(b8[] const) {
class(ADDR, I, start, const);
class(SIZE, I, size, const);
integer(ADD, II, LDA, step, IL, 0d0, IL, 0d1);
}
equalTo(String in, Boolean dest) {
...
}
}
parser host;
usables {
core.Boolean;
core.String;
}
constants {
aConstName "hello";
}
test.StringEqualsTest {
main() {
String hi "hello";
String anotherHi (aConstName);
Boolean b { hi equalTo(anotherHi); };
}
}
Here we introduce constants and symbols. Host and Hosted documents can have constant values, usually for things like arrays or strings. These will be bytes stored in memory that you can reference.
Host and Hosted documents also have a symbol table/runtime table, and just like everything else, you manipulate this table using the language. Each symbol has two values at runtime, for example a hosted document method will have a size and address (how much space the method needs and the instruction address). Looking at some examples you might use runtime values to load the addr of a const, the size of a class, or the address and size of a method.
You can append symbols using the compiler statement "Symbol", this will also add a literal argument to your method with the given name and index of the symbol. You can then use the class (c) instruction builder and that new argument literal to add a runtime table load instruction. Looking at the String name constructor, we get the symbol table index for a const symbol with the input name, and store in a literal argument named constSymbol. We then add instructions to load both the addr and size at that table index.
Strings are very similar to arrays, they have an addr and a size along with a step, though this is always one right now (one byte characters). The addr in a String could reference directly to the constant, or you might allocate space in the method and address it there, or you might allocate space in your process memory and address it there.
Administrator
The core of a process.
parser interface;
usables {
core.Int;
core.Pointer;
}
core.Administrator {
init();
exit();
allocate(Pointer p, Int size);
port(Pointer res);
task(Pointer idOut, Int objectAddr, Int objectTableAddr, Int methodAddr, Int methodSize);
group(Pointer idOut);
awaitTask(Int task);
awaitGroup(Int task);
transition(Int newState);
connect(Int instance, Int protocolMethodAddr, Pointer connOut);
listen(Pointer connOut);
send(Int instance);
}
parser host;
usables {
core.Int;
core.Pointer;
core.AdminRef;
}
documents {
util.SimpleAdministrator;
}
test.AllocatorTest {
main() {
AdminRef admin init();
Int size 124;
Int allocateOne 0;
Pointer<Int> allocateOneP (allocateOne);
admin allocate(allocateOneP size);
Int allocateTwo 0;
Pointer<Int> allocateTwoP (allocateTwo);
admin allocate(allocateTwoP size);
admin exit();
}
}
Here's a small introduction to the interface parser, that will produce Documents of type Interface when compiled. We're defining perhaps the most important interface in Koga, the Administrator. It is still a work in progress, but I do quite like the idea.
Every Host Document specifies a concrete administrator document (remember that a process is created by a Host Document, so every process has an Administrator). An administrator, as the name suggests, is responsible for administrative things like memory space allocations, task creation/scheduling/completion, IPC etc. Maybe it could be described as a runtime. Take note that the administrator is also implemented using the language like most other things.
Symbols, as seen previously, allow us to use administrator methods from any Hosted document. The logician (i.e. the processor) will have the admin object address and its runtime table address stored in its registers. This idea allows library code to allocate, create tasks etc without knowing about the administrator in use and not have to pass around an Allocator object.
I haven't made it clear yet, Koga doesn't use traditional frame stack threads. The tasks (i.e. the methods) have space allocated only for that task. You might refer to them as stackless coroutines but I don't like that name. This poses a problem though, how can you allocate space for a new task, in particular space to invoke an administrator method? Every task actually has space for itself and also space to invoke an administrator method.
References, Fields, and Methods
It's time to start using a Document's methods and fields, not just Usables. Let's look at how this is implemented using the language.
parser machine;
Pointer<Usable T> {
byte[4] addr;
...
constructor(Reference r, Name field) {
symbol(FIELD, fieldSymbol, AG, r.R, AL, field);
class(ADDR, I, addr, fieldSymbol);
integer(ADD, TT, addr, addr, r.objectAddr);
}
...
}
parser machineReference;
core.Reference<Document R> {
byte[4] objectAddr;
byte[4] objectTable;
constructor this() {
logician(GET_OBJECT, objectAddr);
logician(GET_TABLE, objectTable);
}
constructor new() {
byte[4] objectSize;
symbol(CLASS, classSymbol, LG, R);
class(SIZE, I, objectSize, classSymbol);
logician(GET_TASK, objectAddr);
integer(ADD, TI, LDA, objectAddr, LDA, objectAddr, LDA, objectAddr);
admin(ALLOCATE, objectAddr, objectSize);
class(ADDR, I, objectTable, classSymbol);
}
invoke(Name methodName) {
byte[4] frameSize;
byte[4] methodAddr;
byte[4] newFrame;
logician(GET_TASK, newFrame);
integer(ADD, TI, LDA, newFrame, LDA, newFrame, LDA, newFrame);
symbol(METHOD, methodSymbol, LG, R, AL, methodName);
class(SIZE, I, frameSize, methodSymbol);
class(ADDR, I, methodAddr, methodSymbol);
byte[4] adminTaskMethod;
symbol(METHOD, adminTaskSymbol, IL, Administrator, IL, task);
class(ADDR, I, adminTaskMethod, adminTaskSymbol);
byte[4] adminTask;
logician(GET_ALT_TASK, adminTask);
memory(COPY, PA, LDA, adminTask, LDA, newFrame, LDS, newFrame);
integer(ADD, TI, LDA, adminTask, LDA, adminTask, LDS, newFrame);
memory(COPY, PA, LDA, adminTask, LDA, objectAddr, LDS, objectAddr);
integer(ADD, TI, LDA, adminTask, LDA, adminTask, LDS, objectAddr);
memory(COPY, PA, LDA, adminTask, LDA, objectTable, LDS, objectTable);
integer(ADD, TI, LDA, adminTask, LDA, adminTask, LDS, objectTable);
memory(COPY, PA, LDA, adminTask, LDA, methodAddr, LDS, methodAddr);
integer(ADD, TI, LDA, adminTask, LDA, adminTask, LDS, methodAddr);
memory(COPY, PA, LDA, adminTask, LDA, frameSize, LDS, frameSize);
logician(START_ADMIN, LDA, adminTaskMethod);
byte[4] frameDataAddr;
integer(ADD, TI, LDA, frameDataAddr, LDA, newFrame, IL, 0d0);
args();
byte[4] adminScheduleMethod;
symbol(METHOD, adminScheduleSymbol, IL, Administrator, IL, schedule);
class(ADDR, I, adminScheduleMethod, adminScheduleSymbol);
logician(GET_ALT_TASK, adminTask);
memory(COPY, PA, LDA, adminTask, LDA, newFrame, LDS, newFrame);
logician(START_ADMIN, LDA, adminScheduleMethod);
}
arg(Any a) {
memory(COPY, PA, LDA, frameDataAddr, ADA, a, ADS, a);
integer(ADD, TI, LDA, frameDataAddr, LDA, frameDataAddr, ADS, a);
}
}
parser host;
usables {
core.Int;
core.Reference;
core.Pointer;
core.AdminRef;
core.Task;
}
test.LocalVariableTest {
Int x;
main() {
Int y 10;
Int z 16;
Reference<LocalVariableTest> this this();
Pointer<Int> thisx (this x);
thisx <- z;
Pointer<Int> p (y);
this second(p);
}
second(Pointer<Int> r) {
Reference<LocalVariableTest> this this();
Pointer<Int> thisx (this x);
Int a 0;
thisx -> a;
r <- a;
Task t complete();
}
}
We can ignore what the method "second" is doing as it isn't really relevant, it's just some code, fairly verbose code in this case. We're just interested in calling the method.
The Reference code here is rather large and requires some focus and thought to look at. The first thing to note is that it uses a new parser named "machineReference". The Usable produced by a machineReference parser is different, running a method will run the invoke method using the name of the method as the argument. In our case "this second(p)" uses the Reference variable named "this" and run "second(p)". This runs the Reference.invoke method with "second" as the name argument. All the instructions there are to setup and run the "LocalVariableTest.second" method.
Reference r this();
r second(arg1 arg2);
In this example, r second(arg1 arg2);
is like doing r invoke(second);
.
You'll see an "arg(Any a)" method in Reference and an "args();" compiler instruction used in its invoke method.
This is how the arguments are iterated and copied over to the new task.
Looking at Reference invoke. It makes a symbol using the methodName argument. It uses this symbol index to add a table lookup instruction, getting the method address at runtime. It then invokes an admin instruction to create a new task. Copies over the arguments to this tasks memory space. It then invokes an admin instruction to schedule the new task.
There is a field "x" on "LocalVariableTest" and there is a new Pointer constructor to get the pointer to an objects field. Objects have space allocated according to their size, so here LocalVariableTest has 4 bytes allocated, it's just the one Int x. There are no object header values or anything, though that could definitely change in the future. To get the field we need a field symbol for the field name, here x. This will be 0 at runtime since it's the first field. We then add the address of the object, which is stored in Reference.objectAddr. We now have a pointer to the field x that we can copy to and from.
Structs
Structs aren't too developed right now but a basic implementation can still be shown. As you might expect, these are implemented using a new parser.
parser structure;
usables {
core.Boolean;
core.While MyWhile;
core.If;
core.Byte;
core.Int;
}
core.LocalDate {
Int year;
Int month;
Int day;
constructor today() {
Int year 2025;
Int month 3;
Int day 6;
}
addDays(Int amount) {
amount + 2;
day + amount;
Boolean isTooHigh { day > 28; };
MyWhile (isTooHigh) {
month + 1;
day - 28;
isTooHigh = { day > 28; };
};
}
testing(Int amount, Block b) {
Boolean x true;
If tmp (x, b);
}
}
parser host;
usables {
core.Byte;
core.Int;
core.Boolean;
core.If;
core.While;
core.LocalDate;
core.Exit;
}
test.StructureTest {
main() {
LocalDate date today();
Int addDays 30;
date addDays(addDays);
date testing(addDays) {
addDays + 5;
};
Exit;
}
}
The idea of a struct is to have a Usable that's written in terms of other usables (instead of bytes, addresses, instructions etc). The benefits being that you inline the data and instructions. This example makes a LocalDate struct, with some nonsense methods just to test things are working. You can also see an example of changing an imported usable name (While -> MyWhile).
I think LocalDate is a good example to discuss structs. It makes sense to store it's data inline however I don't think it necessarily makes sense to inline the instructions. A real LocalDate implementation of plusDays would be quite a lot of instructions, using this all over a codebase could increase instruction count quite considerably.
I do think there could be a use case for an inlined object such that the data is inlined but the instructions are still separate. I do think this should be quite easily possible in Koga though I've yet to prove it.
Hello world
Let's skip a bunch of stuff and look at hello world. There will be a client and server that talk to each other. The IPC functionality is pretty underdeveloped still, I wouldn't look at the code too much as it's quite janky right now.
parser protocol;
chatting.Chatting {
chat {
Byte[4096] read;
Byte[4096] write;
}
}
parser hosted;
usables {
core.Int;
core.Connection;
core.Reference;
core.Pointer;
core.Boolean;
core.AdminRef;
core.While;
core.OutputStream;
core.InputStream;
core.String;
core.AdminRef;
core.Task;
}
chatting.Chat {
InputStream in;
OutputStream out;
Int instance;
asClient(Connection connection) {
Reference<Chat> this this();
Int inAddr pageTwo(connection);
InputStream inStream (inAddr);
Pointer<InputStream> thisIn (this, in);
thisIn <- inStream;
Int outAddr pageOne(connection);
OutputStream outStream (outAddr);
Pointer<OutputStream> thisOut (this, out);
thisOut <- outStream;
Int instnc instance(connection);
Pointer<Int> thisInstance (this, instance);
thisInstance <- instnc;
Task t complete();
}
asServer(Connection connection) {
Reference<Chat> this this();
Int inAddr pageOne(connection);
InputStream inStream (inAddr);
Pointer<InputStream> thisIn (this, in);
thisIn <- inStream;
Int outAddr pageTwo(connection);
OutputStream outStream (outAddr);
Pointer<OutputStream> thisOut (this, out);
thisOut <- outStream;
Int instnc instance(connection);
Pointer<Int> thisInstance (this, instance);
thisInstance <- instnc;
Task t complete();
}
read(Pointer<String> outString) {
Reference<Chat> this this();
Pointer<InputStream> thisIn (this, in);
InputStream in;
thisIn -> in;
in wait();
String out;
out readFrom(in);
outString <- out;
thisIn <- in;
Task t complete();
}
write(String inString) {
Reference<Chat> this this();
Pointer<OutputStream> thisOut (this, out);
OutputStream out;
thisOut -> out;
Pointer<Int> thisInstance (this, instance);
Int instance;
thisInstance -> instance;
AdminRef admin ();
inString copyTo out;
admin send(instance);
thisOut <- out;
Task t complete();
}
}
parser host;
usables {
core.InputStream;
core.OutputStream;
core.Int;
core.Pointer;
core.AdminRef;
core.Boolean;
core.While;
core.String;
core.Exit;
core.Connection;
core.Reference;
}
documents {
util.SimpleAdministrator;
chatting.Chat;
chatting.Chatting;
}
test.TalkerTest {
main() {
AdminRef admin init();
Int instance 1;
Connection conn connect(instance, Chatting, chat);
Reference<Chat> chat new();
chat asClient(conn);
String str "hello server";
chat write(str);
String result;
Pointer<String> resultPtr (result);
chat read(resultPtr);
Exit;
}
}
parser host;
usables {
core.InputStream;
core.OutputStream;
core.AdminRef;
core.Int;
core.Boolean;
core.While;
core.If;
core.Array;
core.ArrayPointer;
core.Pointer;
core.Thread;
core.String;
core.Reference;
core.Task;
core.Connection;
core.Exit;
}
documents {
util.SimpleAdministrator;
chatting.Chat;
chatting.Chatting;
}
test.Server supports Chatting {
main() {
AdminRef admin init();
Connection conn;
Pointer<Connection> connPtr (conn);
admin listen(connPtr);
Reference<Server> t this();
t chat(conn);
Exit;
}
chat(Connection conn) {
Reference<Chat> chat new();
chat asServer(conn);
String str;
Pointer<String> strPtr (str);
chat read(strPtr);
String expected "hello server";
Boolean isGreeting { str == expected; };
If (isGreeting) {
String response "hello client";
chat write(response);
} else {
String response "huh";
chat write(response);
};
Task t complete();
}
}
Who said "hello world!" had to be one line? We start to look at IPC now. This language doesn't follow the unix system so there isn't a stdin and stdout.
I want the protocols between processes to be defined as you can see with the parser named "protocol" and earlier mentions of documents having a protocol type. Processes talk with the system administrator (e.g. kernel) to form connections without processes. A connection in this case is some shared memory mapped into each process's memory space. A protocol defines how this connection will look so in our case it will have two byte arrays named read and write. There is then a utility Hosted document that helps to use this protocol.
The Server uses its local administrator to listen for new connections. The local administrator will communicate with the system administrator. The Talker will use its local administrator to connect to process with id 1, and then send "hello server". The server on receiving the connection will read the input and respond accordingly, in our case responding "hello client".
Concurrency
parser machine;
core.Seq {
constructor(Block try, Block catch) {
Addr fail;
Addr complete;
byte[4] status;
byte[4] statusAddr;
byte[4] statusParamAddr;
l(ADD, RI, LDA, statusAddr, R, task, LDA, status);
~ status pointer is 20 bytes into the admin area
integer(ADD, RI, LDA, statusParamAddr, R, altTask, IL, 0d20);
byte[4] awaitTaskAddr;
integer(ADD, RI, LDA, awaitTaskAddr, R, altTask, IL, 0d0);
context(BLOCK, createTask) {
memory(COPY, PA, LDA, statusParamAddr, LDA, statusAddr, LDS, status);
byte[4] adminTaskMethod;
symbol(METHOD, adminTaskSymbol, IL, Administrator, IL, task);
class(ADDR, I, adminTaskMethod, adminTaskSymbol);
logician(START_ADMIN, LDA, adminTaskMethod);
};
context(BLOCK, taskReady) {
byte[4] adminScheduleMethod;
symbol(METHOD, adminScheduleSymbol, IL, Administrator, IL, awaitTask);
class(ADDR, I, adminScheduleMethod, adminScheduleSymbol);
integer(ADD, RI, LDA, awaitTaskAddr, R, altTask, IL, 0d0);
memory(COPY, PA, LDA, awaitTaskAddr, CL, task, IL, 0d4);
logician(START_ADMIN, LDA, adminScheduleMethod);
conditionalBranch(NEQ, TI, LDA, status, IL, 0d0, fail);
};
Block try;
jump(REL, I, complete);
Addr fail;
Block catch;
Addr complete;
}
}
Reference<Document R> {
...
invoke(Name methodName) {
byte[4] adminTask;
byte[4] frameSize;
byte[4] methodAddr;
byte[4] newTask;
integer(ADD, RI, LDA, adminTask, R, altTask, IL, 0d0);
integer(ADD, RI, LDA, newTask, R, task, LDA, newTask);
symbol(METHOD, methodSymbol, LG, R, AL, methodName);
class(SIZE, I, frameSize, methodSymbol);
class(ADDR, I, methodAddr, methodSymbol);
~ copy all the admin arguments
memory(COPY, PA, LDA, adminTask, LDA, newTask, LDS, newTask);
integer(ADD, TI, LDA, adminTask, LDA, adminTask, LDS, newTask);
memory(COPY, PA, LDA, adminTask, LDA, objectAddr, LDS, objectAddr);
integer(ADD, TI, LDA, adminTask, LDA, adminTask, LDS, objectAddr);
memory(COPY, PA, LDA, adminTask, LDA, objectTable, LDS, objectTable);
integer(ADD, TI, LDA, adminTask, LDA, adminTask, LDS, objectTable);
memory(COPY, PA, LDA, adminTask, LDA, methodAddr, LDS, methodAddr);
integer(ADD, TI, LDA, adminTask, LDA, adminTask, LDS, methodAddr);
memory(COPY, PA, LDA, adminTask, LDA, frameSize, LDS, frameSize);
integer(ADD, TI, LDA, adminTask, LDA, adminTask, LDS, frameSize);
~ use an implicit createTask body
~ otherwise get the status data address, copy to admin arg, and invoke admin.task
createTask {
byte[4] status;
integer(ADD, RI, LDA, status, R, task, LDA, status);
memory(COPY, PA, LDA, adminTask, LDA, status, LDS, status);
byte[4] adminTaskMethod;
symbol(METHOD, adminTaskSymbol, IL, Administrator, IL, task);
class(ADDR, I, adminTaskMethod, adminTaskSymbol);
logician(START_ADMIN, LDA, adminTaskMethod);
};
byte[4] frameDataAddr;
integer(ADD, TI, LDA, frameDataAddr, LDA, newTask, IL, 0d0);
args();
context(IMPLICIT, task, LDA, newTask);
taskReady {
byte[4] adminScheduleMethod;
symbol(METHOD, adminScheduleSymbol, IL, Administrator, IL, awaitTask);
class(ADDR, I, adminScheduleMethod, adminScheduleSymbol);
integer(ADD, RI, LDA, adminTask, R, altTask, IL, 0d0);
memory(COPY, PA, LDA, adminTask, LDA, newTask, LDS, newTask);
logician(START_ADMIN, LDA, adminScheduleMethod);
};
context(REMOVE, task);
}
parser host;
usables {
core.Int;
core.If;
core.Task;
core.Seq;
core.Boolean;
core.Pointer;
core.Reference;
core.AdminRef;
core.Exit;
}
documents {
util.SimpleAdministrator;
}
test.TryTest {
main() {
AdminRef admin init();
Int x 0;
Pointer<Int> ptr (x);
Reference<TryTest> this this();
Seq {
this second(ptr);
this second(ptr);
this second(ptr);
this second(ptr);
this second(ptr);
} {
x = 30;
};
Exit;
}
second(Pointer<Int> ptr) {
Int test 15;
Int x;
ptr -> x;
x if(test) {
Task f fail();
};
x + 5;
ptr <- x;
Task t complete();
}
}
I don't like async/await syntax, but what it does is fairly useful. I've tried to replicate all the usefulness without any of the ugliness. A major goal of the language is to neatly implement structured concurrency constructs using the language just like all the other constructs.
The initial Reference example I showed earlier was a bit of a lie, the above Reference is the more accurate Reference.invoke method. You can see an example of a comment, just start the line with a ~ which represent the curvature of writing.
In this invoke method, we are trying to invoke two blocks, createTask and taskReady, however we have some default instructions to use if these blocks don't exist. They might not exist because the blocks here aren't passed in as arguments, they are named implicits. This allows the reference to be configured when scheduling new tasks, by default awaiting the created task. You can pass in a block instead to await a group of tasks, allowing you to have constructs shown below.
usables {
...
core.Seq;
core.One;
core.All;
}
main() {
...
~ Runs sequentially after the other
~ Error thrown on a method failing
Seq {
this taskOne();
this taskTwo();
If (shouldDoTaskThree) {
this taskThree();
};
} catch {
~ Error handling here
...
}
~ Runs in parallel and continues after first one completed
~ Error thrown on all methods failing
One {
this taskOne();
this taskTwo();
If (shouldDoTaskThree) {
this taskThree();
};
} catch {
~ Error handling here
...
}
~ Runs in parallel and waits for all to complete
~ Error thrown on a single method failing
All {
this taskOne();
this taskTwo();
If (shouldDoTaskThree) {
this taskThree();
};
} catch {
~ error handling
...
};
Exit;
}
The idea still needs further development. I do think it can work well though to implement useful concurrency constructs like All or One while keeping the syntax clean. I should note that there is no function colouring either, every method is invoked as a single task. This needs more explaining which will be written elsewhere and another time.
Conclusion
That concludes a short tour of the language in its current state. Hopefully you have a decent idea of how things work. I definitely omitted details, lied a bit, and used wrong inconsistent terms all over, apologies!
There are more examples available in the resources folder, though without explanations.