SP-Lang: Category theory in the wild

Our techonology uses a stream processing language called SP-Lang. SP-Lang is aimed at people who don’t program, with a comparable simplicity to, e.g. spreadsheet macros or SQL. SP-Lang tries to do as much heavy lifting transparently for the user as possible.

On the other side, SP-Lang is compiled down to the machine code of the CPU for the best possible performance.

We recently encountered several interesting problems that demonstrate how seemingly abstract category theory finds its practical applications and helps us solve these problems sustainably.

Dictionary and type inference

The dictionary (or a map) is a type that composes two types into a pair. There is a type of key Tk, and a type of value Tv. These two types are composed into a dictionary type of {Tk:Tv}.

Our goal is to avoid explicit type specifications as much as possible because our users don't understand the concept of types. It means that we need to infer types from clues in the code.
It represents a challenge for a dictionary because you can have various types of different values and keys in the dictionary.

Note: A similar issue is found in the list type, which is the other container type.

Practically, this problem is illustrated in following example:

{
    "Key1": "Hello!",
    "Key2": 123
}

What is the type of this dictionary? The first value is a string "Hello!" and the second is an integer 123.

Let's try to write a type signature for it:

{ str : ( str + si64 ) }

str is for a string type
si64 is for an integer type (Signed Integer 64bit)

The type ( str + si64 ) is a so-called "sum type", using the terminology of the category theory. You declare that the value can be of a string type PLUS an integer type. We added one type to another using a logic operator of + (PLUS). The "sum type" is an algebraic type for this reason. It allows adding (PLUS or SUM) one type to another.

Note: An algebraic operator of + (PLUS) is equivalent to | (OR) in this context. The alternative name of the "sum type" is "COPRODUCT".

Type inference can construct this sum type from the type clues in the code itself.

Type `any`

There is a particular type in SP-Lang: any. This type is a sum of all possible types in SP-Lang.

We can extend the above example:

{ str : any }

And the type signature is valid.

The any type is a fundamental building block for sum types, and it is very powerful. However, there is a downside to this: it is less efficient than using scalar types because there is a particular overhead in a runtime of a compiled SP-Lang.

Tuples with `None` items

Another problem we encounter is that people want to use tuples as dictionary keys (that's ok), but some of the tuple members could be None.

It looks like this:

{
    ("One", 1): "Foo",
    ("Two", None): "Bar",
}

Please notice None in the second key of the dictionary.

Naively, you would type this dictionary as:

{ ( str, si64 ) : str }

... but this doesn't allow None in the tuple.
So the sum type has to come to help us again:

{ ( str, ( si64 + None ) ) : str }

... and yes, None is a type. We state that the second member of a tuple can be either an integer PLUS (OR) None.

For the sake of completeness, any type can be employed here as well:

{ ( str, any ) : str }

Conclusion

SP-Lang has a wide spread of goals: very high performance on one side the ease of use for language users on the other side. It represents a lot of work that SP-Lang has to do instead of the user. In these specific cases, it is about the SP-Lang approach to types. SP-Lang is designed to avoid explicit type specifications. It provides a similar feeling to the user as Python or SQL regarding types. Users can simply avoid them completely, so they have one less thing to worry about. On the other hand, because the SP-Lang is compiled into the machine code, types have to be explicitly known during compilation. Figuring types in runtime slows the execution a lot (see Python).

To solve this, SP-Lang is using type inference. It is a powerful technique that uses clues provided by the user in the code to infer types automatically. The type inference feels a bit magical, and for sure, it is a complex algorithm. The tricky part is how to make it right. "Right" means sustainable in this context. We want to build more language features on top of it.

And this is where category theory, algebraic types, and particularly sum type helped us. It provides proof that we correctly analyze and solve practical problems that we meet in the wild.

About the Author

Ales Teska

TeskaLabs’ founder and CEO, Ales Teska, is a driven innovator who proactively builds things and comes up with solutions to solve practical IT problems.

TeskaLabs LogMan.io

Log Management and SIEM

Tweets by @TeskaLabs

Most Recent Articles

You Might Be Interested in Reading These Articles

Inotify in ASAB Library

From blocking read challenge, ctypes and bitmasks to a solution that enables the ASAB framework to react to changes in the file system in real time.

Continue reading ...

asab development tech eliska

Published on August 15, 2023

From State Machine to Stateless Microservice

In my last blog post, I wrote about implementing a state machine inside a microservice I call Remote Control that will automate deployments of our products and monitor the cluster. Here I would like to describe how all this was wrong and why I had to rewrite the code completely.

Continue reading ...

development tech eliska

Published on February 15, 2023

Software architect's point of view: Why use SeaCat

I've recently received an interesting question from one software architect: Why should he consider embedding SeaCat in his intended mobile application? This turned into a detailed discussion and I realised that not every benefit of SeaCat technology is apparent at first glance. Let me discuss the most common challenges of a software developer in the area of secure mobile communication and the way SeaCat helps to resolve them. The initial impulse for building SeaCat was actually out of frustration of repeating development challenges linked with implementation of secure mobile application communication. So let's talk about the most common challenges and how SeaCat address them.

Continue reading ...

tech development

Published on April 16, 2014

Tags: splang, tech

Follow @TeskaLabs

SP-Lang: Category theory in the wild

Dictionary and type inference

Type any

Tuples with None items

Conclusion

Ales Teska

TeskaLabs LogMan.io

Most Recent Articles

You Might Be Interested in Reading These Articles

Inotify in ASAB Library

From State Machine to Stateless Microservice

Software architect's point of view: Why use SeaCat

Type `any`

Tuples with `None` items