Sea (a compiled language)
Hello everyone! This is my first post, so if it's boring I apologize :D
Just wanted to share a little compiled language I made called Sea. It's called Sea because, well, Sea shares many things in common with C. I chose C to mimic because this was my first compiled language — given that C is so low-level, I figured I wouldn't have to do too much work translating to x86.
A quick overview of the language
In case any of you want to try it out, here's a quick overview of the syntax/semantics.
To start, we'll take the simple "hello, world" program:
Here we immediately see a few differences between C and Sea:
- Import is not a preprocessor routine, it's a keyword
- Functions are defined as
<type> function <name>
- Functions automatically return their last expression — to suppress this, the last expression can be
- Arguments in functions are declared differently — instead of putting arguments between parentheses, they are written as null declarations on the following line:
Control flow (if, else, while) is implemented as a "postfix operator":
A for-loop can be emulated with a block scope and a
These control flow constructs can be extended — an excerpt taken from the stdlib:
$int represents the int in question, and
$() represents execution of the block given to the control statement. We could "call" this like so:
Sea has support for ints, chars, pointers, strings (char*), and moderate support for floats. If you want to see more about the language, peruse
stdlib.sea and the
oldtests, though some of those are outdated) folders in the repl.
stdlib.sea uses what might be a confusing doc notation, so let me explain that here.
Sea comments behave just like C comments:
However, there is a
doc notation (used by SeaLib to generate simple HTML docs for a file). It is the following:
SeaLib translates that to the following:
And yes, that file would actually run (aside from saying "entry point main is not defined"). Doc headers are ignored (along with the line after) just like normal comments.
"That's neat — but why are you posting it here right now?"
Unfortunately, I'm going to abandon Sea (in favor of a newer language, Curta). Why? Two things:
- As the language became more complex, I noticed a few things I had done not-quite-nicely in the compiler were becoming difficult to manage. I was faced with either using bad tools or buying (re-coding) new tools.
- I was actually wrong about translating from a C-like language to x86 being easy. While the actual conversions themselves are not difficult, finding the proper x86 commands can be very difficult! For instance, there is no power function for the SSE floating point system — and all the tools I needed to make a power function were either nonexistent or difficult to implement.
So Sea will probably remain in its current state [for the rest of time].
In the aforementioned newer language (Curta), I am learning from my mistakes — it will compile to C++, with the added benefit that it will work on some embedded systems (most notably the Arduino). Stay tuned!
EDIT: Curta is now here: https://repl.it/talk/challenge/Curta-Lets-make-hard-things-easy/51820
Amazing job! Just a question, why and how did you use js? I thought most programming languages were made in c and c++...
@AbhayBhat Glad you asked! Many interpreted languages are made in C and C++. Why? Because C and C++ are fast, and interpreted languages need to be built on a fast language since they are inherently slow. However, a compiled language is inherently fast, because it's actually not running on JS — it's running on x86. The job of the JS is to convert the Sea programs to x86 assembly, and x86 is very very fast.
So, to answer your question: I used JS because JS is easy to code in, so I wouldn't be hindered by it (and performance of the host language doesn't really matter much for compiled languages).
As a side note: after completing a compiled language, one usually rewrites the language source in the language itself. This is so that you get rid of the JS.
Note: Re-reading my post, it might be a little unclear. Here's the basic idea:
If my language was interpreted, then the JS would be the one running the language. JS is slow, so I don't want a slow thing running my language since my language will run even slower.
Since the language is compiled, JS doesn't actually run the language. Instead, JS just translates it into machine code, which is super fast. Even though JS is slow, the JS doesn't actually do a whole lot — so it's acceptable to use it.
Runtime.getRuntime().exec just runs a Bash script. Basically, you can use that to call the compiler or assembler depending on how you're making the compiler (either directly to Assembly or transpiling to another language which is then compiled). It's like the
child_process.exec function in JS.
@AmazingMech2418 Ah ok, yes that would run the assembler (or the compiler of the non-assembly language if applicable — though I think that might be considered an assembler also in that case?). But the compiler is the program itself — so unless the program
exec's itself (creating an infinite loop) you probably wouldn't use that to run the compiler.
This is incredible!
How do you make it a compiled programming language? Is it compiled due to the assembly or..?
I am very _interested in finding out how to make a homemade programming language compiled!!
@targetfanttthat Epic! We need more compiled languages!
So, an interpreted language is run by the language it's written in. If I wrote an interpreted language in Python, the Python would execute the code.
A compiled language translates the code of your language into a lower-level language like x86 assembly or C. The compiler doesn't run the code — instead, it just translates from one language to another.
Since you said you're interesting in making a compiled language, I'd suggest starting here: https://norasandler.com/2017/11/29/Write-a-Compiler.html
It requires basic programming knowledge, but other than that it builds from the ground up!
@fuzzyastrocat That is where I have gotten stumped at. I have tried and tried to find documentation on how to make my programming language a compiled one but I just give up cause no documentation was very open, or reliable I guess you could say.
Thank you for that link though I am going to have a very fun night tonight, lol
@Lethdev2019 Well yes because Python supports some higher level libraries(like PLY) to help people out. But still in this case you are at a full disadvantage because you have no control on many things, such like you have no control on how you pick up keywords or punctuation in the lexer, you depend fully on a Python library.
Yes, I agree, Python is quicker and more optimizable, but in the end you really have control over one section of your language: The Syntax
@Lethdev2019 Well yes, making the better use of, being, making the situation better and easier in any way possible.
To Me, Python wouldn't be the best language to use due to the fact it is a High Level programming language, it is a Dynamically Typed programming language, it is Interpreted and it is garbage collected, all causing it to be slower in the end and less efficient all together for memory management, which is the key to any programming language
@Lethdev2019 Yes, but just because it was made in C does not mean that the language is going to be fast.
I am talking about why I think Python shouldn't be used to create a programming language. We all have our own opinions though!
I personally like the hands-on feeling of needing to allocate memory, work with low level stuff etc.
While you may not and you would rather just get straight into it without needing to worry about any of the stuff I need to worry about using C..
Here are a few links to websites/a youtube video I was referencing to state my facts:
@targetfanttthat @Lethdev2019 I see both sides of your argument. As for
py_compile, that compiles Python code, not custom language code. Sure, you can translate a custom language to Python, but as @targetfanttthat pointed out the only thing that changes is the "looks" of the language, not the functionality. And if you really do create a true translation of a different language into Python, you're basically just compiling to Python — so why not compile to a lower-level (and faster) language like C?
As for optimization: Yes, python is a very optimizable language. But no matter how much you optimize it, it will still be an interpreted language — and interpreted languages are naturally slower. This is why Python is considered a slow language. I'm not saying Python is bad here: it has very good uses because of how high-level it is. But that same high-level-ness is what makes it slow, so for this particular instance it's not a good choice for a language in which you want performance.
However, if you don't care about performance, then making a language in Python is totally fine. I made my first language in Python, because I knew that Python would be easy to work with. However, that language was painstakingly slow compared to even Python, which is why I've moved on to making compiled languages (which compile to a fast, low-level language).
As for language builder libraries: This is a hotly debated subject. Some people say that these libraries are the best tools to use — others say they are terrible. In my opinion, there is no right answer here. If you just want to get it done fast (or easy), or if you're new to language building, you should use a language builder library. However, if you want full control over the lexing/parsing/etc process (like I did) or if you just want to make it yourself for the fun of it (also like I did), then you should build it from the ground up with no libraries. Neither way is wrong, just a matter of preference.
Pardon the wall of text — didn't realize how long this was until I typed it all out :P
sea see some similarities with this and Haskell. Oh, quick question though, I'm still learning the very tiny basics of Haskell but I cant seem to find the
WHERE ARE THE VARIABLES
Like can I store the result of a function? For example:
@DynamicSquid Here's a little bit of advanced Haskell:
Say you wanted to be able to turn anything into an uppercase name. You could write a name function like so:
But wait a minute — notice how
x is at the end of both
name x and
map toUpper x? That means we can get rid of it — ie,
But why? The best way to understand this is currying:
map, when called with 0 arguments (written out) creates a function that takes one argument
- ...that creates a function that takes one argument
- ...which takes the first argument and maps it over the second.
- ...that creates a function that takes one argument
So by saying
map toUpper, we're partially applying
map. It doesn't have the last argument yet (the thing to map over), so it doesn't do anything. It's just a function that, when you give it the final argument, will then apply the map. It's a bit confusing at first, but I'd suggest researching it since this is something used extensively in Haskell. It's also the basis of Point-free Programming. To learn more about currying, check out the wikipedia article.
What are some of the internal differences between Sea and C? Like when you think of cpp, it adds templates, references, etc so what does Sea add?
Regardless of features, cool language man!
@yekyam Answer: basically nothing.
"Wait, what?" Yes, in its current state, Sea doesn't add hardly anything internally to C (The control flow extension system is new, though you could represent that in C code). Then why did I make it? Well, the key phrase is "current state": all the cool internal additions to C I had planned became too cumbersome to implement (due to problem 2 listed above).
However, you can expect some really cool (internal) features with my next language, Curta. So... stay tuned! :D
(and thanks for "cool language"!)
@yekyam That's rather difficult to determine. As Curta is an embedded systems language, its memory footprint is very, very low. However, I believe the answer would probably be garbage collection (though much less overhead than other garbage-collected languages).
Why am I using all this shifty wording? Because to properly explain it, I'd have to give away a good deal about the language, and that would ruin all the surprise :D
@yekyam If you're interested, curta is now here: https://repl.it/talk/challenge/Curta-Lets-make-hard-things-easy/51820
It's changed since my last post here but it still retains its purpose!
This is sick! I was trying to make a Lang, but I got a bit confused the minute I hit parsing which is embarrassing, ngl.
Cant wait for Curta!
@Highwayman Nice! Yeah, I think more languages need a system for easy documentation embedded in the code.
Well, the way it was constructed I was using many variables to keep track of compiler state (rather than using immutable types and keeping state locally via function calls). It's a bit difficult to explain, but it started to hinder me a bit. Additionally, the parser error system was not well-constructed, so sometimes errors would be very misleading.
So nothing went objectively wrong, it's just that I'm going to do things a bit differently the second time around.
@Highwayman Hey! Since you were interested in Curta, here's the link: https://repl.it/talk/challenge/Curta-Lets-make-hard-things-easy/51820
It's still in development but I've gotten it to a state where it can be used for actual purposes.
@ironblockhd You're probably familiar with Interpreted languages (Night being a good example here). Interpreted languages just run in the language you write them in — so if I made an interpreted language in JS, the JS would run the language. However, this is a compiled language, meaning that the JS just translates my language into assembly language. Why do this? Because x86 assembly is blazingly fast, meaning that my compiled language will have a large performance gain over an interpreted one.
TL;DR: Instead of directly running my language, I translate it into assembly. Then I run the assembly.
Hope that helps!