How To Make A True Coding Language: Part 1
Im making this tutorial series because almost every language i've seen posted to repl talk doesn't use parsing algorithms and I think it would be nice to see some that do. There are a couple flavors of these languages, typically they come in 2 forms
1) They use string splitting and regular expressions
Technically you can call this "parsing" or a language of some sort.
But you will very quickly discover you run into syntax limitations like having to have a separator for a lot of things.
2) They do nothing at all but define some classes or variables
I don't know how people get away with this and then tell you to calm down when someone calmly separates what it is from what it isn't. Even when put in the best possible words as to not directly attack the repl itself.
Which is why I have decided to create a tutorial on making a programming language in hopes people start making ones that don't have the above flaws.
This tutorial is going to go bottom to top using no dependencies at all and will show creation of a lexer all the way up to the hand made recursive-descent parser!
The Lexer (or scanner, tokenizer, whatever you wish to call it)
Located in lexer.py
The other components will get their own files as they are created.
You do know that there is more than one way to make a programming language, right? Your type one
not programming languages are actually programming languages, just without as many powerful functions. For example, you could create a LOLCODE interpreter using split functions and regular expressions and LOLCODE is a programming language. It's not the most useful, but still a language. Same with Forth which is even easier to create an interpreter for. Then, the type two "not programming languages" really aren't programming languages and just created dialects for known languages. However, Clojure is a dialect of Lisp and is considered a separate language, so why couldn't some of what you call "not programming languages" actually be programming languages? For example, the in-development THAIL programming language is really a dialect of Adapt (my programming language which is also in development). Also, please stop arguing with everyone about the things you call "fake". There is still hard work put into it, just maybe not as much as a real OS or full interpreted/compiled programming language.
@CSharpIsGud That's it, it ends there. Interfaces are for type checking, types are for type checking, what it calls "function overloading" is the stupidest implementation of function overloading I've ever seen, is just for type checking, templates are just for type checking.
All it is, is JS with types.
Dart has types too, and a whole different syntax, but there's more that syntax and types that make a language unique. Until TS really branches off from JS, it's still just a dialect. It doesn't add anything new.
Whereas, like I said before, Dart has a whole browser dedicated to running it, parsing it, etc. Look at TypeScript, Deno, one of the only TypeScript run-times that I've heard of, barely came out a few months ago.
@CSharpIsGud TypeScript has room for improvement, let's look at it's function overloading, it just sets up a pattern of types that a function can accept.
Yet, if it's compiled, can't the TS compiler rename the functions before compiling and separate them? I'm sure it could! I can, and I'm sure you can too, so why can't Mircosoft? It wouldn't matter to the developer, since it should all become minified anyways.
@AmazingMech2418 I agree about this. @CSharpisGud thinks that his way is right and that his way is the way to make a "real" coding language. There are plenty of ways to do something. It's common sense, @CSharpIsGud, that people have different beliefs and stuff! Right now you, @CSharpIsGud is just being a legit party pooper.
@JaydenLiu1 Look, I'm done replying to ignorance. I never said my way is the right way, I said that they should use an actual algorithm instead of faking it with if statements.
There are plenty of parsing algorithms you can use that aren't what I use.
You can act like a chain of if statements is a language all you want, I don't really care(but it still makes me cringe, and I'm being literal with that statement).
Do you know why none of these if statement languages made it into the pl jam? Yeah, I think you do.
It's straight up useless trying to explain to how doing a bunch of if statements or classes in python isn't a language to people that make them because they are always ignorant and grasp at every straw they can to defend their "project" that took them 5 minutes and takes less time for people to get bored of it and not look back.
From calling anything that opposes them "hate" to blindly praising everyone who shares the same narrow mindset.
@JaydenLiu1 By the way, the python interpreter that got 80 upvotes, was done in probably less than 30 seconds if they typed fast.
The people that defended the above cycle farm are people like you who simply don't understand that you can't just make a bunch of if statements and call it a language.
@JaydenLiu1 So many pings from here.
Also, you are really only harming yourself by quoting him.
Some of his sentences might go a little overboard but its basically how things are.
If I didn't care about being swarmed by everyone who believes it takes a lot of effort to make a few if statements in a file I would also put the truth in as blunt words as possible.
@JaydenLiu1 Typically if you actually call out said things as not being what they said they are they will go crazy.
There are plenty of things they may or may not reply with but I can think of some probable ones and their counterparts:
"Dude, chill" ('chilling' is not necessary when you are already 'chill')
"Hater" (This defeats itself.)
Special: "I diDn'T sAy iT wAs A rEAl C0diNG lAnGuAGE!" ('coding language' implies that it is a coding language)
"It took a lot of effort for him to make that" (If it did, you wouldn't be commenting about how it can be done in less than 10-30 minutes)
"Stop." (If they can't actually think of any reason you shouldn't call out how easy it is to do what they did)
if you think about it most of the "fake languages" are actually languages,they're just implemented very poorly. It's literally like me ragging on you for not separating the lexer and tokenizer because they're completely different.
Can you like not hate on people’s projects just because they don’t fit your idea of a coding language?
@LoganSpong good idea with isalpha, however by
Standard syntax shared by most languages under expressions I obviously meant stuff like
1 + 2 * 3 which most languages share.
also its 97 lines because this is just the lexer and its in python.
if you look at my other langs like my python compiler you will see it quickly rises into the 3 digit range
@LoganSpong Mine uses classes, but I never said the C++ classes were classes in my own language. If you look, you will see that the compiler doesn't actually support python classes because I haven't gotten to parsing those yet
And obviously I have to make a program for the compiler to compile
@BobNeo @CSharpIsGud @CodeSalvageON Listen, heated discussions are not the reason repl.it was made. It was meant for making and sharing projects. It was meant for people that don't want to install the programming language on their computer that may be around 200mb! Also, I do believe CONSTRUCTIVE criticism is good, however, the keyword is constructive. You don't need to create a post about how someone else's post is invalid and wrong. You can simply comment on their post suggesting the name be changed to something different. @LoganSpong s module is actually really good, and although he may have the description wrong, it can still be really helpful for developers. I am working with him on making his module on pypi and I hope to see it on there soon. Anyway, I don't mean to point fingers, harass, or anything like that. I am simply trying to put an end to this heated discussion.
Whether it is a "true" or a "fake" language (as you call them), both are useless in the sense that, barring exceptions, nobody will use them (except maybe for fun), and I don't think we should blame people making this "fake" languages or "true" languages because both are very interesting to code, it is a question of skills: if you are skillful and experienced then make a "real" language, but if you are a beginner or if you don't have a lot of time (whatever) code a "fake" language, nothing bad with that.
Otherwise this tutorial sounds interesting :)
@JaydenLiu1 People call them fake, because they don't have any real parsing done at all, cant do anything that a normal language can do easily.
And as to why everyone gets so annoyed every time someone posts one, is because they probably made one that actually took longer than an hour to do. Meanwhile someone posts 30 minutes worth of input if print print input print input if and people will defend that like its gods creation whenever someone realizes what it is. That's exactly what's wrong with them, they don't take much effort to make, some people have such a loose definition of what is a language around here that they don't have to care, and if someone does start to point out how it took less than 30 minutes to do they don't have to worry because most if not all of the people that bothered to even look at it are the ones that will just say they are "overreacting" or "hating"
Simply saying that fake vs real makes no sense, makes no sense when there is a very obvious line between what is a programming language using a well known parsing algorithm and what is a program reading a list of predefined commands from a terminal(hint: the underlying language its made in does all the work)
Here is a good example of what a "real" one looks like from DynamicSquid, specifically the syntax.
You can't just parse that with a bunch of input and if statements, and it functions like a language instead of a list of set in stone commands.
You know you are basically de-motivating everyone who tries to make a coding languages. If someone didn't read this para, they could have created an awesome language and could've gotten famous. @Spiered. Some people actually make good coding languages, and people actually use them. Example DynamicSquid's languages are very good.
Also, @JaydenLiu1 if you say, you made a programming language, it means you made a programming language. If you didn't put any work into it, they call it fake cause you didn't really make it that well. A fake programming language is just a programming language someone made which has no work put into it. Yours's is called fake cause it has no work put into it. A real programming language is one that has work put into it. A programming language is real by default. If you said I made a programming language, it means you made a real one. You cannot say you never said that you made a "real" programming language as when you said you made a programming language, it became "real" by default.
I don't know if this made sense, I hope it did.
@TheForArkLD Technically you could call .split and regexp a parser by the definition of the word, but you run into limitations really fast, note how you had to require multiple separators to split statements, parameters etc. and require an expression parsing algorithm like the shunting yard algorithm if you want stuff like
5 + 2 * 3(including order of operations of course)
I hate people saying "I hate fake coding languages" because it offends the creator of the coding language. Please stop saying "fake" or "hate" on people's projects because it's not really nice and it also makes the creator upset for something that he/she made in a long time span.
Also people should stop overreacting on just a project the creator did for fun. People have no point and make no sense. I mean, there’s no difference in the words ”real” and ”fake” when comparing two different coding languages in repl.it.
@JaydenLiu1 People say that, because they aren't languages.
If its fake, then people will call it fake regardless of if they hate it or not.
In case you have not seen, some people have literally just done this and called it a "coding language"
Luckily these kinds of things have sort of died down on replit now.
Simply calling it "hate" to call out what something is and is not does not change the fact that the above snippet is just python with a defined class.
@JaydenLiu1 There's a formal definition of a programming language that separates it from shells, markup languges, and others.
Formal definition of a programming language:
A programming language is a formal language comprising a set of instructions that produce various kinds of output. Programming languages are used in computer programming to implement algorithms. Most programming languages consist of instructions for computers.
Some languages are deemed fake because they do not follow these guidelines.
I'm really curious about making my own language, but I too busy now :(
The way I understand it, it that you have an input
a += 7
And you have to split that up into characters
a, +, =, 7
And that part is called the lexer?
Hey! Great tutorial/idea! Haven't seen one of these yet. (But don't insult other people's projects either...)
ummm... i think you're going to be disappointed. And I feel like it's going to take some time to beat mat. I've been ahead of him once before, but lost the lead. I'm satisfied being second. @CodingCactus and @Vandesm14 are slowly creeping up (
quickly in the case of CodingCactus). The cycle special is REDACTED. I'll delete that in one minute though :) @DynamicSquid
There. I changed the name of my post to: A collection of powerful functions. Like it now? I can also change it to: Some functions I made called Inspyre.
@LoganSpong okay, I may or may not be able to help you so instead, I will give you the info on how to do it using my tutorial also you can fork the project and make sure that you use the version control option on the left when done to make a github git. On top of this replace the information in setup.py with the info you want.