Skip to content
Sign upLog in
← Back to Community
How could I split this text at certain points outside of certain things?
Profile icon
hacker
has Hacker Plan
hg0428

I need to make something that can do:

gettokens("'Hi,,,,','bye'", [',']) #It returns: (["'Hi", "'bye'"], [',']) gettokens("Hello: Bye, ABC:abc", [',',':']) #It returns (["Hello", "Bye", "ABC", "abc"], [":", ",", ":"]) gettokens("\"hello world\", \"700x700+10+20\"", [","]) #It returns (["\"hello world\"", "\"700x700+10+20\""], [","]) gettokens("Hel lo:Cow:: Bye, AB C: a bc", [',', ':']) #It returns (['Hello', 'Cow', 'Bye', 'ABC', 'abc'], [':', ':', ':', ',', ':']) gettokens("\"tH+IS Is a string\"+'This+is also a string'* 8-\"Thi+s /st*rin-gs\"", ["+", "-", "*", "/"]) #It returns (["\"tH+IS Is a string\"", "'This+is also a string'", "8", "\"Thi+s /st*rin-gs\""], ["+", "*", "-"])

I need this gettokens() function.

I have not been successful in making one though, neither has anyone else so far.

Please help, I need this.

Voters
Profile icon
RitaHardeman
Profile icon
ReneganRonin
Profile icon
zplusfour
Profile icon
GameWinner6025
Profile icon
hg0428
Comments
hotnewtop
Profile icon
hg0428
Profile icon
PattanAhmed

@hg0428
Good!

Profile icon
Coder100

But, anyways, you can split by regexp.

import re str = '63, foo, bar,,,,, apple' # Split to array arr = re.split(',+\s*', str) print(arr)

Read more

Profile icon
hg0428

This does not work in most cases.


@Coder100

Profile icon
Coder100

of course it won't

@hg0428

Profile icon
Coder100

@hg0428
using a split will never work for all cases

Profile icon
Coder100

you're going to have to rethink how you do it.

@hg0428

Profile icon
hg0428

My whole language works with this.
I would have to redesign the lang

@Coder100

Profile icon
Coder100

@hg0428
no you don't

Profile icon
Coder100

only the lexer

@hg0428

Profile icon
Coder100

but of course, you will have to eventually if you are using split based lexing.

@hg0428

Profile icon
hg0428

My parser uses it too.

@Coder100

Profile icon
Coder100

A PARSER USES YOUR TOKENS THAT YOU HAVE GENERATED THERE IS ABSOLUTELY NO WAY IT PARSES ANY TEXT WHATSOEVER

@hg0428

Profile icon
hg0428

Some people look at my code and think it doesn't even have a lexer or parser, so my code is probably very different from what you are used to.

@Coder100

Profile icon
RohilPatel
Profile icon
Coder100

well, you are most likely not making a real language if it has no parser

@hg0428

Profile icon
hg0428

It works, and it is real. Even though with the earlier version of this it could not do things like adding the return of a function with a variable.
If I get this function then it will be fully 100% functional.


@Coder100

Profile icon
Coder100
Profile icon
Coder100

well just wanting to say that no real language uses .split lexing and no real language ever should use .split while parsing

@hg0428

Profile icon
ApoorvSingal

google "How to make a lexer in py"

Profile icon
Coder100

No matter what that's never going to work, suppose you had a string that contained the special character.

Profile icon
RohilPatel

Ur right about that one

@Coder100

Profile icon
hg0428

With the past version of this function it did not have that problem

@Coder100

Profile icon
hg0428
Profile icon
Coder100

Why would you lex by split?

Profile icon
RohilPatel

That's a fair point

@Coder100