[GoLUG] [parse] Line-izer preprocessor

Steve Litt slitt at troubleshooters.com
Thu Nov 16 23:10:36 EST 2023


Hi all,

In my opinion, in today's world of gigabytes of RAM, there's no reason
why operations shouldn't be done in-memory. Therefore, to make Lexical
Analysis *much* easier, I made a preprocessor to change every space and
every newline to printable strings, so the entire program could be done
in one line. I did very little "helpful" stuff such as condensing
multiple spaces into one or inserting "blankline" symbols. The only
"helpful" stuff I did are:

* Inserted a bof symbol at the start and an eof symbol at the end,
  which I think reduces break logic in the Lexical Analysis

* Removed all trailing space from each line. I consider meaningful
  trailing space to be a crime against humanity and won't allow it.
  Notice the beginning space is converted faithfully to symbols.

I'm pretty sure my preprocessor is excellent for any grammar not giving
meaning to trailing space. My preprocessor follows:

============================================
#!/usr/bin/python3
nl = '@@nl$$'
sp = '@@sp$$'
bof = '@@bof$$'
eof = '@@eof$$'
import sys

def main():

    print(bof, end='')
    for line in sys.stdin.readlines():
        line = line.rstrip()
        line = line.replace(' ', sp)
        print(line.rstrip()+nl, end='')
    print(eof)

if __name__ == '__main__':
    main()
============================================

In the preceding, notice that the symbols can easily be changed as
desired.

Hope you all like it.

SteveT

Steve Litt 

Autumn 2023 featured book: Rapid Learning for the 21st Century
http://www.troubleshooters.com/rl21



More information about the GoLUG mailing list