From: Ricardo Kagawa <ricardo.kagawa@gmail.com>
To: ~johnnyrichard/olang-devel@lists.sr.ht
Cc: "builds.sr.ht" <builds@sr.ht>
Subject: Re: [olang/patches/.build.yml] build success
Date: Thu, 14 Mar 2024 01:29:09 -0300 [thread overview]
Message-ID: <88cb1a82-809e-4db5-95cd-2bbe828d0166@gmail.com> (raw)
In-Reply-To: <CZOQWPXD2V2J.1V3XBHT3ZQUED@fra01>
>> This grammar adds the token SEMICOLON (';') for every statement. I
know we
>> agreed make it optional, but the SEMICOLON makes the parser much more
>> convenient to implement.
>>
>> And this is the first topic I would like to discuss. Let me know if you
>> agree otherwise I can adapt the grammar to make SEMICOLON optional.
>
> (...) Therefore, I'm curious about your statement that using a
> semicolon makes the parser much more convenient to implement. Could you
> elaborate on this? Have you encountered any new considerations that might
> complicate the implementation?
My limited understanding is that the semicolon would indeed be more
convenient, as it would be a definitive end-of-statement symbol,
requiring no lookahead to resolve as such. The LF token could be
ambiguous on its own (between end-of-statement and white space), so
some lookahead would be required to resolve it.
But it should be alright, as long as the language remains context-free.
Even if it becomes ambiguous, non-deterministic, or requires a long
lookahead. Ideally it should be determinitstic for linear time
performance, but it seems there are parsers that can run close to it in
the average case, as long as the language remains close to
deterministic.
And I don't have a strong opinion on the semicolon issue, except that
it must be an option. But whatever we do, we must avoid the following
pitfall from JavaScript:
```javascript
example
;(x)
```
The semicolon is mandatory here, because otherwise `(x)` is handled as
an argument list, and `example` would be called as a function. That is,
it would be a multi-line statement, instead of two separate statements.
And why anyone would do this?
```javascript
const x = y.example
;(() => {
console.log(x)
})()
```
Immediately invoked function expressions are a thing in JavaScript, and
it would not be uncommon to have some expression ending with an
identifier right before them.
>> The grammar was made by using a EBNF evaluator tool[1].
>>
>> [1]: https://mdkrajnak.github.io/ebnftest/
>
> I would add this link at the markdown, so then people can play with it.
I would make an even stronger argument for including the link in the
docs. A good language specification also specifies which language
specification grammar is used for the specification itself. And the
EBNF in particular is not properly standardized, so you really need to
specify which EBNF variant you are using.
The link should thus be good enough to refer to the EBNF implementation
used in this specification, although a permanent (version locked) link
would be better.
----
As for my revision of the grammar:
- Separated rules into sections.
- Added optional white space around the program.
- You don't actually need non-terminal symbols for keywords. Especially
if you are including the keyword in the symbol name.
- You don't need non-terminal symbols for symbols either, unless you
have a more "semantic" name for it. There should not be another
"semicolon" besides `;`, for example.
- In Johnny's version the function name is a single identifier. I don't
know why Carlos's version made it multiple. I have made it single
again.
- In Johnny's version the space before the return type is optional. I
don't know why Carlos's version made it mandatory. I have made it
optional again.
- Replaced `<identifier>` in `<function-definition>` with
`<function-name>` to express that this identifier is the name of the
declared function. Then, `<function-name>` is just `<identifier>`.
- Renamed `<fn-args>` to `<function-parameters>`, since parameters are
the variables in a function declaration, while arguments are the
values bound to those variables during function calls.
- Replaced `<type>` for `<return-type>` in `<function-declaration>` to
express that this type identifier is the return type of the function.
Then, `<return-type>` is just `<type>`.
- Replaced `<block>` in `<function-definition>` for `<function-body>` to
express that this block is the body of the declared function.
- Reworked `<block>`, `<statement>` and `<end-of-statement>` to allow
for:
- Single statement followd by optional end-of-statement;
- Statement list with mandatory end-of-statement between statements;
- But the statements could be made optional, yet I did not in this
version, as there is no `void` return type, currently.
- Replaced `<number>` in `<return-statement>` with `<expression>` to
prepare for them in the future. The only allowed expression is still
an integer literal, though.
- Renamed `<number>` to `<integer>`, and reworked it to actually
represent decimal integer literals. Sequences of zero digits are now
forbidden at the left side, but a lone zero digit is still allowed.
- Reworked `<identifier>` to better express that it starts with
`<alpha>` or underline, followed by zero or more `<alpha>`, `<digit>`
or underline.
- Removed `_` from `<alpha>` to better reflect the name (as underline is
not an alphabetic character).
- Renamed `<space>` for `<ws>` to avoid ambiguity with the character
U+0020 Space, and made it a one-or-more list. Also introduced `<ows>`
for "optional white space". Shorter names were preferred here due to
these symbols in particular being used very frequently.
- Also introduced `<line-break>` as either LF, CR or CRLF. Otherwise the
CRLF sequence would be parsed as two separate line breaks. Not that it
would matter that much, except maybe for mapping line numbers.
```
(* Entry Point *)
<program> ::= <ows> <function-definition> <ows>
(* Functions *)
<function-definition> ::= 'fn' <ws> <function-name> <ows>
<function-parameters> <ows> ':' <ows> <return-type> <ows> <function-body>
<function-name> ::= <identifier>
<function-parameters> ::= '(' <ows> ')'
<return-type> ::= <type>
<function-body> ::= <block>
(* Statements *)
<block> ::= '{' <ows> <statement> <ows>
(<end-of-statement> <ows> <statement> <ows>)* <end-of-statement>? <ows> '}'
<end-of-statement> ::= ';' | <line-break>
<statement> ::= <return-statement>
<return-statement> ::= 'return' <ws> <expression>
(* Expressions *)
<expression> ::= <integer>
(* Identifiers *)
<type> ::= 'u32'
<identifier> ::= (<alpha> | '_') (<alpha> | <digit> | '_')*
(* Literals *)
<integer> ::= <integer-base10>
<integer-base10> ::= #'[1-9]' <digit>* | '0'
(* Utilities *)
<ws> ::= <white-space>+
<ows> ::= <white-space>*
<white-space> ::= <linear-space> | <line-break>
<line-break> ::= '\n' | '\r' | '\r\n'
<linear-space> ::= #'[ \t]'
<alpha> ::= #'[a-zA-Z]'
<digit> ::= #'[0-9]'
```
Further discussion:
- Is the language going to support Unicode? If so, `<alpha>` could use
the _L:Letter_ Unicode category instead of being limited to
`[a-zA-Z]`. But the EBNF tool does not support Unicode categories in
its regular expressions (it does not support flags). Also don't
forget to rename it to `<letter>` in that case.
- It would help developers in non-English speaking countries, but it
could be difficult to work with multi-byte characters and Unicode
normalization.
- There are more linear space and line break characters than the ones
included here, even within ASCII, although they are not all that
important. Even more in Unicode (some under _Cc:Other/control_,
others under _Z:Separator_). Should we support them?
- The function definition could accept a single expression as an
alternative to its `<block>`, similar to Kotlin.
- The integer literal could include optional underline separators for
readability. Just need to be careful not to start with underline, to
avoid ambiguity with identifiers.
- I guess we don't have to support the full set of Unicode digits, since
we don't know if these digits would even be decimal in the first
place. The numbering system could be very different from our own, so
it is likely not feasible to support them.
- I have not checked if this syntax would avoid that edge case with
JavaScript I mentioned in the beginning. I might check that next
time (I'm still not sure of how).
- It might seem strange that I included semantic non-terminals here,
despite having removed non-terminals for symbols and keywords. I can't
say for sure, since this is my first time trying this style, but I
suspect that besides making the language specification easier to
understand, the important bits to hook into in the parser will be
around these symbols. That is, it could simplify some work on the
parser.
next prev parent reply other threads:[~2024-03-14 4:29 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-09 0:05 [RFC PATCH olang v1] docs: create zero programming language specification Johnny Richard
2024-03-08 23:09 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-14 4:29 ` Ricardo Kagawa [this message]
2024-03-14 22:43 ` Johnny Richard
2024-03-09 0:36 ` [RFC PATCH olang v1] docs: create zero programming language specification Johnny Richard
2024-03-09 5:09 ` Carlos Maniero
2024-03-19 20:21 ` Johnny Richard
2024-03-23 23:31 ` Carlos Maniero
-- strict thread matches above, loose matches on Subject: below --
2024-09-27 23:07 [PATCH olang v2 1/2] ast: add function call node Johnny Richard
2024-09-27 21:11 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-25 23:20 [PATCH olang v1 2/2] parser: add support for parsing function calls Johnny Richard
2024-09-25 21:22 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-25 18:39 [PATCH olang] tests: fix diff error output Carlos Maniero
2024-09-25 18:39 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-25 18:30 [PATCH olang] parser: parse multiple function into a single translation unit Carlos Maniero
2024-09-25 18:31 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-23 22:19 [PATCH olang v1 2/3] lexer: add token comma Johnny Richard
2024-09-23 22:23 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-23 11:43 [PATCH olang 2/2] ast: permit multi declarations on translation unit Carlos Maniero
2024-09-23 11:44 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-23 10:11 [PATCH olang v1 3/3] naming: rename all identifier symbols to id Carlos Maniero
2024-09-23 10:12 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-22 0:46 [PATCH olang v2 4/4] codegen: operate mov instructions based on the symbol's type Carlos Maniero
2024-09-22 0:47 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-21 21:02 [PATCH olang v1 2/2] tests: build: add parallelization support for unit tests Johnny Richard
2024-09-21 21:05 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-21 8:25 [PATCH olang 5/5] codegen: perform mov instructions based on variable type Carlos Maniero
2024-09-21 8:26 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-21 1:13 [PATCH olang 5/5] codegen: preserve function's variable stack location Carlos Maniero
2024-09-21 1:13 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-21 0:20 [PATCH olang v1 3/3] codegen: add support scopes and symbols lookups for var Johnny Richard
2024-09-21 0:23 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-17 15:14 [PATCH olang] cli: add libc error handling Carlos Maniero
2024-09-17 15:15 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-17 13:43 [PATCH olang v1] remove unused examples programs Johnny Richard
2024-09-17 11:43 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-17 12:46 [PATCH olang v1 4/4] docs: info: add instructions to install/uninstall olang Johnny Richard
2024-09-17 10:48 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-16 16:29 [PATCH olang v1 3/3] docs: remove pandoc dependency for man docs Johnny Richard
2024-09-16 14:31 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-09-11 1:03 [PATCH olang v1 2/2] parser: add var definition and reference support Johnny Richard
2024-09-10 23:05 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-08-25 13:16 [PATCH olang v2 2/2] codegen: x86_64: implement binary operations Johnny Richard
2024-08-25 13:26 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-08-21 3:39 [PATCH olang 1/2] tests: add comment based integration tests mechanism Carlos Maniero
2024-08-21 3:41 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-08-13 18:55 [PATCH olang v2 2/2] ast: inline ast_node_data_t union typedef Johnny Richard
2024-08-13 18:04 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-05-12 14:30 [PATCH olang 4/4] tests: print integration tests TODOs Carlos Maniero
2024-05-12 14:31 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-04-27 12:14 [PATCH olang v1 2/2] codegen: x86_64: implement binary operations Johnny Richard
2024-04-27 11:21 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-04-18 23:08 [PATCH olang v1] parser: fix parse expression with binop chain Johnny Richard
2024-04-18 22:11 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-04-18 22:18 [PATCH olang v1] parser: add missing <= and >= binary operators Johnny Richard
2024-04-18 21:22 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-04-18 21:58 [PATCH olang v1] docs: spec: add %, <= and >= binary operators Johnny Richard
2024-04-18 21:02 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-04-16 23:51 [PATCH olang v1] Revert "docs: spec: postpone assignment operators" Johnny Richard
2024-04-16 22:56 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-04-16 23:35 [PATCH olang v2] docs: spec: add binary expressions Johnny Richard
2024-04-16 22:40 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-04-15 18:20 [PATCH olang v1] spec: ebnf: add binary expressions Johnny Richard
2024-04-15 17:43 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-04-08 4:38 [PATCH olang v2 2/2] docs: spec: add variables and constants specification Carlos Maniero
2024-04-08 4:39 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-29 1:59 [PATCH olang] linter: turn off clang-format to keep retro compatibility with v16 Johnny Richard
2024-03-29 0:59 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-29 0:33 [PATCH olang] site: change look and feel and rewrite home introduction section Johnny Richard
2024-03-28 23:33 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-24 16:12 [PATCH olang v3] docs: create o programming language spec Johnny Richard
2024-03-24 15:16 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-19 20:18 [PATCH olang v2] docs: create o programming language spec Johnny Richard
2024-03-19 19:20 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-19 19:57 [PATCH olang v1 3/3] codegen: add compiler support to linux aarch64 arch Johnny Richard
2024-03-19 19:00 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-18 8:39 [PATCH olang v3 3/3] parser: add all binary operation expressions Johnny Richard
2024-03-18 7:43 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-17 21:29 [PATCH olang v2 3/3] parser: add all binary operation expressions Johnny Richard
2024-03-17 20:37 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-13 21:21 [PATCH olang v1 3/3] parser: add basic arithmetic expressions '+' '*' '/' '-' Johnny Richard
2024-03-13 20:29 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-13 12:44 [PATCH olang v3] refactor: rename zero programming language to olang Fabio Maciel
2024-03-13 12:45 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-12 19:35 [PATCH olang v1] refactor: rename zero programming language to olang Johnny Richard
2024-03-12 18:40 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-11 8:48 [PATCH olang] site: change dns to o-lang.org Johnny Richard
2024-03-11 7:50 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-08 23:13 [PATCH olang v1] ast: add ast_node root for the entire program Johnny Richard
2024-03-08 22:13 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-08 22:39 [PATCH olang v2 3/3] tests: add tests for the minimal possible olang program Carlos Maniero
2024-03-08 22:40 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-07 23:23 [PATCH olang 3/3] tests: add tests for the minimal possible olang program Carlos Maniero
2024-03-07 23:24 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-03-01 22:24 [PATCH olang v2 4/4] parser: create simplified parser for tiny AST Johnny Richard
2024-03-01 21:32 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-28 19:04 [PATCH olang v1 4/4] parser: create simplified parser for tiny AST Johnny Richard
2024-02-28 18:11 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-28 14:25 [PATCH olang v3] arena: optimization: ensure alignment memory access Carlos Maniero
2024-02-28 14:26 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-28 12:37 [PATCH olang v2] cli: replace memory allocation malloc -> arena Johnny Richard
2024-02-28 11:39 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-27 19:59 [PATCH olang v2 2/2] utils: create hash map data structure Johnny Richard
2024-02-27 19:01 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-24 20:40 [PATCH olang] test: fix suite name for list_test and arena_test Johnny Richard
2024-02-24 19:42 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-22 19:09 [PATCH olang] cli: replace memory allocation malloc -> arena Johnny Richard
2024-02-22 18:11 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-22 18:38 [PATCH olang] docs: add DCO information on hacking page Johnny Richard
2024-02-22 17:41 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-22 18:24 [PATCH olang] build: rename 0c.c file to main.c Johnny Richard
2024-02-22 17:26 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-21 22:20 [PATCH olang 2/2] utils: create hash map data structure Johnny Richard
2024-02-21 21:24 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-21 15:09 [PATCH olang v2] arena: optimization: make arena 8 bits aligned Carlos Maniero
2024-02-21 15:09 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-21 5:52 [PATCH olang] arena: optimization: make arena 8 bits aligned Carlos Maniero
2024-02-21 5:53 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-20 23:37 [PATCH olang] utils: add linked-list Carlos Maniero
2024-02-20 23:37 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-20 17:35 [PATCH olang v3] utils: add arena Carlos Maniero
2024-02-20 17:41 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-19 20:42 [PATCH olang v5 4/4] lexer: test: add integration tests for --dump-tokens Carlos Maniero
2024-02-19 20:48 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-19 1:44 [PATCH olang v3 2/2] lexer: create --dump-tokens cli command Johnny Richard
2024-02-19 0:47 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-18 0:50 [PATCH olang 2/2] tests: add unit tests configuration Carlos Maniero
2024-02-18 0:55 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-17 21:04 [PATCH olang] docs: deploy: replace shrt.site domain by olang.johnnyrichard.com Johnny Richard
2024-02-17 20:03 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-17 20:38 [PATCH olang] docs: build: fix docs publishing task Johnny Richard
2024-02-17 19:37 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-17 20:12 [PATCH olang] docs: add mobile version Carlos Maniero
2024-02-17 20:17 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-17 18:40 [PATCH olang v2] docs: add HACKING documentation Carlos Maniero
2024-02-17 18:45 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-17 18:29 [PATCH olang v2] docs: add white mode support Carlos Maniero
2024-02-17 18:34 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-17 17:46 [PATCH olang] docs: add white-mode support Carlos Maniero
2024-02-17 17:51 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-17 16:22 [PATCH olang] docs: add pandoc Carlos Maniero
2024-02-17 16:27 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-16 16:24 [PATCH olang v2] docs: add sphinx documentation support Johnny Richard
2024-02-16 15:26 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-16 16:23 [PATCH olang] docs: build: add deployment script Carlos Maniero
2024-02-16 16:28 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-16 8:59 [PATCH olang] docs: add sphinx documentation support Johnny Richard
2024-02-16 8:01 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-16 3:07 [PATCH olang v3 2/2] tests: add integration test setup Carlos Maniero
2024-02-16 3:12 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-15 16:21 [PATCH olang 2/2] tests: add integration test setup Carlos Maniero
2024-02-15 16:27 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-02-13 20:55 [PATCH olang] docs: fix git send-email config instruction Carlos Maniero
2024-02-13 21:00 ` [olang/patches/.build.yml] build success builds.sr.ht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=88cb1a82-809e-4db5-95cd-2bbe828d0166@gmail.com \
--to=ricardo.kagawa@gmail.com \
--cc=builds@sr.ht \
--cc=~johnnyrichard/olang-devel@lists.sr.ht \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.johnnyrichard.com/olang.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox