public inbox for ~johnnyrichard/olang-devel@lists.sr.ht
 help / color / mirror / code / Atom feed
From: "Carlos Maniero" <carlos@maniero.me>
To: "Ricardo Kagawa" <ricardo.kagawa@gmail.com>,
	<~johnnyrichard/olang-devel@lists.sr.ht>
Subject: Re: [RFC PATCH olang v1] docs: create zero programming language specification
Date: Sun, 17 Mar 2024 12:41:52 -0300	[thread overview]
Message-ID: <CZW518JU62AW.30AP7RM0T8DNZ@maniero.me> (raw)
In-Reply-To: <11b1f29a-7a4a-4b46-9376-98bd52c9edd4@gmail.com>

I'd like to begin by echoing Johnny's words of thanks. It's truly
fantastic to see more people getting involved in making olang a
remarkable language.

It's also quite refreshing to have someone on board who is well-versed
in the theory behind creating a programming language. I'm certain that
I, along with others, will learn a great deal from you.

I'll start with a question from your first reply:

> My limited understanding is that the semicolon would indeed be more
> convenient, as it would be a definitive end-of-statement symbol,
> requiring no lookahead to resolve as such. The LF token could be
> ambiguous on its own (between end-of-statement and white space), so
> some lookahead would be required to resolve it.

I had a hard time trying to understand why the "LF token could be
ambiguous on its own", but now I got it. Before, I was just thinking
about the function body where, to me, a blank line in the function's
body is just an empty statement, same as a sequence of semicolons in C.
But I wasn't considering that there are a lot of other places where the
programmer could add a blank space, like in function declarations:

  fn
  main()
  :
  u32
  {
    return 0
  }

Even though the code above is ugly AF, it's still syntactically correct.
And that made me realize why it's more convenient to have an
end-of-statement token. I'm just putting this out here in case someone
else has the same doubt. Although, it seems like we are  all on the same
page about wanting to make the language more user-friendly, even if it
means giving the parser a bit of a hard time.

About the "some lookahead would be required to resolve it" we definitely
added some of these in the parser, but given a better look, we could
easily get rid of then by replacing the function:

  static void
  skip_line_feeds(lexer_t *lexer)
  {
      token_t token;
      lexer_peek_next(lexer, &token);

      while (token.kind == TOKEN_LF) {
          lexer_next_token(lexer, &token);
          lexer_peek_next(lexer, &token);
      }
  }

With:

  static void
  next_non_lf_token(lexer_t *lexer, token_t *token)
  {
      do {
          lexer_next_token(lexer, token);
      } while (token->kind == TOKEN_LF);
  }

I'm sure it may be some corner cases where it cannot be applied, but I
think that it reduces backtrack for most of the cases.


> - Function body now accepts a single expression.
> ...
> - `\v` (vertical tab) and `\f` (form feed) included as line breaks for
>    completeness over ASCII (based on `\s` regex class, which agrees with
>    Unicode properties over the ASCII range).
> - Integer literals can now include underlines as separators.
> ...
> - Introducing hexadecimal integer literals.
> ...

>From your last email, I'm totally up for discussing these topics. But I
reckon we might want to split them into new threads to avoid trying to
hash out the entire language in one go.

What do you all think about us trying to:

1. Nail down the current state of the language, leaving new features for
   later.
2. Figure out how we're gonna document new features.
3. Kick off a new thread for each feature we're thinking of adding.

Sound good?

  reply	other threads:[~2024-03-17 15:42 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-15 20:54 Ricardo Kagawa
2024-03-17 15:41 ` Carlos Maniero [this message]
2024-03-18  9:58 ` Johnny Richard
  -- strict thread matches above, loose matches on Subject: below --
2024-03-09  0:05 Johnny Richard
2024-03-09  0:36 ` Johnny Richard
2024-03-09  5:09 ` Carlos Maniero
2024-03-19 20:21 ` Johnny Richard
2024-03-23 23:31 ` Carlos Maniero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CZW518JU62AW.30AP7RM0T8DNZ@maniero.me \
    --to=carlos@maniero.me \
    --cc=ricardo.kagawa@gmail.com \
    --cc=~johnnyrichard/olang-devel@lists.sr.ht \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.johnnyrichard.com/olang.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox