public inbox for ~johnnyrichard/olang-devel@lists.sr.ht
 help / color / mirror / code / Atom feed
From: ricardo_kagawa@disroot.org
To: ~johnnyrichard/olang-devel@lists.sr.ht
Cc: carlos@maniero.me
Subject: Re: [RFC SPEC] Primitive data types and arrays
Date: Wed, 24 Apr 2024 13:23:32 -0300	[thread overview]
Message-ID: <20240424162332.13360-1-ricardo_kagawa@disroot.org> (raw)
In-Reply-To: <D0NKZ7DCTUI9.227U4AS27K31J@maniero.me>

> About arrays, Johnny has suggested to talk about arrays in a different
> thread, I'm just waiting us to conclude this discussion and I'll start
> another thread to define Olang's array specification. But we brought
> excellent points, and maybe we should define pointers before arrays.

My point was not that you should include pointers in the language, but
that you should not match C arrays as you have specified. Of course,
you can include pointers if that was your plan all along, but I would
rather you did not.

> > Obviously, `boolean` can be either `true` or `false`, but what should
> > that mean? If `boolean` is mapped to `u8`, then zero and non-zero?
> 
> IMO, true should be 1 and false 0 in a way that *1 == true* is true and
> *2 == true* is false. Control flow structures may accept anything not
> just booleans and may apply the non-zero approach you described, but we
> can discuss this on their own RFC (that does not exists yet).

I have my issues regarding that, but let's wait for that new thread.

> > But the real question is what would `char` be? If the language should
> > support Unicode properly, then `char` would represent a _code unit_
> > rather than a "character", which could be considered a misnomer. Since
> > Unicode uses variable-length characters, a Unicode character might be
> > difficult to represent as just `char`.
> > 
> > If no Unicode support is planned, then `char` as `u8` is good enough to
> > represent characters in 7-bit ASCII encoding.
> 
> I'll be honest with you, It makes a lot of sense all you said, making a
> char a u8 seems to enforce an Western-Eurocentrism in Olang. But I
> confess that I never stopped to learn more about unicode.
> 
> At the same time I think we should support a 32-bit sized unicode char,
> I don't wanna make all chars an u32 keeping the support to ASCII encoding.

This is exactly how I feel, except I would stick to UTF-16 (this is what
JS uses). Unicode would be a lot more complex to deal with, and totally
overkill if you don't have plans to support non-ASCII characters as
primitives. But if you do have plans to support it, it might be better
to at least avoid making assumptions that could make it difficult to
transition to it later.

> IMO, we should either postpone specifying a char right now or assume
> that a char at this point represents an ASCII char and start a new RFC
> about unicode where we may define something like an unicode char.

My intent was actually to make you postpone the definition of the `char`
type until you have considered this carefully enough. You don't have to
decide that right now, but you also don't have to define the `char` type
right now either.

But if you do intend to support Unicode as `char`, then I would not make
it something separate from ASCII, as Unicode is a superset of ASCII. Not
a problem if you intend to support Unicode as a separate library (as in
C), but I feel it would be weird to have both ASCII and Unicode as
primitives if you already have ASCII included in Unicode.

> BTW, you seem well versed on the unicode theory, would you like to
> purpose a mechanism to deal with unicode?

I am not that well versed, I just have a user-level knowledge of
Unicode. What I would propose however, is to look at languages that
natively support Unicode, like JS. More precisely, not just copy what
they do, but also look at what they did wrong and try to do better.

In C, `char` is assumed ASCII (it is not actually, but sort of can be)
and Unicode seems to be supported through a standard library (I have
never used Unicode in C, but I suspect it is related to "wide chars",
at least).

> > Also, there are three other types that might be interesting, if I may
> > suggest: `never` (from TypeScript [1]), `unit` (from functional-like
> >     languages [2]) and `null` (from ECMAScript specs [3]).
> 
> They seems to be very specific, we may wanna to wait until we find an
> use for them.

Yeah, I am not suggesting you to include these right now (or at all),
just to take them into consideration. I don't know where you are
planning to go about your language's design, as details are still
lacking at this point.

  reply	other threads:[~2024-04-24 16:23 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-08  3:29 Carlos Maniero
2024-04-12  7:32 ` Johnny Richard
2024-04-13  2:51   ` Carlos Maniero
2024-04-13 23:31     ` Johnny Richard
2024-04-16  3:40       ` Carlos Maniero
2024-04-16 18:34         ` Johnny Richard
2024-04-17  1:30           ` ricardo_kagawa
2024-04-18 21:53             ` Carlos Maniero
2024-04-24 16:23               ` ricardo_kagawa [this message]
2024-04-20 11:45             ` Johnny Richard
2024-04-24 18:45               ` ricardo_kagawa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240424162332.13360-1-ricardo_kagawa@disroot.org \
    --to=ricardo_kagawa@disroot.org \
    --cc=carlos@maniero.me \
    --cc=~johnnyrichard/olang-devel@lists.sr.ht \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.johnnyrichard.com/olang.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox