From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id 0F6mFg8yKWb8HQEA62LTzQ:P1 (envelope-from ) for ; Wed, 24 Apr 2024 18:23:43 +0200 Received: from aspmx1.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1.migadu.com with LMTPS id 0F6mFg8yKWb8HQEA62LTzQ (envelope-from ) for ; Wed, 24 Apr 2024 18:23:43 +0200 X-Envelope-To: patches@johnnyrichard.com Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=lists.sr.ht header.s=20240113 header.b=OO9Qxopq; dkim=pass header.d=disroot.org header.s=mail header.b=MDPETIsh; dmarc=pass (policy=reject) header.from=disroot.org; spf=pass (aspmx1.migadu.com: domain of lists@sr.ht designates 46.23.81.152 as permitted sender) smtp.mailfrom=lists@sr.ht ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=johnnyrichard.com; s=key1; t=1713975823; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=VDmNUlbBtBP0tcqgytuaWyU/3J98DdC52Qhnm0y8QXc=; b=WL0ucj7fgx5vfgGq2kWhDgHSS5iJJDi8OofjcJwZIo3rdr2i6nJcZsS4m9+6U71+VFdZmO 2fFyS8ebnT2LPKHo+KW5fN34YLkwW+s79QNh2IoOnoOIj0tNVFAaZDwRVOJIljDhWuE1A8 VWv12+Nvy4YV6AxGFMZtL7kNJNl3z+zPIgdm3fBfvpk6DfdEZzJWlwRaYLSN1+j6hieopi e9a4r0mBPnoQcP5zvvNV7xjL0mFI3lAEovfKKMtSzGlMBXR+8qzisylzCrpshZBEholz2R 8pad0RzG4ir5i38CD/Xw30HrboPPvC+UvXLJqwpqpGrtMu7+xmdGvq+sDvCqhQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=lists.sr.ht header.s=20240113 header.b=OO9Qxopq; dkim=pass header.d=disroot.org header.s=mail header.b=MDPETIsh; dmarc=pass (policy=reject) header.from=disroot.org; spf=pass (aspmx1.migadu.com: domain of lists@sr.ht designates 46.23.81.152 as permitted sender) smtp.mailfrom=lists@sr.ht ARC-Seal: i=1; s=key1; d=johnnyrichard.com; t=1713975823; a=rsa-sha256; cv=none; b=t3Sq5fdcQYgNnS18IWTPwfpf6Fim3H28lkeHFAzGblh2stZRSz2iEeN++LOHHJiV447aNZ u5H8HXZtTVy0zHbKb+03b+qPCvQ4RLNKjnhxdQ33U+tToDZsamzH3P48CgZJY6Kdmnl9tq euITe5+6nwkMbiRy1jxpqLluBBSh7eMdyVfgs4Ja9lOqym+CCuk3Z/UyYCRzyUcILt9fAA ayxEjERN2I0M3bxjOhuokHO0iIJ84WmKKEzT3e60bH/hGpOOzZ1I8DAgbQeqhPhwMmjGwb usIMEWjIQHt0yWznqX9B9+Za/y8i0JN3hYYLmPqs0rlGbizdakNiTUHmvrNGXQ== Received: from mail-a.sr.ht (mail-a.sr.ht [46.23.81.152]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 1CFB59A66 for ; Wed, 24 Apr 2024 18:23:40 +0200 (CEST) DKIM-Signature: a=rsa-sha256; bh=ree3+Apet1Q1NaqiPbsZWPJZyAJCgE9rViqJfm4qocc=; c=simple/simple; d=lists.sr.ht; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-Unsubscribe:List-Subscribe:List-Archive:List-Post:List-ID; q=dns/txt; s=20240113; t=1713975819; v=1; b=OO9QxopqiG+40HBWRo8hqzvP0QgKGA2efuYYpthoquCms9QWg+yhYmmmfeVMPb5KYfRch3Rq QUllvk16/MGUrgqdl6xyTbWdmkW5wkN4YbOvkE0hIijIFvryxLcosfC4lni/tqVXPby1wfYsZLL Kda7CKGQad6NKv3cSrBiJKMgqrY8obbuGkCCokXyko3BVjCJ/5IAQIa7K/3LzZlRvg5tr7IM4C5 LB0arWRJvkuSz9pswZe0GlCNIlxt7Kn9guPTVC+Fc5Yzz5lWDN76/JqAbr14cBjZSzJ+OgqwG0B M8jZgKMvZiGJ+NIizm1hvRolhc7wafxRuI3xDsgeK3JMQ== Received: from lists.sr.ht (unknown [IPv6:2a03:6000:1813:1337::154]) by mail-a.sr.ht (Postfix) with ESMTPSA id 47CD5203DB for ; Wed, 24 Apr 2024 16:23:39 +0000 (UTC) Received: from layka.disroot.org (layka.disroot.org [178.21.23.139]) by mail-a.sr.ht (Postfix) with ESMTPS id 81B052035B for <~johnnyrichard/olang-devel@lists.sr.ht>; Wed, 24 Apr 2024 16:23:38 +0000 (UTC) X-Virus-Scanned: SPAM Filter at disroot.org From: ricardo_kagawa@disroot.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=disroot.org; s=mail; t=1713975816; bh=ree3+Apet1Q1NaqiPbsZWPJZyAJCgE9rViqJfm4qocc=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=MDPETIshotJHOc4S2jJjiZptvh5pHuprOF95lU26f6IdTMw/LJUVHogK1OTwYw/yF nr5qViWBtvS/lG/wTZAKSTxkp+6TtMqKeeVfrnkpkGfy4V6MBS9RLqftmOUJ7lqL6F 5km1GGXfBYi+k8ruwvWhV9LBiORexTwOqUP7LtRrXGVdrCB3zqUGQgly0OS/tqYQ50 9///1VK5kKSOuePPEbhtcz2YxAa6aOn2IgvS6r07rXhS1DoYO+JQ6odsTCZPBFxq4Y IxfpiWIe//6gUcDbujGz0j89+qqy+TEVt2hDkMddNGsf5Lej/PDUYxK9iGLVfk/R5s phAJoscQtli9A== To: ~johnnyrichard/olang-devel@lists.sr.ht Cc: carlos@maniero.me Subject: Re: [RFC SPEC] Primitive data types and arrays Date: Wed, 24 Apr 2024 13:23:32 -0300 Message-ID: <20240424162332.13360-1-ricardo_kagawa@disroot.org> In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Unsubscribe: List-Subscribe: List-Archive: Archived-At: List-Post: List-ID: ~johnnyrichard/olang-devel <~johnnyrichard/olang-devel.lists.sr.ht> Sender: ~johnnyrichard/olang-devel <~johnnyrichard/olang-devel@lists.sr.ht> X-Migadu-Flow: FLOW_IN X-Migadu-Country: NL X-Migadu-Spam-Score: -9.12 X-Migadu-Scanner: mx11.migadu.com X-Spam-Score: -9.12 X-Migadu-Queue-Id: 1CFB59A66 X-TUID: /LwpDnEAJ5xB > About arrays, Johnny has suggested to talk about arrays in a different > thread, I'm just waiting us to conclude this discussion and I'll start > another thread to define Olang's array specification. But we brought > excellent points, and maybe we should define pointers before arrays. My point was not that you should include pointers in the language, but that you should not match C arrays as you have specified. Of course, you can include pointers if that was your plan all along, but I would rather you did not. > > Obviously, `boolean` can be either `true` or `false`, but what should > > that mean? If `boolean` is mapped to `u8`, then zero and non-zero? > > IMO, true should be 1 and false 0 in a way that *1 == true* is true and > *2 == true* is false. Control flow structures may accept anything not > just booleans and may apply the non-zero approach you described, but we > can discuss this on their own RFC (that does not exists yet). I have my issues regarding that, but let's wait for that new thread. > > But the real question is what would `char` be? If the language should > > support Unicode properly, then `char` would represent a _code unit_ > > rather than a "character", which could be considered a misnomer. Since > > Unicode uses variable-length characters, a Unicode character might be > > difficult to represent as just `char`. > > > > If no Unicode support is planned, then `char` as `u8` is good enough to > > represent characters in 7-bit ASCII encoding. > > I'll be honest with you, It makes a lot of sense all you said, making a > char a u8 seems to enforce an Western-Eurocentrism in Olang. But I > confess that I never stopped to learn more about unicode. > > At the same time I think we should support a 32-bit sized unicode char, > I don't wanna make all chars an u32 keeping the support to ASCII encoding. This is exactly how I feel, except I would stick to UTF-16 (this is what JS uses). Unicode would be a lot more complex to deal with, and totally overkill if you don't have plans to support non-ASCII characters as primitives. But if you do have plans to support it, it might be better to at least avoid making assumptions that could make it difficult to transition to it later. > IMO, we should either postpone specifying a char right now or assume > that a char at this point represents an ASCII char and start a new RFC > about unicode where we may define something like an unicode char. My intent was actually to make you postpone the definition of the `char` type until you have considered this carefully enough. You don't have to decide that right now, but you also don't have to define the `char` type right now either. But if you do intend to support Unicode as `char`, then I would not make it something separate from ASCII, as Unicode is a superset of ASCII. Not a problem if you intend to support Unicode as a separate library (as in C), but I feel it would be weird to have both ASCII and Unicode as primitives if you already have ASCII included in Unicode. > BTW, you seem well versed on the unicode theory, would you like to > purpose a mechanism to deal with unicode? I am not that well versed, I just have a user-level knowledge of Unicode. What I would propose however, is to look at languages that natively support Unicode, like JS. More precisely, not just copy what they do, but also look at what they did wrong and try to do better. In C, `char` is assumed ASCII (it is not actually, but sort of can be) and Unicode seems to be supported through a standard library (I have never used Unicode in C, but I suspect it is related to "wide chars", at least). > > Also, there are three other types that might be interesting, if I may > > suggest: `never` (from TypeScript [1]), `unit` (from functional-like > > languages [2]) and `null` (from ECMAScript specs [3]). > > They seems to be very specific, we may wanna to wait until we find an > use for them. Yeah, I am not suggesting you to include these right now (or at all), just to take them into consideration. I don't know where you are planning to go about your language's design, as details are still lacking at this point.