Re: [RFC] Namespaces in OLANG

public inbox for ~johnnyrichard/olang-devel@lists.sr.ht
 help / color / mirror / code / Atom feed

From: "Carlos Maniero" <carlos@maniero.me>
To: "Johnny Richard" <johnny@johnnyrichard.com>
Cc: <~johnnyrichard/olang-devel@lists.sr.ht>
Subject: Re: [RFC] Namespaces in OLANG
Date: Thu, 28 Mar 2024 10:41:20 -0300	[thread overview]
Message-ID: <D05FCY3GDZLY.1S4HYGDXSMS7C@maniero.me> (raw)
In-Reply-To: <sfltybztpierqixv6gbsemr7wpl4ych4upkqjhd3cyqy6hhgce@fhcz6em7vu6n>

> Thank you very much for providing this insightful reading material.
>
> However, if you're confident in the direction proposed, I
> won't obstruct progress.

I believe it's crucial that we take the necessary time to thoroughly
define the right approach. No need to rush this process. I just wanna
make sure we can clearly outline the direction we want for olang in a
user experience perspective.

By discussing these subject I believe we can clearly write the goal of
the language that yet is still subjective.

> > 2. Full Compatibility with C:
> > 
> >    It is completely compatible with C!
> > 
> >      int olang_core_math__add(int, int);
> > 
> >    If you think it is ugly to call a function that way directly, you can create
> >    macros in C to improve the readability, but completely optional.
> > 
> >      #define NSMATH(name) olang_core_math__##name
>
> I think this is too ugly and very hack.  I would prefer to call
> olang_core_math_add instead.

You quoted the entire text (which I truncated), do you mean the macro is
hacky? Or everything? The macro is just a suggestion.

You mentioned that you would prefer to use *olang_core_math_add*. Did you
mean *olang_core_math__add*, or are you opposed to the use of double
underscores to separate the namespace from the identifier?

> > 3. Manual namespacing is inconvenient: 
> > 
> >    You don't need to manually namespace every function with the cost of start
> >    every single file with a *ns* statement.
>
> If we keep managing names manually, we already have the *1* and *2* for
> free.  So, the only benefit of namespacing would be to avoid the
> inconvenience of adding it manually.

That's partially correct. I mentioned points 1 and 2 because most modern
system languages, such as C++ and Rust, mangle names to avoid conflicts.
However, I made a mistake by not including this solution in the
Alternatives section.

> > An important observation of the *ns* usage is that it must match the directory
> > structure. The path of a file that declares the namespace *olang.core.math*
> > must ends with *olang/core/math.ol*. This requirement is need for future
> > import resolution.
> > 
> > Alternatives:
> > -------------
> > 
> > 1. Automatically create namespaces based on the filename:
>
> I know we don't have written down nicely the goal of the language, but I
> prefer being explicit and avoid convention over configuration.

Agree! That's why namespaced files are great \o/

> > 2. Manual namespaces: ...
> > 
> > Conclusion
> > ----------
> > 
> > In my opinion, the introduction of a namespace statement offers numerous
> > benefits:
> > 
> > - It aids in resolving function name conflicts.
> > - It facilitates deterministic code generation while maintaining compatibility
> >   with C.
>
> The current suggestion doesn't solve the all compatibility with C.  We
> have to provide a way of calling a C function from olang code without
> namespacing (in case of namespace being mandatory).

Good catch! In my opinion, we should follow C's approach on this matter.

  extern fn pow(base: u32, power: u32)

In this case, the extern identifier matches exactly with the assembly
symbol. Please note that the extern statement is merely a semantic tool;
it does not generate any code.

Do you think that namespaces translation in between C and olang are
necessary? In our arena implementation, all functions have the *arena_*
prefix. By using *extern* the way I'm proposing we will call these
functions in olang with their exactly name, ie, *arena_alloc* will be
called using *arena_alloc* not just *alloc*. IMO, it is ok since it is
an external.

Do you think that translating namespaces between C and olang is
necessary? In our arena implementation, all functions have the *arena_*
prefix. By using *extern* in the way I'm proposing, we will call these
functions in olang by their exact names. For instance, *arena_alloc*
will be invoked as *arena_alloc*, not just *alloc*. In my opinion, this
is acceptable since it is an external function.

> > - It simplifies the resolution of imports.
>
> I would suggest to not go much further with import resolution (unless
> you already want to define modules).  Perhaps we could have namespace
> doing nothing else than namespacing...
>

If by "modules" you are referring to the file level, and not to
something like packages or libraries, then that is exactly what I want
to define! Influenced by Clojure, I recommended calling it a
"namespace". However, I believe that naming it ‘mod' or ‘module' is more
suitable for its purpose.

  mod olang.core.math

  fn add(a: u32, b: u32) {
    return a + b
  }

> > These advantages come with the minor stipulation of initiating all files with a
> > namespace statement, which, in my view, is a small price to pay for the
> > benefits gained.
>
> I'm not keen on the idea of enforcing strict adherence to the folder
> structure.
>
> How about we introduce a namespace block instead? Within this
> block, everything would automatically have the namespace added as a
> prefix. This could offer more flexibility while still maintaining
> organization.

Don't you think that in practice almost every single file will
namespace? C++ follows this pattern, and look at this Qt mirror [1], 6k
files, all namespaced, they even created a macro to facilitate the
work.

[1] https://github.com/search?q=repo%3Aradekp%2Fqt+%2FQT_BEGIN_NAMESPACE%5Cn%2F&type=code&p=1

> I think module has a different meaning.  If you want to have modules,
> for sure we have to discuss import resolution.  IMHO namespace shouldn't
> do anything else than namespacing.

I believe you're right. It's almost impossible to discuss modules
without bringing up imports. To me, the way C handles this is one of the
most painful things in my life (hehe).

The main issue is that I never know where something is coming from,
which is especially painful when I'm trying to replicate something I've
already done. This also, often leads to unused includes over time
because it's hard to determine if an include is actually being used.

If we abandon modules and just go with C++-like namespaces, I believe we
we may want to endure C's painful include system. This is because the
language won't have control over function names. The way the include
system is designed sends a message to developers that including a file
is akin to concatenating all the definitions into a single file. But
yet I think it would be ok to have names imports even if we don't
control the language names but it would be just a semantic tool.

Named Imports
-------------

  mod myprog

  import olang.core.math

  fn main(): u32 {
    return olang.core.math::sum(1, 2)
  }

And even associate identifiers to it.

  mod myprog

  import olang.core.math as ocm

  fn main(): u32 {
    return ocm::sum(1, 2)
  }

Note that there is no actually difference in between mangling and my
module purpose, except the fact modules generates deterministic and
friendly names that can be easily used in C and also easy to gen code,
once to generate the assembly symbol of *olang.core.math::sum* we can
just replace dots by underscores and double column to double
underscores.

External Linkage
----------------

We probably don't wanna to make all function global for external
linkage. So we may need a visibility keyword. And to me, everything that
can be imported should also be available for external linkage, even if
we decide to do not generate an object file per module. I would
recommend the usage of *export* or *pub*. I like *export* better.

Note that this is required no meter how we decide to handle imports.

No mangling
-----------

But what if I really need something to have the exact name? Lets say you
wanna integrate with a bootloader that is integrated on the link process and
expects a symbol called *kmain* to jump into?

Ok, I admit, in that case namespaces it is gonna be a pain in the ass. But
the good news is that these are exceptional, you don't need this for
your entire application but only for a few functions.

I was wondering that we could have a *global* keyword where everything
that is global assumes its own name and is always have public visibility.

  mod myprog

  import olang.core.math as ocm

  global fn main(): u32 {
    return ocm::sum(1, 2)
  }

If we decided that we wanna both for the language *mod* and *ns* we
could even have a global ns.

  mod myprog

  import olang.core.math as ocm

  ns global {
    fn main(): u32 {
      return ocm::sum(1, 2)
    }
  }

Summarizing
-----------

We have a few options in the table.

1. Use the names the way they are. (C approach)

Pros:
- Simple, no magic, it is what it is.
- Easy to produce debug info since the assembly symbol will be the
  function name.

Cons:
- More challenge to keep the code out of name conflicts in large
  codebases.
- Requires developers to manually namespace functions.

2. Use mangled names (C++, Rust approach)

Pros:
- Keep the code out of naming conflicts

Cons:
- Non deterministic names
- Since the function name is usually non deterministic, you are required
  to use a no-mangle statement to integrate with C.
- More debug info is required.

3. Use modules (Zig approach (I think))

Pros:
- Keep the code out of naming conflicts
- Deterministic assembly symbols permit integrate with C without any
  magic, you just need to follow the convention ns__fn.

Cons:
- If you really need to have a specific name for your function you gonna
  need a no-mangle approach.
- More debug info is required.
- It is not entirely free of name conflicts, you can force a conflict by
  create function that starts with double underscores which is not
  recommended by C, since these names are reserved.

I haven't talked too much about zig, but here goes a fun fact, zig uses
dot in their names which solves the conflict name I described above once
you cannot create an identifier that contains a dot, but it also makes
non viable the C integration without an ABI.

next prev parent reply	other threads:[~2024-03-28 13:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-24  2:46 Carlos Maniero
2024-03-27 18:39 ` Johnny Richard
2024-03-28 13:41   ` Carlos Maniero [this message]
2024-04-06 16:51     ` Johnny Richard
2024-04-07 20:49       ` Carlos Maniero
2024-04-07 20:58         ` Carlos Maniero
2024-04-08  2:45           ` Carlos Maniero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D05FCY3GDZLY.1S4HYGDXSMS7C@maniero.me \
    --to=carlos@maniero.me \
    --cc=johnny@johnnyrichard.com \
    --cc=~johnnyrichard/olang-devel@lists.sr.ht \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.johnnyrichard.com/olang.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox