* [PATCH olang v1 0/1] deref operation returning value
@ 2024-10-19 14:10 Carlos Maniero
2024-10-19 14:10 ` [PATCH olang v2 1/1] codegen: x64: deref returns pointer value Carlos Maniero
2024-10-19 14:14 ` [PATCH olang v1 0/1] deref operation returning value Carlos Maniero
0 siblings, 2 replies; 5+ messages in thread
From: Carlos Maniero @ 2024-10-19 14:10 UTC (permalink / raw)
To: ~johnnyrichard/olang-devel; +Cc: Carlos Maniero
We had discussed over IRC about may adding some extra information on the
deref operation to support this feature, but there are two reasons why I
opted to keep this on code gen:
1) Modifying the AST would require changing the unary data structure
from a struct to an union, but no other operator would be beneficed by
this since only the deref needs some extra information.
2) The assign operation is required to identify if its LHS is a deref,
once the mov instruction will differ from a simple ref: The simple ref
will mov the expression result to a RBP offset and the deref will mov
the expression result to the pointer location.
Also, once we have the IR, this logic could easily fit the layer that
will build the IR.
V2:
- Fix code style
- Allows multiple pointer dereference
Carlos Maniero (1):
codegen: x64: deref returns pointer value
src/codegen_x86_64.c | 88 +++++++++++++++++++++++++---
src/parser.c | 11 ++--
tests/olc/0038_pointers_deref.ol | 24 ++++++++
tests/olc/0039_pointer_of_pointer.ol | 55 +++++++++++++++++
4 files changed, 164 insertions(+), 14 deletions(-)
create mode 100644 tests/olc/0038_pointers_deref.ol
create mode 100644 tests/olc/0039_pointer_of_pointer.ol
base-commit: 2dbf9a9896e5778535bd3dc1d5069a762d3b94fa
--
2.46.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH olang v2 1/1] codegen: x64: deref returns pointer value
2024-10-19 14:10 [PATCH olang v1 0/1] deref operation returning value Carlos Maniero
@ 2024-10-19 14:10 ` Carlos Maniero
2024-10-19 14:11 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-10-19 14:14 ` [PATCH olang v1 0/1] deref operation returning value Carlos Maniero
1 sibling, 1 reply; 5+ messages in thread
From: Carlos Maniero @ 2024-10-19 14:10 UTC (permalink / raw)
To: ~johnnyrichard/olang-devel; +Cc: Carlos Maniero
Deref is context dependent when doing an assignment:
*a = 1
It is expected for the deref codegen to return the pointer location, so
than the assignment binop would be able to assign a new value at that
location.
On another hand, when performing:
return *a
It is expected for deref to actually returns the pointer location value.
At this point, we don't have an easy way to get the operation result
type so there is a tricky logic to determine if the unnary expression is
returning a pointer or a value, ie: when dereferencing "var b: u16**"
once (*b) the result size is 8 bits as it returns another pointer, but
by dereferencing it twice (**b) we are returning 2 bits since at this
point we are returning a u16.
Signed-off-by: Carlos Maniero <carlos@maniero.me>
---
src/codegen_x86_64.c | 88 +++++++++++++++++++++++++---
src/parser.c | 11 ++--
tests/olc/0038_pointers_deref.ol | 24 ++++++++
tests/olc/0039_pointer_of_pointer.ol | 55 +++++++++++++++++
4 files changed, 164 insertions(+), 14 deletions(-)
create mode 100644 tests/olc/0038_pointers_deref.ol
create mode 100644 tests/olc/0039_pointer_of_pointer.ol
diff --git a/src/codegen_x86_64.c b/src/codegen_x86_64.c
index deb7e24..4213571 100644
--- a/src/codegen_x86_64.c
+++ b/src/codegen_x86_64.c
@@ -52,6 +52,8 @@ typedef enum x86_64_register_type
REG_R15
} x86_64_register_type_t;
+typedef size_t size_in_bytes_t;
+
/**
* Arch/ABI arg1 arg2 arg3 arg4 arg5 arg6 arg7 Notes
* ──────────────────────────────────────────────────────────────
@@ -76,6 +78,14 @@ codegen_x86_64_put_stack_offset(codegen_x86_64_t *codegen,
static size_t
codegen_x86_64_get_stack_offset(codegen_x86_64_t *codegen, symbol_t *symbol);
+static size_in_bytes_t
+codegen_x86_64_emit_unary_deref_address(codegen_x86_64_t *codegen,
+ ast_unary_op_t *unary_op);
+
+static size_in_bytes_t
+codegen_x86_64_emit_unary_deref_value(codegen_x86_64_t *codegen,
+ ast_unary_op_t *unary_op);
+
static size_t
type_to_bytes(type_t *type);
@@ -126,8 +136,6 @@ codegen_x86_64_get_next_label(codegen_x86_64_t *codegen)
return ++codegen->label_index;
}
-typedef size_t size_in_bytes_t;
-
static size_in_bytes_t
codegen_x86_64_emit_expression(codegen_x86_64_t *codegen, ast_node_t *expr_node)
{
@@ -619,7 +627,8 @@ codegen_x86_64_emit_expression(codegen_x86_64_t *codegen, ast_node_t *expr_node)
AST_UNARY_DEREFERENCE &&
"unsupported assignment lhs");
- codegen_x86_64_emit_expression(codegen, bin_op.lhs);
+ codegen_x86_64_emit_unary_deref_address(
+ codegen, &bin_op.lhs->as_unary_op);
fprintf(codegen->out, " push %%rax\n");
@@ -679,12 +688,8 @@ codegen_x86_64_emit_expression(codegen_x86_64_t *codegen, ast_node_t *expr_node)
return 8;
}
case AST_UNARY_DEREFERENCE: {
- // FIXME: support dereference of dereference (**)
- assert(unary_op.expr->kind == AST_NODE_REF &&
- "unsupported unary expression for dereference (*)");
-
- return codegen_x86_64_emit_expression(codegen,
- unary_op.expr);
+ return codegen_x86_64_emit_unary_deref_value(codegen,
+ &unary_op);
}
default: {
assert(0 && "unsupported unary operation");
@@ -829,6 +834,71 @@ codegen_x86_64_emit_if(codegen_x86_64_t *codegen, ast_if_stmt_t if_stmt)
fprintf(codegen->out, ".L%ld:\n", end_else_label);
}
+static size_in_bytes_t
+codegen_x86_64_emit_unary_deref_address(codegen_x86_64_t *codegen,
+ ast_unary_op_t *unary_op)
+{
+ assert(unary_op->kind == AST_UNARY_DEREFERENCE);
+
+ if (unary_op->expr->kind == AST_NODE_UNARY_OP) {
+ // dive into the AST until it finds a ref
+ size_in_bytes_t size = codegen_x86_64_emit_unary_deref_address(
+ codegen, &unary_op->expr->as_unary_op);
+
+ fprintf(codegen->out, " mov (%%rax), %%rax\n");
+
+ return size;
+ }
+
+ assert(unary_op->expr->kind == AST_NODE_REF);
+ return codegen_x86_64_emit_expression(codegen, unary_op->expr);
+}
+
+static size_in_bytes_t
+codegen_x86_64_emit_unary_deref_value(codegen_x86_64_t *codegen,
+ ast_unary_op_t *unary_op)
+{
+ codegen_x86_64_emit_unary_deref_address(codegen, unary_op);
+
+ ast_node_t *expr = unary_op->expr;
+
+ size_t deref_levels = 0;
+
+ while (expr->kind != AST_NODE_REF) {
+ assert(expr->kind == AST_NODE_UNARY_OP &&
+ expr->as_unary_op.kind == AST_UNARY_DEREFERENCE);
+
+ expr = expr->as_unary_op.expr;
+ deref_levels++;
+ }
+
+ ast_ref_t ref = expr->as_ref;
+
+ symbol_t *symbol = scope_lookup(ref.scope, ref.id);
+
+ type_t *t_result = symbol->type;
+
+ assert(t_result->kind == TYPE_PTR);
+
+ // FIXME: Identifies the operation result type based on how many times the
+ // deref operator was used. The semantics should provide the result
+ // type of an operation, so then the codegen does not need to care
+ // about it.
+ while (deref_levels-- > 0) {
+ t_result = symbol->type->as_ptr.type;
+
+ assert(symbol->type->kind == TYPE_PTR);
+ }
+
+ size_in_bytes_t deref_size = type_to_bytes(t_result->as_ptr.type);
+
+ fprintf(codegen->out,
+ " mov (%%rax), %s\n",
+ get_reg_for(REG_ACCUMULATOR, deref_size));
+
+ return deref_size;
+}
+
static size_t
type_to_bytes(type_t *type)
{
diff --git a/src/parser.c b/src/parser.c
index d26f266..386f60b 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -538,13 +538,12 @@ parser_parse_type(parser_t *parser)
return NULL;
}
- token_t ptr_token;
+ type_t *type = type_new_unknown(parser->arena, token.value);
+ token_t ptr_token;
lexer_peek_next(parser->lexer, &ptr_token);
- type_t *type = type_new_unknown(parser->arena, token.value);
-
- if (ptr_token.kind == TOKEN_STAR) {
+ while (ptr_token.kind == TOKEN_STAR) {
if (!skip_expected_token(parser, TOKEN_STAR)) {
return NULL;
}
@@ -553,7 +552,9 @@ parser_parse_type(parser_t *parser)
ptr_id.size =
ptr_token.value.chars - token.value.chars + ptr_token.value.size;
- return type_new_ptr(parser->arena, ptr_id, type);
+ type = type_new_ptr(parser->arena, ptr_id, type);
+
+ lexer_peek_next(parser->lexer, &ptr_token);
}
return type;
diff --git a/tests/olc/0038_pointers_deref.ol b/tests/olc/0038_pointers_deref.ol
new file mode 100644
index 0000000..cef91b2
--- /dev/null
+++ b/tests/olc/0038_pointers_deref.ol
@@ -0,0 +1,24 @@
+# Copyright (C) 2024 olang mantainers
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <https://www.gnu.org/licenses/>.
+
+fn main(): u32 {
+ var a: u64 = 0
+ var b: u64* = &a
+ return *b
+}
+
+# TEST test_compile(exit_code=0)
+
+# TEST test_run_binary(exit_code=0)
diff --git a/tests/olc/0039_pointer_of_pointer.ol b/tests/olc/0039_pointer_of_pointer.ol
new file mode 100644
index 0000000..e98178a
--- /dev/null
+++ b/tests/olc/0039_pointer_of_pointer.ol
@@ -0,0 +1,55 @@
+# Copyright (C) 2024 olang mantainers
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <https://www.gnu.org/licenses/>.
+
+fn main(): u32 {
+ var a: u32 = 1
+ var b: u32* = &a
+ var c: u32** = &b
+ var d: u32 = 41
+
+ **c = 42
+
+ if a != 42 {
+ return 1
+ }
+
+ if a != *b {
+ return 2
+ }
+
+ if *b != **c {
+ return 3
+ }
+
+ if b != &a {
+ return 4
+ }
+
+ if *c != &a {
+ return 5
+ }
+
+ *c = &d
+
+ if *b != 41 {
+ return 6
+ }
+
+ return 0
+}
+
+# TEST test_compile(exit_code=0)
+
+# TEST test_run_binary(exit_code=0)
--
2.46.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [olang/patches/.build.yml] build success
2024-10-19 14:10 ` [PATCH olang v2 1/1] codegen: x64: deref returns pointer value Carlos Maniero
@ 2024-10-19 14:11 ` builds.sr.ht
0 siblings, 0 replies; 5+ messages in thread
From: builds.sr.ht @ 2024-10-19 14:11 UTC (permalink / raw)
To: Carlos Maniero; +Cc: ~johnnyrichard/olang-devel
olang/patches/.build.yml: SUCCESS in 33s
[deref operation returning value][0] v2 from [Carlos Maniero][1]
[0]: https://lists.sr.ht/~johnnyrichard/olang-devel/patches/55544
[1]: mailto:carlos@maniero.me
✓ #1353678 SUCCESS olang/patches/.build.yml https://builds.sr.ht/~johnnyrichard/job/1353678
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH olang v1 0/1] deref operation returning value
2024-10-19 14:10 [PATCH olang v1 0/1] deref operation returning value Carlos Maniero
2024-10-19 14:10 ` [PATCH olang v2 1/1] codegen: x64: deref returns pointer value Carlos Maniero
@ 2024-10-19 14:14 ` Carlos Maniero
1 sibling, 0 replies; 5+ messages in thread
From: Carlos Maniero @ 2024-10-19 14:14 UTC (permalink / raw)
To: Carlos Maniero, ~johnnyrichard/olang-devel
Ops! I messed with the pathset coverletter subject. This is ment to be
v2.
The patch per si is right.
Sorry :(
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH olang v1 0/1] deref operation returning value
@ 2024-10-17 2:48 Carlos Maniero
0 siblings, 0 replies; 5+ messages in thread
From: Carlos Maniero @ 2024-10-17 2:48 UTC (permalink / raw)
To: ~johnnyrichard/olang-devel; +Cc: Carlos Maniero
We had discussed over IRC about may adding some extra information on the
deref operation to support this feature, but there are two reasons why I
opted to keep this on code gen:
1) Modifying the AST would require changing the unary data structure
from a struct to an union, but no other operator would be beneficed by
this since only the deref needs some extra information.
2) The assign operation is required to identify if its LHS is a deref,
once the mov instruction will differ from a simple ref: The simple ref
will mov the expression result to a RBP offset and the deref will mov
the expression result to the pointer location.
Also, once we have the IR, this logic could easily fit the layer that
will build the IR.
Carlos Maniero (1):
codegen: x64: deref returns pointer value
src/codegen_x86_64.c | 61 +++++++++++++++++++++++++++++++++++++-------
1 file changed, 52 insertions(+), 9 deletions(-)
--
2.46.1
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-10-19 14:14 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-19 14:10 [PATCH olang v1 0/1] deref operation returning value Carlos Maniero
2024-10-19 14:10 ` [PATCH olang v2 1/1] codegen: x64: deref returns pointer value Carlos Maniero
2024-10-19 14:11 ` [olang/patches/.build.yml] build success builds.sr.ht
2024-10-19 14:14 ` [PATCH olang v1 0/1] deref operation returning value Carlos Maniero
-- strict thread matches above, loose matches on Subject: below --
2024-10-17 2:48 Carlos Maniero
Code repositories for project(s) associated with this public inbox
https://git.johnnyrichard.com/olang.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox