From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms1.migadu.com with LMTPS id YMrTONQ/Gmd7PgEAqHPOHw:P1 (envelope-from ) for ; Thu, 24 Oct 2024 14:38:45 +0200 Received: from aspmx1.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0.migadu.com with LMTPS id YMrTONQ/Gmd7PgEAqHPOHw (envelope-from ) for ; Thu, 24 Oct 2024 14:38:45 +0200 X-Envelope-To: patches@johnnyrichard.com Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=lists.sr.ht header.s=20240113 header.b=Gb5ztMjt; dkim=pass header.d=maniero.me header.s=hostingermail1 header.b=UVGg3DHD; spf=pass (aspmx1.migadu.com: domain of lists@sr.ht designates 46.23.81.152 as permitted sender) smtp.mailfrom=lists@sr.ht; dmarc=pass (policy=none) header.from=maniero.me; arc=pass ("mailchannels.net:s=arc-2022:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=johnnyrichard.com; s=key1; t=1729773524; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=1CsXNbhBuf5xmNpk36pZ9TxkGXpwFlcvhU4cAieWuds=; b=oKVExJx5QRUqzZUkzeTFwMDqng7fqgbJJQbvMpsa/hqKzOKqOwHyAPaSBa+szme35BOFok A+pr8caV8GT9IjU7qWVq8mgGvg+8C3sptuUh3vMTOXH/YgvjnNSFw7asB4ekBdEng8Qb8L /qOgo/BYegd0AL/8QbrhBTIZq3SPZanJ9bEFjzAnRU1hQbAQaI6dN3lw907RBaLwpMZve9 02EIZqEUhB/OuyBXlxBiE+MMWIObVscrOCen+yglB+N9IL2xhG+WythkEb+xubzctp16Q5 kHDbAlc5BGKPgLt70+OPnxrmcraCvNDKdUQDWpsjUy7UHn1bqIaFAEMQeY1vIg== ARC-Authentication-Results: i=2; aspmx1.migadu.com; dkim=pass header.d=lists.sr.ht header.s=20240113 header.b=Gb5ztMjt; dkim=pass header.d=maniero.me header.s=hostingermail1 header.b=UVGg3DHD; spf=pass (aspmx1.migadu.com: domain of lists@sr.ht designates 46.23.81.152 as permitted sender) smtp.mailfrom=lists@sr.ht; dmarc=pass (policy=none) header.from=maniero.me; arc=pass ("mailchannels.net:s=arc-2022:i=1") ARC-Seal: i=2; s=key1; d=johnnyrichard.com; t=1729773524; a=rsa-sha256; cv=pass; b=O9PkSUPm+Hu8TEPk9gJPAb9EeNdWH62jJO+wbpZBMK2XUV7Cpggu8Hz260UpzrWcnvSPL/ I21luk1PzhToN+JE2NYmFKVmO7lVccS8++CVSh+qBS68XXT9j3Zwk+224MUhl/biL1m2HT i2ii2Yec4gRsII4oVB8JzrV+9ZcL8/a/H5Jg6i6456GfUX+R+kDIINlcbDld5XI0abkt6/ ffTegEay719phP9LfQNcvL9VdEke8nAlp0fcht6wun7dOR0LrRBF5mxq63dDuY/PrVr2M/ +v1HYpWyAb88Sz5eAJmU9K6uqIKNv3+hzZIeqeC7fO6KpyBegcrioxrwJxMmyw== Received: from mail-a.sr.ht (mail-a.sr.ht [46.23.81.152]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id A583478F15 for ; Thu, 24 Oct 2024 14:38:44 +0200 (CEST) DKIM-Signature: a=rsa-sha256; bh=ki8P5SUO9/MRw6dXpPoIXpZqiFhBVlRTyBjogCMrx5s=; c=simple/simple; d=lists.sr.ht; h=From:To:Cc:Subject:In-Reply-To:References:Date:List-Unsubscribe:List-Subscribe:List-Archive:List-Post:List-ID; q=dns/txt; s=20240113; t=1729773524; v=1; b=Gb5ztMjtH6TXXWIbZky8JO5vOfCod6R4C/8idu/g/NqOGbkt7hX+nfl1TH0YmA+/1sEKtvgs QVpN229+jLfr0BOfsLNR1tLeHGi2nx2QvzQkT18EYlOKkAp7V7iNEgVwYC0+XE/RNXsT1wdwPcF nVJO1+/BvTzZQlMxuX7JqPNlbFjHX8XD3I2T9TqyaNJJAc/RQSDGyST2Emfw6OqlNhM7/coiOt8 1yyP/B7m++qFOardJkyOf1I6k2/++cueousjPf/E5c+KqZ9IzjJhxZyN9sLFRZBNKD8V5fxqHtk M+UNT4MEwEc1dhn3h+zTEb5xpkceRaPAB1g8Ui5WdjnBg== Received: from lists.sr.ht (unknown [46.23.81.154]) by mail-a.sr.ht (Postfix) with ESMTPSA id 5B18220357 for ; Thu, 24 Oct 2024 12:38:44 +0000 (UTC) Received: from dwarf.ash.relay.mailchannels.net (dwarf.ash.relay.mailchannels.net [23.83.222.53]) by mail-a.sr.ht (Postfix) with ESMTPS id 1BD4D20347 for <~johnnyrichard/olang-devel@lists.sr.ht>; Thu, 24 Oct 2024 12:38:42 +0000 (UTC) X-Sender-Id: hostingeremail|x-authuser|carlos@maniero.me Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 6D91C323822 for <~johnnyrichard/olang-devel@lists.sr.ht>; Thu, 24 Oct 2024 12:38:41 +0000 (UTC) Received: from uk-fast-smtpout10.hostinger.io (100-99-180-172.trex-nlb.outbound.svc.cluster.local [100.99.180.172]) (Authenticated sender: hostingeremail) by relay.mailchannels.net (Postfix) with ESMTPA id B017E325DBB for <~johnnyrichard/olang-devel@lists.sr.ht>; Thu, 24 Oct 2024 12:38:40 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1729773521; a=rsa-sha256; cv=none; b=YDkHa8R62NGcigBJC8RMDHMLye18bXCUKVQ+2Td0pYedrXqtEWzemUJINirqDc1I1vp1E2 m0arV/ioh6neWN4FQfO73EGnnlq71aFXUsLglLPMvg59piJmk1/KMuG7iumjNJHvcrh3CC 68fXRRqfrWlFiqx3aNlg9EO03Bd8ACL7zoOJ/niDKvH/ImDZ3gIiMwIOFEiR5RfP4JWkTH B5rv+wh8bEemLbxyQD44okG+DwsskvX3eB0hq5XpvP8DlprHS7YLYl/3wh2W+TglDHMIz3 soF+4N42zD/MvxVlI58q6zzdWVOuamJgnOSNXIe6E2bVhhNxd46Y+AWdB22RVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1729773521; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1CsXNbhBuf5xmNpk36pZ9TxkGXpwFlcvhU4cAieWuds=; b=vXJf7vgam0P0EnxQ92VDwLFjT71uUrVNnaGwpVmvp11tNu5xE6adZ9I1CGlft7/qyQ/Ay1 A5PqBImLP+5d0Isqe89dUQkN7JGBXNPByUdd6za4uoTo6G+kXYoCwVIMkNoJ61gMbAwdL0 YURWA2tKKRq9VSIn9dJj9vsEvjT93J2AS9qpn8kZPzivN8eGWaFc2b/uO7gNeHjTBVfCjN xqi04HBdM1qAJtHmi7YYmIxSDDIw8MutvHuzbE7EFFV3q5mIBpp6CoTw0nK6gUio1PMCVj cXYpcbRDcu7gtmbtRK4rqEJ1Y5FuS0d8vOf/b9Kg1WW4X/EWoFd/iNmHUIuJMQ== ARC-Authentication-Results: i=1; rspamd-7767f6b98-lcqmv; auth=pass smtp.auth=hostingeremail smtp.mailfrom=carlos@maniero.me X-Sender-Id: hostingeremail|x-authuser|carlos@maniero.me X-MC-Relay: Neutral X-MailChannels-SenderId: hostingeremail|x-authuser|carlos@maniero.me X-MailChannels-Auth-Id: hostingeremail X-Eyes-Blushing: 144bf48a5a4f4b6e_1729773521383_874937555 X-MC-Loop-Signature: 1729773521383:2135337360 X-MC-Ingress-Time: 1729773521383 Received: from uk-fast-smtpout10.hostinger.io (uk-fast-smtpout10.hostinger.io [145.14.155.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.99.180.172 (trex/7.0.2); Thu, 24 Oct 2024 12:38:41 +0000 From: Carlos Maniero DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=maniero.me; s=hostingermail1; t=1729773518; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1CsXNbhBuf5xmNpk36pZ9TxkGXpwFlcvhU4cAieWuds=; b=UVGg3DHDUTFPZ46QybrpIH8xJN50pamx4Hq9UqlPrL+zx+SpXYU7U16vkErXuIAMFFo4SR fQwjuGO11DL2xxHxaFJfWuBxLfGLSq5rGMpmO3QZ2rUZ4CDHceX0DH5Lwt3QEFQT2Wp2Us MKvMn69NAFt5kKUNIE6Nayk9oBMbk6izjGFksZMoh3dhBT1cdie4X6Ubi0qWNHqMQg4Jns EmjnDiKCqDLcfDyCKlX/FwoTjF4hhDfKZ/N2WZh7Wwr19z0b/Amdx959KvrLXYFEYtevW6 Z+mWUujB4zNFUvmI7rxrm7EsRXGYWiNHFb5XTFX06drVDbDDj3DdepXdbdTwOw== To: ~johnnyrichard/olang-devel@lists.sr.ht Cc: Carlos Maniero Subject: [PATCH olang v1 2/6] semantics: resolve variable symbols Message-ID: <20241024123825.120390-3-carlos@maniero.me> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241024123825.120390-1-carlos@maniero.me> References: <20241024123825.120390-1-carlos@maniero.me> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Date: Thu, 24 Oct 2024 12:38:38 +0000 (UTC) X-CM-Analysis: v=2.4 cv=TcnEtgQh c=1 sm=1 tr=0 ts=671a3fce a=WwxFCuf3mf1fs3oSi6/dng==:117 a=WwxFCuf3mf1fs3oSi6/dng==:17 a=MKtGQD3n3ToA:10 a=1oJP67jkp3AA:10 a=LqnkzvhNAY-f5Nek8_gA:9 a=BXDaF_L80NY05PYiAFlV:22 X-CM-Envelope: MS4xfDkBPDXV91C7k5M3RYcbuwlrEMlWwpxmnQlMzWEGc0sHAwlbmnsD6z2fdNdtMoLC/+/MwBKR+WcfsJgt832/Tf62KpoKloe5YcZ2f2PbqKgly2BIY5Zz Y6GoHUy8W00OVmk1smBLmvPF0Cg9irCiL/L6PGVN2iqTx/RmfBiI02tWW8hvL3IDE+8xAoQvqS8oytj0MmQSKDJzU7a7qQqyD7Y/Zwhk+qFz76fOi1GG925h X-AuthUser: carlos@maniero.me X-Sourcehut-Patchset-Status: UNKNOWN List-Unsubscribe: List-Subscribe: List-Archive: Archived-At: List-Post: List-ID: ~johnnyrichard/olang-devel <~johnnyrichard/olang-devel.lists.sr.ht> Sender: ~johnnyrichard/olang-devel <~johnnyrichard/olang-devel@lists.sr.ht> X-Migadu-Country: NL X-Migadu-Flow: FLOW_IN X-Migadu-Scanner: mx11.migadu.com X-Migadu-Spam-Score: -1.54 X-Spam-Score: -1.54 X-Migadu-Queue-Id: A583478F15 X-TUID: mZUgW1XgD7GF This is a first step for semantics check. It introduces the symbol to ast_id_t. It still does not check if types match neither print helpful errors. Why to add the symbol into the ast_id_t? ---------------------------------------- The semantics is required to resolve the symbol in order to perform type checking and to check if the reference can be resolved. So, checking if the symbol exists and throw the found symbol away is a computational waste. Additionally, adding the symbol to the id removes complexity from codegen, once symbol_lookups will be no longer required. Why to create the resolve_symbols function? ------------------------------------------- If you look at the resolve_symbols and the populate_scope you should wonder "why there are two functions that traverse the entire ast?". The main reason is that olang allows reference before definition. So if something such as a function call refer to a function before it is defined the checker will fail. Meaning that it is required to first to populate the scope and make the symbols' registration for the entire AST and afterwards check for symbols. Signed-off-by: Carlos Maniero --- src/ast.h | 1 + src/codegen_x86_64.c | 30 ++++------- src/type_checker.c | 123 ++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 132 insertions(+), 22 deletions(-) diff --git a/src/ast.h b/src/ast.h index c55ecf1..2181849 100644 --- a/src/ast.h +++ b/src/ast.h @@ -72,6 +72,7 @@ typedef struct ast_id { string_view_t name; scope_t *scope; + symbol_t *symbol; } ast_id_t; typedef struct ast_translation_unit diff --git a/src/codegen_x86_64.c b/src/codegen_x86_64.c index c7122be..1758902 100644 --- a/src/codegen_x86_64.c +++ b/src/codegen_x86_64.c @@ -143,12 +143,10 @@ codegen_x86_64_emit_expression(codegen_x86_64_t *codegen, ast_node_t *expr_node) case AST_NODE_REF: { ast_ref_t ref = expr_node->as_ref; - symbol_t *symbol = scope_lookup(ref.id.scope, ref.id.name); - assert(symbol); - - size_t offset = codegen_x86_64_get_stack_offset(codegen, symbol); + size_t offset = + codegen_x86_64_get_stack_offset(codegen, ref.id.symbol); - size_t bytes = type_to_bytes(symbol->type); + size_t bytes = type_to_bytes(ref.id.symbol->type); fprintf(codegen->out, " mov -%ld(%%rbp), %s\n", @@ -597,17 +595,14 @@ codegen_x86_64_emit_expression(codegen_x86_64_t *codegen, ast_node_t *expr_node) switch (bin_op.lhs->kind) { case AST_NODE_REF: { ast_ref_t ref = bin_op.lhs->as_ref; - scope_t *scope = ref.id.scope; - - symbol_t *symbol = scope_lookup(scope, ref.id.name); - assert(symbol); size_t offset = codegen_x86_64_get_stack_offset( - codegen, symbol); + codegen, ref.id.symbol); codegen_x86_64_emit_expression(codegen, bin_op.rhs); - size_t type_size = type_to_bytes(symbol->type); + size_t type_size = + type_to_bytes(ref.id.symbol->type); fprintf(codegen->out, " mov %s, -%ld(%%rbp)\n", get_reg_for(REG_ACCUMULATOR, type_size), @@ -668,11 +663,8 @@ codegen_x86_64_emit_expression(codegen_x86_64_t *codegen, ast_node_t *expr_node) ast_ref_t ref = unary_op.operand->as_ref; - symbol_t *symbol = scope_lookup(ref.id.scope, ref.id.name); - assert(symbol); - size_t offset = - codegen_x86_64_get_stack_offset(codegen, symbol); + codegen_x86_64_get_stack_offset(codegen, ref.id.symbol); fprintf( codegen->out, " lea -%ld(%%rbp), %%rax\n", offset); @@ -722,16 +714,12 @@ codegen_x86_64_emit_block(codegen_x86_64_t *codegen, ast_block_t *block) case AST_NODE_VAR_DEF: { ast_var_definition_t var_def = node->as_var_def; - scope_t *scope = var_def.id.scope; - - symbol_t *symbol = scope_lookup(scope, var_def.id.name); - assert(symbol); - size_t type_size = type_to_bytes(symbol->type); + size_t type_size = type_to_bytes(var_def.id.symbol->type); codegen->base_offset += type_size; codegen_x86_64_put_stack_offset( - codegen, symbol, codegen->base_offset); + codegen, var_def.id.symbol, codegen->base_offset); if (var_def.value) { codegen_x86_64_emit_expression(codegen, var_def.value); diff --git a/src/type_checker.c b/src/type_checker.c index a2ffdd6..daccecf 100644 --- a/src/type_checker.c +++ b/src/type_checker.c @@ -23,6 +23,9 @@ static void populate_scope(checker_t *checker, scope_t *scope, ast_node_t *ast); +static void +resolve_symbols(checker_t *checker, ast_node_t *ast); + checker_t * checker_new(arena_t *arena) { @@ -98,6 +101,7 @@ checker_check(checker_t *checker, ast_node_t *ast) scope_t *scope = scope_new(checker->arena); populate_scope(checker, scope, ast); + resolve_symbols(checker, ast); // TODO: traverse the ast tree to verify semantics } @@ -105,12 +109,27 @@ checker_check(checker_t *checker, ast_node_t *ast) static void register_id(checker_t *checker, scope_t *scope, ast_id_t *id, type_t *type) { - id->scope = scope; symbol_t *symbol = symbol_new(checker->arena, id->name, type); + id->scope = scope; + id->symbol = symbol; + scope_insert(scope, symbol); } +static void +resolve_id(checker_t *checker, ast_id_t *id) +{ + assert(checker); + + symbol_t *symbol = scope_lookup(id->scope, id->name); + + // FIXME: assert types and print a friendly error message + assert(symbol); + + id->symbol = symbol; +} + static void populate_scope(checker_t *checker, scope_t *scope, ast_node_t *ast) { @@ -244,3 +263,105 @@ populate_scope(checker_t *checker, scope_t *scope, ast_node_t *ast) return; } } + +static void +resolve_symbols(checker_t *checker, ast_node_t *ast) +{ + assert(checker); + + switch (ast->kind) { + case AST_NODE_TRANSLATION_UNIT: { + list_item_t *item = list_head(ast->as_translation_unit.decls); + + while (item != NULL) { + resolve_symbols(checker, (ast_node_t *)item->value); + item = list_next(item); + } + return; + } + + case AST_NODE_FN_DEF: { + if (ast->as_fn_def.block != NULL) { + resolve_symbols(checker, ast->as_fn_def.block); + } + return; + } + + case AST_NODE_FN_CALL: { + list_item_t *item = list_head(ast->as_fn_call.args); + + while (item != NULL) { + resolve_symbols(checker, (ast_node_t *)item->value); + item = list_next(item); + } + + return; + } + + case AST_NODE_IF_STMT: { + resolve_symbols(checker, ast->as_if_stmt.cond); + resolve_symbols(checker, ast->as_if_stmt.then); + + if (ast->as_if_stmt._else) { + resolve_symbols(checker, ast->as_if_stmt._else); + } + + return; + } + + case AST_NODE_WHILE_STMT: { + resolve_symbols(checker, ast->as_while_stmt.cond); + resolve_symbols(checker, ast->as_while_stmt.then); + + return; + } + + case AST_NODE_BINARY_OP: { + ast_binary_op_t bin_op = ast->as_bin_op; + + resolve_symbols(checker, bin_op.lhs); + resolve_symbols(checker, bin_op.rhs); + return; + } + + case AST_NODE_UNARY_OP: { + ast_unary_op_t unary_op = ast->as_unary_op; + + resolve_symbols(checker, unary_op.operand); + return; + } + + case AST_NODE_RETURN_STMT: { + ast_return_stmt_t return_stmt = ast->as_return_stmt; + + resolve_symbols(checker, return_stmt.value); + return; + } + + case AST_NODE_BLOCK: { + ast_block_t block = ast->as_block; + + list_item_t *item = list_head(block.nodes); + + while (item != NULL) { + resolve_symbols(checker, (ast_node_t *)item->value); + item = list_next(item); + } + + return; + } + + case AST_NODE_VAR_DEF: { + resolve_symbols(checker, ast->as_var_def.value); + return; + } + + case AST_NODE_REF: { + resolve_id(checker, &ast->as_ref.id); + return; + } + case AST_NODE_LITERAL: + case AST_NODE_UNKNOWN: + return; + } +} -- 2.46.1