From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms1.migadu.com with LMTPS id iNHMM8/1ImdQEwEA62LTzQ:P1 (envelope-from ) for ; Thu, 31 Oct 2024 04:13:20 +0100 Received: from aspmx1.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1.migadu.com with LMTPS id iNHMM8/1ImdQEwEA62LTzQ (envelope-from ) for ; Thu, 31 Oct 2024 04:13:19 +0100 X-Envelope-To: patches@johnnyrichard.com Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=lists.sr.ht header.s=20240113 header.b=jliB+X2v; dkim=pass header.d=maniero.me header.s=hostingermail1 header.b=mZaUTNCn; spf=pass (aspmx1.migadu.com: domain of lists@sr.ht designates 46.23.81.152 as permitted sender) smtp.mailfrom=lists@sr.ht; dmarc=pass (policy=none) header.from=maniero.me; arc=pass ("mailchannels.net:s=arc-2022:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=johnnyrichard.com; s=key1; t=1730344399; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=EkUQ7SwY9cubvQtVMONsRCFgp75WAqAK2WjTiOVilFM=; b=JRexwud75KDMUQragCmYrJg4A68fZGrNy7BelZBRZpmr5ekwg4bC4NnZBpgrGuqd3M3WCg mXntHzw2wnf7wROl23noYJGCXswoJ7/3wZH08ZQN+fw+gU2EYbKzu5nXQgPPABxNpfkDb4 XJFj2vlYvwz9S83xOwyq4ZFHM21O46Nye3R0r2rxEk+bb/ekiuIa2WxMLc7dYxORjgJioW yFwhvqISxRF+o3KFjhCB7UBIjJDUprJHpMH+zddlubFWeg0ojR32Cp0M/1EvwH6Bhof4Gx 79EyzHvoAMFQ9iD6tlTkBa2Oe4EyzABeiFo4alMpyiMkqb9HGjriVka7bt02Kg== ARC-Authentication-Results: i=2; aspmx1.migadu.com; dkim=pass header.d=lists.sr.ht header.s=20240113 header.b=jliB+X2v; dkim=pass header.d=maniero.me header.s=hostingermail1 header.b=mZaUTNCn; spf=pass (aspmx1.migadu.com: domain of lists@sr.ht designates 46.23.81.152 as permitted sender) smtp.mailfrom=lists@sr.ht; dmarc=pass (policy=none) header.from=maniero.me; arc=pass ("mailchannels.net:s=arc-2022:i=1") ARC-Seal: i=2; s=key1; d=johnnyrichard.com; t=1730344399; a=rsa-sha256; cv=pass; b=NeQQnxZf++7C12YoIxzLHOLpM6b9i6PliXUSu3usS+NUDg/wWKstV/NubCYMcjz1+bj76B YAPiHW9F5NIKnilgOhqpXU72i8UqteSrts3ou6xY185eJ/hVqYExqU1XQvQ6/1/U60YLEi NuL9Oqh5BX997R5xSdOgiDUGz+pkdCU1kY5CxsRNJHd5wPGktjOfsTprRIZdLgaVAjFhmu fVt06Two7t3NYSCeEPDJloJuV4DfoD4agK8c1DeOLVWocbjK9yuR1emrnypIejGVOil2t/ Yfq9A0HacF76s12600aRIkZ1ThPwTnk/qSOwe1n/uCPoKbmI612Lihzbn57Hsw== Received: from mail-a.sr.ht (mail-a.sr.ht [46.23.81.152]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id A3467460D6 for ; Thu, 31 Oct 2024 04:13:19 +0100 (CET) DKIM-Signature: a=rsa-sha256; bh=iX2bBNuic6/EmQok6TUiGtQZcnFsiNYPESdXegL8Oyk=; c=simple/simple; d=lists.sr.ht; h=From:To:Cc:Subject:In-Reply-To:References:Date:List-Unsubscribe:List-Subscribe:List-Archive:List-Post:List-ID; q=dns/txt; s=20240113; t=1730344398; v=1; b=jliB+X2v2tS6dsjv9V4mIwTah7YSKq9qZgK/vbHpBgrGJO/mcob2bla6gHhM7eR/FlP4F7vD B1bI3fOyzZvw3S9/U1RcMVLkKRqSqvNIfTSCf80Kk0kCCMUBdb9NwMbG3FqU0bNRnhPGfX/s/oH nXkLLsfgRlxkOkEPY4KNcIVE52FTDgD86yAh5p63mv3CG0Dbnad6lD3HIMFczCct3fW9nn+LgLi zEHDI4jigDMGpfSeUtwK0ySI3atHLuBTLx5T38/k+kv/rnrfsh9k1G+sQVIwZ9y1IfC/DLzUgEE 2s+I34c0wBEPCi9RVWf/Ne+ekIZE9K7kgJv9JhbjQqtrg== Received: from lists.sr.ht (unknown [46.23.81.154]) by mail-a.sr.ht (Postfix) with ESMTPSA id C468F20396 for ; Thu, 31 Oct 2024 03:13:18 +0000 (UTC) Received: from cross.elm.relay.mailchannels.net (cross.elm.relay.mailchannels.net [23.83.212.46]) by mail-a.sr.ht (Postfix) with ESMTPS id 95DA32038E for <~johnnyrichard/olang-devel@lists.sr.ht>; Thu, 31 Oct 2024 03:13:17 +0000 (UTC) X-Sender-Id: hostingeremail|x-authuser|carlos@maniero.me Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 1FD757813BE for <~johnnyrichard/olang-devel@lists.sr.ht>; Thu, 31 Oct 2024 03:13:16 +0000 (UTC) Received: from uk-fast-smtpout10.hostinger.io (100-101-211-53.trex-nlb.outbound.svc.cluster.local [100.101.211.53]) (Authenticated sender: hostingeremail) by relay.mailchannels.net (Postfix) with ESMTPA id 6974378154E for <~johnnyrichard/olang-devel@lists.sr.ht>; Thu, 31 Oct 2024 03:13:15 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1730344395; a=rsa-sha256; cv=none; b=dpgsFa9beAcuQDf/6maAVitu1BNFX/8bNrldRhebdBEzvVynJJvK/8wChGw+/tLNTANluH Za9kCrsQiDmc423Se0Fcq33IQ/j+o7GmvIP64g/XM77QCIa4vzv38MEsA9qKu1Wv+Foad+ izZxvnYYjaTB5M9/ugMBqHIveKp7okB6CtPYVfRdpwktiVLFbJsHB+UJW+jEPmfRo5wJd6 Zc1Ae5+egBdnH/+Mx36kTjiUVvuBvNetS1erH4GtOBTS7WnXe4c/DnosqgSlyqesGTnkdT OLwVLYYb7NxP+T/QFw2pLT0HcDLqTUECr2dAf6bBkFwhNwPnhBPhraaY7PB1ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1730344395; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EkUQ7SwY9cubvQtVMONsRCFgp75WAqAK2WjTiOVilFM=; b=cYhVnV7zb8uqfAv/EmtEZ819i8aGOXu8iP2okScIBKYYuc8ujS8QMzESmFOjuLwFIPWrTA kcNkVj9vDYxfG0OSnM4CqcUMiQjoIsscIQTE8+PhACQu29beoTB5w4oA0yiv9JsA1MeCJa tEkux6ZYeidL3LyAR7qggGevyDnOnIRbf0r032W88p22c/5EBEmzM90m7Exl8XtyZbiDLw XlWC8RV2U3HSbU2IuTqdWP4Rd1AcBjLbHltqqOaDFd364Y2lsRHMQ41Z3RBtKzgBX4xQXD RkL4pptoAoMDYgYii7/XS8oBreFusIa6de+LhLqKGlZKDgkuPjvDnk+alhnekA== ARC-Authentication-Results: i=1; rspamd-77cfccfb8-cxhz5; auth=pass smtp.auth=hostingeremail smtp.mailfrom=carlos@maniero.me X-Sender-Id: hostingeremail|x-authuser|carlos@maniero.me X-MC-Relay: Neutral X-MailChannels-SenderId: hostingeremail|x-authuser|carlos@maniero.me X-MailChannels-Auth-Id: hostingeremail X-Minister-Versed: 344fcbe319f19d85_1730344396030_32718706 X-MC-Loop-Signature: 1730344396030:2889361974 X-MC-Ingress-Time: 1730344396030 Received: from uk-fast-smtpout10.hostinger.io (uk-fast-smtpout10.hostinger.io [145.14.155.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.101.211.53 (trex/7.0.2); Thu, 31 Oct 2024 03:13:16 +0000 From: Carlos Maniero DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=maniero.me; s=hostingermail1; t=1730344393; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EkUQ7SwY9cubvQtVMONsRCFgp75WAqAK2WjTiOVilFM=; b=mZaUTNCnK1AQnhmtziIG5uI5ZIJfUKOX2z8ZzEC8+9/YHNUuICGRx84PLYd5A30UVordq3 vFqebjvf5SEmMgOm+z3aYy0mxEujzGe7YKrPGUlcHWqldA6i+zXsyh/1r34/6Qi2mWb0TT a4I4Ol+/R+COFcQ3edp+iFb9DsD0sShDDwy7U9Y2Kec32Kua6m83h0Bs2x/lbT0pLQgsSS cFQup5PmD72aWdn1nCZxUpDDxvvUZPkk7r4Y0LTWGfLQ4lqwJrN4AO06EsGXY8YNXfGoeK s67sObVm+91AUXbcqXwHdvSXLI3GxPnuZpjBSYDx1Q1bhRPCuE0lHBbrFQGltw== To: ~johnnyrichard/olang-devel@lists.sr.ht Cc: Carlos Maniero Subject: [PATCH olang v2 2/6] semantics: resolve variable symbols Message-ID: <20241031031302.136553-3-carlos@maniero.me> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241031031302.136553-1-carlos@maniero.me> References: <20241031031302.136553-1-carlos@maniero.me> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Date: Thu, 31 Oct 2024 03:13:13 +0000 (UTC) X-CM-Envelope: MS4xfLFd1CpbxrDl6SnW9o0sL5rTD0GozzXC6SeOzKtZIT/h+XF9wSvAWHOraFzew7eInYZmdbpg9UX87qWfGMlANkG7dRikTJAIS3It1G1Rky7LgBN3SeOi xzOzS2bQStHP3vbp8O9aTzMS6H8z09Wut3vKOEEgWkXmK5v9gO2tjY2oUICypjjEMJ9n9vExcfmx1XV+1MDVol4QxJhA88ZmRl/LWyPlh1Uv1gBjR3oR7pUV X-CM-Analysis: v=2.4 cv=MYc+uI/f c=1 sm=1 tr=0 ts=6722f5c9 a=WwxFCuf3mf1fs3oSi6/dng==:117 a=WwxFCuf3mf1fs3oSi6/dng==:17 a=MKtGQD3n3ToA:10 a=1oJP67jkp3AA:10 a=KpkGK1pUnXIwxGg8brYA:9 a=BXDaF_L80NY05PYiAFlV:22 X-AuthUser: carlos@maniero.me X-Sourcehut-Patchset-Status: UNKNOWN List-Unsubscribe: List-Subscribe: List-Archive: Archived-At: List-Post: List-ID: ~johnnyrichard/olang-devel <~johnnyrichard/olang-devel.lists.sr.ht> Sender: ~johnnyrichard/olang-devel <~johnnyrichard/olang-devel@lists.sr.ht> X-Migadu-Country: NL X-Migadu-Flow: FLOW_IN X-Migadu-Scanner: mx11.migadu.com X-Migadu-Spam-Score: -1.50 X-Spam-Score: -1.50 X-Migadu-Queue-Id: A3467460D6 X-TUID: 8Varsob7q+KI This is a first step for semantics check. It introduces the symbol to ast_ident_t. It still does not check if types match neither print helpful errors. Why to add the symbol into the ast_ident_t? ------------------------------------------- The semantics is required to resolve the symbol in order to perform type checking and to check if the reference can be resolved. So, checking if the symbol exists and throw the found symbol away is a computational waste. Additionally, adding the symbol to the id removes complexity from codegen, once symbol_lookups will be no longer required. Why to create the resolve_symbols function? ------------------------------------------- If you look at the resolve_symbols and the populate_scope you should wonder "why there are two functions that traverse the entire ast?". The main reason is that olang allows reference before definition. So if something such as a function call refer to a function before it is defined the checker will fail. Meaning that it is required to first to populate the scope and make the symbols' registration for the entire AST and afterwards check for symbols. Signed-off-by: Carlos Maniero --- src/ast.c | 2 +- src/ast.h | 3 +- src/codegen_x86_64.c | 34 ++++------- src/pretty_print_ast.c | 2 +- src/type_checker.c | 125 ++++++++++++++++++++++++++++++++++++++++- 5 files changed, 137 insertions(+), 29 deletions(-) diff --git a/src/ast.c b/src/ast.c index 7bb5277..e6b518e 100644 --- a/src/ast.c +++ b/src/ast.c @@ -106,7 +106,7 @@ ast_new_node_var_def(arena_t *arena, node_var_def->loc = loc; ast_var_definition_t *var_def = &node_var_def->as_var_def; - var_def->id.name = id; + var_def->ident.name = id; var_def->type = type; var_def->value = value; diff --git a/src/ast.h b/src/ast.h index 33496a8..9d128d7 100644 --- a/src/ast.h +++ b/src/ast.h @@ -72,6 +72,7 @@ typedef struct ast_ident { string_view_t name; scope_t *scope; + symbol_t *symbol; } ast_ident_t; typedef struct ast_translation_unit @@ -108,7 +109,7 @@ typedef struct ast_fn_call typedef struct ast_var_definition { AST_NODE_HEAD; - ast_ident_t id; + ast_ident_t ident; type_t *type; ast_node_t *value; } ast_var_definition_t; diff --git a/src/codegen_x86_64.c b/src/codegen_x86_64.c index b695cc0..9b101f8 100644 --- a/src/codegen_x86_64.c +++ b/src/codegen_x86_64.c @@ -143,12 +143,10 @@ codegen_x86_64_emit_expression(codegen_x86_64_t *codegen, ast_node_t *expr_node) case AST_NODE_REF: { ast_ref_t ref = expr_node->as_ref; - symbol_t *symbol = scope_lookup(ref.ident.scope, ref.ident.name); - assert(symbol); - - size_t offset = codegen_x86_64_get_stack_offset(codegen, symbol); + size_t offset = + codegen_x86_64_get_stack_offset(codegen, ref.ident.symbol); - size_t bytes = type_to_bytes(symbol->type); + size_t bytes = type_to_bytes(ref.ident.symbol->type); fprintf(codegen->out, " mov -%ld(%%rbp), %s\n", @@ -597,18 +595,14 @@ codegen_x86_64_emit_expression(codegen_x86_64_t *codegen, ast_node_t *expr_node) switch (bin_op.lhs->kind) { case AST_NODE_REF: { ast_ref_t ref = bin_op.lhs->as_ref; - scope_t *scope = ref.ident.scope; - - symbol_t *symbol = - scope_lookup(scope, ref.ident.name); - assert(symbol); size_t offset = codegen_x86_64_get_stack_offset( - codegen, symbol); + codegen, ref.ident.symbol); codegen_x86_64_emit_expression(codegen, bin_op.rhs); - size_t type_size = type_to_bytes(symbol->type); + size_t type_size = + type_to_bytes(ref.ident.symbol->type); fprintf(codegen->out, " mov %s, -%ld(%%rbp)\n", get_reg_for(REG_ACCUMULATOR, type_size), @@ -669,12 +663,8 @@ codegen_x86_64_emit_expression(codegen_x86_64_t *codegen, ast_node_t *expr_node) ast_ref_t ref = unary_op.operand->as_ref; - symbol_t *symbol = - scope_lookup(ref.ident.scope, ref.ident.name); - assert(symbol); - - size_t offset = - codegen_x86_64_get_stack_offset(codegen, symbol); + size_t offset = codegen_x86_64_get_stack_offset( + codegen, ref.ident.symbol); fprintf( codegen->out, " lea -%ld(%%rbp), %%rax\n", offset); @@ -724,16 +714,12 @@ codegen_x86_64_emit_block(codegen_x86_64_t *codegen, ast_block_t *block) case AST_NODE_VAR_DEF: { ast_var_definition_t var_def = node->as_var_def; - scope_t *scope = var_def.id.scope; - - symbol_t *symbol = scope_lookup(scope, var_def.id.name); - assert(symbol); - size_t type_size = type_to_bytes(symbol->type); + size_t type_size = type_to_bytes(var_def.ident.symbol->type); codegen->base_offset += type_size; codegen_x86_64_put_stack_offset( - codegen, symbol, codegen->base_offset); + codegen, var_def.ident.symbol, codegen->base_offset); if (var_def.value) { codegen_x86_64_emit_expression(codegen, var_def.value); diff --git a/src/pretty_print_ast.c b/src/pretty_print_ast.c index 9c0e607..06b93f1 100644 --- a/src/pretty_print_ast.c +++ b/src/pretty_print_ast.c @@ -288,7 +288,7 @@ ast_node_to_pretty_print_node(ast_node_t *ast, arena_t *arena) char name[256]; sprintf(name, "Var_Definition ", - SV_ARG(var.id.name), + SV_ARG(var.ident.name), SV_ARG(var.type->id)); node->name = (char *)arena_alloc(arena, sizeof(char) * (strlen(name) + 1)); diff --git a/src/type_checker.c b/src/type_checker.c index abf7ae9..081034d 100644 --- a/src/type_checker.c +++ b/src/type_checker.c @@ -23,6 +23,9 @@ static void populate_scope(checker_t *checker, scope_t *scope, ast_node_t *ast); +static void +resolve_symbols(checker_t *checker, ast_node_t *ast); + checker_t * checker_new(arena_t *arena) { @@ -98,6 +101,7 @@ checker_check(checker_t *checker, ast_node_t *ast) scope_t *scope = scope_new(checker->arena); populate_scope(checker, scope, ast); + resolve_symbols(checker, ast); // TODO: traverse the ast tree to verify semantics } @@ -108,12 +112,27 @@ register_id(checker_t *checker, ast_ident_t *ident, type_t *type) { - ident->scope = scope; symbol_t *symbol = symbol_new(checker->arena, ident->name, type); + ident->scope = scope; + ident->symbol = symbol; + scope_insert(scope, symbol); } +static void +resolve_id(checker_t *checker, ast_ident_t *id) +{ + assert(checker); + + symbol_t *symbol = scope_lookup(id->scope, id->name); + + // FIXME: assert types and print a friendly error message + assert(symbol); + + id->symbol = symbol; +} + static void populate_scope(checker_t *checker, scope_t *scope, ast_node_t *ast) { @@ -231,7 +250,7 @@ populate_scope(checker_t *checker, scope_t *scope, ast_node_t *ast) type_resolve(ast->as_var_def.type); register_id( - checker, scope, &ast->as_var_def.id, ast->as_var_def.type); + checker, scope, &ast->as_var_def.ident, ast->as_var_def.type); populate_scope(checker, scope, ast->as_var_def.value); return; @@ -247,3 +266,105 @@ populate_scope(checker_t *checker, scope_t *scope, ast_node_t *ast) return; } } + +static void +resolve_symbols(checker_t *checker, ast_node_t *ast) +{ + assert(checker); + + switch (ast->kind) { + case AST_NODE_TRANSLATION_UNIT: { + list_item_t *item = list_head(ast->as_translation_unit.decls); + + while (item != NULL) { + resolve_symbols(checker, (ast_node_t *)item->value); + item = list_next(item); + } + return; + } + + case AST_NODE_FN_DEF: { + if (ast->as_fn_def.block != NULL) { + resolve_symbols(checker, ast->as_fn_def.block); + } + return; + } + + case AST_NODE_FN_CALL: { + list_item_t *item = list_head(ast->as_fn_call.args); + + while (item != NULL) { + resolve_symbols(checker, (ast_node_t *)item->value); + item = list_next(item); + } + + return; + } + + case AST_NODE_IF_STMT: { + resolve_symbols(checker, ast->as_if_stmt.cond); + resolve_symbols(checker, ast->as_if_stmt.then); + + if (ast->as_if_stmt._else) { + resolve_symbols(checker, ast->as_if_stmt._else); + } + + return; + } + + case AST_NODE_WHILE_STMT: { + resolve_symbols(checker, ast->as_while_stmt.cond); + resolve_symbols(checker, ast->as_while_stmt.then); + + return; + } + + case AST_NODE_BINARY_OP: { + ast_binary_op_t bin_op = ast->as_bin_op; + + resolve_symbols(checker, bin_op.lhs); + resolve_symbols(checker, bin_op.rhs); + return; + } + + case AST_NODE_UNARY_OP: { + ast_unary_op_t unary_op = ast->as_unary_op; + + resolve_symbols(checker, unary_op.operand); + return; + } + + case AST_NODE_RETURN_STMT: { + ast_return_stmt_t return_stmt = ast->as_return_stmt; + + resolve_symbols(checker, return_stmt.value); + return; + } + + case AST_NODE_BLOCK: { + ast_block_t block = ast->as_block; + + list_item_t *item = list_head(block.nodes); + + while (item != NULL) { + resolve_symbols(checker, (ast_node_t *)item->value); + item = list_next(item); + } + + return; + } + + case AST_NODE_VAR_DEF: { + resolve_symbols(checker, ast->as_var_def.value); + return; + } + + case AST_NODE_REF: { + resolve_id(checker, &ast->as_ref.ident); + return; + } + case AST_NODE_LITERAL: + case AST_NODE_UNKNOWN: + return; + } +} -- 2.46.1