From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mail-a.sr.ht; dkim=pass header.d=maniero.me header.i=@maniero.me Received: from dog.elm.relay.mailchannels.net (dog.elm.relay.mailchannels.net [23.83.212.48]) by mail-a.sr.ht (Postfix) with ESMTPS id 27FEB202F5 for <~johnnyrichard/olang-devel@lists.sr.ht>; Wed, 28 Feb 2024 14:26:08 +0000 (UTC) X-Sender-Id: hostingeremail|x-authuser|carlos@maniero.me Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 61FAA540FE7 for <~johnnyrichard/olang-devel@lists.sr.ht>; Wed, 28 Feb 2024 14:26:07 +0000 (UTC) Received: from uk-fast-smtpout2.hostinger.io (unknown [127.0.0.6]) (Authenticated sender: hostingeremail) by relay.mailchannels.net (Postfix) with ESMTPA id 7B9AB5420A6 for <~johnnyrichard/olang-devel@lists.sr.ht>; Wed, 28 Feb 2024 14:26:06 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1709130367; a=rsa-sha256; cv=none; b=K+cn+qeXceVmphHmbgAbNM2UCXWCCDUwxupKCNH+dFGzXo7qDw6Saj7M2gzrKJOyvEg/rp 07lbwWL/8yI3kaJKu3DOY/HtPX0YsGOIICKpnLuHwiQ/ivpwH9PjHwkSE+Ba33ei6sxfu/ dH9J4xtwvM2sdH6qqMLa9UrtdyDnxXgnPRJdpiGDQ9OnNtcMs7zOM1NIqtBGpFzxwHPrrr wzO5QBxf/agzl7pAW8+vU4B3PD8sZ4PPfjsAgRs4iNWvKCztNzbIBEbEDpRUywX7S6n0kg HFipicw4dwj6cIagVOcQ9yqbz+6K5Xk4PFKb6yp4n5wk9PbWBYhDbdN8YjAykg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1709130367; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding:dkim-signature; bh=VPmqvvlQwoA3keLnMXEHhqlu73g5Q/+V4dXV648ARXI=; b=LCK+t2BjQXQPJN0LBV9MsOgDEIppEVa2p1yPfvKHChnkp6FSkdBmjS0iEUIZImM0xhn91b pYC+1YyVBvi0ysax5JW1Cm6vuun06nd5Ol9zVQHnQQsOKst/f7qjGi90+FkFG3+eI9m45y DibR1Mch7cpPKSZOz6MOCPT+5spX1fiBnSXBePBbDjqlhLC9OyvDemujEErQDLu+TAbVco f+CRB+7orDu6Sucj1dMx3IOUr03axvuzcZSPJyzDVVbnknAa81E/fDG+DX3Z48BjVF+Isj qfpdle73/P1sFD1Wr16oF/YHTMf+BcR8JsWVyPDE0usYzt/6g651vq9ZcAwP4g== ARC-Authentication-Results: i=1; rspamd-55b4bfd7cb-2qlbj; auth=pass smtp.auth=hostingeremail smtp.mailfrom=carlos@maniero.me X-Sender-Id: hostingeremail|x-authuser|carlos@maniero.me X-MC-Relay: Neutral X-MailChannels-SenderId: hostingeremail|x-authuser|carlos@maniero.me X-MailChannels-Auth-Id: hostingeremail X-Bubble-Trail: 4af555e755b84ff0_1709130367272_1813451892 X-MC-Loop-Signature: 1709130367272:2638148846 X-MC-Ingress-Time: 1709130367272 Received: from uk-fast-smtpout2.hostinger.io (uk-fast-smtpout2.hostinger.io [31.220.23.36]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.111.52.252 (trex/6.9.2); Wed, 28 Feb 2024 14:26:07 +0000 From: Carlos Maniero DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=maniero.me; s=hostingermail1; t=1709130360; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=VPmqvvlQwoA3keLnMXEHhqlu73g5Q/+V4dXV648ARXI=; b=CD9jGkoO+o8tf3SXtSAuUXpb/Hu6nQ4+xCFOxjBhM9DOEBbVU8ZLwHM+9my7fsJSpIduQS KmWg15g/c/sLE42tkez8SzIA4UUTOjiiaACn729lC2eTg/rHwOK+Al0DhyFolJN8H0SsaL 73ojazMRBX+28dSZH8SLvZ0a/fuK8XYxPUHY/2hsHxtk6N3zhOV+xcvReeUnC3cXRoCHjc I/pcCbq+JDHt4Am2Lk671BOd82EyMFUgZY2qJ3R01Wtnhxxsco1H8oXHL1y7TuiH+hZfUN K0bl2Nv3E88jsbZ8Ozb3TMgpQxfElo26g8+0c7JtQf3uUzI2m8FpufMIuvPurg== To: ~johnnyrichard/olang-devel@lists.sr.ht Cc: Carlos Maniero Subject: [PATCH olang v3] arena: optimization: ensure alignment memory access Date: Wed, 28 Feb 2024 11:25:34 -0300 Message-Id: <20240228142534.1810524-1-carlos@maniero.me> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-CM-Analysis: v=2.4 cv=WakKaVhX c=1 sm=1 tr=0 ts=65df4279 a=5+VMC1FZ3J4mVPAKpPmAqg==:117 a=5+VMC1FZ3J4mVPAKpPmAqg==:17 a=IkcTkHD0fZMA:10 a=MKtGQD3n3ToA:10 a=1oJP67jkp3AA:10 a=BXDaF_L80NYA:10 a=QyXUC8HyAAAA:8 a=VwQbUJbxAAAA:8 a=82LIuqY_CL0C6oROu4YA:9 a=3ZKOabzyN94A:10 a=QEXdDO2ut3YA:10 a=Rpk92cw-xUYA:10 a=-Hvd4qO6MZgA:10 a=AjGcO6oz07-iQ99wixmX:22 X-CM-Envelope: MS4xfDiYNoOAhHRDT4XinIl8ltkqCvYoUh2Llbp+zx75UdlRoKvr3mmdjKJhIjOpgIPzH8ZwfVU5Nit1bXA1m8fjzr8+sJ+mApLuQtRAlzJcd4i92wyLBFfw rI6s+zlhns7Q2r/fHeJAAXGvjZb8ZCVIZnFqZWwY6njIBbb/LAIBMX8uPvFSwQ5Ayuoo3ZfpJVXn5lDo7v//xCH5dTIyTYHG3CzfFqDyjm1/RhiBxhIknWaZ X-AuthUser: carlos@maniero.me X-TUID: zaLvBQhowWS7 This commit changes the pointers returned by *arena_alloc* to always be 16 bytes aligned. Non-aligned data structure could have a huge impact on performance. Take the example bellow: int main() { void *pointer = malloc(1024); int offset = 0; long long* data = pointer + offset; for (int i = 0; i < INT_MAX; i++) { *data += i; } printf("result = %lld", *data); } These are the execution times in my machine: +----------+----------------+ | Offset | Execution time | +----------+----------------+ | 0 bytes | 0m1.655s | | 1 bytes | 0m2.286s | | 2 bytes | 0m2.282s | | 4 bytes | 0m1.716s | | 8 bytes | 0m1.712s | | 16 bytes | 0m1.665s | +----------+----------------+ The reason of the performance degradation can be found at Intel's manual [1]: > To improve the performance of programs, data structures (especially > stacks) should be aligned on natural boundaries whenever possible. The > reason for this is that the processor requires two memory accesses to > make an unaligned memory access; aligned accesses require only one > memory access. Double Quadwords has 16 bytes natural boundary and this is the highest natural boundary possible on a x86_64 architecture. Also, all other natural boundaries are power of two, meaning that any other word will also be aligned when using 16 bytes alignment. You can learn more about memory alignment on Drake's and Berg's "Unaligned Memory Accesses" article [2]. [1]: Intel® 64 and IA-32 Architectures Software Developer’s Manual Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" Chapter 4.1.1 "Alignment of Words, Doublewords, Quadwords, and Double Quadwords https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html [2]: https://www.kernel.org/doc/html/next/_sources/core-api/unaligned-memory-access.rst.txt Signed-off-by: Carlos Maniero --- v3: - Include more details on commit message explaining why it metters. - Enforces 16 bytes aligment on all platforms. - Remove duplicated test scenario (overflow test). src/arena.c | 16 +++++++++++--- src/arena.h | 3 +++ tests/unit/arena_test.c | 48 +++++++++++++++++++++++++++++++++++------ 3 files changed, 58 insertions(+), 9 deletions(-) diff --git a/src/arena.c b/src/arena.c index ae33e6a..ad2e535 100644 --- a/src/arena.c +++ b/src/arena.c @@ -28,14 +28,18 @@ arena_new(size_t size) return arena; } +static uint8_t +arena_padding(size_t bytes); + void * -arena_alloc(arena_t *arena, size_t size) +arena_alloc(arena_t *arena, size_t bytes) { - if ((arena->offset + size) > arena->size) { + if ((arena->offset + bytes) > arena->size) { return NULL; } void *pointer = arena->region + arena->offset; - arena->offset += size; + arena->offset += bytes + arena_padding(bytes); + return pointer; } @@ -51,3 +55,9 @@ arena_free(arena_t *arena) arena->size = 0; free(arena->region); } + +static uint8_t +arena_padding(size_t bytes) +{ + return (ARENA_ALIGNMENT_BYTES - bytes) & ARENA_ALIGNMENT_BYTES_MASK; +} diff --git a/src/arena.h b/src/arena.h index 157165c..03fd803 100644 --- a/src/arena.h +++ b/src/arena.h @@ -19,6 +19,9 @@ #include #include +#define ARENA_ALIGNMENT_BYTES 16 +#define ARENA_ALIGNMENT_BYTES_MASK (ARENA_ALIGNMENT_BYTES - 1) + typedef struct arena { size_t offset; diff --git a/tests/unit/arena_test.c b/tests/unit/arena_test.c index 13f406f..a471572 100644 --- a/tests/unit/arena_test.c +++ b/tests/unit/arena_test.c @@ -19,14 +19,14 @@ #include "munit.h" static MunitResult -arena_test(const MunitParameter params[], void *user_data_or_fixture) +arena_alloc_test(const MunitParameter params[], void *user_data_or_fixture) { - arena_t arena = arena_new(sizeof(int) * 2); + arena_t arena = arena_new(ARENA_ALIGNMENT_BYTES * 2); - int *a = arena_alloc(&arena, sizeof(int)); + uint8_t *a = arena_alloc(&arena, sizeof(uint8_t)); *a = 1; - int *b = arena_alloc(&arena, sizeof(int)); + uint8_t *b = arena_alloc(&arena, sizeof(uint8_t)); *b = 2; munit_assert_int(*a, ==, 1); @@ -34,7 +34,7 @@ arena_test(const MunitParameter params[], void *user_data_or_fixture) arena_release(&arena); - int *c = arena_alloc(&arena, sizeof(int)); + uint8_t *c = arena_alloc(&arena, sizeof(uint8_t)); *c = 3; munit_assert_int(*c, ==, 3); @@ -49,7 +49,43 @@ arena_test(const MunitParameter params[], void *user_data_or_fixture) return MUNIT_OK; } -static MunitTest tests[] = { { "/arena_test", arena_test, NULL, NULL, MUNIT_TEST_OPTION_NONE, NULL }, +static MunitResult +arena_padding_test(const MunitParameter params[], void *user_data_or_fixture) +{ + arena_t arena = arena_new(512); + + // Allocated bytes is < ARENA_ALIGNMENT_BYTES + uint8_t *a = arena_alloc(&arena, sizeof(uint8_t)); + uint8_t *b = arena_alloc(&arena, sizeof(uint8_t)); + + munit_assert_int((b - a) % ARENA_ALIGNMENT_BYTES, ==, 0); + munit_assert_int(b - a, ==, ARENA_ALIGNMENT_BYTES); + + arena_release(&arena); + + // Allocated bytes is == ARENA_ALIGNMENT_BYTES + a = arena_alloc(&arena, ARENA_ALIGNMENT_BYTES); + b = arena_alloc(&arena, sizeof(uint8_t)); + + munit_assert_int((b - a) % ARENA_ALIGNMENT_BYTES, ==, 0); + munit_assert_int(b - a, ==, ARENA_ALIGNMENT_BYTES); + + arena_release(&arena); + + // Allocated bytes is > ARENA_ALIGNMENT_BYTES + a = arena_alloc(&arena, ARENA_ALIGNMENT_BYTES + 1); + b = arena_alloc(&arena, sizeof(uint8_t)); + + munit_assert_int((b - a) % ARENA_ALIGNMENT_BYTES, ==, 0); + munit_assert_int(b - a, ==, ARENA_ALIGNMENT_BYTES * 2); + + arena_free(&arena); + + return MUNIT_OK; +} + +static MunitTest tests[] = { { "/arena_alloc_test", arena_alloc_test, NULL, NULL, MUNIT_TEST_OPTION_NONE, NULL }, + { "/arena_padding_test", arena_padding_test, NULL, NULL, MUNIT_TEST_OPTION_NONE, NULL }, { NULL, NULL, NULL, NULL, MUNIT_TEST_OPTION_NONE, NULL } }; static const MunitSuite suite = { "/arena", tests, NULL, 1, MUNIT_SUITE_OPTION_NONE }; -- 2.34.1