From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms1.migadu.com with LMTPS id YOYGDb7xHmcdXwEA62LTzQ:P1 (envelope-from ) for ; Mon, 28 Oct 2024 03:06:54 +0100 Received: from aspmx1.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1.migadu.com with LMTPS id YOYGDb7xHmcdXwEA62LTzQ (envelope-from ) for ; Mon, 28 Oct 2024 03:06:54 +0100 X-Envelope-To: patches@johnnyrichard.com Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=lists.sr.ht header.s=20240113 header.b=epK6BzpM; dkim=pass header.d=maniero.me header.s=hostingermail1 header.b=RWVucVaa; spf=pass (aspmx1.migadu.com: domain of lists@sr.ht designates 46.23.81.152 as permitted sender) smtp.mailfrom=lists@sr.ht; dmarc=pass (policy=none) header.from=maniero.me; arc=pass ("mailchannels.net:s=arc-2022:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=johnnyrichard.com; s=key1; t=1730081214; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding:list-id: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=ISAy5Pr75XAGI9ywqAfeWwAJnz+EAaFKvPQr0sLeZYY=; b=Fz46xU4FtwzWdeYbMiLpKj/fJ6m/U0RWjNFdzsB+DAsVbVexH8vIJHTR4EcnEhKgoIzuoQ 1a3UpGYmqrorTo5zzNL71bUG5TPKllu4xOdHxOKT4OGXgfNub1RLfGIyQo3anFDkPfefO4 WOk7vm3h5zA5G80qa9ZzW/u6Z4+S7aKOMq/i/eiSMGnqTVr7MDYrWtc4AR9mQJTSAqVsXv TLLFfnMnTMRj2DBsgF2kO8F6zXLq/XWLWjhCsueEX3Ga8haQVIpRCGxBK+otPf9JbfFBmI bk83fH4RH6l8+ynUpdu85IWjc2iCcGzlmh0MzUULHYUm3dw7w35klgFLIZHpcQ== ARC-Authentication-Results: i=2; aspmx1.migadu.com; dkim=pass header.d=lists.sr.ht header.s=20240113 header.b=epK6BzpM; dkim=pass header.d=maniero.me header.s=hostingermail1 header.b=RWVucVaa; spf=pass (aspmx1.migadu.com: domain of lists@sr.ht designates 46.23.81.152 as permitted sender) smtp.mailfrom=lists@sr.ht; dmarc=pass (policy=none) header.from=maniero.me; arc=pass ("mailchannels.net:s=arc-2022:i=1") ARC-Seal: i=2; s=key1; d=johnnyrichard.com; t=1730081214; a=rsa-sha256; cv=pass; b=YmdewBbzUkL4RETBTSG7f9RaDHyarqRVrNkn0fDxPfodhmHC30XkPHBqrDMr+Dvze0j3DY KKeRInppUdFF9Lu9z3q2w8YFGidIN//Mc7SwCIQB8Yy/zvpMIOhEHN8Tm2b70KrNMWNCD1 AAp4AJagr41XjVR6UauTZJ8qi+SGxcHj9wB68aBMDYBW7gWr3OR7WINjG6URZqI2MzUJwb CEsgV73GK+ARyx4kmzVrCLe8lA2l9A0hpXSb5/RmPOBrrmKzoCmUDGEVHi5GrEt7Ro6phD a3FWUgME28ZjcFbr95cNbpjmM2DKNcvlkSg6rkKmJQZ62XDmTldm82MRTTxjsg== Received: from mail-a.sr.ht (mail-a.sr.ht [46.23.81.152]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 9447541A77 for ; Mon, 28 Oct 2024 03:06:50 +0100 (CET) DKIM-Signature: a=rsa-sha256; bh=T7My5VGcei/bMkWE78S2n1hCcEM9S4tPxBpr3Ttd4fY=; c=simple/simple; d=lists.sr.ht; h=From:To:Cc:Subject:Date:List-Unsubscribe:List-Subscribe:List-Archive:List-Post:List-ID; q=dns/txt; s=20240113; t=1730081209; v=1; b=epK6BzpMIuiwe4Ncs4fvZOaRXjH9e7wsJDHr45JGOH8ZP/g/Sew32H6MCtdCoIFGS257s5zW u1Eea1QwgHh77OJAe6+tWxvTfNEk+cACgQwRVunO4dmUr+zw2qEfxF/xayizJcyOMD5U4V+P92X Evni5738vkbSZIpgQ/s1bgaMWe2jcYSTbLzMTKg+T9OAdB/BlV7yBOW7TjOpcKwCefyGzAQEkM2 t3EILWmpPDsYk0G+Pn1x2SBZJnVsK3oD7I+CqDNgJb4ah2bCj/eDOLGYf4zwZZILn0H5hUlKVTF /p69NMongjOHvRwB6I0m4L05z4ieF2EHvccPIJpRHMVMg== Received: from lists.sr.ht (unknown [46.23.81.154]) by mail-a.sr.ht (Postfix) with ESMTPSA id 62E37202A0 for ; Mon, 28 Oct 2024 02:06:49 +0000 (UTC) Received: from poodle.tulip.relay.mailchannels.net (poodle.tulip.relay.mailchannels.net [23.83.218.249]) by mail-a.sr.ht (Postfix) with ESMTPS id 44BCA2013E for <~johnnyrichard/olang-devel@lists.sr.ht>; Mon, 28 Oct 2024 02:06:35 +0000 (UTC) X-Sender-Id: hostingeremail|x-authuser|carlos@maniero.me Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 535412C42C0 for <~johnnyrichard/olang-devel@lists.sr.ht>; Mon, 28 Oct 2024 02:06:33 +0000 (UTC) Received: from fr-int-smtpout5.hostinger.io (trex-3.trex.outbound.svc.cluster.local [100.103.25.208]) (Authenticated sender: hostingeremail) by relay.mailchannels.net (Postfix) with ESMTPA id 654AB2C423C for <~johnnyrichard/olang-devel@lists.sr.ht>; Mon, 28 Oct 2024 02:06:32 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1730081192; a=rsa-sha256; cv=none; b=iCd5HHx2RC7U3JRRHzkxF9AUQ+cEaJXDVb/ncrf+DvnArp4jV/+fInC2SDf1cYrGp35GEB ei4haYDYLHyLTKldWuaHkpgMsc4nf6PLg/hAfVRmlmk9Jzz+Ym7qdGpLPOTth9jKL0qC4b hHTSkzk6dTpL2vYvmv2A2HdcRmRrsA0c/f3hEtpIWzOkYTADA+TyaCHFOfYcySRtfUZZRX oMrKSb+zcK667gdk9+aptk+EhHjZWh9Wd1QUD+ogu644T3aa10tbOAh85Z404sYvQ5ncVy PF7brlkT8yLf0DYzRCWvEE5yVaYZgw1OdcgB4MG4EDyK5OWl4FBqO9fmWT5P6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1730081192; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding:dkim-signature; bh=ISAy5Pr75XAGI9ywqAfeWwAJnz+EAaFKvPQr0sLeZYY=; b=czrOls/Ci7+CkTEMhXzjUBD6PNfuC2knpiLhyER5WO9hWCbh4lDg1OmENJM6dkMf/cd/rK bwbaPszgxy9AoUMZQhS4vp0ydxiYA077G9GbdSS3cxO62h68OaTCtchhFSZYCzYf114RQP 3gzBCHvfhorXKBS2SO9o5GpPx9Vr2TKwss3qu708IypIb/PbP418Lr0RYPaaPaPzZnETcF aaHne3A0mztTiMwWFbvBeAuVc/CKqET+pHu5WANSrwhi8/gD2sTpjiL4y15rdcPaA/1xfI Vid0sFHbubTYHZmL7rJ/E+cuE1sdWWKJFL0RYmUDDpf3zu00VGifTMsYjGPmJw== ARC-Authentication-Results: i=1; rspamd-7c89d756bb-f2ddq; auth=pass smtp.auth=hostingeremail smtp.mailfrom=carlos@maniero.me X-Sender-Id: hostingeremail|x-authuser|carlos@maniero.me X-MC-Relay: Neutral X-MailChannels-SenderId: hostingeremail|x-authuser|carlos@maniero.me X-MailChannels-Auth-Id: hostingeremail X-Battle-Oafish: 662593a14dc375f6_1730081193081_1651185425 X-MC-Loop-Signature: 1730081193081:3456268592 X-MC-Ingress-Time: 1730081193081 Received: from fr-int-smtpout5.hostinger.io (fr-int-smtpout5.hostinger.io [89.116.146.168]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.103.25.208 (trex/7.0.2); Mon, 28 Oct 2024 02:06:33 +0000 From: Carlos Maniero DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=maniero.me; s=hostingermail1; t=1730081190; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=ISAy5Pr75XAGI9ywqAfeWwAJnz+EAaFKvPQr0sLeZYY=; b=RWVucVaahSaQk6VyAsWy54GhHvien/soXhKkRTzqlfFFjPmjHst5wguGr8PMFmnKp4PyA7 d+GkrP7DqdPYno5mGthuRkA1Dmi5qjgE5uDeGYlMssvDjUzLGi1N3kYXihZqwoJi6Ww5Bw sr9VMB0ixpEpiRmwyLylT3bV2O7OH/2I7ClI2HML2QRukVmKgVAi2A4cTT6XzpU/LfXjEQ BWoq+yEnd27dQGh2Y+1jkgStV8YLoslsMSv0UAnQINj7+8syeZl/6vKwOgFfeRs5AUZL0C 2Y+3j8MeESVan54B9dYub7mlBgxwQ/e3spkbK7a1jTMwzTLCz1u6GVHtjDTaJQ== To: ~johnnyrichard/olang-devel@lists.sr.ht Cc: Carlos Maniero Subject: [PATCH olang] docs: add script to generate man pages from docstring Message-ID: <20241028020233.69521-2-carlos@maniero.me> X-Mailer: git-send-email 2.46.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Date: Mon, 28 Oct 2024 02:06:29 +0000 (UTC) X-CM-Envelope: MS4xfNEXVSwv0cdBcQOJaGTu6bHCN7npt8aGZcLGRWB0fWwP/oUWzivmZnu6u3m/+ynvjtsL3JMR/A6wI2ZgO2qfvwpZLDj7ga+7wxuDqTk36Vf7U7MtPLN2 iJrkVrlQ/XYvpN1ExZb0ICNs7aJqIJ+yxXBb8Pmij/nXJn0Y1RGvsPhF43Tc1Ogp29n1mAqhz8l266OZzghp7WIokLquwxSZxUR+Clr8cVc8kQpAPs4whtq4 X-CM-Analysis: v=2.4 cv=Z6G+H2RA c=1 sm=1 tr=0 ts=671ef1a6 a=WwxFCuf3mf1fs3oSi6/dng==:117 a=WwxFCuf3mf1fs3oSi6/dng==:17 a=MKtGQD3n3ToA:10 a=1oJP67jkp3AA:10 a=jziKOLAnAAAA:8 a=mDV3o1hIAAAA:8 a=akpiK5phkrpVwx6QWrkA:9 a=1ljl-wuu_96014yfT5Vk:22 a=BXDaF_L80NY05PYiAFlV:22 X-AuthUser: carlos@maniero.me X-Sourcehut-Patchset-Status: PROPOSED List-Unsubscribe: List-Subscribe: List-Archive: Archived-At: List-Post: List-ID: ~johnnyrichard/olang-devel <~johnnyrichard/olang-devel.lists.sr.ht> Sender: ~johnnyrichard/olang-devel <~johnnyrichard/olang-devel@lists.sr.ht> X-Migadu-Country: NL X-Migadu-Flow: FLOW_IN X-Migadu-Scanner: mx11.migadu.com X-Migadu-Spam-Score: -1.52 X-Spam-Score: -1.52 X-Migadu-Queue-Id: 9447541A77 X-TUID: 5jI8t0YTcscY There is no initial intention to make this manuals public but just to support the core developers. Signed-off-by: Carlos Maniero --- The Makefile was inspired on the test/olc/Makefile and is supporting parallelism. If you have manpath in your system you can use the following configuration: echo "MANDATORY_MANPATH /the_location_to_/olang/docs/man/" >> ~/.manpath Makefile | 7 + scripts/gen-docstring | 704 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 711 insertions(+) create mode 100755 scripts/gen-docstring diff --git a/Makefile b/Makefile index 58526ac..2d021d2 100644 --- a/Makefile +++ b/Makefile @@ -20,6 +20,7 @@ BUILDDIR := build SRCS := $(wildcard $(SRCDIR)/*.c) HEADERS := $(wildcard $(SRCDIR)/*.h) +DEV_MAN_PAGES := $(patsubst %, %.man, $(HEADERS)) OBJS := $(patsubst $(SRCDIR)/%.c, $(BUILDDIR)/%.o, $(SRCS)) .PHONY: all @@ -115,6 +116,12 @@ docs: docs-dist: $(MAKE) -C docs dist +.PHONY: +dev-man-docs: $(DEV_MAN_PAGES) + $(BUILDDIR)/%.o: $(SRCDIR)/%.c @$(CC) $(CFLAGS) -c $< -o $@ @printf 'CC\t%s\n' '$@' + +$(SRCDIR)/%.man: $(SRCDIR)/% + @./scripts/gen-docstring $< `basename $<` docs/man/ diff --git a/scripts/gen-docstring b/scripts/gen-docstring new file mode 100755 index 0000000..ec3afcb --- /dev/null +++ b/scripts/gen-docstring @@ -0,0 +1,704 @@ +#!/bin/python3 + +# Copyright (C) 2024 Carlos Maniero +# Copyright (C) 2024 Johnny Richard +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# TODO: Create a man for the text bellow: +""" +gen-docstring - Parses docstring into man. + +USAGE: + ./scripts/gen-docstring file_name import_name man_output_dir + +RESTRICTIONS: + + - It works for macros definition, typedef struct, typedef union, functions + - It may not work with definitions that uses macros + - It does not works well with inner structs/union + +SUPPORTED FEATURES: + + BASIC DOCSTRING: + + /** + * This is the man page title + */ + typedef struct my_struct + { + } my_struct_t; + + This will generate a man page for the my_struct_t. + + SECTIONS: + + All sections must start with a capital letter and must include a dash + line above. + + /** + * This is the man page title + * + * This is my description + * .B it supports groff syntax + * + * This is my section + * ------------------ + * + * This is my description + * .B it supports groff syntax + */ + + All the content disposed right after the man page belongs to the + description section if no section is defined. + + GROUPS: + + You may want to group many docstrings into a single man page. So you + can use group_name(3) where 3 is the man level. + + /** + * my_group(3) - This is the man page title + * + * Description of the first item + */ + ... + + /** + * my_group(3) - This text will be ignored + * + * Description of the second item + */ + ... + + The script will generate a single file within the group_name the other + declarations will be linked to the file within the group_name. + + The title of the declaration that matches the structure been commented + will be used as the man page title. +""" + +import os +import sys +import re +from enum import Enum +from collections import namedtuple + +Ctx = namedtuple('Ctx', 'filename include_name man_output_dir') + +ctx = None + +Token = namedtuple('Token', 'kind value location') + +token_rules = [ + docstr_rule := r'(?P/\*\*.*?\*/)', + bcomment_rule := r'(?P/\*.*?\*/)', + comment_rule := r'(?P//[^\n]*)', + keyword_rule := r'(?Ptypedef|struct|union|enum)[\s]', + marco_rule := r'(?P#[a-zA-Z_][a-zA-Z0-9_]*)', + string_rule := r'(?P"([^"\\]*(\\.[^"\\])*)*")', + hex_rule := r'(?P0[xX][0-9a-fA-F]+)', + octal_rule := r'(?P0[oO]?[0-8]+)', + binary_rule := r'(?P0[bB]?[0-1]+)', + decimal_rule := r'(?P\d+)', + char_rule := r'(?P\'[\w ]\')', + comma_rule := r'(?P,)', + dot_rule := r'(?P\.)', + arrow_rule := r'(?P->)', + lshift_rule := r'(?P<<)', + rshift_rule := r'(?P>>)', + semicolon_rule := r'(?P;)', + eq_rule := r'(?P=)', + star_rule := r'(?P\*)', + plus_rule := r'(?P\+)', + slash_rule := r'(?P/)', + and_rule := r'(?P&)', + pipe_rule := r'(?P\|)', + tilde_rule := r'(?P~)', + bang_rule := r'(?P!)', + dash_rule := r'(?P-)', + lbrace_rule := r'(?P{)', + rbrace_rule := r'(?P})', + lparen_rule := r'(?P\()', + rparen_rule := r'(?P\))', + ident_rule := r'(?P[a-zA-Z_][a-z-A-Z0-9_]*)', + ws_rule := r'(?P\s+)', +] + +tokenizer_pattern = re.compile('|'.join(token_rules), re.DOTALL) + +def tokenize(code): + pos = 0 + tokens = [] + while pos < len(code): + match = tokenizer_pattern.match(code, pos) + if match: + token_kind = match.lastgroup + + if token_kind != 'WS': + tokens.append(Token(token_kind, match.group(token_kind), match.span())) + pos = match.end() + else: + tokens.append(Token('UNKNOWN', code[pos], (pos, pos + 1))) + pos += 1 + return tokens + + +re_section_name = r'[A-Z].*$' +re_group = r'(([a-zA-Z_]\w+)\((.*?)\))' +re_group_title = r'^(?P([a-zA-Z_]\w+))\((?P\d)\)( - (?P.*))?' + + +Field = namedtuple('Field', 'name type') +EnumField = namedtuple('EnumField', 'name value') + + +class Section: + def __init__(self, name, contents): + self.name = name + self.contents = contents + self.subsections = [] + + +NodeKind = Enum('NodeKind', ['FUNCTION', 'TYPEDEF_STRUCT', 'TYPEDEF_UNION', 'TYPEDEF_ENUM', 'MACRO']) + + +class DocString: + def __init__(self, comment_lines, code_node): + self.comment_lines = comment_lines + self.code_node = code_node + + def get_name(self): + return self.code_node.get_name() + + def get_group_name(self): + if self.is_group(): + return re.match(re_group_title, self.comment_lines[0]).group('group_name') + return self.get_name() + + def is_group(self): + return re.match(re_group_title, self.comment_lines[0]) + + def get_man_level(self): + if self.is_group(): + return re.match(re_group_title, self.comment_lines[0]).group('man_level') + return 3 + + def is_entry_doc(self): + return self.get_group_name() == self.get_name() + + def get_title(self): + if self.is_group(): + return re.match(re_group_title, self.comment_lines[0]).group('title') + return self.comment_lines[0] + + def get_description(self): + description = "" + for line in self.comment_lines[2:]: + if line.startswith('='): + break + elif len(description) > 0 and description[-1] != '\n': + description += ' ' + description += line + '\n' + return description.strip() + + def get_sections(self): + sections = [] + + section_name = None + section_contents = "" + + lines = self.comment_lines[2:] + index = 0 + + while len(lines) > index: + line = lines[index] + next_line = None + + if len(lines) > index + 1: + next_line = lines[index + 1] + + if re.match(re_section_name, line) and (next_line and re.match('-+', next_line)): + if section_name: + sections.append(Section(section_name, section_contents[1:-1])) + section_name = None + section_contents = "" + section_name = line.strip().upper() + index += 1 + elif index == 0: + section_name = 'DESCRIPTION' + section_contents += line + '\n' + elif section_name: + section_contents += line + '\n' + + index += 1 + + if section_name: + sections.append(Section(section_name, section_contents)) + + return sections + + +class DocStringFile: + def __init__(self, docstrings): + self.docstrings = docstrings + + def get_name(self): + return self.docstrings[0].get_group_name() + + def get_man_level(self): + return self.docstrings[0].get_man_level() + + def get_group_name(self): + names = [] + + for docstring in self.docstrings: + names.append(docstring.get_name()) + return ", ".join(names) + + def is_group(self): + return len(self.docstrings) > 1 + + def get_title(self): + return self.docstrings[0].get_title() + + def get_description(self): + return self.docstrings[0].get_description() + + def get_filepath(self): + man_level = self.get_man_level() + filename = f'{self.get_name()}.{man_level}' + return os.path.join(ctx.man_output_dir, f'man{man_level}', filename) + + def get_links(self): + man_level = self.get_man_level() + to = os.path.join(f'man{man_level}', f'{self.get_name()}.{man_level}') + + for docstring in self.docstrings: + if docstring.is_entry_doc(): + continue + filename = f'{docstring.get_name()}.{man_level}' + yield (os.path.join(ctx.man_output_dir, f'man{man_level}', filename), to) + + def get_sections(self): + if not self.is_group(): + return self.docstrings[0].get_sections() + + sections = [] + + for docstring in self.docstrings: + for section in docstring.get_sections(): + parent_section = next(filter(lambda x: x.name == section.name, sections), None) + + if not parent_section: + parent_section = Section(section.name, '') + sections.append(parent_section) + + parent_section.subsections.append(section) + section.name = docstring.get_name() + + return sections + + +class MacroNode: + def __init__(self, tokens, file_contents): + self.tokens = tokens + self.kind = NodeKind.MACRO + self.file_contents = file_contents + + def get_name(self): + return self.tokens[1].value + + def get_contents(self): + start_location = self.tokens[0].location[0] + + index = 0; + + while True: + cur_char = self.file_contents[start_location + index] + next_char = self.file_contents[start_location + index + 1] + + if next_char == '\n' and cur_char != '\\': + break + + index += 1 + + return self.file_contents[start_location:start_location+index+1] + + +class TypedefNode: + def __init__(self, tokens, kind): + self.tokens = tokens + self.kind = kind + + def get_name(self): + tokens = self.tokens + + # FIXME: support typdef without brackets + while len(tokens) > 0 and tokens[0].value != '{': + tokens = tokens[1:] + else: + tokens = tokens[1:] + + open_brackets = 1 + + while len(tokens) > 0 and open_brackets != 0: + if tokens[0].value == '{': + open_brackets += 1 + elif tokens[0].value == '}': + open_brackets -= 1 + tokens = tokens[1:] + + if len(tokens) == 0: + raise Exception("could not find the typedef name") + + return tokens[0].value + + def get_declaration(self): + final_index = 0 + + tokens = self.tokens + while tokens[0].value != '{': + tokens = tokens[1:] + final_index += 1 + + return " ".join([token.value for token in self.tokens[0:final_index]]) + + def get_fields(self): + tokens = self.tokens + while tokens[0].value != '{': + tokens = tokens[1:] + else: + tokens = tokens[1:] + + while tokens[0].value != '}': + #FIXME: support inner declarations properly + # It is working but with a poor representation + end_index = 1 + + level = 0 + while tokens[end_index].value != ';' or level > 0: + if tokens[end_index].value == '{': + level += 1 + elif tokens[end_index].value == '}': + level -= 1 + end_index += 1 + + token_name = tokens[end_index - 1].value + token_type = " ".join([token.value for token in tokens[0:end_index - 1]]) + + yield Field(token_name, token_type) + + tokens = tokens[end_index + 1:] + + +class TypedefEnumNode(TypedefNode): + def tokens_to_field(self, value_tokens): + if len(value_tokens) > 2: + return EnumField(value_tokens[0], " ".join(value_tokens[2:])) + return EnumField(value_tokens[0], None) + + def get_fields(self): + tokens = self.tokens + while tokens[0].value != '{': + tokens = tokens[1:] + else: + tokens = tokens[1:] + + value_tokens = [] + + while True: + if tokens[0].value == ',' or tokens[0] == '}': + yield self.tokens_to_field(value_tokens) + value_tokens = [] + else: + value_tokens.append(tokens[0].value) + + if tokens[0].value == '}': + break + + tokens = tokens[1:] + else: + if len(value_tokens) > 0: + yield self.tokens_to_field(value_tokens) + + + +class FunctionNode: + def __init__(self, tokens): + self.tokens = tokens + self.kind = NodeKind.FUNCTION + + def get_ret_type(self): + end_index = 0 + + while self.tokens[end_index].value != '(': + end_index += 1 + + return " ".join([token.value for token in self.tokens[0:end_index - 1]]) + + def get_name(self): + tokens = self.tokens + + while len(tokens) > 1 and tokens[1].value != '(': + tokens = tokens[1:] + + return tokens[0].value + + def get_fields(self): + tokens = self.tokens + while tokens[0].value != '(': + tokens = tokens[1:] + else: + tokens = tokens[1:] + + while True: + end_index = 1 + + while tokens[end_index].value != ',' and tokens[end_index].value != ')': + end_index += 1 + + + token_name = tokens[end_index - 1].value + + # FIXME: use file file contents to get this range + token_type = " ".join([token.value for token in tokens[0:end_index - 1]]) + + yield Field(token_name, token_type) + + if tokens[end_index].value == ')': + break + + tokens = tokens[end_index + 1:] + + +def get_code_node(tokens, file): + if tokens[0].kind == 'MACRO': + return MacroNode(tokens, file) + # FIXME: allows structures without typedef + if tokens[0].value == 'typedef' and tokens[1].value == 'union': + return TypedefNode(tokens, NodeKind.TYPEDEF_UNION) + if tokens[0].value == 'typedef' and tokens[1].value == 'struct': + return TypedefNode(tokens, NodeKind.TYPEDEF_STRUCT) + if tokens[0].value == 'typedef' and tokens[1].value == 'enum': + return TypedefEnumNode(tokens, NodeKind.TYPEDEF_ENUM) + return FunctionNode(tokens) + + +def group_docstring_into_files(docstrings): + files = [] + + docstrings = sorted(docstrings, key=lambda x: x.is_entry_doc(), reverse=True) + + for docstring in docstrings: + file = next(filter(lambda x: x.get_name() == docstring.get_group_name(), files), None) + + if docstring.is_group() and file: + file.docstrings.append(docstring) + else: + files.append(DocStringFile([docstring])) + + return files + + +def extract_comment(comment): + lines = comment.splitlines() + + comment_lines = [] + + for line in lines[1:]: + if line.strip().startswith("*/"): + break + + comment_lines.append(line.strip()[2:]) + + return comment_lines + + +def extract_docstring(tokens, lines): + docstrings = [] + + while len(tokens) > 0: + token = tokens[0] + if token.kind == 'DOCSTR': + docstrings.append( + DocString( + extract_comment(token.value), + get_code_node(tokens[1:], lines) + ) + ) + + tokens = tokens[1:] + + return group_docstring_into_files(docstrings) + +def man_print_fn_synopsis(docstring): + groff_lines = [] + + groff_lines.append(f".nf") + fields = list(docstring.code_node.get_fields()) + paren = f"{docstring.code_node.get_ret_type()} {docstring.code_node.get_name()}(" + paren_len = len(paren) + post = "," + for index, field in enumerate(fields): + if index == len(fields) - 1: + post = ");" + + groff_lines.append(f".BI \"{paren}{field.type} \" {field.name} {post}") + + paren = " " * paren_len + + groff_lines.append(f".fi") + return groff_lines + + +def man_print_typedef(docstring): + groff_lines = [] + groff_lines.append(f'.B "{docstring.code_node.get_declaration()}"') + groff_lines.append('.br') + groff_lines.append('.B "{"') + + for index, field in enumerate(docstring.code_node.get_fields()): + groff_lines.append('.br') + if field.type: + groff_lines.append(f'.BI " {field.type} " "{field.name}";') + else: + groff_lines.append(f'.B " {field.name};"') + + groff_lines.append('.br') + groff_lines.append(f'.B "}} {docstring.code_node.get_name()};"') + return groff_lines + + +def man_print_enum_synopisis(docstring): + groff_lines = [] + groff_lines.append(f'.B "{docstring.code_node.get_declaration()}"') + groff_lines.append('.br') + groff_lines.append('.B "{"') + + for index, field in enumerate(docstring.code_node.get_fields()): + groff_lines.append('.br') + if field.value: + groff_lines.append(f'.BR " {field.name} " "= {field.value}",') + else: + groff_lines.append(f'.B " {field.name},"') + + groff_lines.append('.br') + groff_lines.append(f'.B "}} {docstring.code_node.get_name()};"') + + return groff_lines + +def man_print_macro(docstring): + contents = docstring.code_node.get_contents().replace('\\', '\\\\').splitlines() + return [ + '.nl' + '.B' + contents[0], + ] + contents[1:] + ['.ni'] + + +def ascii_to_groff(text): + text = re.sub(r'^ *<code>', '.EX', text, flags=re.M) + text = re.sub(r'^ *</code>', '.EE', text, flags=re.M) + text = re.sub(r'<b>(.*?)</b>', r'\\fB\1\\fR', text, flags=re.M) + text = re.sub(re_group, r'\\fB\2\\fR(\3)', text, flags=re.M) + text = re.sub(r'<i>(.*?)</i>', r'\\fI\1\\fR', text, flags=re.M) + + groff_lines = [] + ascii_lines = text.splitlines() + + line_index = 0 + + while len(ascii_lines) > line_index: + ascii_line = ascii_lines[line_index] + + if ascii_line.startswith('@'): + groff_lines.append('.TP') + + field_name = ascii_line.split(':')[0][1:] + groff_lines.append(f'.I {field_name}') + + description = ":".join(ascii_line.split(':')[1:]).strip() + + while len(ascii_lines) > line_index + 1 and ascii_lines[line_index + 1][0] == ' ': + description += ' ' + ascii_lines[line_index + 1].strip() + line_index += 1 + groff_lines.append(description) + + else: + groff_lines.append(ascii_line) + + line_index += 1 + + return groff_lines + + +def generate_docs(): + with open(ctx.filename) as file: + file_contents = file.read() + + for doc_file in extract_docstring(tokenize(file_contents), file_contents): + groff_lines = [] + + groff_lines.append(f".TH {doc_file.get_name()} {doc_file.get_man_level()} {doc_file.get_name()} \"\" \"Olang Hacker's manual\"") + groff_lines.append(f".SH NAME") + groff_lines.append(f"{doc_file.get_group_name()} \\- {doc_file.get_title()}") + + groff_lines.append(f".SH SYNOPSIS") + + groff_lines.append(f".B #include <{ctx.include_name}>") + groff_lines.append(f".P") + + for index, docstring in enumerate(doc_file.docstrings): + if index: + groff_lines.append('.P') + node = docstring.code_node + if node.kind == NodeKind.FUNCTION: + groff_lines += man_print_fn_synopsis(docstring) + elif node.kind in [NodeKind.TYPEDEF_STRUCT, NodeKind.TYPEDEF_UNION]: + groff_lines += man_print_typedef(docstring) + elif node.kind == NodeKind.MACRO: + groff_lines += man_print_macro(docstring) + elif node.kind == NodeKind.TYPEDEF_ENUM: + groff_lines += man_print_enum_synopisis(docstring) + + for section in doc_file.get_sections(): + groff_lines.append(f'.SH {section.name}') + groff_lines += ascii_to_groff(section.contents) + + for subsection in section.subsections: + groff_lines.append(f'.SS {subsection.name}') + groff_lines += ascii_to_groff(subsection.contents) + + print(f'MAN\t{doc_file.get_filepath()}') + os.makedirs(os.path.dirname(doc_file.get_filepath()), exist_ok=True) + file = open(doc_file.get_filepath(), "w") + file.write("\n".join(groff_lines)) + file.close() + + for (link, to) in doc_file.get_links(): + print(f'MAN\t{link}') + os.makedirs(os.path.dirname(link), exist_ok=True) + file = open(link, "w") + file.write(f".so {to}") + file.close() + + +if __name__ == "__main__": + if len(sys.argv) != 4: + print(f'USAGE:\n\t{sys.argv[0]} file include_name man_output_dir') + exit(1) + + ctx = Ctx(*sys.argv[1:]) + + generate_docs() base-commit: f87fb371a0105a458be07bd3f269bb45da913d16 -- 2.46.1