AST

抽象语法树是目前编译器中最复杂的部分，它是编译器的中间表示，也是编译器的核心。本节将介绍AST的设计和实现。

AST的设计

基本上，所有源代码中的基础单位都会对应抽象语法树中的一个节点。抽象语法树有很多类型的节点，他们可能会相互引用。

所有的节点都必须实现Node trait，这个trait定义了节点的基本行为。

#[enum_dispatch]
pub trait Node: RangeTrait + FmtTrait + PrintTrait {
    fn emit<'a, 'b>(
        &mut self,
        ctx: &'b mut Ctx<'a>,
        builder: &'b BuilderEnum<'a, '_>,
    ) -> NodeResult;
}

/// print trait to report the AST node structure
#[enum_dispatch]
pub trait PrintTrait {
    fn print(&self, tabs: usize, end: bool, line: Vec<bool>);
}

一般来说，所有的节点都需要加入NodeEnum中，并且使用#[node]proc macro进行修饰。同时需要在fmt.rs中加入format相关的函数。

如果你想学习如何添加新的节点，可以先从简单的节点开始，比如BoolConstNode

#[node]
pub struct BoolConstNode {
    pub value: bool,
}

你可能注意到了，Nodetrait继承了RangeTrait，这个trait定义了节点的位置信息。

#[enum_dispatch]
pub trait RangeTrait {
    fn range(&self) -> Range;
}

一般来说，RangeTrait的实现通过#[node]宏来自动生成，你不需要手动实现它。

Node接口中的print函数用于打印节点的信息，它会被用于调试。print打印的结果和tree的输出非常像，你需要用一些工具函数来格式化输出。以ifnode的print函数为例：

#![allow(unused)]
fn main() {
    fn print(&self, tabs: usize, end: bool, mut line: Vec<bool>) {
        deal_line(tabs, &mut line, end);
        tab(tabs, line.clone(), end);
        println!("IfNode");
        self.cond.print(tabs + 1, false, line.clone());
        if let Some(el) = &self.els {
            self.then.print(tabs + 1, false, line.clone());
            el.print(tabs + 1, true, line.clone());
        } else {
            self.then.print(tabs + 1, true, line.clone());
        }
    }
}

emit函数是生成llvm代码的核心，它会调用llvm api构造自己对应的llvm ir。在编译的时候，最上层节点的emit会被调用，该函数会递归的调用自己的子节点的emit函数，最终生成整个程序的llvm ir。
下方是ifnode的emit函数：

    fn emit<'a, 'b>(
        &mut self,
        ctx: &'b mut Ctx<'a>,
        builder: &'b BuilderEnum<'a, '_>,
    ) -> NodeResult {
        let cond_block = builder.append_basic_block(ctx.function.unwrap(), "if.cond");
        let then_block = builder.append_basic_block(ctx.function.unwrap(), "if.then");
        let else_block = builder.append_basic_block(ctx.function.unwrap(), "if.else");
        let merge_block = builder.append_basic_block(ctx.function.unwrap(), "if.after");
        builder.build_unconditional_branch(cond_block);
        ctx.position_at_end(cond_block, builder);
        let cond_range = self.cond.range();

        let (gen_def, skip_body) = cond_pre_process(
            &mut self.cond,
            ctx,
            builder,
            then_block,
            else_block,
            cond_range,
        );

        // emit the else logic into the then block
        ctx.position_at_end(then_block, builder);
        let mut then_terminator = TerminatorEnum::None;
        // emit the code inside a child context because it belongs to a sub-block
        let mut child = ctx.new_child(self.then.range().start, builder);
        if !skip_body || matches!(builder, BuilderEnum::NoOp(_)) {
            if let Some(mut def) = gen_def {
                def.emit(&mut child, builder)?;
            }
            then_terminator = self.then.emit(&mut child, builder)?.get_term();
        }

        if then_terminator.is_none() {
            // there is no terminator(like return, yield and so forth) in the statement
            // create an unconditional branch to merge block to finish off the "then" block
            builder.build_unconditional_branch(merge_block);
        }

        // emit the else logic into the else block
        ctx.position_at_end(else_block, builder);
        let terminator = if let Some(el) = &mut self.els {
            let mut child = ctx.new_child(el.range().start, builder);
            let else_terminator = el.emit(&mut child, builder)?.get_term();
            if else_terminator.is_none() {
                // create an unconditional branch only if no terminator is detected
                // otherwise, the code to be executed might be the others instead of merge block
                // for example, if there is a 'return' statement in the if-then-else clause,
                // it won't execute the merge block as it returns directly
                builder.build_unconditional_branch(merge_block);
            }

            if then_terminator.is_return() && else_terminator.is_return() {
                TerminatorEnum::Return
            } else {
                TerminatorEnum::None
            }
        } else {
            builder.build_unconditional_branch(merge_block);
            TerminatorEnum::None
        };

        ctx.position_at_end(merge_block, builder);
        if terminator.is_return() {
            builder.build_unconditional_branch(merge_block);
        }
        ctx.emit_comment_highlight(&self.comments[0]);

        NodeOutput::default().with_term(terminator).to_result()
    }

emit函数的第一个参数是节点自身，第二个参数是编译上下文。编译上下文中会包含一些需要透传的信息，比如符号表，lsp参数等，第三个参数是builder，用于生成中间代码。目前builder只有llvm的实现。

emit函数的返回值比较复杂，它是一个Result枚举类型，它的Ok类型中包含一个Option的PLValue--这是代表该节点的运算结果，一般只有表达式节点有这个值，statement节点这里会返回None，除此之外还包含一个Option的PLType，这是代表该节点返回值的类型，最后一个是一个TerminatorEnum，用于分析某个路径是否有终结语句（break，continue，return，panic等）。返回值的Err类型是一个PLDiag，这个类型是用于报告错误的，包含了所有的错误信息。一般来说，在返回之前错误就会被加到编译上下文中，所以调用者不需要对其进行任何处理。

打印AST结构

plc命令行工具有打印ast的功能，你可以使用plc xxx.pi --printast命令来打印ast结构。
下方是一个ast打印结果的样例：

...
file: /Users/bobli/src/pivot-lang/test/sub/mod.pi
ProgramNode
 └─ FuncDefNode
     ├─ id: name
     ├─ TypeNameNode
     │   └─ ExternIdNode
     │       └─ VarNode: void
     └─ StatementsNode
         └─ RetNode
file: /Users/bobli/src/pivot-lang/test/mod2.pi
ProgramNode
 ├─ UseNode
 │   ├─ VarNode: sub
 │   └─ VarNode: mod
 ├─ FuncDefNode
 │   ├─ id: test_mod
 │   ├─ TypedIdentifierNode
 │   │   ├─ id: args
 │   │   └─ TypeNameNode
 │   │       └─ ExternIdNode
 │   │           └─ VarNode: i64
 │   ├─ TypeNameNode
 │   │   └─ ExternIdNode
 │   │       └─ VarNode: void
 │   └─ StatementsNode
 │       └─ RetNode
 └─ StructDefNode
     ├─ id: Mod2
     └─ TypedIdentifierNode
         ├─ id: y
         └─ TypeNameNode
             └─ ExternIdNode
                 └─ VarNode: bool
...

Pivot Lang

AST

AST的设计

打印AST结构