《Let’s Build A Simple Interpreter》学习笔记(二)

该笔记基于教程 Let’s Build A Simple Interpreter. from Ruslan’s Blog,原文使用 Python 为 Pascal 编写解释器,在该笔记中我将使用 Rust 进行解释器的编写。

1 核心概念

1.1 词素(lexeme)

Token Sample lexemes
INTEGER 342, 9, 0, 17, 1
PLUS +
MINUS -
词素指的是形成一个 Token 的字符序列。

2 问题

对上节的解释器进行升级,解释器需要在上节的基础上实现如下拓展:

  1. 无视输入的任意位置的空格
  2. 实现多位数字的识别
  3. 实现两个正数的减法

3 代码实现

3.1 interpreter/token.rs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// interpreter/token.rs

#[derive(Debug, PartialEq, Eq, Clone)]
pub enum TokenType {
Integer,
Plus,
Minus,
Eof,
}

#[derive(Debug, PartialEq, Eq, Clone)]
pub struct Token {
pub token_type: TokenType,
pub value: Option<String>,
}

impl Token {
pub fn new(token_type: TokenType, value: Option<String>) -> Self {
Self { token_type, value }
}
}

3.2 interpreter/mod.rs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
// interpreter/mod.rs
pub mod token;
pub use token::*;

#[derive(Debug, PartialEq, Eq, Clone)]
pub struct Interpreter {
text: String,
pos: usize,
current_token: Option<Token>,
current_char: Option<char>,
}

impl Interpreter {
pub fn new() -> Self {
Interpreter {
text: String::new(),
pos: 0,
current_token: None,
current_char: None,
}
}

pub fn error(&self, message: &str) -> ! {
panic!("Error parsing input: {}", message);
}

pub fn advance(&mut self) {
self.pos += 1;
if self.pos > self.text.len() {
self.current_char = None;
return;
}
self.current_char = self.text.chars().nth(self.pos);
}

pub fn integer(&mut self) -> String {
let mut result = String::new();
while let Some(c) = self.current_char {
if c.is_digit(10) {
result.push(c);
self.advance();
} else {
break;
}
}
result
}

pub fn get_next_token(&mut self) -> Token {
// End of input
if self.pos >= self.text.len() {
return Token::new(TokenType::Eof, None);
}

// Skip whitespace
if self.current_char.unwrap().is_whitespace() {
self.advance();
return self.get_next_token();
}

if self.current_char.unwrap().is_digit(10) {
return Token::new(TokenType::Integer, Some(self.integer()));
}

if self.current_char == Some('+') {
self.advance();
return Token::new(
TokenType::Plus,
Some(self.current_char.unwrap().to_string()),
);
}
if self.current_char == Some('-') {
self.advance();
return Token::new(
TokenType::Minus,
Some(self.current_char.unwrap().to_string()),
);
}
self.error("Unexpected character");
}

pub fn eat(&mut self, token_type: TokenType) {
if let Some(ref current_token) = self.current_token {
if current_token.token_type == token_type {
self.current_token = Some(self.get_next_token());
} else {
self.error("Unexpected token");
}
} else {
self.error("Unexpected end of input");
}
}

pub fn expr(&mut self, text: String) -> i32 {
self.text = text;
self.pos = 0;
self.current_char = self.text.chars().nth(self.pos);
self.current_token = Some(self.get_next_token());

let left = self.current_token.clone().unwrap();
self.eat(TokenType::Integer);

let op = self.current_token.clone().unwrap();
if op.token_type == TokenType::Plus {
self.eat(TokenType::Plus);
} else if op.token_type == TokenType::Minus {
self.eat(TokenType::Minus);
} else {
self.error("Unexpected operator");
}

let right = self.current_token.clone().unwrap();
self.eat(TokenType::Integer);

if let (Some(left_val), Some(right_val)) = (left.value, right.value) {
let left_int: i32 = left_val.parse().unwrap();
let right_int: i32 = right_val.parse().unwrap();
if op.token_type == TokenType::Minus {
return left_int - right_int;
} else {
return left_int + right_int;
}
}
self.error("Invalid expression");
}
}


3.3 main.rs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// main.rs

pub mod interpreter;
use interpreter::Interpreter;
use std::io::Write;

fn main() {
loop {
let mut interpreter = Interpreter::new();
let mut input = String::new();
print!("calc> ");
std::io::stdout().flush().unwrap();
std::io::stdin().read_line(&mut input).unwrap();
let result = interpreter.expr(input.trim().to_string());
println!("{}", result);
}
}

《Let’s Build A Simple Interpreter》学习笔记(二)

http://localhost/2025/09/24/interpreter-2/

Author

Zero'F_Fa

Posted on

2025-09-24

Updated on

2025-10-21

Licensed under

Comments