Python Recursion: Writing a Recursive Descent Parser

two circular pieces of braided material sit side by side on a white surface, one with black squares and the other has a red stripe

Note: this page has been created with the use of AI. Please take caution, and note that the content of this page does not necessarily reflect the opinion of Cratecode.

When dealing with math expressions, we often need a way to evaluate them and obtain the result. One effective and interesting approach is to use a recursive descent parser. In this guide, you'll learn how to write a recursive descent parser in Python to evaluate math expressions.

What is a Recursive Descent Parser?

A recursive descent parser is a top-down parser that starts with the highest level grammar rule and recursively applies lower-level grammar rules until it reaches the terminal symbols (tokens). It's called "recursive" because it uses recursive functions to traverse the grammar, and "descent" because it works from the top down.

To illustrate the concept, let's consider we have a simple grammar for arithmetic expressions:

expression -> term + expression | term - expression | term term -> factor * term | factor / term | factor factor -> ( expression ) | number

Here, expression, term, and factor are non-terminal symbols, while number and the operators are terminal symbols.

Writing a Recursive Descent Parser in Python

Let's build a recursive descent parser in Python step by step.

1. Tokenization

First, we need to tokenize the input math expression. Tokenization is the process of breaking down the input string into tokens, which are the smallest meaningful units in the expression.

import re def tokenize(expression): token_pattern = r"(\d+|[()+\-*/])" tokens = re.findall(token_pattern, expression) return tokens

2. Recursive Parsing Functions

We'll implement the parsing functions based on the grammar rules defined earlier. Each non-terminal symbol will have its own corresponding function.

def parse_factor(tokens): token = tokens.pop(0) if token == "(": result = parse_expression(tokens) tokens.pop(0) # Discard ")" return result else: return float(token) def parse_term(tokens): left = parse_factor(tokens) while tokens and tokens[0] in "*/": op = tokens.pop(0) right = parse_factor(tokens) if op == "*": left *= right else: left /= right return left def parse_expression(tokens): left = parse_term(tokens) while tokens and tokens[0] in "+-": op = tokens.pop(0) right = parse_term(tokens) if op == "+": left += right else: left -= right return left

3. Putting It All Together

Now, we can create a function called evaluate_expression that combines tokenization and parsing to evaluate the given math expression.

def evaluate_expression(expression): tokens = tokenize(expression) result = parse_expression(tokens) return result

4. Testing the Parser

Let's test our recursive descent parser with some sample expressions:

print(evaluate_expression("10+20-30")) # Output: 0.0 print(evaluate_expression("3*(4+5)")) # Output: 27.0 print(evaluate_expression("6/2+1")) # Output: 4.0

Congratulations! You've just created a recursive descent parser in Python to evaluate math expressions. You can now extend the grammar rules and parsing functions to handle more complex expressions or even other types of languages.

Similar Articles