[ad_1]
How does working with ASTs relate to pattern-matching? Nicely, a operate to find out whether or not (to an inexpensive approximation) an arbitrary AST node represents the image collections.deque
might need seemed one thing like this, earlier than sample matching…
import ast
# This clearly will not work if the image is imported with an alias
# within the supply code we're inspecting
# (e.g. "from collections import deque as d").
# However let's not fear about that right here :-)
def node_represents_collections_dot_deque(node: ast.AST) -> bool:
"""Decide if *node* represents 'deque' or 'collections.deque'"""
return (
isinstance(node, ast.Title) and node.id == "deque"
) or (
isinstance(node, ast.Attribute)
and isinstance(node.worth, ast.Title)
and node.worth.id == "collections"
and node.worth.attr == "deque"
)
However in Python 3.10, sample matching permits a sublime destructuring syntax:
import ast
def node_represents_collections_dot_deque(node: ast.AST) -> bool:
"""Decide if *node* represents 'deque' or 'collections.deque'"""
match node:
case ast.Title("deque"):
return True
case ast.Attribute(ast.Title("collections"), "deque"):
return True
case _:
return False
I do know which one I want.
For some, although, this nonetheless isn’t sufficient – and Michael “Sully” Sullivan is considered one of them. On the Python Language Summit 2023, Sullivan shared concepts for the place sample matching might go subsequent.
Sullivan’s rivalry is that, whereas sample matching supplies elegant syntactic sugar in easy instances such because the one above, our potential to chain destructurings utilizing sample matching is presently pretty restricted. For instance, say we need to write a operate inspecting Python AST that takes an ast.FunctionDef
node and identifies whether or not the node represents a synchronous operate with precisely two parameters, each of them annotated as accepting integers. The operate would behave in order that the next holds true:
>>> import ast
>>> supply = "def add_2(number1: int, number2: int): go"
>>> node = ast.parse(supply).physique[0]
>>> sort(node)
<class 'ast.FunctionDef'>
>>> is_function_taking_two_ints(node)
True
With pre-pattern-matching syntax, we’d have written such a operate like this:
def is_int(node: ast.AST | None) -> bool:
"""Decide if *node* represents 'int' or 'builtins.int'"""
return (
isinstance(node, ast.Title) and node.id == "int"
) or (
isinstance(node, ast.Attribute)
and isinstance(node.worth, ast.Title)
and node.worth.id == "builtins"
and node.attr == "int"
)
def is_function_taking_two_ints(node: ast.FunctionDef) -> bool:
"""Decide if *node* represents a operate that accepts two ints"""
args = node.args.posonlyargs + node.args.args
return len(args) == 2 and all(is_int(node.annotation) for node in args)
If we wished to rewrite this utilizing sample matching, we might probably do one thing like this:
def is_int(node: ast.AST | None) -> bool:
"""Decide if *node* represents 'int' or 'builtins.int'"""
match node:
case ast.Title("int"):
return True
case ast.Attribute(ast.Title("builtins"), "int"):
return True
case _:
return False
def is_function_taking_two_ints(node: ast.FunctionDef) -> bool:
"""Decide if *node* represents a operate that accepts two ints"""
match node.args.posonlyargs + node.args.args:
case [ast.arg(), ast.arg()] as arglist:
return all(is_int(arg.annotation) for arg in arglist)
case _:
return False
That leaves lots to be desired, nonetheless! The is_int()
helper operate may be rewritten in a a lot cleaner means. However integrating it into the is_function_taking_two_ints()
is… considerably icky! The code feels tougher to grasp than earlier than, whereas the aim of sample matching is to enhance readability.
One thing like this, (ab)utilizing metaclasses, will get us lots nearer to what it feels sample matching ought to be like. Through the use of considered one of Python’s hooks for customising isinstance()
logic, it’s attainable to rewrite our is_int()
helper operate as a category, that means we will seamlessly combine it into our is_function_taking_two_ints()
operate in a really expressive means:
import abc
import ast
class PatternMeta(abc.ABCMeta):
def __instancecheck__(cls, inst: object) -> bool:
return cls.match(inst)
class Sample(metaclass=PatternMeta):
"""Summary base class for varieties representing 'summary patterns'"""
@staticmethod
@abc.abstractmethod
def match(node) -> bool:
"""Subclasses should override this methodology"""
elevate NotImplementedError
class int_node(Sample):
"""Class representing AST patterns signifying `int` or `builtins.int`"""
@staticmethod
def match(node) -> bool:
match node:
case ast.Title("int"):
return True
case ast.Attribute(ast.Title("builtins"), "int"):
return True
case _:
return False
def is_function_taking_two_ints(node: ast.FunctionDef) -> bool:
"""Decide if *node* represents a operate that accepts two ints"""
match node.args.posonlyargs + node.args.args:
case [
ast.arg(annotation=int_node()),
ast.arg(annotation=int_node()),
]:
return True
case _:
return False
That is nonetheless hardly superb, nonetheless – that’s a number of boilerplate we’ve needed to introduce to our helper operate for figuring out int
annotations! And who needs to muck about with metaclasses?
A slide from Sullivan’s speak |
A __match__
made in heaven?
Sullivan proposes that we make it simpler to put in writing helper features for sample matching, resembling the instance above, with out having to resort to customized metaclasses. Two competing approaches had been introduced for dialogue.
The primary concept – a __match__
particular methodology – is maybe the simpler of the 2 to instantly grasp, and appeared in early drafts of the sample matching PEPs. (It was finally faraway from the PEPs as a way to scale back the scope of the proposed modifications to Python.) The proposal is that any class might outline a __match__
methodology that may very well be used to customize how match statements apply to the category. Our is_function_taking_two_ints()
case may very well be rewritten like so:
class int_node:
"""Class representing AST patterns signifying `int` or `builtins.int`"""
# The __match__ methodology is known by Python to be a static methodology,
# even with out the @staticmethod decorator,
# just like __new__ and __init_subclass__
def __match__(node) -> ast.Title | ast.Attribute:
match node:
case ast.Title("int"):
# Profitable matches can return customized objects,
# that may be certain to new variables by the caller
return node
case ast.Attribute(ast.Title("builtins"), "int"):
return node
case _:
# Return `None` to point that there was no match
return None
def is_function_taking_two_ints(node: ast.FunctionDef) -> bool:
"""Decide if *node* represents a operate that accepts two ints"""
match node.args.posonlyargs + node.args.args:
case [
ast.arg(annotation=int_node()),
ast.arg(annotation=int_node()),
]:
return True
case _:
return False
The second concept is extra radical: the introduction of some sort of new syntax (maybe reusing Python’s ->
operator) that may enable Python coders to “apply” features throughout sample matching. With this proposal, we might rewrite is_function_taking_two_ints()
like so:
def is_int(node: ast.AST | None) -> bool:
"""Decide if *node* represents 'int' or 'builtins.int'"""
match node:
case ast.Title("int"):
return True
case ast.Attribute(ast.Title("builtins"), "int"):
return True
case _:
return False
def is_function_taking_two_ints(node: ast.FunctionDef) -> bool:
"""Decide if *node* represents a operate that accepts two ints"""
match node.args.posonlyargs + node.args.args:
case [
ast.arg(annotation=is_int -> True),
ast.arg(annotation=is_int -> True),
]
case _:
return False
The reception within the room to Sullivan’s concepts was optimistic; the consensus appeared to be that there was clearly room for enchancment on this space. Brandt Bucher, creator of the unique sample matching implementation in Python 3.10, concurred that this sort of enhancement was wanted. Łukasz Langa, in the meantime, stated he’d obtained many queries from customers of different programming languages resembling C#, asking methods to deal with this sort of drawback.
The proposal for a __match__
particular methodology follows a sample frequent in Python’s knowledge mannequin, the place double-underscore “dunder” strategies are overridden to offer a category with particular behaviour. As such, it’s going to seemingly be much less jarring, at first look, to these new to the concept. Attendees of Sullivan’s speak appeared, broadly, to barely want the __match__
proposal, and Sullivan himself stated he thought it “seemed prettier”.
Jelle Zijlstra argued that the __match__
dunder would offer a sublime symmetry between the development and destruction of objects. Brandt Bucher, in the meantime, stated he thought the usability enhancements weren’t important sufficient to benefit new syntax.
Nonetheless, the choice proposal for brand new syntax additionally has a lot to advocate it. Sullivan argued that having devoted syntax to precise the concept of “making use of” a operate throughout sample matching was extra express. Mark Shannon agreed, noting the similarity between this concept and options within the Haskell programming language. “That is purposeful programming,” Shannon argued. “It feels bizarre to use OOP fashions to this.”
Addendum: pattern-matching assets and recipes
Within the meantime, whereas we anticipate a PEP, there are many modern makes use of of sample matching arising within the ecosystem. For additional studying/watching/listening, I like to recommend:
[ad_2]
That’s a number of phrases which can or might not imply very a lot to you – however contemplate, for instance, utilizing the
ast
module to parse Python supply code. In case you’re unfamiliar with theast
module: the module supplies instruments that allow you to compile Python supply code into an “summary syntax tree” (AST) representing the code’s construction. The Python interpreter itself converts Python supply code into an AST as a way to perceive methods to run that code – however parsing Python supply code utilizing ASTs can be a typical job for linters, resembling plugins for flake8 or pylint. Within the following instance,ast.parse()
is used to parse the supply codex = 42
into anast.Module
node, andast.dump()
is then used to disclose the tree-like construction of that node in a human-readable type: