PSyIR: the PSyclone Intermediate Representation

The PSyIR is at the heart of PSyclone, representing code for existing code and PSyKAl DSLs (at both the PSy- and kernel-layer levels). A PSyIR tree may be constructed from scratch (in Python) or by processing existing source code using a frontend. Transformations act on the PSyIR and ultimately the generated code is produced by one of the PSyIR’s backends.

PSyIR Nodes

The PSyIR consists of classes whose instances can be connected together to form a tree which represent computation in a syntax-independent way. These classes all inherit from the Node baseclass and, as a result, PSyIR instances are often referred to collectively as ‘PSyIR nodes’.

At the present time PSyIR classes can be essentially split into two types: language-level nodes, which are nodes that the PSyIR backends support, and therefore they can be directly translated to code; and higher-level nodes, which are additional nodes that each domain can insert. These nodes must implement a lower_to_language_level method in order to be converted to their equivalent representation using only language-level nodes. This then permits code to be generated for them.

The rest of this document describes only the language-level nodes, but as all nodes inherit from the same base classes, the methods described here are applicable to all PSyIR nodes.

Available language-level nodes

Text Representation

When developing a transformation script it is often necessary to examine the structure of the PSyIR. All nodes in the PSyIR have the view method that provides a text-representation of that node and all of its descendants. If the termcolor package is installed (see Getting Going) then colour highlighting is used as part of the output string. For instance, part of the Schedule constructed for the second NEMO example is rendered as:

_images/schedule_with_indices.png

Note that in this view, only those nodes which are children of Schedules have their indices shown. This means that nodes representing e.g. loop bounds or the conditional part of if statements are not indexed. For the example shown, the PSyIR node representing the if(l_hst) code would be reached by schedule.children[14].if_body.children[1] or, using the shorthand notation (see below), schedule[14].if_body[1] where schedule is the overall parent Schedule node (omitted from the above image).

One problem with the view method is that the output can become very large for big ASTs and is not readable for users unfamiliar with the PSyIR. An alternative to it is the debug_string method that generates a text representation with Fortran-like syntax but on which the high abstraction constructs have not yet been lowered to Fortran level and instead they will be embedded as < node > expressions.

Tree Navigation

Each PSyIR node provides several ways to navigate the AST. These can be categorised as homogeneous naviation methods (available in all nodes), and heterogenous or semantic navigation methods (different methods available depending on the node type). The homogeneous methods must be used for generic code navigation that should work regardless of its context. However, when the context is known, we recommend using the semantic methods to increase the code readability.

The homogeneous navigation methods are:

Node.children()
Returns:

the immediate children of this Node.

Return type:

List[psyclone.psyir.nodes.Node]

Node.siblings()
Returns:

list of sibling nodes, including self.

Return type:

List[psyclone.psyir.nodes.Node]

Node.parent()
Returns:

the parent node.

Return type:

psyclone.psyir.nodes.Node or NoneType

Node.root()
Returns:

the root node of the PSyIR tree.

Return type:

psyclone.psyir.nodes.Node

Node.walk()[source]

Recurse through the PSyIR tree and return all objects that are an instance of ‘my_type’, which is either a single class or a tuple of classes. In the latter case all nodes are returned that are instances of any classes in the tuple. The recursion into the tree is stopped if an instance of ‘stop_type’ (which is either a single class or a tuple of classes) is found. This can be used to avoid analysing e.g. inlined kernels, or as performance optimisation to reduce the number of recursive calls. The recursion into the tree is also stopped if the (optional) ‘depth’ level is reached.

Parameters:
  • my_type (type | Tuple[type, ...]) – the class(es) for which the instances are collected.

  • stop_type (Optional[type | Tuple[type, ...]]) – class(es) at which recursion is halted (optional).

  • depth (Optional[int]) – the depth value the instances must have (optional).

Returns:

list with all nodes that are instances of my_type starting at and including this node.

Return type:

List[psyclone.psyir.nodes.Node]

Node.get_sibling_lists()[source]

Recurse through the PSyIR tree and return lists of Nodes that are instances of ‘my_type’ and are immediate siblings. Here ‘my_type’ is either a single class or a tuple of classes. In the latter case all nodes are returned that are instances of any classes in the tuple. The recursion into the tree is stopped if an instance of ‘stop_type’ (which is either a single class or a tuple of classes) is found.

Parameters:
  • my_type (type | Tuple[type, ...]) – the class(es) for which the instances are collected.

  • stop_type (Optional[type | Tuple[type, ...]]) – class(es) at which recursion is halted (optional).

Returns:

list of lists, each of which containing nodes that are instances of my_type and are immediate siblings, starting at and including this node.

Return type:

List[List[psyclone.psyir.nodes.Node]]

Node.ancestor()[source]

Search back up the tree and check whether this node has an ancestor that is an instance of the supplied type. If it does then we return it otherwise we return None. An individual (or tuple of) (sub-) class(es) to ignore may be provided via the excluding argument. If include_self is True then the current node is included in the search. If limit is provided then the search ceases if/when the supplied node is encountered. If shared_with is provided, then the ancestor search will find an ancestor of both this node and the node provided as shared_with if such an ancestor exists.

Parameters:
  • my_type (type | Tuple[type, ...]) – class(es) to search for.

  • excluding (Optional[type | Tuple[type, ...]]) – (sub-)class(es) to ignore or None.

  • include_self (bool) – whether or not to include this node in the search.

  • limit (Optional[psyclone.psyir.nodes.Node]) – an optional node at which to stop the search.

  • shared_with (Optional[psyclone.psyir.nodes.Node]) – an optional node which must also have the found node as an ancestor.

Returns:

First ancestor Node that is an instance of any of the requested classes or None if not found.

Return type:

Optional[psyclone.psyir.nodes.Node]

Raises:
  • TypeError – if excluding is provided but is not a type or tuple of types.

  • TypeError – if limit is provided but is not an instance of Node.

Node.scope()

Some nodes (e.g. Schedule and Container) allow symbols to be scoped via an attached symbol table. This property returns the closest ScopingNode node including self.

Returns:

the closest ancestor ScopingNode node.

Return type:

psyclone.psyir.node.ScopingNode

Raises:

SymbolError – if there is no ScopingNode ancestor.

Node.path_from()[source]

Find the path in the psyir tree between ancestor and node and returns a list containing the path.

The result of this method can be used to find the node from its ancestor for example by:

>>> index_list = node.path_from(ancestor)
>>> cursor = ancestor
>>> for index in index_list:
>>>    cursor = cursor.children[index]
>>> assert cursor is node
Parameters:

ancestor (psyclone.psyir.nodes.Node) – an ancestor node of self to find the path from.

Raises:

ValueError – if ancestor is not an ancestor of self.

Returns:

a list of child indices representing the path between ancestor and self.

Return type:

List[int]

In addition to the navigation methods, nodes also have homogeneous methods to interrogate their location and surrounding nodes.

Node.immediately_precedes()[source]
Parameters:

node – the node to compare it to.

Returns:

whether this node immediately precedes the given node.

Return type:

bool

Node.immediately_follows()[source]
Parameters:

node – the node to compare it to.

Returns:

whether this node immediately follows the given node.

Return type:

bool

Node.position()

Find a Node’s position relative to its parent Node (starting with 0 if it does not have a parent).

Returns:

relative position of a Node to its parent

Return type:

int

Node.abs_position()

Find a Node’s absolute position in the tree (starting with 0 if it is the root). Needs to be computed dynamically from the starting position (0) as its position may change.

Returns:

absolute position of a Node in the tree.

Return type:

int

Raises:

InternalError – if the absolute position cannot be found.

Node.sameParent()[source]
Returns:

True if node_2 has the same parent as this node, False otherwise.

Return type:

bool

The semantic navigation methods are:

  • Schedule:

    subscript operator for indexing the statements (children) inside the Schedule, e.g. sched[3] or sched[2:4].

  • Assignment:
    Assignment.lhs()
    Returns:

    the child node representing the Left-Hand Side of the assignment.

    Return type:

    psyclone.psyir.nodes.Node

    Raises:

    InternalError – Node has fewer children than expected.

    Assignment.rhs()
    Returns:

    the child node representing the Right-Hand Side of the assignment.

    Return type:

    psyclone.psyir.nodes.Node

    Raises:

    InternalError – Node has fewer children than expected.

  • IfBlock:
    IfBlock.condition()

    Return the PSyIR Node representing the conditional expression of this IfBlock.

    Returns:

    IfBlock conditional expression.

    Return type:

    psyclone.psyir.nodes.Node

    Raises:

    InternalError – If the IfBlock node does not have the correct number of children.

    IfBlock.if_body()

    Return the Schedule executed when the IfBlock evaluates to True.

    Returns:

    Schedule to be executed when IfBlock evaluates to True.

    Return type:

    psyclone.psyir.nodes.Schedule

    Raises:

    InternalError – If the IfBlock node does not have the correct number of children.

    IfBlock.else_body()

    If available return the Schedule executed when the IfBlock evaluates to False, otherwise return None.

    Returns:

    Schedule to be executed when IfBlock evaluates to False, if it doesn’t exist returns None.

    Return type:

    psyclone.psyir.nodes.Schedule or NoneType

  • Loop:
    Loop.loop_body()
    Returns:

    the PSyIR Schedule with the loop body statements.

    Return type:

    psyclone.psyir.nodes.Schedule

  • WhileLoop:
    WhileLoop.condition()

    Return the PSyIR Node representing the conditional expression of this WhileLoop.

    Returns:

    WhileLoop conditional expression.

    Return type:

    psyclone.psyir.nodes.Node

    Raises:

    InternalError – If the WhileLoop node does not have the correct number of children.

    WhileLoop.loop_body()

    Return the Schedule executed when the WhileLoop condition is True.

    Returns:

    Schedule to be executed when WhileLoop condition is True.

    Return type:

    psyclone.psyir.nodes.Schedule

    Raises:

    InternalError – If the WhileLoop node does not have the correct number of children.

  • Array nodes (e.g. ArrayReference, ArrayOfStructuresReference):
    ArrayReference.indices()

    Supports semantic-navigation by returning the list of nodes representing the index expressions for this array reference.

    Returns:

    the PSyIR nodes representing the array-index expressions.

    Return type:

    list of psyclone.psyir.nodes.Node

    Raises:

    InternalError – if this node has no children or if they are not valid array-index expressions.

  • RegionDirective:
    RegionDirective.dir_body()
    Returns:

    the Schedule associated with this directive.

    Return type:

    psyclone.psyir.nodes.Schedule

    Raises:

    InternalError – if this node does not have a Schedule as its first child.

    RegionDirective.clauses()
    Returns:

    the Clauses associated with this directive.

    Return type:

    List of psyclone.psyir.nodes.Clause

  • Nodes representing accesses of data within a structure (e.g. StructureReference, StructureMember):
    StructureReference.member()
    Returns:

    the PSyIR child representing the accessor component.

    Return type:

    psyclone.psyir.nodes.Member

    Raises:

    InternalError – if the first child of this node is not an instance of Member.

DataTypes

The PSyIR supports the following datatypes: ScalarType, ArrayType, StructureType, UnresolvedType, UnsupportedType and NoType. These datatypes are used when creating instances of DataSymbol, RoutineSymbol and Literal (although note that NoType may only be used with a RoutineSymbol). UnresolvedType and UnsupportedType are both used when processing existing code. The former is used when a symbol is being imported from some other scope (e.g. via a USE statement in Fortran) that hasn’t yet been resolved and the latter is used when an unsupported form of declaration is encountered.

More information on each of these various datatypes is given in the following subsections.

Scalar DataType

A Scalar datatype consists of an intrinsic and a precision.

The intrinsic can be one of INTEGER, REAL, BOOLEAN and CHARACTER.

The precision can be UNDEFINED, SINGLE, DOUBLE, an integer value specifying the precision in bytes, or a datasymbol (see Section Symbols and Symbol Tables) that contains precision information. Note that UNDEFINED, SINGLE and DOUBLE allow the precision to be set by the system so may be different for different architectures. For example:

>>> char_type = ScalarType(ScalarType.Intrinsic.CHARACTER,
...                        ScalarType.Precision.UNDEFINED)
>>> int_type = ScalarType(ScalarType.Intrinsic.INTEGER,
...                       ScalarType.Precision.SINGLE)
>>> bool_type = ScalarType(ScalarType.Intrinsic.BOOLEAN, 4)
>>> symbol = DataSymbol("rdef", int_type, initial_value=4)
>>> scalar_type = ScalarType(ScalarType.Intrinsic.REAL, symbol)

For convenience PSyclone predefines a number of scalar datatypes:

REAL_TYPE, INTEGER_TYPE, BOOLEAN_TYPE and CHARACTER_TYPE all have precision set to UNDEFINED;

REAL_SINGLE_TYPE, REAL_DOUBLE_TYPE, INTEGER_SINGLE_TYPE and INTEGER_DOUBLE_TYPE;

REAL4_TYPE, REAL8_TYPE, INTEGER4_TYPE and INTEGER8_TYPE.

Array DataType

An Array datatype itself has another datatype (or DataTypeSymbol) specifying the type of its elements and a shape. The shape can have an arbitrary number of dimensions. Each dimension captures what is known about its extent. It is necessary to distinguish between four cases:

Description

Entry in shape list

An array has a static extent known at compile time.

ArrayType.ArrayBounds containing integer Literal values

An array has an extent defined by another symbol or (constant) PSyIR expression.

ArrayType.ArrayBounds containing Reference or Operation nodes

An array has a definite extent which is not known at compile time but can be queried at runtime.

ArrayType.Extent.ATTRIBUTE

It is not known whether an array has memory allocated to it in the current scoping unit.

ArrayType.Extent.DEFERRED

where ArrayType.ArrayBounds is a namedtuple with lower and upper members holding the lower- and upper-bounds of the extent of a given array dimension.

The distinction between the last two cases is that in the former the extents are known but are kept internally with the array (for example an assumed shape array in Fortran) and in the latter the array has not yet been allocated any memory (for example the declaration of an allocatable array in Fortran) so the extents may have not been defined yet.

For example:

>>> array_type = ArrayType(REAL4_TYPE, [5, 10])

>>> n_var = DataSymbol("n", INTEGER_TYPE)
>>> array_type = ArrayType(INTEGER_TYPE, [Reference(n_var),
...                                       Reference(n_var)])

>>> array_type = ArrayType(REAL8_TYPE, [ArrayType.Extent.ATTRIBUTE,
...                                     ArrayType.Extent.ATTRIBUTE])

>>> array_type = ArrayType(BOOLEAN_TYPE, [ArrayType.Extent.DEFERRED])

Structure Datatype

A Structure datatype consists of a dictionary of components where the name of each component is used as the corresponding key. Each component is stored as a named tuple with name, datatype and visibility members.

For example:

# Shorthand for a scalar type with REAL_KIND precision
SCALAR_TYPE = ScalarType(ScalarType.Intrinsic.REAL, REAL_KIND)

# Structure-type definition
GRID_TYPE = StructureType.create([
    ("dx", SCALAR_TYPE, Symbol.Visibility.PUBLIC),
    ("dy", SCALAR_TYPE, Symbol.Visibility.PUBLIC)])

GRID_TYPE_SYMBOL = DataTypeSymbol("grid_type", GRID_TYPE)

# A structure-type containing other structure types
FIELD_TYPE_DEF = StructureType.create(
    [("data", ArrayType(SCALAR_TYPE, [10]), Symbol.Visibility.PUBLIC),
     ("grid", GRID_TYPE_SYMBOL, Symbol.Visibility.PUBLIC),
     ("sub_meshes", ArrayType(GRID_TYPE_SYMBOL, [3]),
      Symbol.Visibility.PUBLIC),
     ("flag", INTEGER4_TYPE, Symbol.Visibility.PUBLIC)])

Unknown DataType

If a PSyIR frontend encounters an unsupported declaration then the corresponding Symbol is given UnsupportedType. The text of the original declaration is stored in the type object and is available via the declaration property.

NoType

NoType represents the empty type, equivalent to void in C. It is currently only used to describe a RoutineSymbol that has no return type (such as a Fortran subroutine).

Symbols and Symbol Tables

Some PSyIR nodes have an associated Symbol Table (psyclone.psyir.symbols.SymbolTable) which keeps a record of the Symbols (psyclone.psyir.symbols.Symbol) specified and used within them.

Symbol Tables can be nested (i.e. a node with an attached symbol table can be an ancestor or descendent of a node with an attached symbol table). If the same symbol name is used in a hierarchy of symbol tables then the symbol within the symbol table attached to the closest ancestor node is in scope. By default, symbol tables are aware of other symbol tables and will return information about relevant symbols from all symbol tables.

The SymbolTable has the following interface:

class psyclone.psyir.symbols.SymbolTable(node=None, default_visibility=Visibility.PUBLIC)[source]

Encapsulates the symbol table and provides methods to add new symbols and look up existing symbols. Nested scopes are supported and, by default, the add and lookup methods take any ancestor symbol tables into consideration (ones attached to nodes that are ancestors of the node that this symbol table is attached to). If the default visibility is not specified then it defaults to Symbol.Visbility.PUBLIC.

Parameters:
  • node (Optional[psyclone.psyir.nodes.Schedule | psyclone.psyir.nodes.Container]) – reference to the Schedule or Container to which this symbol table belongs.

  • default_visibility – optional default visibility value for this symbol table, if not provided it defaults to PUBLIC visibility.

Raises:

TypeError – if node argument is not a Schedule or a Container.

Where each element is a Symbol with an immutable name:

class psyclone.psyir.symbols.Symbol(name, visibility=Visibility.PUBLIC, interface=None)[source]

Generic Symbol item for the Symbol Table and PSyIR References. It has an immutable name label because it must always match with the key in the SymbolTable. If the symbol is private then it is only visible to those nodes that are descendants of the Node to which its containing Symbol Table belongs.

Parameters:
  • name (str) – name of the symbol.

  • visibility (psyclone.psyir.symbols.Symbol.Visibility) – the visibility of the symbol.

  • interface (Optional[ psyclone.psyir.symbols.symbol.SymbolInterface]) – optional object describing the interface to this symbol (i.e. whether it is passed as a routine argument or accessed in some other way). Defaults to psyclone.psyir.symbols.AutomaticInterface

Raises:

TypeError – if the name is not a str.

There are several Symbol sub-classes to represent different labeled entities in the PSyIR. At the moment the available symbols are:

  • class psyclone.psyir.symbols.ContainerSymbol(name, **kwargs)[source]

    Symbol that represents a reference to a Container. The reference is lazy evaluated, this means that the Symbol will be created without parsing and importing the referenced container, but this can be imported when needed.

    Parameters:
    • name (str) – name of the symbol.

    • wildcard_import (bool) – if all public Symbols of the Container are imported into the current scope. Defaults to False.

    • is_intrinsic (bool) – if the module is an intrinsic import. Defauts to False.

    • kwargs (unwrapped dict.) – additional keyword arguments provided by psyclone.psyir.symbols.Symbol.

  • class psyclone.psyir.symbols.DataSymbol(name, datatype, is_constant=False, initial_value=None, **kwargs)[source]

    Symbol identifying a data element. It contains information about: the datatype, the shape (in column-major order) and the interface to that symbol (i.e. Local, Global, Argument).

    Parameters:
    • name (str) – name of the symbol.

    • datatype (psyclone.psyir.symbols.DataType) – data type of the symbol.

    • is_constant (bool) – whether this DataSymbol is a compile-time constant (default is False). If True then an initial_value must also be provided.

    • initial_value (Optional[item of TYPE_MAP_TO_PYTHON | psyclone.psyir.nodes.Node]) – sets a fixed known expression as an initial value for this DataSymbol. If is_constant is True then this Symbol will always have this value. If the value is None then this symbol does not have an initial value (and cannot be a constant). Otherwise it can receive PSyIR expressions or Python intrinsic types available in the TYPE_MAP_TO_PYTHON map. By default it is None.

    • kwargs (unwrapped dict.) – additional keyword arguments provided by psyclone.psyir.symbols.TypedSymbol

  • class psyclone.psyir.symbols.DataTypeSymbol(name, datatype, visibility=Visibility.PUBLIC, interface=None)[source]

    Symbol identifying a user-defined type (e.g. a derived type in Fortran).

    Parameters:
    • name (str) – the name of this symbol.

    • datatype (psyclone.psyir.symbols.DataType) – the type represented by this symbol.

    • visibility (psyclone.psyir.symbols.Symbol.Visibility) – the visibility of this symbol.

    • interface (psyclone.psyir.symbols.SymbolInterface) – the interface to this symbol.

  • class psyclone.psyir.symbols.IntrinsicSymbol(name, intrinsic, **kwargs)[source]

    Symbol identifying a callable intrinsic routine.

    Parameters:
    • name (str) – name of the symbol.

    • intrinsic (psyclone.psyir.nodes.IntrinsicCall.Intrinsic) – the intrinsic enum describing this Symbol.

    • kwargs (unwrapped dict.) – additional keyword arguments provided by psyclone.psyir.symbols.TypedSymbol

    # TODO #2541: Currently name and the intrinsic should match, we really # just need the name, and make all the Intrinsic singature information # live inside the IntrinsicSymbol class.

  • class psyclone.psyir.symbols.RoutineSymbol(name, datatype=None, **kwargs)[source]

    Symbol identifying a callable routine.

    Parameters:
    • name (str) – name of the symbol.

    • datatype (psyclone.psyir.symbols.DataType) – data type of the symbol. Default to NoType().

    • kwargs (unwrapped dict.) – additional keyword arguments provided by psyclone.psyir.symbols.TypedSymbol

  • class psyclone.psyir.symbols.GenericInterfaceSymbol(name, routines, **kwargs)[source]

    Symbol identifying a generic interface that maps to a number of different callable routines.

    Parameters:
    • name (str) – name of the interface.

    • routines (list[tuple[ psyclone.psyir.symbols.RoutineSymbol, bool]]) – the routines that this interface provides access to.

    • kwargs (unwrapped dict.) – additional keyword arguments provided by psyclone.psyir.symbols.TypedSymbol

See the reference guide for the full API documentation of the SymbolTable and the Symbol types.

Symbol Interfaces

Each symbol has a Symbol Interface with the information about how the variable data is provided into the local context. The currently available Interfaces are:

  • class psyclone.psyir.symbols.AutomaticInterface[source]

    The symbol is declared without attributes. Its data will live during the local context.

  • class psyclone.psyir.symbols.DefaultModuleInterface[source]

    The symbol contains data declared in a module scope without additional attributes.

  • class psyclone.psyir.symbols.ImportInterface(container_symbol, orig_name=None)[source]

    Describes the interface to a Symbol that is imported from an external PSyIR container. The symbol can be renamed on import and, if so, its original name in the Container is specified using the optional ‘orig_name’ argument.

    Parameters:
    • container_symbol (psyclone.psyir.symbols.ContainerSymbol) – symbol representing the external container from which the symbol is imported.

    • orig_name (Optional[str]) – the name of the symbol in the external container before it is renamed, or None (the default) if it is not renamed.

    Raises:

    TypeError – if the orig_name argument is an unexpected type.

  • class psyclone.psyir.symbols.ArgumentInterface(access=None)[source]

    Captures the interface to a Symbol that is accessed as a routine argument.

    Parameters:

    access (psyclone.psyir.symbols.ArgumentInterface.Access) – specifies how the argument is used in the Schedule

  • class psyclone.psyir.symbols.StaticInterface[source]

    The symbol contains data that is kept alive through the execution of the program.

  • class psyclone.psyir.symbols.CommonBlockInterface[source]

    A symbol declared in the local scope but acts as a global that can be accessed by any scope referencing the same CommonBlock name.

  • class psyclone.psyir.symbols.UnresolvedInterface[source]

    We have a symbol but we don’t know where it is declared.

  • class psyclone.psyir.symbols.UnknownInterface[source]

    We have a symbol with a declaration but PSyclone does not support its attributes.

  • class psyclone.psyir.symbols.PreprocessorInterface[source]

    The symbol exists in the file through compiler macros or preprocessor directives.

    Note that this is different from UnresolvedInterface because the backend will not check if is importing statements that could bring them into scope.

Creating PSyIR

Symbol names

PSyIR symbol names can be specified by a user. For example:

var_name = "my_name"
symbol_table = SymbolTable()
data = DataSymbol(var_name, REAL_TYPE)
symbol_table.add(data)
reference = Reference(data)

However, the SymbolTable add() method will raise an exception if a user tries to add a symbol with the same name as a symbol already existing in the symbol table.

Alternatively, the SymbolTable also provides the new_symbol() method (see Section Symbols and Symbol Tables for more details) that uses a new distinct name from any existing names in the symbol table. By default the generated name is the value PSYIR_ROOT_NAME variable specified in the DEFAULT section of the PSyclone config file, followed by an optional “_” and an integer. For example, the following code:

from psyclone.psyir.symbols import SymbolTable
symbol_table = SymbolTable()
for i in range(0, 3):
    var_name = symbol_table.new_symbol().name
    print(var_name)

gives the following output:

psyir_tmp
psyir_tmp_0
psyir_tmp_1

As the root name (psyir_tmp in the example above) is specified in PSyclone’s config file it can be set to whatever the user wants.

Note

The particular format used to create a unique name is the responsibility of the SymbolTable class and may change in the future.

A user might want to create a name that has some meaning in the context in which it is used e.g. idx for an index, i for an iterator, or temp for a temperature field. To support more readable names, the new_symbol() method allows the user to specify a root name as an argument to the method which then takes the place of the default root name. For example, the following code:

from psyclone.psyir.symbols import SymbolTable
symbol_table = SymbolTable()
for i in range(0, 3):
    var_name = symbol_table.new_symbol(root_name="something")
    print(var_name)

gives the following output:

something
something_0
something_1

By default, new_symbol() creates generic symbols, but often the user will want to specify a Symbol subclass with some given parameters. The new_symbol() method accepts a symbol_type parameter to specify the subclass. Arguments for the constructor of that subclass may be supplied as keyword arguments. For example, the following code:

from psyclone.psyir.symbols import SymbolTable, DataSymbol, REAL_TYPE
symbol_table = SymbolTable()
symbol_table.new_symbol(root_name="something",
                        symbol_type=DataSymbol,
                        datatype=REAL_TYPE,
                        is_constant=True,
                        initial_value=3)

declares a symbol named “something” of REAL_TYPE datatype where the is_constant and initial_value arguments will be passed to the DataSymbol constructor.

An example of using the new_symbol() method can be found in the PSyclone examples/psyir directory.

Nodes

PSyIR nodes are connected together via parent and child methods provided by the Node baseclass.

These nodes can be created in isolation and then connected together. For example:

assignment = Assignment()
literal = Literal("0.0", REAL_TYPE)
reference = Reference(symbol)
assignment.children = [reference, literal]

However, as connections get more complicated, creating the correct connections can become difficult to manage and error prone. Further, in some cases children must be collected together within a Schedule (e.g. for IfBlock, Loop and WhileLoop).

To simplify this complexity, each of the Kernel-layer nodes which contain other nodes have a static create method which helps construct the PSyIR using a bottom up approach. Using this method, the above example then becomes:

literal = Literal("0.0", REAL_TYPE)
reference = Reference(symbol)
assignment = Assignment.create(reference, literal)

Creating the PSyIR to represent a complicated access of a member of a structure is best performed using the create() method of the appropriate Reference subclass. For a relatively straightforward access such as (the Fortran) field1%region%nx, this would be:

from psyclone.psyir.nodes import StructureReference
fld_sym = symbol_table.lookup("field1")
ref = StructureReference.create(fld_sym, ["region", "nx"])

where symbol_table is assumed to be a pre-populated Symbol Table containing an entry for “field1”.

A more complicated access involving arrays of structures such as field1%sub_grids(idx, 1)%nx would be constructed as:

from psyclone.psyir.symbols import INTEGER_TYPE
from psyclone.psyir.nodes import StructureReference, Reference, Literal
idx_sym = symbol_table.lookup("idx")
fld_sym = symbol_table.lookup("field1")
ref = StructureReference.create(fld_sym,
    [("sub_grids", [Reference(idx_sym), Literal("1", INTEGER_TYPE)]),
     "nx"])

Note that the list of quantities passed to the create() method now contains a 2-tuple in order to describe the array access.

More examples of using this approach can be found in the PSyclone examples/psyir directory.

Comparing PSyIR nodes

The == (equality) operator for PSyIR nodes performs a specialised equality check to compare the value of each node. This is also useful when comparing entire subtrees since the equality operator automatically recurses through the children and compares each child with the appropriate equality semantics, e.g.

# Is the loop upper bound expression exactly the same?
if loop1.stop_expr == loop2.stop_expr:
        print("Same upper bound!")

The equality operator will handle expressions like my_array%my_field(:3) with the derived type fields and the range components automatically, but it cannot handle symbolically equivalent fields, i.e. my_array%my_field(:3) != my_array%my_field(:2+1).

Annotations and code comments are ignored in the equality comparison since they don’t alter the semantic meaning of the code. So these two statements compare to True:

a = a + 1
a = a + 1 !Increases a by 1

Sometimes there are cases where one really means to check for the specific instance of a node. In this case, Python provides the is operator, e.g.

# Is the self instance part of this routine?
is_here = any(node is self for node in routine.walk(Node))

Additionally, PSyIR nodes cannot be used as map keys or similar. The easiest way to do this is just use the id as the key:

node_map = {}
node_map[id(mynode)] = "element"

Modifying the PSyIR

Once we have a complete PSyIR AST there are 2 ways to modify its contents and/or structure: by applying transformations (see next section Transformations), or by direct PSyIR API methods. This section describes some of the methods that the PSyIR classes provide to modify the PSyIR AST in a consistent way (e.g. without breaking its many internal references). Some complete examples of modifying the PSyIR can be found in the PSyclone examples/psyir/modify.py script.

The rest of this section introduces examples of the available direct PSyIR modification methods.

Renaming symbols

The symbol table provides the method rename_symbol() that given a symbol and an unused name will rename the symbol. The symbol renaming will affect all the references in the PSyIR AST to that symbol. For example, the PSyIR representing the following Fortran code:

subroutine work(psyir_tmp)
    real, intent(inout) :: psyir_tmp
    psyir_tmp=0.0
end subroutine

could be modified by the following PSyIR statements:

symbol = symbol_table.lookup("psyir_tmp")
symbol_table.rename_symbol(tmp_symbol, "new_variable")

which would result in the following Fortran output code:

subroutine work(new_variable)
    real, intent(inout) :: new_variable
    new_variable=0.0
end subroutine

Specialising symbols

The Symbol class provides the method specialise() that given a subclass of Symbol will change the Symbol instance to the specified subclass. If the subclass has any additional properties then these would need to be set explicitly.

symbol = Symbol("name")
symbol.specialise(RoutineSymbol)
# Symbol is now a RoutineSymbol

This method is useful as it allows the class of a symbol to be changed without affecting any references to it.

Replacing PSyIR nodes

In certain cases one might want to replace a node in a PSyIR tree with another node. All nodes provide the replace_with() method to replace the node and its descendants with another given node and its descendants.

node.replace_with(new_node)

When the node being replaced is part of a named context (in Calls or Operations) the name of the argument is conserved by default. For example

call named_subroutine(name1=1)
call.children[0].replace_with(Literal('2', INTEGER_TYPE))

will become:

call named_subroutine(name1=2)

This behaviour can be changed with the keep_name_in_context parameter.

call.children[0].replace_with(
    Literal('3', INTEGER_TYPE),
    keep_name_in_context=False
)

will become:

call named_subroutine(3)

Detaching PSyIR nodes

Sometimes we just may wish to detach a certain PSyIR subtree in order to remove it from the root tree but we don’t want to delete it altogether, as it may be re-inserted again in another location. To achieve this, all nodes provide the detach method:

tmp = node.detach()

Copying nodes

Copying a PSyIR node and its children is often useful in order to avoid repeating the creation of similar PSyIR subtrees. The result of the copy allows the modification of the original and the copied subtrees independently, without altering the other subtree. Note that this is not equivalent to the Python copy or deepcopy functionality provided in the copy library. This method performs a bespoke copy operation where some components of the tree, like children, are recursively copied, while others, like the top-level parent reference are not.

new_node = node.copy()

Named arguments

The Call node (and its sub-classes) support named arguments.

Named arguments can be set or modified via the create(), append_named_arg(), insert_named_arg() or replace_named_arg() methods.

If an argument is inserted directly (via the children list) then it is assumed that this is not a named argument. If the top node of an argument is replaced by removing and inserting a new node then it is assumed that this argument is no longer a named argument. If it is replaced with the replace_with method, it has a keep_name_in_context argument to choose the desired behaviour (defaults to True). If arguments are re-ordered then the names follow the re-ordering.

The names of named arguments can be accessed via the argument_names property. This list has an entry for each argument and either contains a name or None (if this is not a named argument).

The PSyIR does not constrain which arguments are specified as being named and what those names are. It is the developer’s responsibility to make sure that these names are consistent with any intrinsics that will be generated by the back-end. In the future, it is expected that the PSyIR will know about the number and type of arguments expected by Operation nodes, beyond simply being unary, binary or nary.

One restriction that Fortran has (but the PSyIR does not) is that all named arguments should be at the end of the argument list. If this is not the case then the Fortran backend writer will raise an exception.