Dependency Analysis Tools#

PSyIR nodes provide multiple methods to reason about the dependencies and lifetimes of the symbols used in a PSyIR tree. These are described in the PSyIR dependency analysis methods section.

These methods currently use two distinct implementations, the new variable access API (which also provides the DefinitionUseChains and the Loop Dependency Tools for deeper analysis), and the older PSyKAl halo exchange dependency analysis. There is a certain overlap between these two methods, and it is expected that the old PSyKAl dependency analysis will be integrated with the variable access API in the future (see #1148).

PSyIR Dependency Analysis Methods#

Node.reference_accesses()[source]
Return type:

VariablesAccessMap

Returns:

a map of all the symbol accessed inside this node, the keys are Signatures (unique identifiers to a symbol and its structure acccessors) and the values are AccessSequence (a sequence of AccessTypes).

Reference.previous_accesses()[source]
Returns:

the nodes accessing the same symbol directly before this reference. It can be multiple nodes if the control flow diverges and there are multiple possible accesses.

Return type:

List[psyclone.psyir.nodes.Node]

Reference.next_accesses()[source]
Returns:

the nodes accessing the same symbol directly after this reference. It can be multiple nodes if the control flow diverges and there are multiple possible accesses.

Return type:

List[psyclone.psyir.nodes.Node]

Reference.escapes_scope(scope, visited_nodes=None)[source]

Whether the symbol lifetime continues after the given scope. For example, given the following fortran code:

    do i=1,10
      a = 1
      b = 2
      c = 3
    end do
    b = 4
    call mysub(a, b)
end subroutine

‘b’ and ‘c’ if it is local, finish their value lifetime at the end of the loop scope (it is not re-used afterwards). While for ‘a’ and ‘c’ if it is global, their value may be used later and they “escape the scope”.

Parameters:
  • scope (Node) – the given scope that we evaluate.

  • visited_nodes (Optional[set]) – a set of nodes already visited, this is necessary because the dependency chains may contain cycles. Defaults to an empty set.

Return type:

bool

Returns:

whether the symbol lifetime continues after the given scope.

Reference.enters_scope(scope, visited_nodes=None)[source]

Whether the symbol lifetime starts before the given scope. For example, given the following fortran code:

do i=1,10
  a = 1
  if (b>3) c = 1
end do

‘a’ does not enter the scope. Even if it had a value before in the loop scope this is reassigned to a new value. However, ‘b’ and ‘c’ values enter the scope, because there is a path in which they take the value the symbol had before the scope.

Parameters:
  • scope (Node) – the given scope that we evaluate.

  • visited_nodes (Optional[set]) – a set of nodes already visited, this is necessary because the dependency chains may contain cycles. Defaults to an empty set.

Return type:

bool

Returns:

whether the symbol lifetime starts before the given scope.

Loop.independent_iterations(test_all_variables=False, signatures_to_ignore=None, dep_tools=None)[source]

This function analyses a loop in the PSyIR to see whether its iterations are independent.

Parameters:
  • test_all_variables (bool) – if True, it will test if all variable accesses are independent, otherwise it will stop after the first variable access is found that isn’t.

  • signatures_to_ignore (Optional[ List[psyclone.core.Signature]]) – list of signatures for which to skip the access checks.

  • dep_tools (Optional[ psyclone.psyir.tools.DependencyTools]) – an optional instance of DependencyTools so that the caller can access any diagnostic messages detailing why the loop iterations are not independent.

Returns:

True if the loop iterations are independent, False otherwise.

Return type:

bool

PSyKAl analysis methods:

Node.dag(file_name='dag', file_format='svg')[source]

Create a dag of this node and its children, write it to file and return the graph object.

Parameters:
  • file_name (str) – name of the file to create.

  • file_format (str) – format of the file to create. (Must be one recognised by Graphviz.)

Returns:

the graph object or None (if ‘graphviz’ is not found).

Return type:

graphviz.Digraph or NoneType

Raises:

GenerationError – if the specified file format is not recognised by Graphviz.

Node.backward_dependence()[source]

Returns the closest preceding Node that this Node has a direct dependence with or None if there is not one. Only Nodes with the same parent as self are returned. Nodes inherit their descendants’ dependencies. The reason for this is that for correctness a node must maintain its parent if it is moved. For example a halo exchange and a kernel call may have a dependence between them but it is the loop body containing the kernel call that the halo exchange must not move beyond i.e. the loop body inherits the dependencies of the routines within it.

Node.forward_dependence()[source]

Returns the closest following Node that this Node has a direct dependence with or None if there is not one. Only Nodes with the same parent as self are returned. Nodes inherit their descendants’ dependencies. The reason for this is that for correctness a node must maintain its parent if it is moved. For example a halo exchange and a kernel call may have a dependence between them but it is the loop body containing the kernel call that the halo exchange must not move beyond i.e. the loop body inherits the dependencies of the routines within it.

PSyKAl Dependence Analysis#

Dependence Analysis in PSyclone produces ordering constraints between instances of the Argument class within a PSyIR tree.

The Argument class is used to specify the data being passed into and out of instances of the Kern class, HaloExchange class and GlobalSum class (and their subclasses).

As an illustration consider the following invoke:

invoke(           &
    kernel1(a,b), &
    kernel2(b,c))

where the metadata for kernel1 specifies that the 2nd argument is written to and the metadata for kernel2 specifies that the 1st argument is read.

In this case the PSyclone dependence analysis will determine that there is a flow dependence between the second argument of Kernel1 and the first argument of Kernel2 (a read after a write).

Information about arguments is aggregated to the PSyIR node level (kernel1 and kernel2 in this case) and then on to the parent loop node resulting in a flow dependence (a read after a write) between a loop containing kernel1 and a loop containing kernel2. This dependence is used to ensure that a transformation is not able to move one loop before or after the other in the PSyIR schedule (as this would cause incorrect results).

Dependence analysis is implemented in PSyclone to support functionality such as adding and removing halo exchanges, parallelisation and moving nodes in a PSyIR schedule. Dependencies between nodes in a PSyIR schedule can be viewed as a DAG using the dag() method within the Node base class.

DataAccess Class#

The DataAccess class is at the core of PSyclone data dependence analysis. It takes an instance of the Argument class on initialisation and provides methods to compare this instance with other instances of the Argument class. The class is used to determine 2 main things, called overlap and covered.

Overlap#

Overlap specifies whether accesses specified by two instances of the Argument class access the same data or not. If they do access the same data their accesses are deemed to overlap. The best way to explain the meaning of overlap is with an example:

Consider a one dimensional array called A of size 4 (A(4)). If one instance of the Argument class accessed the first two elements of array A and another instance of the Argument class accessed the last two elements of array A then they would both be accessing array A but their accesses would not overlap. However, if one instance of the Argument class accessed the first three elements of array A and another instance of the Argument class accessed the last two elements of array A then their accesses would overlap as they are both accessing element A(3).

Having explained the idea of overlap in its general sense, in practice PSyclone currently assumes that any two instances of the Argument class that access data with the same name will always overlap and does no further analysis (apart from halo exchanges and vectors, which are discussed below). The reason for this is that nearly all accesses to data, associated with an instance of the Argument class, start at index 1 and end at the number of elements, dofs or some halo depth. The exceptions to this are halo exchanges, which only access the halo and boundary conditions, which only access a subset of the data. However these subset accesses are currently not captured in metadata so PSyclone must assume subset accesses do not exist.

If there is a field vector associated with an instance of an Argument class then all of the data in its vector indices are assumed to be accessed when the argument is part of a Kern or a GlobalSum. However, in contrast, a HaloExchange only acts on a single index of a field vector. Therefore there is one halo exchange per field vector index. For example:

InvokeSchedule[invoke='invoke_0_testkern_stencil_vector_type', dm=True]
... HaloExchange[field='f1', type='region', depth=1, check_dirty=True]
... HaloExchange[field='f1', type='region', depth=1, check_dirty=True]
... HaloExchange[field='f1', type='region', depth=1, check_dirty=True]
... Loop[type='',field_space='w0',it_space='cells', upper_bound='cell_halo(1)']
... ... CodedKern testkern_stencil_vector_code(f1,f2) [module_inline=False]

In the above PSyIR schedule, the field f1 is a vector field and the CodedKern testkern_stencil_vector_code is assumed to access data in all of the vector components. However, there is a separate HaloExchange for each component. This means that halo exchanges accessing the same field but different components do not overlap, but each halo exchange does overlap with the loop node. The current implementation of the overlaps() method deals with field vectors correctly.

Coverage#

The concept of coverage naturally follows from the discussion in the previous section.

Again consider a one dimensional array called A of size 4 (A(4)). If one instance (that we will call the source) of the Argument class accessed the first 3 elements of array A (i.e. elements 1 to 3) and another instance of the Argument class accessed the first two elements of array A then their accesses would overlap as they are both accessing elements A(1) and A(2) and elements A(1) and A(2) would be covered. However, access A(3) for the source Argument class would not yet be covered. If a subsequent instance of the Argument class accessed the 2nd and 3rd elements of array A then all of the accesses (A(1), A(2) and A(3)) would now be covered so the source argument would be deemed to be covered.

In PSyclone the above situation occurs when a vector field is accessed in a kernel and also requires halo exchanges e.g.:

InvokeSchedule[invoke='invoke_0_testkern_stencil_vector_type', dm=True]
   HaloExchange[field='f1', type='region', depth=1, check_dirty=True]
   HaloExchange[field='f1', type='region', depth=1, check_dirty=True]
   HaloExchange[field='f1', type='region', depth=1, check_dirty=True]
   Loop[type='',field_space='w0',it_space='cells', upper_bound='cell_halo(1)']
      CodedKern testkern_stencil_vector_code(f1,f2) [module_inline=False]

In this case the PSyIR loop node needs to know about all 3 halo exchanges before its access is fully covered. This functionality is implemented by passing instances of the Argument class to the DataAccess class update_coverage() method and testing the access.covered property until it returns True.

# this example is for a field vector 'f1' of size 3
# f1_index[1,2,3] are halo exchange accesses to vector indices [1,2,3] respectively
access = DataAccess(f1_loop)
access.update_coverage(f1_index1)
result = access.covered  # will be False
access.update_coverage(f1_index2)
result = access.covered  # will be False
access.update_coverage(f1_index3)
result = access.covered  # will be True
access.reset_coverage()

Note the reset_coverage() method can be used to reset internal state so the instance can be re-used (but this is not used by PSyclone at the moment).

The way in which halo exchanges are placed means that it is not possible for two halo exchange with the same index to depend on each other in a schedule. As a result an exception is raised if this situation is found.

Notice there is no concept of read or write dependencies here. Read or write dependencies are handled by classes that make use of the DataAccess class i.e. the _field_write_arguments() and _field_read_arguments() methods, both of which are found in the Arguments class.

Variable Accesses#

When using PSyclone with generic Fortran code, it is not possible to rely on pre-defined kernel information to determine dependencies between loops. So an additional, somewhat lower-level API has been implemented that can be used to determine variable accesses (READ, WRITE etc.), which is based on the PSyIR information. The only exception to this is if a kernel is called, in which case the metadata for the kernel declaration will be used to determine the variable accesses for the call statement. The information about all variable usage of a PSyIR node or a list of nodes can be gathered by creating an object of type psyclone.core.VariablesAccessMap. This class uses a Signature object to keep track of the variables used.

Signature#

A signature can be thought of as a tuple that consists of the variable name and structure members used in an access - called components. For example, an access like a(1)%b(k)%c(i,j) would be stored with a signature (a, b, c), giving three components a, b, and c. A simple variable such as a is stored as a one-element tuple (a, ), having a single component.

class psyclone.core.Signature(variable, sub_sig=None)[source]

Given a variable access of the form a(i,j)%b(k,l)%c, the signature of this access is the tuple (a,b,c). For a simple scalar variable a the signature would just be (a,). The signature is the key used in VariablesAccessMap. In order to make sure two different signature objects containing the same variable can be used as a key, this class implements __hash__ and other special functions. The constructor also supports appending an existing signature to this new signature using the sub_sig argument. This is used in StructureReference to assemble the overall signature of a structure access.

Parameters:
  • variable (str or tuple of str or list of str) – the variable that is accessed.

  • sub_sig (psyclone.core.Signature) – a signature that is to be added to this new signature.

__eq__(other)[source]

Required in order to use a Signature instance as a key. Compares two objects (one of which might not be a Signature).

__hash__()[source]

This returns a hash value that is independent of the instance. I.e. two instances with the same signature will have the same hash key.

__lt__(other)[source]

Required to sort signatures. It just compares the tuples.

property is_structure
Returns:

True if this signature represents a structure.

Return type:

bool

to_language(component_indices=None, language_writer=None)[source]

Converts this signature with the provided indices to a string in the selected language.

TODO 1320 This subroutine can be removed when we stop supporting strings - then we can use a PSyIR writer for the ReferenceNode to provide the right string.

Parameters:
Raises:

InternalError – if the number of components in this signature is different from the number of indices in component_indices.

property var_name
Returns:

the actual variable name, i.e. the first component of the signature.

Return type:

str

AccessType#

An individual access to a Signature is described by an instance of the AccessType enumeration:

class psyclone.core.access_type.AccessType(*values)[source]

A simple enum-class for the various valid access types.

CALL = 8

A symbol representing a routine is called.

CONSTANT = 10

Access data that cannot be redefined during execution, therefore, it is available at compile-time and can be used for type properties such as kinds or dimensions.

INC = 4

Incremented from more than one cell column (see the LFRic API section of the User Guide).

INQUIRY = 9

The property/ies of a symbol is/are queried but the data it represents is not accessed (e.g. ‘var’ in SIZE(var, dim=1)).

READ = 1

Data associated with the symbol is read.

READINC = 5

Read before incrementing. Requires that the outermost halo be clean (see the LFRic API section of the User Guide).

READWRITE = 3

Data associated with the symbol is both read and written (e.g. is passed to a routine with intent(inout)).

SUM = 6

Is the output of a SUM reduction.

UNKNOWN = 7

This is used internally to indicate unknown access type of a variable, e.g. when a variable is passed to a subroutine and the access type of this variable in the subroutine is unknown. TODO #2863 - VariablesAccessMap does not currently consider UNKNOWN accesses and it should!

WRITE = 2

Data associated with the symbols is written.

static all_read_accesses()[source]
Returns:

A list of all access types that involve reading an argument in some form.

Return type:

List of py:class:psyclone.core.access_type.AccessType.

static all_write_accesses()[source]
Returns:

A list of all access types that involve writing to an argument in some form.

Return type:

List of py:class:psyclone.core.access_type.AccessType.

api_specific_name()[source]

This convenience function returns the name of the type in the current API. E.g. in the lfric API, WRITE –> “gh_write”. If no mapping is available then the generic name is returned.

Return type:

str

Returns:

The API specific name.

static from_string(access_string)[source]

Convert a string (e.g. “read”) into the corresponding AccessType enum value (AccessType.READ).

Parameters:

access_string (str) – Access type as a string.

Returns:

Corresponding AccessType enum.

Return type:

psyclone.core.access_type.AccessType

Raises:

ValueError – if access_string is not a valid access type.

static get_valid_reduction_modes()[source]
Returns:

A list of valid reduction access modes.

Return type:

List of py:class:psyclone.core.access_type.AccessType.

static get_valid_reduction_names()[source]
Returns:

A list of valid reduction access names.

Return type:

List of strings.

static non_data_accesses()[source]
Returns:

all access types that do not touch any data associated with a symbol.

Return type:

list[psyclone.core.AccessType]

VariablesAccessMap#

The VariablesAccessMap class is used to store information about all accesses in a region of code. To collect access information, call any Node reference_accesses() method for the code region of interest. It will return the accesses for the PSyIR in a dictionary of kind VariablesAccessMap.

Node.reference_accesses()[source]
Return type:

VariablesAccessMap

Returns:

a map of all the symbol accessed inside this node, the keys are Signatures (unique identifiers to a symbol and its structure acccessors) and the values are AccessSequence (a sequence of AccessTypes).

class psyclone.core.VariablesAccessMap[source]

This dictionary stores AccessSequence instances indexed by their signature.

__str__()[source]

Gives a shortened visual representation of all variables and their access mode

add_access(signature, access_type, node, component_indices=None)[source]

Adds access information for the variable with the given signature. If the component_indices parameter is not an instance of ComponentIndices, it is used to construct an instance. Therefore it can be None, a list or a list of lists of PSyIR nodes. In the case of a list of lists, this will be used unmodified to construct the ComponentIndices structures. If it is a simple list, it is assumed that it contains the indices used in accessing the last component of the signature. For example, for a%b with component_indices=[i,j], it will create [[], [i,j] as component indices, indicating that no index is used in the first component a. If the access is supposed to be for a(i)%b(j), then the component_indices argument must be specified as a list of lists, i.e. [[i], [j]].

Parameters:
property all_data_accesses: List[Signature]
Returns:

all Signatures in this instance that have a data access (i.e. the data associated with them is read or written).

property all_signatures
Returns:

all signatures contained in this instance, sorted (in order to make test results reproducible).

Return type:

List[psyclone.core.Signature]

has_read_write(signature)[source]

Checks if the specified variable signature has at least one READWRITE access (which is typically only used in a function call).

Parameters:

signature (psyclone.core.Signature) – signature of the variable

Returns:

True if the specified variable name has (at least one) READWRITE access.

Return type:

bool

Raises:

KeyError if the signature cannot be found.

is_called(signature)[source]
Parameters:

signature (Signature) – signature of the variable.

Return type:

bool

Returns:

True if the specified variable is called at least once.

is_read(signature)[source]

Checks if the specified variable signature is at least read once.

Parameters:

signature (psyclone.core.Signature) – signature of the variable

Return type:

bool

Returns:

True if the specified variable name is read (at least once).

Raises:

KeyError if the signature cannot be found.

is_written(signature)[source]

Checks if the specified variable signature is at least written once.

Parameters:

signature (psyclone.core.Signature) – signature of the variable.

Returns:

True if the specified variable is written (at least once).

Return type:

bool

Raises:

KeyError if the signature name cannot be found.

update(other_access_map)[source]

Updates this dictionary with the entries in the provided VariablesAccessMap. If there are repeated signatures, the provided values are appended to the existing sequence of accesses.

Parameters:

other_access_map (psyclone.core.VariablesAccessMap) – the other VariablesAccessMap instance.

This class collects information for each variable used in the tree starting with the given node. Use the update() method to combine two VariablesAccessMap objects into one. It is up to the user to keep track of which statements (PSyIR nodes) a given VariablesAccessMap instance is holding information about. If the PSyIR tree is modified the VariablesAccessMap maps become invalid, so it is not recommended to store them.

AccessSequence#

The values of the VariablesAccessMap map are AccessSequence, which contain the sequence of accesses to a given variable. When a new variable is detected when adding access information to a VariablesAccessMap instance via the add_access() method, a new instance of AccessSequence is added, which in turn stores all accesses to the specified variable.

class psyclone.core.AccessSequence(signature)[source]

This class stores a list with all accesses to one variable.

Parameters:

signature (psyclone.core.Signature) – signature of the variable.

add_access(access_type, node, component_indices=None)[source]

Adds access information to this variable.

Parameters:
  • access_type (AccessType) – the type of access (READ, WRITE, ….)

  • node (Node) – Node in PSyIR in which the access happens.

  • component_indices (Union[list[list[Node]], ComponentIndices, None]) – indices used for each component of the access.

property all_read_accesses
Returns:

a list with all AccessInfo data for this variable that involve reading this variable.

Return type:

List[psyclone.core.AccessInfo]

property all_write_accesses
Returns:

a list with all AccessInfo data for this variable that involve writing this variable.

Return type:

List[psyclone.core.AccessInfo]

change_read_to_write()[source]

This function is only used when analysing an assignment statement. The LHS has first all variables identified, which will be READ. This function is then called to change the assigned-to variable on the LHS to from READ to WRITE. Since the LHS is stored in a separate AccessSequence class, it is guaranteed that there is only one READ entry for the variable (although there maybe INQUIRY accesses for array bounds).

Raises:

InternalError – if there is an access that is not READ or INQUIRY or there is > 1 READ access.

has_data_access()[source]
Return type:

bool

Returns:

True if there is an access of the data associated with this signature (as opposed to a call or an inquiry), False otherwise.

has_indices(index_variable=None)[source]

Checks whether this variable accesses has any index. If the optional index_variable is provided, only indices involving the given variable are considered.

Parameters:

index_variable (str) – only consider index expressions that involve this variable.

Return type:

bool

Returns:

true if any of the accesses has an index.

has_read_write()[source]

Checks if this variable has at least one READWRITE access.

Returns:

True if this variable is read (at least once).

Return type:

bool

is_called()[source]
Return type:

bool

Returns:

whether or not any accesses of this variable represent a call.

is_read()[source]
Return type:

bool

Returns:

True if this variable is read (at least once).

is_read_only()[source]

Checks if this variable is always read, and never written.

Return type:

bool

Returns:

True if this variable is read only.

is_written()[source]
Return type:

bool

Returns:

True if this variable is written (at least once).

is_written_first()[source]
Return type:

bool

Returns:

True if this variable is written in the first data access (which indicates that this variable is not an input variable for a kernel).

property signature
Returns:

the signature for which the accesses are stored.

Return type:

psyclone.core.Signature

str_access_summary()[source]
Return type:

str

Returns:

a string of the accesstypes but removing duplicates.

property var_name
Returns:

the name of the variable whose access info is managed.

Return type:

str

AccessInfo#

The class AccessSequence is an ordered list of psyclone.core.AccessInfo instances to store all accesses to a single variable. A new instance of AccessInfo is appended to the list whenever add_access() is called.

class psyclone.core.AccessInfo(access_type, node, component_indices=None)[source]

This class stores information about an access to a variable (the node where it happens and the type of access, and the index accessed if available).

Parameters:
  • access – the access type.

  • node (Node) – Node in PSyIR in which the access happens.

  • component_indices (Union[list[list[Node]], ComponentIndices, None]) – indices used in the access, defaults to None.

property access_type: AccessType
Returns:

the access type.

change_read_to_write()[source]

This changes the access mode from READ to WRITE. This is used for processing assignment statements, where the LHS is first considered to be READ, and which is then changed to be WRITE.

Raises:

InternalError – if the variable originally does not have READ access.

property component_indices

This function returns the list of accesses used for each component as an instance of ComponentIndices. For example, a(i)%b(j,k)%c will return an instance of ComponentIndices representing [ [i], [j, k], [] ]. In the case of a simple scalar variable such as a, the component_indices will represent [ [] ].

Returns:

the indices used in this access for each component.

Return type:

psyclone.core.component_indices.ComponentIndices

property description: str
Returns:

a textual description of this access for use in error messages.

has_indices()[source]
Return type:

bool

Returns:

whether any of the access components uses an index.

is_any_read()[source]
Return type:

bool

Returns:

whether this access represents a write of any kind.

is_any_write()[source]
Return type:

bool

Returns:

whether this access represents a write of any kind.

property is_data_access: bool
Returns:

whether or not this access is to the data associated with a signature (i.e. is not just an inquiry-type access).

property node
Returns:

the PSyIR node at which this access happens.

Return type:

psyclone.psyir.nodes.Node

Indices#

The AccessInfo class stores the original PSyIR node that contains the access, but it also stores the indices used in a simplified form, which makes it easier to analyse dependencies without having to analyse a PSyIR tree for details. The indices are stored in the ComponentIndices object that each access has, which can be accessed using the component_indices property of an AccessInfo object.

class psyclone.core.ComponentIndices(indices=None)[source]

This class stores index information for variable accesses. It stores one index list for each component of a variable, e.g. for a(i)%b(j) it would store [ [i], [j] ]. Even for scalar accesses an empty list is stored, so a would have the component indices [ [] ], and a%b would have [ [], [] ]. Each member of this list of lists is the PSyIR node describing the array expression used.

As a shortcut, the indices parameter can be None or an empty list (which then creates the component indices as [[]], i.e. indicating a scalar access), a list l (which will then create the component indices as [l], i.e. a single component variable, which uses all the indices in the list l as array indices).

TODO #845 - the constructor should check that the things it is passed are PSyIR nodes. Currently it is sometimes given strings.

Parameters:

indices (None, [], a list or a list of lists of psyclone.psyir.nodes.Node) – the indices from which to create this object.

Raises:
  • InternalError – if the indices parameter is not None, a list or a list of lists.

  • InternalError – if the indices parameter is a list, and some but not all members are a list.

__getitem__(indx)[source]

Allows to use this class as a dictionary. If indx is an integer, the list of indices for the specified component is returned. If indx is a tuple (as returned from iterate), it will return the PSyIR of the index for the specified component at the specified dimension.

Returns:

either the list of indices for a component, or the index PSyIR node for the specified tuple.

Return type:

list of psyclone.psyir.nodes.Node, or psyclone.psyir.nodes.Node

Raises:

IndexError – if a tuple is given and one of the indices is outside of the valid range.

__len__()[source]
Returns:

the number of components in this class.

Return type:

int

get_subscripts_of(set_of_vars)[source]

This function returns a flat list of which variable from the given set of variables is used in each subscript. For example, the access a(i+i2)%b(j*j+k,k)%c(l,5) would have the component_indices [[i+i2], [j*j+k,k], [l,5]]. If the set of variables is (i,j,k), then get_subscripts_of would return [{i},{j,k},{k},{l},{}].

Parameters:

set_of_vars (Set[str]) – set with name of all variables.

Returns:

a list of sets with all variables used in the corresponding array subscripts as strings.

Return type:

List[Set[str]]

has_indices()[source]
Return type:

bool

Returns:

whether any of the access components uses an index

property indices_lists
Returns:

the component indices list of lists.

Return type:

list of list of psyclone.psyir.nodes.Node

iterate()[source]

Allows iterating over all component indices. It returns a tuple with two elements, the first one indicating the component, the second the dimension for which the index is. The return tuple can be used in a dictionary access (see __getitem__) of this object.

Returns:

a tuple of the component index and index.

Return type:

tuple(int, int)

The ComponentIndices class provides an array-like accessor for the internal data structure, you can use len(component_indices) to get the number of components for which array indices are stored. The information can be accessed using array subscription syntax, e.g.: component_index[0] will return the list of array indices used in the first component. You can also use a 2-tuple to select a component and a dimension at the same time, e.g. component_indices[(0,1)], which will return the index used in the second dimension of the first component.

ComponentIndices provides an easy way to iterate over all indices using its iterate() method, which returns all valid 2-tuples of component index and dimension index. For example:

# access_info is an AccessInfo instance and contains one access. This
# could be as simple as `a(i,j)`, but also something more complicated
# like `a(i+2*j)%b%c(k, l)`.
for indx in access_info.component_indices.iterate():
    # indx is a 2-tuple of (component_index, dimension_index)
    psyir_index = access_info.component_indices[indx]

# Using enumerate:
for count, indx in enumerate(access_info.component_indices.iterate()):
    psyir_index = access_info.component_indices[indx]
    # fortran writer converts a PSyIR node to Fortran:
    print(f"Index-id {count} of 'a(i,j)': {fortran_writer(psyir_index)}")
Index-id 0 of 'a(i,j)': i
Index-id 1 of 'a(i,j)': j

To find out details about an index expression, you can either analyse the tree (e.g. using walk), or use the variable access functionality again. Below is an example that shows how this is done to determine if an array expression contains a reference to a given variable specified as a signature in the variable index_variable. The variable access_info is an instance of AccessInfo and contains the information about one reference. The function reference_accesses is used to analyse the index expression. Typically, this code would be wrapped in an outer loop over all accesses.

index_variable = Signature("i")
# access_info contains the access information for a single
# reference, e.g. `a(i+2*j)%b%c(k, l)`. Loop over all
# individual index expressions ("i+2*j", then "k" and "l"
# in the example above).
for indx in access_info.component_indices.iterate():
    index_expression = access_info.component_indices[indx]

    # Create an access info object to collect the accesses
    # in the index expression
    accesses = VariablesAccessMap(index_expression)

    # Then test if the index variable is used. Note that
    # the key of `access` is a signature, as is the `index_variable`
    if index_variable in accesses:
        # The index variable is used as an index
        # at the specified location.
        print(f"Index '{index_variable}' is used.")
        break
else:
    print(f"Index '{index_variable}' is not used.")

Access Examples#

Below we show a simple example of how to use this API. This is from the psyclone.psyir.nodes.OMPParallelDirective, and it is used to determine a list of all the scalar variables that must be declared as thread-private. Note that this code does not handle the usage of first-private declarations.

result = set()
var_accesses = omp_directive.reference_accesses()
for signature in var_accesses.all_signatures:
    if signature.is_structure:
        # A lookup in the symbol table for structures are
        # more complicated, so ignore them for this example.
        continue
    var_name = str(signature)
    symbol = symbol_table.lookup(var_name)
    # Ignore variables that are arrays, we only look at scalar ones.
    # The `is_array_access` function will take information from
    # the access information as well as from the symbol table
    # into account.
    access_sequence = var_accesses[signature]
    if symbol.is_array_access(access_info=access_info):
        # It's not a scalar variable, so it will not be private
        continue

    # If a scalar variable is only accessed once, it is either a coding
    # error or a shared variable - anyway it is not private
    if len(access_sequence) == 1:
        continue

    # We have at least two accesses. If the first one is a write,
    # assume the variable should be private:
    if access_sequence[0].access_type == AccessType.WRITE:
        print("Private variable", var_name)
        result.add(var_name.lower())

The next, hypothetical example shows how the VariablesAccessMap class can be used iteratively. Assume that you have a function can_be_parallelised that determines if the given variable accesses can be parallelised, and the aim is to determine the largest consecutive block of statements that can be executed in parallel. The accesses of one statement at a time are added until we find accesses that would prevent parallelisation:

# Create an empty instance to store accesses
accesses = VariablesAccessMap()
list_of_parallelisable_statements = []
for next_statement in statements:
    # Add the variable accesses of the next statement to
    # the existing accesses:
    next_statement.reference_accesses(accesses)
    # Stop when the next statement can not be parallelised
    # together with the previous accesses:
    if not can_be_parallelised(accesses):
        break
    list_of_parallelisable_statements.append(next_statement)

print(f"The first {len(list_of_parallelisable_statements)} statements can "
      f"be parallelised.")

Note

There is a certain overlap in the dependency analysis code and the variable access API. More work on unifying those two approaches will be undertaken in the future. Also, when calling reference_accesses() for an LFRic or GOcean kernel, the variable access mode for parameters is taken from the kernel metadata, not from the actual kernel source code.

Loop Dependency Tools#

PSyclone contains a class that builds upon the data-dependency functionality to provide useful tools for dependency analysis. It especially provides messages for the user to indicate why parallelisation was not possible. It uses SymPy internally to compare expressions symbolically.

class psyclone.psyir.tools.dependency_tools.DependencyTools(loop_types_to_parallelise=None)[source]

This class provides some useful dependency tools, allowing a user to overwrite/modify functions depending on the application. It includes a messaging system where functions can store messages that might be useful for the user to see.

Parameters:

loop_types_to_parallelise (Optional[List[str]]) – A list of loop types that will be considered for parallelisation. An example loop type might be ‘lat’, indicating that only loops over latitudes should be parallelised. The actually supported list of loop types is specified in the PSyclone config file. This can be used to exclude for example 1-dimensional loops.

Raises:

TypeError – if an invalid loop type is specified.

can_loop_be_parallelised(loop, test_all_variables=False, signatures_to_ignore=None)[source]

This function analyses a loop in the PsyIR to see if it can be safely parallelised.

Parameters:
  • loop (psyclone.psyir.nodes.Loop) – the loop node to be analysed.

  • test_all_variables (bool) – if True, it will test if all variable accesses can be parallelised, otherwise it will stop after the first variable is found that can not be parallelised.

  • signatures_to_ignore (Optional[ List[psyclone.core.Signature]]) – list of signatures for which to skip the access checks.

Returns:

True if the loop can be parallelised.

Return type:

bool

Raises:

TypeError – if the supplied node is not a Loop.

can_loops_be_fused(loop1, loop2)[source]

Function that verifies if two loops can be fused.

Parameters:
Returns:

whether the loops can be fused or not.

Return type:

bool

get_all_messages()[source]

Returns all messages that have been stored by the last function the user has called.

Returns:

a list of all messages.

Return type:

List[str]

Note

PSyclone provides ReplaceInductionVariableTrans, a transformation that can be very useful to improve the ability of the dependency analysis to provide useful information. It is recommended to run this transformation on a copy of the tree, since the transformation might prevent other optimisations. For example, it will set the values of removed variables at the end of the loop, which can prevent loop fusion etc to work as expected.

An example of how to use this class is shown below. (Note that this is just for demonstration purposes: in reality the validate method of OMPLoopTrans will also use the dependence analysis to check that the transformation is safe.) It takes a list of statements (i.e. nodes in the PSyIR), and adds ‘OMP DO’ directives around loops that can be parallelised:

parallel_loop = OMPLoopTrans()
dt = DependencyTools()

for statement in loop_statements:
    if isinstance(statement, Loop):
        # Check if there is a variable dependency that might
        # prevent this loop from being parallelised:
        if dt.can_loop_be_parallelised(statement):
            parallel_loop.apply(statement)
        else:
            # Print all messages from the dependency analysis
            # as feedback for the user:
            for message in dt.get_all_messages():
                print(message)

DefinitionUseChain#

PSyclone also provides a DefinitionUseChain class, which can search for forward and backward dependencies for a given Reference inside a region of code. This implementation differs from the DependencyTools as it is control-flow aware, so can find many dependencies for a single Reference in a given Routine or scope.

This is primarily used to implement the Reference.next_accesses and Reference.previous_accesses functions, but can be used directly as follows:

chain = DefinitionUseChain(reference)
accesses = chain.find_forward_accesses()
# accesses contains Nodes that are dependent on reference
accesses[0].....

By default the dependencies will be searched for in the containing Routine.

Limitations#

At the moment the DefinitionUseChain assumes that any control flow could not be taken, i.e. any code inside a Loop or If statement is not guaranteed to occur. These dependencies will be found, but will not limit further searching into the tree. Additionally, GOTO statements are not supported and if found, will throw an Exception.