Module Manager#

PSyclone uses a ModuleManager to handle searching for files containing Fortran modules. This object acts as the top-level interface to the code making up a program. It may be used to obtain the PSyIR for each Container (Fortran module) in a code. It is used by the Container import interface and the Kernel Extraction (PSyKE). For the latter it is used to discover all of the source files required to make a standalone driver.

The ModuleManager is a singleton which must be obtained via ModuleManager.get(). Having obtained the instance, it may be used to search for a particular module via the get_module_info method:

ModuleManager.get_module_info(module_name)[source]

This function returns the ModuleInfo for the specified module.

Parameters:

module_name (str) – Name of the module.

Return type:

Optional[ModuleInfo]

Returns:

object describing the requested module or None if the manager has been configured to ignore this module.

Raises:

FileNotFoundError – if the module_name is not found in either the cached data nor in the search path.

Any PSyclone command line option -d (see The psyclone command) will be added to the ModuleManager as recursive search paths. Internally, the ModuleManager uses caching to avoid repeatedly searching directories, and it will only access search paths as required. For example, if it should happen that the first search path is sufficient to find all modules during the lifetime of the module manager, no other search path will ever be accessed. The caching also implies that the ModuleManager will not detect any new files created during its lifetime.

Rather than rely on any particular naming convention to identify which Fortran source file contains a given module, a measure of the ‘similarity’ of the target module name and the base name (i.e. the filename stripped of any path and suffix) of any source file is used to identify likely candidates. The standard Python difflib.SequenceMatcher.ratio method is used to obtain the similarity score. If the score is above a certain threshold (currently set to 0.7 in the ModuleManager class) then the file is read, its contents cached within a FileInfo object, and a regular expression used to determine whether or not it does contain the target module. This approach has been designed to minimise IO activity since this could get very costly on the types of shared filesystem common on HPC resources. The use of a FileInfo object also facilitates the decoupling of the concept of a file from that of a module since the former can contain more than one of the latter.

The ModuleManager will return a ModuleInfo object to make information about a module available. Similar to the ModuleManager, a ModuleInfo object relies heavily on caching to avoid repeatedly reading a source file or parsing it. The side effect is that changes to a source file during the lifetime of the ModuleManager will not be reflected in its information.

The ModuleManager also provides a static function that will sort a list of module dependencies, so that compiling the modules in this order (or adding them in this order to a file) will allow compilation, i.e. any module will only depend on previously defined modules:

ModuleManager.sort_modules(module_dependencies)[source]

This function sorts the given dependencies so that all dependencies of a module are before any module that needs it. Input is a dictionary that contains all modules to be sorted as keys, and the value for each module is the set of dependencies that the module depends on.

Parameters:

module_dependencies (dict[str, set[str]]) – the list of modules required as keys, with all their dependencies as value.

Return type:

list[str]

Returns:

the sorted list of modules.

Once a ModuleInfo has been obtained, its primary role is to provide access to the PSyIR of the Container representing the module:

ModuleInfo.get_psyir()[source]

Returns the PSyIR representation of this module. This is based on the fparser tree (see get_parse_tree), and the information is cached. If the PSyIR must be modified, it needs to be copied, otherwise the modified tree will be returned from the cache in the future.

If the conversion to PSyIR fails then None is returned.

Returns:

PSyIR representing this module.

Return type:

psyclone.psyir.nodes.Container | NoneType

Raises:

InternalError – if the named Container (module) does not exist in the PSyIR.

However, it also provides methods (get_used_module_names, get_used_symbols_from_modules) for interrogating the parse tree which can be useful if it is not possible to represent this in PSyIR.

An example usage of the ModuleManager and ModuleInfo objects, which prints the filenames of all modules used in tl_testkern_mod:

mod_manager = ModuleManager.get()
# Add the path to the PSyclone LFRic example codes:
mod_manager.add_search_path("../../src/psyclone/tests/test_files/"
                            "lfric")

testkern_info = mod_manager.get_module_info("tl_testkern_mod")

used_mods = testkern_info.get_used_module_names()
# Sort the modules so we get a reproducible output ordering
used_mods_list = sorted(list(used_mods))
for module_name in used_mods_list:
    mod_info = mod_manager.get_module_info(module_name)
    print("Module:", module_name, os.path.basename(mod_info.filename))
Module: argument_mod argument_mod.f90
Module: constants_mod constants_mod.f90
Module: fs_continuity_mod fs_continuity_mod.f90
Module: kernel_mod kernel_mod.f90

FileInfo#

FileInfo is a class that is used to store information about Fortran files.

This information can include:

  • The source code itself

  • The fparser tree information

  • The PSyIR tree information

All this information is gathered in this single class since this also allows for caching of it, see next section

Caching#

The ModuleManager and FileInfo support a caching of the fparser tree representation of a source code. (Support for PSyIR is planned)

This caching has to be explicitly enabled in the constructor of ModuleManager.

ModuleManager.get().cache_active = True

Most of the time in the PSyIR generation is currently spent in the fparser tree generation. Consequently, this leads to significant speed-ups in the process of reading and parsing the source code of modules.

Default cache file locations#

The default cache file is named the same way as the source file, but replaces the file extension with .psycache. E.g., a cache file for the source file foo.f90 will be called foo.psycache.

(Global) cache file folder#

To avoid storing cache files together with source code files, a path can be provided to the module manager.

mod_manager = ModuleManager.get()
mod_manager.cache_active = True
mod_manager.cache_path = "/tmp/my_cache_path"

A cache file name will then be created based on the hashsum of each source code file. The combination of the provided cache_path and the cache file name will then be used as the storage location.

Note, that the cache path directory must exist.

Caching algorithm#

The caching algorithm to obtain the fparser tree OR PSyIR is briefly described as follows:

  • If fparser tree / PSyIR was read before: RETURN fparser tree or PSyIR

  • If source code is not yet read:

    • Read the content of the file

    • Create the source’s checksum.

  • Read cache file if it exists:

    • If the checksum of the cache is the same as the one of the source:

      • load the fparser tree / PSyIR from the cache file and RETURN fparser tree or PSyIR

  • Create the fparser tree / PSyIR from the source code

  • Save cache file IF it was not loaded before:

    • Update cache information

    • Store to cache file

  • RETURN fparser tree or PSyIR