HWLOC & NETLOC
The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, ...) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information as well as the locality of I/O devices such as network interfaces, InfiniBand HCAs or GPUs. It primarily aims at helping applications with gathering information about modern computing hardware so as to exploit it accordingly and efficiently.
hwloc also includes the Network Locality subproject which focus on detecting network topologies, therefore assembling both server internal and network topology into a global map of the entire cluster or supercomputer.
Profiling & Measuring :
MAQAO (Modular Assembly Code Quality Analyzer and Optimizer) is developed by UVSQ. See link.
UVSQ will study enhancements of the MAQAO tool for support of GPU and for integration into the SILC measurement system. Actively support the application developers from WP4 in analyzing, porting and optimizing their codes.
Not developed by COLOC, but by an open source community – led by Jülich Supercomputing Center (JSC) with which COLOC has strong relationships
Scalasca, developed by JSC: is a toolset to analyze the performance behavior of parallel applications and to identify opportunities for optimization. See link
Not developed by COLOC, but by an open source community – led by the Technical University of Dresden (ZIH) with which COLOC has strong relationships
VampirTrace, developed by ZIH (Technische Universität Dresden. Center for Information Services and High Performance Computing) is a run-time library and toolset for instrumentation and tracing of software applications using OTF (Open Trace Format). The traces can be visualized by the Vampir and Scalasca tools. See Link.
FOISOL is a MPI and OpenMP massively parallelised direct linear equation solver developed to handle huge systems of equations typically obtained using the finite element method (FEM). FOISOL automatically selects a Multilevel Domain Decomposition which minimizes the number of operations and spreads the workload as evenly as possible among available MPI-processes at each stage of the solution process. It provides a close estimate of core memory and disk storage needed. The time to solve the given problem on the actual hardware configuration and the solution strategy chosen by FOISOL is provided. The estimator checks that the actual problem is solvable with the available resources. FOISOL is an out-of-core solver. It adapts the solution strategy according to the available core memory. Thus securing a solution at the expense of time.
Input is provided in terms of element matrices and assembled right hand-sides. FOISOL can handle a large number of right hand-sides in parallel. Equation systems with millions of unknowns can be solved efficiently on a off-the-shelf modern PC, but COLOC should enable to further improve the efficiency of FOISOL which has the potential of solving at least one order of magnitude larger problems compared to comparable freely available solvers.
COLOC also provides possibilities for FOI in cooperation with COLOC partners to study, analyse, and find remedies to other problems such as I/O-bottlenecks causing latencies in the dynamic scheduling of the computational resources in advanced massively parallelised multi-level solution procedure.
MUMPS (MUltifrontal Massively Parallel sparse direct Solver) is a software application for the solution of large sparse systems of linear algebraic equations on distributed memory parallel computers. It is a public domain implementation of the multifrontal method supported by CERFACS, IRIT-ENSEEIHT, and INRIA.
Although the evolution of MUMPS was not formally included in the work programme of the COLOC project, a strong cooperation has been found mutually profitable because of the type of problems to solve in The COLOC application use cases.
The SLURM Workload Manager (formally known as Simple Linux Utility for Resource Management), or Slurm for short, is a free and open-source job scheduler for the Linux kernel used by many of the world's supercomputers and computer clusters. It provides three key functions. First, it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job such as MPI) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending jobs.
Main enhancements to be carried out by the COLOC project include the design, the development, and the validation of:
- a standardized interface to enable SLURM to interact with applications for improving process placement algorithms using topology and data locality aware models created by HWLOC.
- a PMI2/x interface to improve communication between SLURM and MPI.
- an extension to exploit LUSTRE IO statistics.
LUSTRE (Linux cluster)
The Lustre file system is a open source, parallel file system that supports the requirements of leadership class HPC and Enterprise environments worldwide.
The Lustre Community website lustre.org supports the open source community — developers, admins, and users — providing downloadable Lustre releases, documentation, development tree access, issue reporting, working groups, mailing lists, and more.
As an active member of this community, Bull will contribute its experience and knowledge to COLOC and will vuild on COLOC work to evolve LUSTRE to better manage data locality and data storage.