2020 International Workshop on Software Engineering for Computational Science

June 3-5, 2020

Held in Conjunction with The International Conference on Computational Science


Home      Call for Papers      Committee      Schedule

Virtual Workshop

This workshop is online and asynchronous.

If you would like to participate in the discussion, please watch the videos, read the paper (linked from the paper title), and add your question as suggested text in the Google documents below.

The authors will respond to your comment as soon as possible. Please be aware that the authors may be in a different time zone and may not be able to respond to your comments immediately.

The discussion will continue, virtually, through June 12.

Lessons learned in a decade of research software engineering GPU applications

Ben van Werkhoven, Willem Jan Palenstijn and Alessio Sclocco

After years of using Graphics Processing Units (GPUs) to accelerate scientific applications in fields as varied as tomography, computer vision, climate modeling, digital forensics, geospatial databases, particle physics, radio astronomy, and localization microscopy, we noticed a number of technical, socio-technical, and non-technical challenges that Research Software Engineers (RSEs) may run into. While some of these challenges, such as managing different programming languages within a project, or having to deal with different memory spaces, are common to all software projects involving GPUs, others are more typical of scientific software projects. Among these challenges we include changing resolutions or scales, maintaining an application over time and making it sustainable, and evaluating both the obtained results and the achieved performance. %In this paper, we present the challenges and lessons learned from research software engineering GPU applications.

Unit Tests of Scientific Software: A Study on SWMM

Zedong Peng, Xuanyi Lin and Nan Niu

Testing helps assure software quality by executing program and uncovering bugs. Scientific software developers often find it challenging to carry out systematic and automated testing due to reasons like inherent model uncertainties and complex floating point computations. We report in this paper a manual analysis of the unit tests written by the developers of the Storm Water Management Model (SWMM). The results show that the 1,458 SWMM tests have a 54.0% code coverage and a 82.4% user manual coverage. We also observe a "getter-setter-getter" testing pattern from the SWMM unit tests. Based on these results, we offer insights to improve test development and coverage.

NUMA-Awareness as a Plug-In for an Eventify-based Fast Multipole Method

Laura Morgenstern, David Haensel, Andreas Beckmann and Ivo Kabadshow

Following the trend towards Exascale, today's supercomputers consist of increasingly complex and heterogeneous compute nodes. To exploit the performance of these systems, research software in HPC needs to keep up with the rapid development of hardware architectures. Since manual tuning of software to each and every architecture is neither sustainable nor viable, we aim to tackle this challenge through appropriate software design. In this article, we aim to improve the performance and sustainability of FMSolvr, a parallel Fast Multipole Method for Molecular Dynamics, by adapting it to Non-Uniform Memory Access architectures in a portable and maintainable way. The parallelization of FMSolvr is based on Eventify, an event-based tasking framework we co-developed with FMSolvr. We describe a layered software architecture that enables the separation of the Fast Multipole Method from its paral-lelization. The focus of this article is on the development and analysis of a reusable NUMA module that improves performance while keeping both layers separated to preserve maintainability and extensibility. By means of the NUMA module we introduce diverse NUMA-aware data distribution , thread pinning and work stealing policies for FMSolvr. During the performance analysis the modular design of the NUMA module was advantageous since it facilitates combination, interchange and redesign of the developed policies. The performance analysis reveals that the run-time of FMSolvr is reduced by 21% from 1.48 ms to 1.16 ms through these policies.

Boosting Group-level Synergies by Using a Shared Modeling Framework

Yunus Sevinchan, Benjamin Herdeanu, Harald Mack, Lukas Riedel and Kurt Roth

Modern software engineering has established sophisticated tools and workflows that enable distributed development of high-quality software. Here, we present our experiences in adopting these workflows to collectively develop, maintain, and use research software, specifically: a modeling framework for complex and evolving systems. We exemplify how sharing this modeling framework within our research group helped conveying software engineering best practices, fostered cooperation, and boosted synergies. Together, these experiences illustrate that the adoption of modern software engineering workflows is feasible in the dynamically changing academic context, and how these practices facilitate reliability, reproducibility, reusability, and sustainability of research software, ultimately improving the quality of the resulting scientific output.

Testing Research Software: A Case Study

Nasir Eisty, Danny Perez, Jeffrey Carver, J. David Moulton and Hai Ah Nam

Background: The increasing importance of software for the conduct of various types of research raises the necessity of proper testing to ensure correctness. The unique characteristics of the research software produce challenges in the testing process that require attention. Aims: Therefore, the goal of this paper is to share the experience of implementing a testing framework using a statistical approach for a specific type of research software, i.e. non-deterministic software. Method: Using the ParSplice research software project as a case, we implemented a testing framework based on a statistical testing approach called Multinomial Test. Results: Using the new framework, we were able to test the ParSplice project and demonstrate correctness in a situation where traditional methodical testing approaches were not feasible. Conclusions: This study opens up the possibilities of using statistical testing approaches for research software that can overcome some of the inherent challenges involved in testing non-deterministic research software.

APE: A Command-Line Tool and API for Automated Workflow Composition

Vedran Kasalica and Anna-Lena Lamprecht

Automated workflow composition is bound to take the work with scientific workflows to the next level. On top of today's comprehensive eScience infrastructure, it enables the automated generation of possible workflows for a given specification. However, functionality for automated workflow composition tends to be integrated with one of the many available workflow management systems, and is thus difficult or impossible to apply in other environments. Therefore we have developed APE (the Automated Pipeline Explorer) as a command-line tool and API for automated composition of scientific workflows. APE is easily configured to a new application domain by providing it with a domain ontology and semantically annotated tools. It can then be used to synthesize purpose-specific workflows based on a specification of the available workflow inputs, desired outputs and possibly additional constraints. The workflows can further be transformed into executable implementations and/or exported into standard workflow formats. In this paper we describe APE v1.0 and discuss lessons learned from applications in bioinformatics and geosciences.

Last Updated on June 1, 2020 by Jeffrey Carver