@dickscheid

IO Challenges for Human Brain Atlasing Using Deep Learning Methods - An In-Depth Analysis

, , , , and . 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), page 291-298. (February 2019)
DOI: 10.1109/EMPDP.2019.8671630

Abstract

The use of Deep Learning methods have been identified as a key opportunity for enabling processing of extreme-scale scientific datasets. Feeding data into compute nodes equipped with several high-end GPUs at sufficiently high rate is a known challenge. Facilitating processing of these datasets thus requires the ability to store petabytes of data as well as to access the data with very high bandwidth. In this work, we look at two Deep Learning use cases for cytoarchitectonic brain mapping. These applications are very challenging for the underlying IO system. We present an in depth analysis of their IO requirements and performance. Both applications are limited by the IO performance, as the training processes often have to wait several seconds for new training data. Both applications read random patches from a collection of large HDF5 datasets or TIFF files, which result in many small non-consecutive accesses to the parallel file systems. By using a chunked data format or storing temporally copies of the required patches, the IO performance can be improved significantly. These leads to a decrease of the total runtime of up to 80 percent.

Links and resources

Tags

community

  • @dickscheid
  • @dblp
@dickscheid's tags highlighted