Cloud computing is increasingly considered as an additional computational
resource platform for scientific workflows. The cloud offers opportunity
to scale-out applications from desktops and local cluster resources.
Each platform has different properties (e.g., queue wait times in
high performance systems, virtual machine startup overhead in clouds)
and characteristics (e.g., custom environments in cloud) that makes
choosing from these diverse resource platforms for a workflow execution
a challenge for scientists. Scientists are often faced with deciding
resource platform selection trade-offs with limited information on
the actual workflows. While many workflow planning methods have explored
resource selection or task scheduling, these methods often require
fine-scale characterization of the workflow that is onerous for a
scientist. In this paper, we describe our early exploratory work
in using blackbox characteristics for a cost-benefit analysis of
using different resource platforms. In our blackbox method, we use
only limited high-level information on the workflow length, width,
and data sizes. The length and width are indicative of the workflow
duration and parallelism. We compare the effectiveness of this approach
to other resource selection models using two exemplar scientific
workflows on desktop, local cluster, HPC center, and cloud platforms.
Early results suggest that the blackbox model often makes the same
resource selections as a more fine-grained whitebox model. We believe
the simplicity of the blackbox model can help inform a scientist
on the applicability of a new resource platform, such as cloud resources,
even before porting an existing workflow.
%0 Conference Paper
%1 Simmhan:sciencecloud:2010
%A Simmhan, Yogesh
%A Ramakrishnan, Lavanya
%B International Workshop on Scientific Cloud Computing (ScienceCloud)
%D 2010
%I ACM
%K azure, cloud, eScience, hpc, management, msr, paper peer resource reviewed, scheduling, short workflows,
%P 445-450
%R 10.1145/1851476.1851541
%T Comparison of resource platform selection approaches for scientific
workflows
%U http://ceng.usc.edu/~simmhan/pubs/simmhan-sciencecloud-2010.pdf
%X Cloud computing is increasingly considered as an additional computational
resource platform for scientific workflows. The cloud offers opportunity
to scale-out applications from desktops and local cluster resources.
Each platform has different properties (e.g., queue wait times in
high performance systems, virtual machine startup overhead in clouds)
and characteristics (e.g., custom environments in cloud) that makes
choosing from these diverse resource platforms for a workflow execution
a challenge for scientists. Scientists are often faced with deciding
resource platform selection trade-offs with limited information on
the actual workflows. While many workflow planning methods have explored
resource selection or task scheduling, these methods often require
fine-scale characterization of the workflow that is onerous for a
scientist. In this paper, we describe our early exploratory work
in using blackbox characteristics for a cost-benefit analysis of
using different resource platforms. In our blackbox method, we use
only limited high-level information on the workflow length, width,
and data sizes. The length and width are indicative of the workflow
duration and parallelism. We compare the effectiveness of this approach
to other resource selection models using two exemplar scientific
workflows on desktop, local cluster, HPC center, and cloud platforms.
Early results suggest that the blackbox model often makes the same
resource selections as a more fine-grained whitebox model. We believe
the simplicity of the blackbox model can help inform a scientist
on the applicability of a new resource platform, such as cloud resources,
even before porting an existing workflow.
%@ 978-1-60558-942-8
@inproceedings{Simmhan:sciencecloud:2010,
abstract = {Cloud computing is increasingly considered as an additional computational
resource platform for scientific workflows. The cloud offers opportunity
to scale-out applications from desktops and local cluster resources.
Each platform has different properties (e.g., queue wait times in
high performance systems, virtual machine startup overhead in clouds)
and characteristics (e.g., custom environments in cloud) that makes
choosing from these diverse resource platforms for a workflow execution
a challenge for scientists. Scientists are often faced with deciding
resource platform selection trade-offs with limited information on
the actual workflows. While many workflow planning methods have explored
resource selection or task scheduling, these methods often require
fine-scale characterization of the workflow that is onerous for a
scientist. In this paper, we describe our early exploratory work
in using blackbox characteristics for a cost-benefit analysis of
using different resource platforms. In our blackbox method, we use
only limited high-level information on the workflow length, width,
and data sizes. The length and width are indicative of the workflow
duration and parallelism. We compare the effectiveness of this approach
to other resource selection models using two exemplar scientific
workflows on desktop, local cluster, HPC center, and cloud platforms.
Early results suggest that the blackbox model often makes the same
resource selections as a more fine-grained whitebox model. We believe
the simplicity of the blackbox model can help inform a scientist
on the applicability of a new resource platform, such as cloud resources,
even before porting an existing workflow.},
acmid = {1851541},
added-at = {2014-08-13T04:08:36.000+0200},
author = {Simmhan, Yogesh and Ramakrishnan, Lavanya},
biburl = {https://www.bibsonomy.org/bibtex/2f0a367c024128f84b3ce5f8efe99641b/simmhan},
booktitle = {International Workshop on Scientific Cloud Computing (ScienceCloud)},
doi = {10.1145/1851476.1851541},
interhash = {2783c4182fa1b06ee1b37f3c0fa35aad},
intrahash = {f0a367c024128f84b3ce5f8efe99641b},
isbn = {978-1-60558-942-8},
keywords = {azure, cloud, eScience, hpc, management, msr, paper peer resource reviewed, scheduling, short workflows,},
location = {Chicago, Illinois},
month = {June},
numpages = {6},
owner = {Simmhan},
pages = {445-450},
publisher = {ACM},
series = {High Performance Distributed Computing (HPDC)},
timestamp = {2014-08-13T04:08:36.000+0200},
title = {Comparison of resource platform selection approaches for scientific
workflows},
url = {http://ceng.usc.edu/~simmhan/pubs/simmhan-sciencecloud-2010.pdf},
year = 2010
}