Abstract

Skyline queries are a well-known technique for explorative retrieval, multi-objective optimization problems, and personalization tasks in databases. They are widely acclaimed for their intuitive query formulation mechanisms. However, when operating on incomplete datasets, skyline query processing is severely hampered and often has to resort to error-prone heuristics. Unfortunately, incom-plete datasets are a frequent phenomenon due to widespread use of automated in-formation extraction and aggregation. In this paper, we evaluate and compare var-ious established heuristics for adapting skylines to incomplete datasets, focusing specifically on the error they impose on the skyline result. Building upon these results, we argue for improving the skyline result quality by employing crowd-enabled databases. This allows to dynamic outsourcing of some database opera-tors to human workers, therefore enabling the elicitation of missing values during runtime. Unfortunately, each crowd-sourcing operation will result in monetary and query runtime costs. Therefore, our main contribution is introducing a so-phisticated error model, allowing us to specifically concentrate on those tuples which are highly likely to be error-prone, and relying on established heuristics for safer tuples. This technique of focused crowd-sourcing allows us to strike a per-fect balance between costs and result quality.

Links and resources

Tags

    community

    • @toennies
    • @dblp
    @toennies's tags highlighted