Selecting a well-performing algorithm for a given task or dataset can be time-consuming andtedious, but is crucial for the successful day-to-day business of developing new AI & MLapplications. Algorithm Selection (AS) mitigates this through a meta-model leveragingmeta-information about previous tasks. However, most of the available AS methods areerror-prone because they characterize a task by either cheap-to-compute properties of thedataset or evaluations of cheap proxy algorithms, called landmarks. In this work, we extendthe classical AS data setup to include multi-fidelity information and empirically demonstratehow meta-learning on algorithms’ learning behaviour allows us to exploit cheap test-timeevidence effectively and combat myopia significantly. We further postulate a budget-regrettrade-off w.r.t. the selection process. Our new selector MASIF is able to jointly interpretonline evidence on a task in form of varying-length learning curves without any parametricassumption by leveraging a transformer-based encoder. This opens up new possibilities forguided rapid prototyping in data science on cheaply observed partial learning curves.

Links and resources