Interlinking knowledge bases are widely recognized as an important, but challenging problem. A significant amount of research has been undertaken to provide solutions to this problem with varying degrees of automation and user involvement. In this paper, we present a two-staged experiment for the creation of gold standards that act as benchmarks for several interlinking algorithms. In the first stage the gold standards are generated through manual validation process highlighting the role of users. Using the gold standards obtained from this stage, we assess the performance of human evaluators in addition to supervised interlinking algorithms. We evaluate our approach on several data interlinking tasks with respect to precision, recall and F-measure. Additionally we perform a qualitative analysis on the types of errors made by humans and machines.