R2 is a measure on how much better (R2 >0) a model fits the data than simply using the mean of the output variable.
This article states, that adding new input variables will always increase the R2 value. While this is true, when fitting a model and calculating R2 on all available data. However, as soon as we have a training/test split, this does not (should not) hold, i.e., consider adding a very strong outlier in the test data.
##Note on adjusted R2
"Adjusted R2 does not have the same interpretation as R2—while R2 is a measure of fit, adjusted R2 is instead a comparative measure of suitability of alternative nested sets of explanators. As such, care must be taken in interpreting and reporting this statistic. Adjusted R2 is particularly useful in the feature selection stage of model building."
This means, that the **adjusted R2** is basically for fitting models (not necessarily prediction) where the model is to be kept as simple as possible, i.e., it implements an Occams Razor. This is similar to the general behavior of Bayesian model comparison, where a better fit of the observations by adding additional variables is weighted against the complexity of the model (see MixedTrails).