J. Werfel, X. Xie, and H. Seung. In, MIT Press, (2003)Discussion of learning curves for stochastic gradient descent.
Besides gradient based approaches, the paper shortly describes (with additional references) weight perturbation and node perturbation approaches..
J. Aslam, and M. Frost. Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, page 449-450. Toronto, Canada, ACM Press, (July 2003)
D. Lin. ICML '98: Proceedings of the Fifteenth International Conference on Machine Learning, page 296--304. San Francisco, CA, USA, Morgan Kaufmann Publishers Inc., (1998)
D. Bollegala, Y. Matsuo, and M. Ishizuka. WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining, page 104--113. New York, NY, USA, ACM, (2009)
C. Cattuto, D. Benz, A. Hotho, and G. Stumme. The Semantic Web - ISWC 2008, Proc.Intl. Semantic Web Conference 2008, volume 5318 of LNAI, page 615--631. Heidelberg, Springer, (2008)