We have started to see this port issue crop up in the past 2 months during our ever increasing unit testing.
The latest spate of problems were however fixed by simply updating our Linux instances with a higher number of file descriptors.
On one of the systems showing this local port issue, we noticed that the default setup for open files was at a mere 2,000.
[joakim@lapetus jetty]$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 10.04 LTS
[joakim@lapetus jetty]$ ulimit -n
So we updated the /etc/security/limits.conf to bump this number up to 20,000, and this solved our bad local port issues (after a reboot)
[joakim@lapetus jetty]$ grep nofile /etc/security/limits.conf
# - nofile - max number of open files
* hard nofile 40000
* soft nofile 20000
[joakim@lapetus jetty]$ ulimit -a | grep -i file
core file size (blocks, -c) 0
file size (blocks, -f) unlimited
open files (-n) 20000
file locks (-x) unlimited
This new ulimit helped our unit testing on the systems having issues. Our analysis shows that the aggressive unit testing that we do starts and stops a jetty server (on a system assigned port, using special port #0) consumes the socket at a rate faster than they can be recycled back into the "open files" ulimit, and caused our unit tests to eventually all fail due to a "-1" local port. See examples of error messages at https://bugs.eclipse.org/bugs/show_bug.cgi?id=310634
Having a loop that continues to attempt to start the server while looking for a valid port number (as seen in the Hadoop codebase) will not help in this excessive "open files" ulimit condition. The best choice we've been able to come up with is to simply increase the "open files" ulimit.