[PL #3916] Bug in Proper...
Mark Huang via RT
devel at planet-lab.org
Tue Jan 25 10:46:24 EST 2005
Email Recipients (see http://www.planet-lab.org/Support)
Requestor: justin at cs.arizona.edu
Ticket Ccs: mlhuang at cs.princeton.edu, smuir at cs.princeton.edu, stork at cs.arizona.edu, vivek at cs.princeton.edu
> When we call Proper on the production nodes sometimes there is no
> response (the call waits forever for a response that will presumably
> never come).
> This is currently happening on planetlab1.cs.purdue.edu (with
> arizona_stork) in case that aids in troubleshooting... We are happy
> to do whatever we can to help find this bug...
This may be related to the loopback TCP problem that Vivek (and others)
are having on nodes. I am pulling my hair out trying to figure out what
the problem could be; every lead that I've followed up on has turned cold.
In your code, can you timeout the connect() request after a few (>=5)
seconds, and retry forever? If eventually your connect() request
succeeds, then you're probably seeing the same thing Vivek is seeing.
Unfortunately, it won't bring us closer to a solution since Vivek's test
case is easy to reproduce already, but it should just be a matter of
time before we can track it down.
More information about the Devel-community