[Planetlab-users] recvfrom blocking on DNS (in China)
Emil Sit
sit+planetlab at MIT.EDU
Mon May 9 10:40:19 EDT 2005
Has anyone observed any nodes blocking in recvfrom to DNS servers
for very long periods of time? e.g.
> mit_dht at lzu2.6planetlab.edu.cn:~
> [0] $ strace -p 20267
> Process 20267 attached - interrupt to quit
> recvfrom(4, <unfinished ...>
> mit_dht at lzu2.6planetlab.edu.cn:~
> [0] $ lsof -p 20267
> COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
[...]
> wget 20267 mit_dht 3w REG 253,2 33 3361614 /tmp/sfs-0.8pre-1.i386.sum
> wget 20267 mit_dht 4u IPv4 573940260 UDP lzu2.6planetlab.edu.cn:38300->CS.NIC.EDU.CN:domain
The wget manpage suggests that wget itself doesn't do any DNS lookup
timeouts on its own, except for any timeout set by the system libraries;
ltrace indicates that it is just calling gethostbyname. Doesn't
the system library time out DNS lookups?
I've seen several other nodes in that state, including:
pku2.6planetlab.edu.cn
cut1.6planetlab.edu.cn
xmu1.6planetlab.edu.cn
zju1.6planetlab.edu.cn
uestc2.6planetlab.edu.cn
Some of those nodes are trying to run stork to install some packages,
some are just running wget. Logging into those nodes suggests that
they can successfully resolve via DNS (for example, the lsof output
knows the name of CS.NIC.EDU.CN...)
I've posted to support [PL #5672] and Mark H suggested I inquire
more generally.
More information about the Users
mailing list