[Planetlab-devel] node manager

Daniel Hokka Zakrisson dhokka at CS.Princeton.EDU
Tue Dec 11 13:48:26 EST 2007


Hi Thierry,

Quoting Thierry Parmentelat <thierry.parmentelat at sophia.inria.fr>:
> I'm moving here the thread that started on the cvs ml
>
> The issues I have with nm are as follows - from this list you'll see 
> that i could use some help :-)
>
> - when I started digging yesterday, the external symptom was that 
> slices did not get created; at least i cound not enter the node 
> through ssh
>
> - there was a lot of issues related to conf_files, so upon faiyaz's 
> suggestion, and in an attempt to focus on slice creation, I have 
> commented out all modules but 'sm' in nm.py
>
> - I had made the indentation change because I was seing this kind of 
> messages which suggested there was something wrong with calling 
> set_ipaddresses_config. I haven't committed that back yet, but here 
> is what I now getting with the original version
>
> Tue Dec 11 14:41:53 2007: operation on ts_slicetest1 failed. 
> Traceback (most recent call last):
>  File "/usr/share/NodeManager/accounts.py", line 168, in _run
>    cmd[0](*cmd[1:])
>  File "/usr/share/NodeManager/accounts.py", line 129, in _ensure_created
>    if not isinstance(self._acct, next_class): self._acct = next_class(rec)
>  File "/usr/share/NodeManager/sliver_vs.py", line 65, in __init__
>    self.configure(rec)
>  File "/usr/share/NodeManager/sliver_vs.py", line 83, in configure
>    self.set_resources()
>  File "/usr/share/NodeManager/sliver_vs.py", line 175, in set_resources
>    self.set_ipaddresses_config(self.rspec['ip_addresses'])
>  File "/usr/lib/python2.5/site-packages/vserver.py", line 230, in 
> set_ipaddresses_config
>    self.set_ipaddresses(addresses)
>  File "/usr/lib/python2.5/site-packages/vserver.py", line 219, in 
> set_ipaddresses
>    vserverimpl.netremove(self.ctx, "all")
> OSError: [Errno -22] Unknown error 4294967274
>
> and I have no clue what this -22 error actually means

EINVAL. Would be interesting to know what causes it, i.e. if it's the 
kernel or userspace.

> - as I wrote in a commit log this morning, generally speaking I think 
> sliver_vs.set_resources should be much more consevative and protect 
> all calls to the underlying Vserver object in try/except clauses. In 
> the example above, an error down below causes control to come up to 
> accounts._run which breaks the logic, the thread queues dont get 
> polled and all is broken.
>
> - generally speaking, it's really hard to understand where error 
> messages actually end up
> I'm also trying to improve this on the fly, but that's really not perfect yet
> the error messages directly printed from vserver.py somehow get lost 
> in daemon mode.
> for instance, I need to run without the -d option to see the 
> following kind of errors, which are not too useful either
> Unexpected error with getrlimit for context 501
> Unexpected error with getrlimit for context 500
> Unexpected error with setrlimit for running context 500
> Unexpected error with getrlimit for context 500
> Unexpected error with setrlimit for running context 500
> ...

You're right, VServer.__init__ ought to take another argument that is 
used for logging.

> - and on the same track, some other errors turn out to get logged ... 
> in the slice's /var/log/boot.log
> took me while to figure
> [in slice] # cat /var/log/boot.log
> Tue Dec 11 14:21:16 2007: starting the virtual server ts_slicetest1
> Traceback (most recent call last):
>  File "/usr/lib/python2.5/site-packages/vserver.py", line 447, in start
>  File "/usr/lib/python2.5/site-packages/vserver.py", line 396, in __prep
> TypeError: argument 1 must be string, not file

Whoops, my bad. Should be fixed now.

> - I have messed around a lot and don't know very well where I am 
> staying anymore
> but this morning I was in a situation where authorized_keys was 
> actually getting created with some correct value, but that was not 
> visible from the slice; that is, /home/<slice>/.ssh/authorized_keys 
> was OK but after I entered the slice I could not see any .ssh - but 
> maybe here I'm doing something wrong. What's supposed to be the magic 
> that allows the slice to see a file in /home again ?
>
> -- Thierry

Daniel



More information about the Devel mailing list