[Planetlab-users] node availability benchmarks
knauer at jf.intel.com
Fri Jun 18 14:55:58 EDT 2004
In an E-mail message, Larry Peterson wrote:
>On Jun 18, 2004, at 1:36 PM, Neil Spring wrote:
>> I'd change:
>> "visible": I don't understand the value in reporting a metric based on
>> being able to ssh into a non-vserver slice. who has one of these but
>Visible is not intended to equal usable. It just says we've eliminated one
>source of problems: those introduced by bad network configurations.
I see that visibility might be useful to the Princeton crew, specifically
because it can find network problems, but also mainly because those guys
know which ones SHOULD be pingable and which shouldn't.
To a PlanetLab user, though, visibility isn't really useful at all. In
fact, it might be WORSE than useful in that it's tantalizing without
offering any assurance of satisfaction. :)
>> "usable": to me should be a number of machines I should have no
>> problems with. If you keep a list of "usable" machines and I find one
>> to be unusable, I should be able to file a trouble ticket with high
This is key, IMHO. I would break it into two parts, though; usable to
an existing application and usable to a new one.
The former, as Larry mentions elsewhere, is not useful to PL Central
because they can't know what every application needs (and an apps
inability to phone home is not necessarily PLC's fault). It is
probably best maintained by each researcher (e.g. as we do with
PEPR/Trumpet in idsl_tbh -- we periodically check application and node
health ourselves). If something breaks along the way, after making
sure it isn't the experiment itself, then PL Central's other tests
should come to the rescue.
The latter, though, is something we don't have, and something that I
as a user would find REALLY helpful. If I want to make a new slice,
or add a node to my current slice, I'd love to have a list of
candidate nodes that I can reasonably expect to instantiate my service
on. This especially as opposed to what we have now, which is
"oversubscribe and hope you get enough for what you really wanted",
which is both inelegant and not good for the environment overall.
This facet of "usable" would include (by definition) reachability by
SSH, ability to instantiate a new sliver, and perhaps even ability to
install a painfully-simple "phone home" application. Implicit in that
are some (many?) of Mic's tests, I suppose, but those are ancillary to
just being able to add the darn thing to my slice and run something on
it. With the added constraint of time (arbitrarily chosen amount),
this also would implicitly test things like correct operation of node
manager, correct opertaion of "dialback to PLC", availability of
resources (CPU load, disk, memory), and so forth.
In summary, while there are any number of great useful diagnostic
tests for the folks who have to keep the system running, the main test
we're lacking is whether a user should even bother with a particular
node or not. And THAT metric, as disappointing as it may be to start,
is probably worthy of focus.
Rob Knauerhase [knauer at jf.intel.com] Intel Labs / PlanetLab SRP
"I have been a happy man ever since January 1, 1990, when I no longer had an
email address. I'd used email since about 1975, and it seems to to me that
15 years of email is plenty for one lifetime." -- Donald Knuth
More information about the Users