[Planetlab-users] concerning the behavior of
planetlab2.cs.virginia.edu in the last two weeks
Livia Maria Rodrigues Sampaio
livia at dsc.ufcg.edu.br
Wed Aug 16 10:17:20 EDT 2006
Hello all,
I have been running some experiments using a distributed application (implemented by myself) in the planet-lab with 5 nodes, including planetlab2.cs.virginia.edu. In the last two weeks the performance of my application slowed down. As such performance depends on the workload of the node planetlab2.cs.virginia.edu I suppose this node slowed down due to some reason. I would like to have more information about such workload variability because this is very important to my performance studies. Particularly, If you have also been using this node in your experiments you can contribute with my studies.
The objective of my experiments is to analyze the performance of a distributed
application using the execution time as performance metric.
In one of the experiments I used
messages of 32KB. The execution time of the application was around 1 second.
However, from july 29th to august 13th, such execution times increased to
180 seconds. Presently, august 16th, the execution times are again around
1 second. The variability in the performance of the application is associated with the workload of the node planetlab2.cs.virginia.edu. It seems that the node was "fast" and became "slow" for some
period. This is not surprising in the planet-lab as pointed out in the paper "Fixing the embarrassing slowness of OpenDHT on Planetlab". However, I would like to have more information about the slowness of this specific node.
In order to put some light on this topic I started investigating two possibilities: variabilities in the network load (to/from the
node) or in the processing load (in the node). In the first
case, I collected some
traceroute data, in the periods the node was "fast" and "slow", trying to find some
communication bottleneck - in this case, I configured the traceroute to send messages of 1KB. The traceroute data described the transmission of a message from each of the 5 nodes considered in my experiments to the same destination. I didn't noticed anything strange in the logs. Consequently, I suppose the reason for the variability in the performance of my application was due to an intensive use of the resources (eg. CPU, memory) of planetlab2.cs.virginia.edu by a number of applications running on this node, during the observed period, concurrently with my own application. Particularly, memory bound application can cause more swapout of the files been used. Moreover, applications with high priority can use more cpu cycles than the others.
Another possibility is the intensive use of bandwidth causing fairsharing of available bandwidth, in this case, each slice of the same node cannot use the bandwidth limit specified for it. I suppose this is more critical as the number of applications running in different slices but on the same node increases.
I would appreciate a lot if you could help me on this topic in the following way:
1. Have anybody run applications on planetlab2.cs.virginia.edu during the period from july 29th to august 13th? Is your application memory bound or does it require high bandwidth?
2. Is it possible to get information about CPU usage or memory consumption of a planet-lab node during some period in the past in order to identity the behavior of the applications competing with my own for the resources in the node planetlab2.cs.virginia.edu?
I am looking forward to hearing from you...
Lívia
More information about the Users
mailing list