[PL #2899] CKRM memory controller hangs

Marc E. Fiuczynski via RT devel at planet-lab.org
Wed Nov 3 14:35:19 EST 2004


Email Recipients (see http://www.planet-lab.org/Support)
       Requestor: acb at cs.princeton.edu
       Ticket Ccs: acb at cs.princeton.edu, frankeh at watson.ibm.com, mef at cs.princeton.edu, sekharan at us.ibm.com

==================================================

Chandra,

We are precisely testing the case when we run low on memory.  Our workload consists of a bunch of vservers (with a one to one mapping to classes) and memory is always fully utilized across these vservers.  It is for this reason the shrink_cache and refill_inactive_zone code gets invoked.

Marc


> -----Original Message-----
> From: devel-community-bounces at planet-lab.org
> [mailto:devel-community-bounces at planet-lab.org]On Behalf Of
> sekharan at us.ibm.com via RT
> Sent: Wednesday, November 03, 2004 2:27 PM
> To: acb at CS.Princeton.EDU
> Subject: Re: [PL #2899] CKRM memory controller hangs
> 
> 
> Email Recipients (see http://www.planet-lab.org/Support)
>        Requestor: acb at cs.princeton.edu
>        Ticket Ccs: acb at cs.princeton.edu, frankeh at watson.ibm.com, 
> mef at cs.princeton.edu, sekharan at us.ibm.com
> 
> ==================================================
> 
> On Wed, Nov 03, 2004 at 02:17:17PM -0500, Marc E. Fiuczynski via RT wrote:
> > 
> > Hi Chandra & Hubertus,
> > 
> > The two functions shrink_class and shrink_classes are ifdef'd 
> out.  But the underlying functions -- refill_inactive_zone and 
> shrink_caches -- are still called by other code (shrink_zone) in 
> vmscan.c.  The loop in shrink_zone is essentially identical to 
> the one in your shrink_class function, and the irq's are off too 
> long within the internal loop.
> 
> but, this code should get executed unless you are running low on 
> memory. is
> that what is happening ?
> 
> > 
> > Maybe it is sufficient to move the "redo:" label inside these 
> functions before the spin_lock_irq() and then explicitly unlock 
> right before doing the goto back to the redo label; thereby 
> letting the interrupt through. Thoughts?
> 
> This sounds like a valid solution. Can you try it ?
> 
> chandra
> 
> _______________________________________________
> Devel-community mailing list
> Devel-community at lists.planet-lab.org
> http://lists.planet-lab.org/mailman/listinfo/devel-community
> 





More information about the Devel-community mailing list