Ensuring That SGE Does Not Oversubscribe Processors or Memory

SGE, be default, will happily oversubscribe processors when multiple queues target the same nodes (nice one, SGE). Furthermore, even if jobs specify a memory limit, if each individual job uses less then the total memory limit, but the sum memory usage of jobs assigned to the machine exceeds the machine’s memory, the memory of the entire node can be exhausted, sending it off into limbo (cunning, SGE, very cunning). The solution to both of these issues is to specify processors and memory as consumable resources at the host level.

Read more