I'm running the loadbalancer on a cluster with 5 compute nodes and a 
master (started with a master and 1 compute node).  It correctly 
detected that it should remove nodes.  It removed the nodes from SGE's 
execute list, but the nodes were still in the cluster (listclusters 
shows them).  I then killed the loadbalancer and tried removing manually 
via "removenode".  This resulted in:
Remove 5 nodes from cluster5(y/n)? y
 >>> Running plugin elasticip.ElasticIPSetup
 >>> Running plugin schrowscoreconfigurator.SchrodingerConfiguratorPlugin
 >>> Running plugin starcluster.plugins.sge.SGEPlugin
 >>> Removing node006 from SGE
!!! ERROR - Error occured while running plugin 
'starcluster.plugins.sge.SGEPlugin':
!!! ERROR - remote command 'source /etc/profile && qconf -de node006'
!!! ERROR - failed with status 1:
!!! ERROR - denied: execution host "node006" does not exist
So I forcibly removed them.  when I do that I get messages like this for 
each node:
 >>> Terminating node: node006 (i-d16e4815)
 >>> Running plugin elasticip.ElasticIPSetup
 >>> Running plugin schrowscoreconfigurator.SchrodingerConfiguratorPlugin
 >>> Running plugin starcluster.plugins.sge.SGEPlugin
 >>> Removing node005 from SGE
!!! ERROR - Error occured while running plugin 
'starcluster.plugins.sge.SGEPlugin':
Has anyone experienced this?  If so, what is causing this?
Herc
Received on Thu Dec 10 2015 - 01:00:03 EST
This archive was generated by
hypermail 2.3.0.