| 
       
	问题现象: 
	  添加后过了一会,主机处于无响应状态,下面的时间栏报错:网络相关的错误 
	问题分析: 
	 错误日志如下: 
	Reactor thread::INFO::2018-02-09 15:58:40,982::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 192.168.1.211:57037 
	Reactor thread::DEBUG::2018-02-09 15:58:41,039::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11 
	Reactor thread::INFO::2018-02-09 15:58:41,041::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol stomp from 192.168.1.211:57037 
	Reactor thread::INFO::2018-02-09 15:58:41,041::stompreactor::101::Broker.StompAdapter::(_cmd_connect) Processing CONNECT request 
	Reactor thread::DEBUG::2018-02-09 15:58:41,042::stompreactor::470::protocoldetector.StompDetector::(handle_socket) Stomp detected from ('192.168.1.211', 57037) 
	JsonRpc (StompReactor)::INFO::2018-02-09 15:58:41,042::stompreactor::128::Broker.StompAdapter::(_cmd_subscribe) Subscribe command received 
	JsonRpc (StompReactor)::INFO::2018-02-09 15:58:41,043::stompreactor::128::Broker.StompAdapter::(_cmd_subscribe) Subscribe command received 
	jsonrpc.Executor/3::DEBUG::2018-02-09 15:58:42,045::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {} 
	jsonrpc.Executor/3::DEBUG::2018-02-09 15:58:42,046::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True 
	jsonrpc.Executor/3::DEBUG::2018-02-09 15:58:42,047::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.setupNetworks' in bridge with {u'bondings': {}, u'networks': {u'hcimgmt': {u'nic': u'eno1', u'mtu': u'1500', u'bootproto': u'dhcp', u'STP': u'no', u'bridged': u'true', u'defaultRoute': True}}, u'options': {u'connectivityCheck': u'true', u'connectivityTimeout': 120}} 
	Reactor thread::INFO::2018-02-09 15:58:46,299::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 127.0.0.1:47125 
	Reactor thread::DEBUG::2018-02-09 15:58:46,311::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11 
	Reactor thread::INFO::2018-02-09 15:58:46,311::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol xml from 127.0.0.1:47125 
	Reactor thread::DEBUG::2018-02-09 15:58:46,312::bindingxmlrpc::1302::XmlDetector::(handle_socket) xml over http detected from ('127.0.0.1', 47125) 
	BindingXMLRPC::INFO::2018-02-09 15:58:46,312::xmlrpc::73::vds.XMLRPCServer::(handle_request) Starting request handler for 127.0.0.1:47125 
	Thread-13::INFO::2018-02-09 15:58:46,312::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47125 started 
	Thread-13::DEBUG::2018-02-09 15:58:46,314::utils::671::root::(execCmd) /bin/taskset --cpu-list 0-5 /usr/libexec/vdsm/hooks/after_get_all_vm_stats/10_fakevmstats (cwd None) 
	Thread-13::DEBUG::2018-02-09 15:58:46,440::utils::689::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0 
	Thread-13::INFO::2018-02-09 15:58:46,440::hooks::98::root::(_runHooksDir) 
	Thread-13::INFO::2018-02-09 15:58:46,443::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47125 stopped 
	Reactor thread::INFO::2018-02-09 15:58:46,550::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 127.0.0.1:47126 
	Reactor thread::DEBUG::2018-02-09 15:58:46,561::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11 
	Reactor thread::INFO::2018-02-09 15:58:46,562::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol xml from 127.0.0.1:47126 
	Reactor thread::DEBUG::2018-02-09 15:58:46,562::bindingxmlrpc::1302::XmlDetector::(handle_socket) xml over http detected from ('127.0.0.1', 47126) 
	BindingXMLRPC::INFO::2018-02-09 15:58:46,563::xmlrpc::73::vds.XMLRPCServer::(handle_request) Starting request handler for 127.0.0.1:47126 
	Thread-14::INFO::2018-02-09 15:58:46,563::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47126 started 
	Thread-14::INFO::2018-02-09 15:58:46,567::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47126 stopped 
	Reactor thread::INFO::2018-02-09 15:59:01,459::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 127.0.0.1:47145 
	Reactor thread::DEBUG::2018-02-09 15:59:01,471::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11 
	Reactor thread::INFO::2018-02-09 15:59:01,471::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol xml from 127.0.0.1:47145 
	Reactor thread::DEBUG::2018-02-09 15:59:01,472::bindingxmlrpc::1302::XmlDetector::(handle_socket) xml over http detected from ('127.0.0.1', 47145) 
	BindingXMLRPC::INFO::2018-02-09 15:59:01,472::xmlrpc::73::vds.XMLRPCServer::(handle_request) Starting request handler for 127.0.0.1:47145 
	Thread-15::INFO::2018-02-09 15:59:01,472::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47145 started 
	Thread-15::DEBUG::2018-02-09 15:59:01,474::utils::671::root::(execCmd) /bin/taskset --cpu-list 0-5 /usr/libexec/vdsm/hooks/after_get_all_vm_stats/10_fakevmstats (cwd None) 
	Thread-15::DEBUG::2018-02-09 15:59:01,598::utils::689::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0 
	Thread-15::INFO::2018-02-09 15:59:01,599::hooks::98::root::(_runHooksDir) 
	Thread-22::INFO::2018-02-09 16:00:47,608::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47267 started 
	Thread-22::DEBUG::2018-02-09 16:00:47,610::utils::671::root::(execCmd) /bin/taskset --cpu-list 0-5 /usr/libexec/vdsm/hooks/after_get_all_vm_stats/10_fakevmstats (cwd None) 
	Thread-22::DEBUG::2018-02-09 16:00:47,735::utils::689::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0 
	Thread-22::INFO::2018-02-09 16:00:47,735::hooks::98::root::(_runHooksDir) 
	Thread-22::INFO::2018-02-09 16:00:47,737::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47267 stopped 
	jsonrpc.Executor/3::ERROR::2018-02-09 16:00:49,825::API::1677::vds::(_rollback) connectivity check failed 
	Traceback (most recent call last): 
	  File "/usr/share/vdsm/API.py", line 1675, in _rollback 
	    yield rollbackCtx 
	  File "/usr/share/vdsm/API.py", line 1527, in setupNetworks 
	    supervdsm.getProxy().setupNetworks(networks, bondings, options) 
	  File "/usr/share/vdsm/supervdsm.py", line 50, in __call__ 
	    return callMethod() 
	  File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda> 
	    **kwargs) 
	  File "<string>", line 2, in setupNetworks 
	  File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod 
	    raise convert_to_error(kind, result) 
	ConfigNetworkError: (10, 'connectivity check failed') 
	Reactor thread::INFO::2018-02-09 16:01:02,749::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 127.0.0.1:47286 
	Reactor thread::DEBUG::2018-02-09 16:01:02,762::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11 
	Reactor thread::INFO::2018-02-09 16:01:02,762::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol xml from 127.0.0.1:47286 
	Reactor thread::DEBUG::2018-02-09 16:01:02,763::bindingxmlrpc::1302::XmlDetector::(handle_socket) xml over http detected from ('127.0.0.1', 47286) 
	原因分析: 
	  1. 机器网络状态不稳定会影响; 
	  2. 首先会收到一个Host.ping命令,然后再是发送Host.setupNetworks命令; 
	  3. 在Host.setupNetworks命令之后120秒之内没有再收到Host.ping命令了,就会出现这个问题 
	代码分析: 
	def clientSeen(timeout): 
	    start = time.time() 
	    while timeout >= 0: 
	        try: 
	            #进入循环,timeout=120,这里判断在120秒内是否更改了文件P_VDSM_CLIENT_LOG; 
	            #Host.ping这个接口就会更改这个文件; 
	            #如果之后没有收到Host.ping命令,这里过了120秒就返回True了, 
	            if os.stat(constants.P_VDSM_CLIENT_LOG).st_mtime > start:  // #/var/run/vdsm/client.log 
	                return True 
	        except OSError as e: 
	            if e.errno == errno.ENOENT: 
	                pass  # P_VDSM_CLIENT_LOG is not yet there 
	            else: 
	                raise 
	        time.sleep(1) 
	        timeout -= 1 
	    return False 
	def _check_connectivity(connectivity_check_networks, networks, bondings, 
	                        options, logger): 
	    if utils.tobool(options.get('connectivityCheck', True)): 
	        logger.debug('Checking connectivity...') 
	        #如果前面返回的是True,那就会抛出异常,导致创建网桥失败; 
	        if not clientSeen(_get_connectivity_timeout(options)): 
	            logger.info('Connectivity check failed, rolling back') 
	            for network in connectivity_check_networks: 
	                # If the new added network was created on top of 
	                # existing bond, we need to keep the bond on rollback 
	                # flow, else we will break the new created bond. 
	                _delNetwork(network, force=True, 
	                            implicitBonding=networks[network]. 
	                            get('bonding') in bondings) 
	            raise ConfigNetworkError(ne.ERR_LOST_CONNECTION, 
	                                     'connectivity check failed') 
	解决方法: 
	  1. 将主机在页面上删除,然后再重新添加,就能添加成功了; 
	  2. ovirt-engine代码分析,为什么没有下发下来Host.ping命令。 
(责任编辑:IT)  | 
    
