> 虚拟化 Virtualization > oVirt >

oVirt主机添加失败

问题现象:
  添加后过了一会,主机处于无响应状态,下面的时间栏报错:网络相关的错误
 
问题分析:
 错误日志如下:
 
Reactor thread::INFO::2018-02-09 15:58:40,982::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 192.168.1.211:57037
Reactor thread::DEBUG::2018-02-09 15:58:41,039::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11
Reactor thread::INFO::2018-02-09 15:58:41,041::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol stomp from 192.168.1.211:57037
Reactor thread::INFO::2018-02-09 15:58:41,041::stompreactor::101::Broker.StompAdapter::(_cmd_connect) Processing CONNECT request
Reactor thread::DEBUG::2018-02-09 15:58:41,042::stompreactor::470::protocoldetector.StompDetector::(handle_socket) Stomp detected from ('192.168.1.211', 57037)
JsonRpc (StompReactor)::INFO::2018-02-09 15:58:41,042::stompreactor::128::Broker.StompAdapter::(_cmd_subscribe) Subscribe command received
JsonRpc (StompReactor)::INFO::2018-02-09 15:58:41,043::stompreactor::128::Broker.StompAdapter::(_cmd_subscribe) Subscribe command received
jsonrpc.Executor/3::DEBUG::2018-02-09 15:58:42,045::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/3::DEBUG::2018-02-09 15:58:42,046::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True
jsonrpc.Executor/3::DEBUG::2018-02-09 15:58:42,047::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.setupNetworks' in bridge with {u'bondings': {}, u'networks': {u'hcimgmt': {u'nic': u'eno1', u'mtu': u'1500', u'bootproto': u'dhcp', u'STP': u'no', u'bridged': u'true', u'defaultRoute': True}}, u'options': {u'connectivityCheck': u'true', u'connectivityTimeout': 120}}
Reactor thread::INFO::2018-02-09 15:58:46,299::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 127.0.0.1:47125
Reactor thread::DEBUG::2018-02-09 15:58:46,311::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11
Reactor thread::INFO::2018-02-09 15:58:46,311::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol xml from 127.0.0.1:47125
Reactor thread::DEBUG::2018-02-09 15:58:46,312::bindingxmlrpc::1302::XmlDetector::(handle_socket) xml over http detected from ('127.0.0.1', 47125)
BindingXMLRPC::INFO::2018-02-09 15:58:46,312::xmlrpc::73::vds.XMLRPCServer::(handle_request) Starting request handler for 127.0.0.1:47125
Thread-13::INFO::2018-02-09 15:58:46,312::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47125 started
Thread-13::DEBUG::2018-02-09 15:58:46,314::utils::671::root::(execCmd) /bin/taskset --cpu-list 0-5 /usr/libexec/vdsm/hooks/after_get_all_vm_stats/10_fakevmstats (cwd None)
Thread-13::DEBUG::2018-02-09 15:58:46,440::utils::689::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0
Thread-13::INFO::2018-02-09 15:58:46,440::hooks::98::root::(_runHooksDir)
Thread-13::INFO::2018-02-09 15:58:46,443::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47125 stopped
Reactor thread::INFO::2018-02-09 15:58:46,550::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 127.0.0.1:47126
Reactor thread::DEBUG::2018-02-09 15:58:46,561::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11
Reactor thread::INFO::2018-02-09 15:58:46,562::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol xml from 127.0.0.1:47126
Reactor thread::DEBUG::2018-02-09 15:58:46,562::bindingxmlrpc::1302::XmlDetector::(handle_socket) xml over http detected from ('127.0.0.1', 47126)
BindingXMLRPC::INFO::2018-02-09 15:58:46,563::xmlrpc::73::vds.XMLRPCServer::(handle_request) Starting request handler for 127.0.0.1:47126
Thread-14::INFO::2018-02-09 15:58:46,563::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47126 started
Thread-14::INFO::2018-02-09 15:58:46,567::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47126 stopped
Reactor thread::INFO::2018-02-09 15:59:01,459::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 127.0.0.1:47145
Reactor thread::DEBUG::2018-02-09 15:59:01,471::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11
Reactor thread::INFO::2018-02-09 15:59:01,471::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol xml from 127.0.0.1:47145
Reactor thread::DEBUG::2018-02-09 15:59:01,472::bindingxmlrpc::1302::XmlDetector::(handle_socket) xml over http detected from ('127.0.0.1', 47145)
BindingXMLRPC::INFO::2018-02-09 15:59:01,472::xmlrpc::73::vds.XMLRPCServer::(handle_request) Starting request handler for 127.0.0.1:47145
Thread-15::INFO::2018-02-09 15:59:01,472::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47145 started
Thread-15::DEBUG::2018-02-09 15:59:01,474::utils::671::root::(execCmd) /bin/taskset --cpu-list 0-5 /usr/libexec/vdsm/hooks/after_get_all_vm_stats/10_fakevmstats (cwd None)
Thread-15::DEBUG::2018-02-09 15:59:01,598::utils::689::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0
Thread-15::INFO::2018-02-09 15:59:01,599::hooks::98::root::(_runHooksDir)
 
Thread-22::INFO::2018-02-09 16:00:47,608::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47267 started
Thread-22::DEBUG::2018-02-09 16:00:47,610::utils::671::root::(execCmd) /bin/taskset --cpu-list 0-5 /usr/libexec/vdsm/hooks/after_get_all_vm_stats/10_fakevmstats (cwd None)
Thread-22::DEBUG::2018-02-09 16:00:47,735::utils::689::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0
Thread-22::INFO::2018-02-09 16:00:47,735::hooks::98::root::(_runHooksDir)
Thread-22::INFO::2018-02-09 16:00:47,737::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47267 stopped
jsonrpc.Executor/3::ERROR::2018-02-09 16:00:49,825::API::1677::vds::(_rollback) connectivity check failed
Traceback (most recent call last):
  File "/usr/share/vdsm/API.py", line 1675, in _rollback
    yield rollbackCtx
  File "/usr/share/vdsm/API.py", line 1527, in setupNetworks
    supervdsm.getProxy().setupNetworks(networks, bondings, options)
  File "/usr/share/vdsm/supervdsm.py", line 50, in __call__
    return callMethod()
  File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda>
    **kwargs)
  File "<string>", line 2, in setupNetworks
  File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod
    raise convert_to_error(kind, result)
ConfigNetworkError: (10, 'connectivity check failed')
Reactor thread::INFO::2018-02-09 16:01:02,749::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 127.0.0.1:47286
Reactor thread::DEBUG::2018-02-09 16:01:02,762::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11
Reactor thread::INFO::2018-02-09 16:01:02,762::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol xml from 127.0.0.1:47286
Reactor thread::DEBUG::2018-02-09 16:01:02,763::bindingxmlrpc::1302::XmlDetector::(handle_socket) xml over http detected from ('127.0.0.1', 47286)
原因分析:
  1. 机器网络状态不稳定会影响;
 
  2. 首先会收到一个Host.ping命令,然后再是发送Host.setupNetworks命令;
 
  3. 在Host.setupNetworks命令之后120秒之内没有再收到Host.ping命令了,就会出现这个问题
 
代码分析:
def clientSeen(timeout):
    start = time.time()
    while timeout >= 0:
        try:
            #进入循环,timeout=120,这里判断在120秒内是否更改了文件P_VDSM_CLIENT_LOG;
            #Host.ping这个接口就会更改这个文件;
            #如果之后没有收到Host.ping命令,这里过了120秒就返回True了,
            if os.stat(constants.P_VDSM_CLIENT_LOG).st_mtime > start:  // #/var/run/vdsm/client.log
                return True
        except OSError as e:
            if e.errno == errno.ENOENT:
                pass  # P_VDSM_CLIENT_LOG is not yet there
            else:
                raise
        time.sleep(1)
        timeout -= 1
    return False
 
def _check_connectivity(connectivity_check_networks, networks, bondings,
                        options, logger):
    if utils.tobool(options.get('connectivityCheck', True)):
        logger.debug('Checking connectivity...')
        #如果前面返回的是True,那就会抛出异常,导致创建网桥失败;
        if not clientSeen(_get_connectivity_timeout(options)):
            logger.info('Connectivity check failed, rolling back')
            for network in connectivity_check_networks:
                # If the new added network was created on top of
                # existing bond, we need to keep the bond on rollback
                # flow, else we will break the new created bond.
                _delNetwork(network, force=True,
                            implicitBonding=networks[network].
                            get('bonding') in bondings)
            raise ConfigNetworkError(ne.ERR_LOST_CONNECTION,
                                     'connectivity check failed')
解决方法:
  1. 将主机在页面上删除,然后再重新添加,就能添加成功了;
 
  2. ovirt-engine代码分析,为什么没有下发下来Host.ping命令。
 

 

(责任编辑:IT)