记一起由Oracle心跳引起的生产库故障-天下网标王
数据库 Oracle
节点之间连接心跳网络的网线有问题,导致心跳网络异常,RAC节点之间不能正常通信,脑裂,ORACLE的服务被中止。RAC集群为了保证一致性和完整性,在心跳网络异常的情况下,会发生脑裂,ORACLE实例会被强制中止。

一、问题描述

环境描述:

节点

sid

db_name

software_version

备注

172.16.2.22

hdls1

HDLS

11.2.0.4

rac节点

172.16.2.23

hdls2

HDLS

11.2.0.4

rac 节点

事件原因:

两个节点的心跳网络异常,导致RAC脑裂,中断了节点运行的oracle实列进程,数据库服务宕掉。

二、过程

1、 时间:16:45报障处理

检查发现两台oracle实例进程中止,无法正常连接。

2、时间:17:25恢复23节点

恢复23节点,保证业务作业可正常进行,排查22节点故障。等待作业完成处理。

  • 重启22节点后,23节点的数据服务恢复正常
reboot -f
  • 检查23节点的数据库服务状态
crs_stat -t 

3、对节点22进行分析

EVMD日志

2022-09-06 22:37:17.970: [GIPCHTHR][3844073216]gipchaWorkerCreateInterface: created remote interface for node 'hdls02', haName 'fe0a-b4a2-f838-ac00', inf 'udp://11.0.0.23:19879'
2022-09-06 22:37:17.970: [GIPCHGEN][3844073216]gipchaWorkerAttachInterface: Interface attached inf 0x7f21c4021fa0 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', local 0x7f21c402b1b0, ip '11.0.0.23:19879', subnet '11.0.0.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x6 }
2022-09-06 22:37:17.970: [GIPCXCPT][3844073216]gipchaLowerRecv: message from unrecognized node 'udp://11.0.0.23:19879', hdr 0x7f21c002bf68 { len 80, seq 0, type gipchaHdrTypeAck (3), lastSeq 1, lastAck 0, minAck 2, flags 0x1, srcLuid 24d64699-7050de6f, dstLuid 6678805d-500d8712, msgId 1 }, ret gipcretFail (1)
2022-09-06 22:37:17.970: [GIPCHALO][3844073216]gipchaLowerCallback: EXCEPTION[ ret gipcretFail (1) ] error while processing req 0x7f21e51fbe60 { type gipcreqtypeRecv, endp 0000000000001950, ret gipcretSuccess, local 'udp://11.0.0.22:18417', peer 'udp://11.0.0.23:19879', buf 0x7f21c002bf68, len 10240, olen 80 }, hctx 0x21a9430 [0000000000000010] { gipchaContext : host 'hdls01', name '5e52-0b6f-5d73-b878', luid '6678805d-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x5 }
2022-09-06 22:37:17.971: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:17.971: [GIPCHALO][3844073216]gipchaLowerSend: deffering startup of hdr 0x7f21c001f2d8 { len 232, seq 0, type gipchaHdrTypeSend (1), lastSeq 0, lastAck 0, minAck 0, flags 0x0, srcLuid 00000000-00000000, dstLuid 00000000-00000000, msgId 0 }, node 0x7f21c4013650 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', srcLuid 6678805d-500d8712, dstLuid 00000000-00000000 numInf 1, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [1 : 1], createTime 7114604, sentRegister 1, localMonitor 0, flags 0x4 }
2022-09-06 22:37:17.981: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:17.991: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.001: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.011: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.021: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.031: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.035: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.045: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.055: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.060: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.070: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.075: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.079: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.087: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.097: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.107: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.117: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.127: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.972: [GIPCHALO][3844073216]gipchaLowerProcessAcks: ESTABLISH finished for node 0x7f21c4013650 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', srcLuid 6678805d-500d8712, dstLuid 24d64699-7050de6f numInf 1, contigSeq 2, lastAck 2, lastValidAck 0, sendSeq [1 : 1], createTime 7114604, sentRegister 1, localMonitor 0, flags 0x20c }
2022-09-06 22:37:18.972: [GIPCHALO][3844073216]gipchaLowerProcessWaitQ: triggering deffered startup of msg 0x7f21c001f2d8 { len 232, seq 0, type gipchaHdrTypeSend (1), lastSeq 0, lastAck 0, minAck 0, flags 0x0, srcLuid 00000000-00000000, dstLuid 00000000-00000000, msgId 0 }, node 0x7f21c4013650 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', srcLuid 6678805d-500d8712, dstLuid 24d64699-7050de6f numInf 1, contigSeq 2, lastAck 2, lastValidAck 0, sendSeq [2 : 2], createTime 7114604, sentRegister 1, localMonitor 0, flags 0x208 }
2022-09-06 22:37:18.973: [GIPCXCPT][3844073216]gipchaInternalResolve: failed to resolve ret gipcretKeyNotFound (36), host 'hdls01', port '13b5-9956-d0b9-0552', hctx 0x21a9430 [0000000000000010] { gipchaContext : host 'hdls01', name '5e52-0b6f-5d73-b878', luid '6678805d-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x5 }, ret gipcretKeyNotFound (36)
2022-09-06 22:37:18.973: [GIPCHGEN][3844073216]gipchaResolveF [gipcmodGipcResolve : gipcmodGipc.c : 815]: EXCEPTION[ ret gipcretKeyNotFound (36) ] failed to resolve ctx 0x21a9430 [0000000000000010] { gipchaContext : host 'hdls01', name '5e52-0b6f-5d73-b878', luid '6678805d-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x5 }, host 'hdls01', port '13b5-9956-d0b9-0552', flags 0x0
2022-09-06 22:37:18.973: [ CRSCCL][3835287296]No connection to peer(2, 18699304) will retry send msgid=0. rc=9
2022-09-06 22:37:18.973: [GIPCXCPT][3844073216]gipchaInternalResolve: failed to resolve ret gipcretKeyNotFound (36), host 'hdls01', port '0277-7bff-f7af-8073', hctx 0x21a9430 [0000000000000010] { gipchaContext : host 'hdls01', name '5e52-0b6f-5d73-b878', luid '6678805d-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x5 }, ret gipcretKeyNotFound (36)
2022-09-06 22:37:18.973: [GIPCHGEN][3844073216]gipchaResolveF [gipcmodGipcResolve : gipcmodGipc.c : 815]: EXCEPTION[ ret gipcretKeyNotFound (36) ] failed to resolve ctx 0x21a9430 [0000000000000010] { gipchaContext : host 'hdls01', name '5e52-0b6f-5d73-b878', luid '6678805d-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x5 }, host 'hdls01', port '0277-7bff-f7af-8073', flags 0x0
2022-09-06 22:37:18.973: [ CRSCCL][3837388544]clsCclNewConn: added new conn to tempConList: newPeerCon = bc007ba0
2022-09-06 22:37:18.973: [ CRSCCL][3837388544]PNC: Disconnecting conn from node (2,18699304).
2022-09-06 22:37:18.973: [ CRSCCL][3837388544]PNC: Keeping our connection to node (2,18699304).
2022-09-06 22:37:18.973: [GIPCHAUP][3844073216]gipchaUpperDisconnect: initiated discconnect umsg 0x7f21c0010c20 { msg 0x7f21c002dc88, ret gipcretRequestPending (15), flags 0x2 }, msg 0x7f21c002dc88 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-00001a62, dstCid 00000000-000005da }, endp 0x7f21c0016c00 [0000000000001a62] { gipchaEndpoint : port 'EVMDMAIN2_1/2b61-5a3a-b3b0-633e', peer 'hdls02:d60b-3fd7-4897-3466', srcCid 00000000-00001a62, dstCid 00000000-000005da, numSend 0, maxSend 100, groupListType 2, hagroup 0x21d35a0, usrFlags 0x4000, flags 0x21c }
2022-09-06 22:37:18.973: [ CRSCCL][3837388544]ConnAccepted from Peer:msgTag= 0xcccccccc version= 0 msgType= 4 msgId= 0 msglen = 0 clschdr.size_clscmsgh= 88 src= (2, 18699304) dest= (1, 4294793640)
2022-09-06 22:37:18.973: [GIPCXCPT][3844073216]gipchaUpperProcessDisconnect: dropping Disconnect to unknown msg 0x7f21c0036a68 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-000005da, dstCid 00000000-00001a62 }, node 0x7f21c4013650 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', srcLuid 6678805d-500d8712, dstLuid 24d64699-7050de6f numInf 1, contigSeq 7, lastAck 5, lastValidAck 6, sendSeq [6 : 6], createTime 7114604, sentRegister 1, localMonitor 0, flags 0x208 }, ret gipcretFail (1)
2022-09-06 22:37:18.973: [GIPCHAUP][3844073216]gipchaUpperProcessDisconnect: EXCEPTION[ ret gipcretFail (1) ] error during DISCONNECT processing for node 0x7f21c4013650 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', srcLuid 6678805d-500d8712, dstLuid 24d64699-7050de6f numInf 1, contigSeq 7, lastAck 5, lastValidAck 6, sendSeq [6 : 6], createTime 7114604, sentRegister 1, localMonitor 0, flags 0x208 }
2022-09-06 22:37:18.973: [GIPCHAUP][3844073216]gipchaUpperCallbackDisconnect: completed DISCONNECT ret gipcretSuccess (0), umsg 0x7f21c0010c20 { msg 0x7f21c002dc88, ret gipcretSuccess (0), flags 0x2 }, msg 0x7f21c002dc88 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-00001a62, dstCid 00000000-000005da }, hendp 0x7f21c0016c00 [0000000000001a62] { gipchaEndpoint : port 'EVMDMAIN2_1/2b61-5a3a-b3b0-633e', peer 'hdls02:d60b-3fd7-4897-3466', srcCid 00000000-00001a62, dstCid 00000000-000005da, numSend 0, maxSend 100, groupListType 2, hagroup 0x21d35a0, usrFlags 0x4000, flags 0x21c }
2022-09-06 22:37:18.984: [ EVMD][3964278592] Authorization database built successfully.
2022-09-06 22:37:19.042: [ CLSE][3964278592]clse_get_auth_loc: Returning default authloc: /oracle/grid/crs_1/auth/evm/hdls01
2022-09-06 22:37:23.254: [ GIPCNET][3844073216]gipcmodNetworkProcessSend: [network] failed send attempt endp 0x7f21c001ecd0 [0000000000001950] { gipcEndpoint : localAddr 'udp://11.0.0.22:18417', remoteAddr '', numPend 5, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj 0x7f21c001eae0, sendp 0x7f21c0011550flags 0x3, usrFlags 0x4000 }, req 0x7f21c0011250 [0000000000001c19] { gipcSendRequest : addr 'udp://11.0.0.23:19879', data 0x7f21b0013448, len 1384, olen 0, parentEndp 0x7f21c001ecd0, ret gipcretEndpointNotAvailable (40), objFlags 0x0, reqFlags 0x2 }
2022-09-06 22:37:23.254: [ GIPCNET][3844073216]gipcmodNetworkProcessSend: slos op : sgipcnValidateSocket
2022-09-06 22:37:23.255: [ GIPCNET][3844073216]gipcmodNetworkProcessSend: slos dep : Invalid argument (22)
2022-09-06 22:37:23.255: [ GIPCNET][3844073216]gipcmodNetworkProcessSend: slos loc : address not
2022-09-06 22:37:23.255: [ GIPCNET][3844073216]gipcmodNetworkProcessSend: slos info: addr '11.0.0.22:18417', len 1384, buf 0x7f21b0013448, cookie 0x7f21c0011250
2022-09-06 22:37:23.255: [GIPCXCPT][3844073216]gipcInternalSendSync: failed sync request, ret gipcretEndpointNotAvailable (40)
2022-09-06 22:37:23.255: [GIPCXCPT][3844073216]gipcSendSyncF [gipchaLowerInternalSend : gipchaLower.c : 846]: EXCEPTION[ ret gipcretEndpointNotAvailable (40) ] failed to send on endp 0x7f21c001ecd0 [0000000000001950] { gipcEndpoint : localAddr 'udp://11.0.0.22:18417', remoteAddr '', numPend 5, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj 0x7f21c001eae0, sendp 0x7f21c0011550flags 0x3, usrFlags 0x4000 }, addr 0x7f21c00183a0 [0000000000001a10] { gipcAddress : name 'udp://11.0.0.23:19879', objFlags 0x0, addrFlags 0x1 }, buf 0x7f21b0013448, len 1384, flags 0x0
2022-09-06 22:37:23.255: [GIPCHGEN][3844073216]gipchaInterfaceFail: marking interface failing 0x7f21c4021fa0 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', local 0x7f21c402b1b0, ip '11.0.0.23:19879', subnet '11.0.0.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0x6 }
2022-09-06 22:37:23.255: [GIPCHALO][3844073216]gipchaLowerInternalSend: failed to initiate send on interface 0x7f21c4021fa0 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', local 0x7f21c402b1b0, ip '11.0.0.23:19879', subnet '11.0.0.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0x86 }, hctx 0x21a9430 [0000000000000010] { gipchaContext : host 'hdls01', name '5e52-0b6f-5d73-b878', luid '6678805d-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x5 }
2022-09-06 22:37:23.255: [GIPCHGEN][3844073216]gipchaInterfaceDisable: disabling interface 0x7f21c402b1b0 { host '', haName '5e52-0b6f-5d73-b878', local (nil), ip '11.0.0.22:18417', subnet '11.0.0.0', mask '255.255.255.0', mac 'ec-c0-1b-08-5e-b6', ifname 'eno3', numRef 0, numFail 1, idxBoot 0, flags 0x10d }
2022-09-06 22:37:23.255: [GIPCHGEN][3844073216]gipchaInterfaceDisable: disabling interface 0x7f21c4021fa0 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', local 0x7f21c402b1b0, ip '11.0.0.23:19879', subnet '11.0.0.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0x86 }
2022-09-06 22:37:23.255: [GIPCHALO][3844073216]gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x7f21c4021fa0 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', local 0x7f21c402b1b0, ip '11.0.0.23:19879', subnet '11.0.0.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0xa6 }
2022-09-06 22:37:23.255: [GIPCHGEN][3844073216]gipchaInterfaceReset: resetting interface 0x7f21c4021fa0 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', local 0x7f21c402b1b0, ip '11.0.0.23:19879', subnet '11.0.0.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0xa6 }
2022-09-06 22:37:23.372: [GIPCHDEM][3844073216]gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x7f21c402b1b0 { host '', haName '5e52-0b6f-5d73-b878', local (nil), ip '11.0.0.22:18417', subnet '11.0.0.0', mask '255.255.255.0', mac 'ec-c0-1b-08-5e-b6', ifname 'eno3', numRef 0, numFail 0, idxBoot 0, flags 0x12d }
2022-09-06 22:37:23.372: [GIPCHTHR][3844073216]gipchaWorkerCreateInterface: created remote interface for node 'hdls02', haName 'fe0a-b4a2-f838-ac00', inf 'udp://11.0.0.23:19879'
2022-09-06 22:37:23.373: [GIPCXCPT][3841971968]gipchaDaemonProcessRecv: dropping unrecognized daemon request 17, hctx 0x21a9430 [0000000000000010] { gipchaContext : host 'hdls01', name '5e52-0b6f-5d73-b878', luid '6678805d-00000000', numNode 1, numInf 0, usrFlags 0x0, flags 0x5 }, ret gipcretFail (1)
2022-09-06 22:37:23.373: [GIPCHDEM][3841971968]gipchaDaemonProcessRecv: EXCEPTION[ ret gipcretFail (1) ] exception processing requset type 17, hctx 0x21a9430 [0000000000000010] { gipchaContext : host 'hdls01', name '5e52-0b6f-5d73-b878', luid '6678805d-00000000', numNode 1, numInf 0, usrFlags 0x0, flags 0x5 }
2022-09-06 22:37:27.377: [GIPCHDEM][3841971968]gipchaDaemonInfRequest: sent local interfaceRequest, hctx 0x21a9430 [0000000000000010] { gipchaContext : host 'hdls01', name '5e52-0b6f-5d73-b878', luid '6678805d-00000000', numNode 1, numInf 0, usrFlags 0x0, flags 0x1 } to gipcd
2022-09-06 22:37:28.916: [GIPCHALO][3844073216]gipchaLowerProcessNode: no valid interfaces found to node for 5660 ms, node 0x7f21c4013650 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', srcLuid 6678805d-500d8712, dstLuid 24d64699-7050de6f numInf 1, contigSeq 9, lastAck 20, lastValidAck 9, sendSeq [21 : 27], createTime 7114604, sentRegister 1, localMonitor 0, flags 0x8 }
2022-09-06 22:37:32.598: [GIPCHDEM][3841971968]gipchaDaemonInfRequest: sent local interfaceRequest, hctx 0x21a9430 [0000000000000010] { gipchaContext : host 'hdls01', name '5e52-0b6f-5d73-b878', luid '6678805d-00000000', numNode 1, numInf 0, usrFlags 0x0, flags 0x1 } to gipcd
2022-09-06 22:37:32.612: [GIPCHGEN][3841971968]gipchaNodeAddInterface: adding interface information for inf 0x7f21c4024d60 { host '', haName '5e52-0b6f-5d73-b878', local (nil), ip '11.0.0.22', subnet '11.0.0.0', mask '255.255.255.0', mac 'ec-c0-1b-08-5e-b6', ifname 'eno3', numRef 0, numFail 0, idxBoot 0, flags 0x1 }
2022-09-06 22:37:33.409: [GIPCHTHR][3844073216]gipchaWorkerCreateInterface: created local interface for node 'hdls01', haName '5e52-0b6f-5d73-b878', inf 'udp://11.0.0.22:23405'
2022-09-06 22:37:33.409: [GIPCHGEN][3844073216]gipchaWorkerAttachInterface: Interface attached inf 0x7f21c4021fa0 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', local 0x7f21c4024d60, ip '11.0.0.23:19879', subnet '11.0.0.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0x6 }
2022-09-07 00:25:13.978: [GIPCHGEN][3841971968]gipchaInterfaceFail: marking interface failing 0x7f21c4024d60 { host '', haName '5e52-0b6f-5d73-b878', local (nil), ip '11.0.0.22:23405', subnet '11.0.0.0', mask '255.255.255.0', mac 'ec-c0-1b-08-5e-b6', ifname 'eno3', numRef 1, numFail 0, idxBoot 0, flags 0xd }
2022-09-07 00:25:14.397: [GIPCHGEN][3844073216]gipchaInterfaceFail: marking interface failing 0x7f21c4021fa0 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', local 0x7f21c4024d60, ip '11.0.0.23:19879', subnet '11.0.0.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0x6 }
2022-09-07 00:25:15.398: [GIPCHGEN][3844073216]gipchaInterfaceDisable: disabling interface 0x7f21c4024d60 { host '', haName '5e52-0b6f-5d73-b878', local (nil), ip '11.0.0.22:23405', subnet '11.0.0.0', mask '255.255.255.0', mac 'ec-c0-1b-08-5e-b6', ifname 'eno3', numRef 0, numFail 1, idxBoot 0, flags 0x18d }
2022-09-07 00:25:15.398: [GIPCHGEN][3844073216]gipchaInterfaceDisable: disabling interface 0x7f21c4021fa0 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', local 0x7f21c4024d60, ip '11.0.0.23:19879', subnet '11.0.0.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0x86 }
2022-09-07 00:25:15.398: [GIPCHALO][3844073216]gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x7f21c4021fa0 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', local 0x7f21c4024d60, ip '11.0.0.23:19879', subnet '11.0.0.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0xa6 }
2022-09-07 00:25:15.398: [GIPCHGEN][3844073216]gipchaInterfaceReset: resetting interface 0x7f21c4021fa0 { host 'hdls02', haName 'fe0a-b4a2-f838-ac00', local 0x7f21c4024d60, ip '11.0.0.23:19879', subnet '11.0.0.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0xa6 }
2022-09-07 00:25:16.399: [GIPCHDEM][3844073216]gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x7f21c4024d60 { host '', haName '5e52-0b6f-5d73-b878', local (nil), ip '11.0.0.22:23405', subnet '11.0.0.0', mask '255.255.255.0', mac 'ec-c0-1b-08-5e-b6', ifname 'eno3', numRef 0, numFail 0, idxBoot 0, flags 0x1ad }

通过上面的日志可以看出,两个节点之间心跳网络通信异常,不能各自获取对端节点的信息,导致oracle实例进程中止。

系统日志

通过上述日志可以看出eno3心跳网口状态一直在DOWN和UP之间循环,状态不稳定。

4、时间:22:30出现23单节点监听挂起

由于心跳网络故障,两节点无法正常通信,22:30,23节点实例中断,23:38,23节点数据库服务恢复。

5、时间:00:30作业结束后,更换心跳6类线

  • 等业务作业运行结束后,对心跳线进行更换,更换心跳6类线。
  • 22节点尝试启动数据库服务,成功。
  • srvctl start instance -d HDLS -i hdls1

6、时间:00:30 数据库恢复正常

  • 监听状态
  • [grid@hdls01 ~]$ srvctl status listener
    Listener LISTENER is enabledListener LISTENER is running on 
    node(s): hdls01,hdls02
  • 数据库实例状态
[grid@hdls01 ~]$ ps -ef |grep -Ei "ora_"
oracle 13648 1 0 15:05 ? 00:00:00 ora_w002_hdls1
oracle 20745 1 0 00:48 ? 00:00:08 ora_pmon_hdls1
oracle 20747 1 0 00:48 ? 00:00:02 ora_psp0_hdls1
oracle 20749 1 0 00:48 ? 00:01:46 ora_vktm_hdls1
oracle 20754 1 0 00:48 ? 00:00:00 ora_gen0_hdls1
oracle 20756 1 0 00:48 ? 00:00:08 ora_diag_hdls1
oracle 20758 1 0 00:48 ? 00:00:03 ora_dbrm_hdls1
oracle 20760 1 0 00:48 ? 00:00:01 ora_ping_hdls1
oracle 20762 1 0 00:48 ? 00:00:00 ora_acms_hdls1
oracle 20764 1 0 00:48 ? 00:02:48 ora_dia0_hdls1
oracle 20766 1 0 00:48 ? 00:01:36 ora_lmon_hdls1
oracle 20768 1 0 00:48 ? 00:00:17 ora_lmd0_hdls1
oracle 20770 1 0 00:48 ? 00:01:17 ora_lms0_hdls1
oracle 20774 1 0 00:48 ? 00:01:18 ora_lms1_hdls1
oracle 20778 1 0 00:48 ? 00:01:14 ora_lms2_hdls1
oracle 20782 1 0 00:48 ? 00:01:14 ora_lms3_hdls1
oracle 20786 1 0 00:48 ? 00:01:14 ora_lms4_hdls1
oracle 20790 1 0 00:48 ? 00:00:00 ora_rms0_hdls1
oracle 20792 1 0 00:48 ? 00:00:01 ora_lmhb_hdls1
oracle 20794 1 0 00:48 ? 00:00:14 ora_mman_hdls1
oracle 20796 1 0 00:48 ? 00:00:03 ora_dbw0_hdls1
oracle 20798 1 0 00:48 ? 00:00:03 ora_dbw1_hdls1
oracle 20800 1 0 00:48 ? 00:00:03 ora_dbw2_hdls1
oracle 20802 1 0 00:48 ? 00:00:03 ora_dbw3_hdls1
oracle 20804 1 0 00:48 ? 00:00:03 ora_dbw4_hdls1
oracle 20806 1 0 00:48 ? 00:00:03 ora_dbw5_hdls1
oracle 20808 1 0 00:48 ? 00:00:03 ora_dbw6_hdls1
oracle 20810 1 0 00:48 ? 00:00:03 ora_dbw7_hdls1
oracle 20812 1 0 00:48 ? 00:00:03 ora_dbw8_hdls1
oracle 20814 1 0 00:48 ? 00:00:03 ora_dbw9_hdls1
oracle 20816 1 0 00:48 ? 00:00:03 ora_dbwa_hdls1
oracle 20818 1 0 00:48 ? 00:00:03 ora_dbwb_hdls1
oracle 20820 1 0 00:48 ? 00:01:30 ora_lgwr_hdls1
oracle 20822 1 0 00:48 ? 00:00:34 ora_ckpt_hdls1
oracle 20824 1 0 00:48 ? 00:00:15 ora_smon_hdls1
oracle 20826 1 0 00:48 ? 00:00:00 ora_reco_hdls1
oracle 20828 1 0 00:48 ? 00:00:00 ora_rbal_hdls1
oracle 20830 1 0 00:48 ? 00:00:00 ora_asmb_hdls1
oracle 20832 1 0 00:48 ? 00:00:45 ora_mmon_hdls1
oracle 20834 1 0 00:48 ? 00:01:01 ora_mmnl_hdls1
oracle 20838 1 0 00:48 ? 00:00:00 ora_d000_hdls1
oracle 20840 1 0 00:48 ? 00:00:00 ora_mark_hdls1
oracle 20842 1 0 00:48 ? 00:00:00 ora_s000_hdls1
oracle 20899 1 0 00:48 ? 00:00:27 ora_lck0_hdls1
oracle 20901 1 0 00:48 ? 00:00:01 ora_rsmn_hdls1
oracle 20916 1 0 00:48 ? 00:00:06 ora_o000_hdls1
oracle 21003 1 0 00:48 ? 00:00:00 ora_arc0_hdls1
oracle 21005 1 0 00:48 ? 00:00:00 ora_arc1_hdls1
oracle 21007 1 0 00:48 ? 00:00:01 ora_arc2_hdls1
oracle 21009 1 0 00:48 ? 00:00:00 ora_arc3_hdls1
oracle 21053 1 0 00:48 ? 00:00:06 ora_o001_hdls1
oracle 21251 1 0 00:48 ? 00:00:25 ora_nsa2_hdls1
oracle 21253 1 0 00:48 ? 00:00:06 ora_o002_hdls1
oracle 21264 1 0 00:48 ? 00:00:00 ora_gtx0_hdls1
oracle 21268 1 0 00:48 ? 00:00:01 ora_rcbg_hdls1
oracle 21274 1 0 00:48 ? 00:00:00 ora_qmnc_hdls1
oracle 21296 1 0 00:48 ? 00:00:00 ora_q000_hdls1
oracle 21349 1 0 00:48 ? 00:00:04 ora_cjq0_hdls1
oracle 22074 1 0 00:49 ? 00:00:00 ora_q002_hdls1
oracle 26225 1 0 00:53 ? 00:00:00 ora_smco_hdls1
oracle 54283 1 0 15:56 ? 00:00:00 ora_j000_hdls1
oracle 71251 1 0 16:17 ? 00:00:00 ora_w001_hdls1
oracle 72429 1 0 16:18 ? 00:00:00 ora_pz99_hdls1
oracle 72500 1 0 16:18 ? 00:00:00 ora_j001_hdls1
grid 74000 71958 0 16:19 pts/1 00:00:00 grep --color=auto -Ei ora_
[grid@hdls01 ~]$

7、处理后RAC状态检查

  • 检查rac集群服务
[grid@hdls01 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.ARCHLOG.dg ora....up.type ONLINE ONLINE hdls01
ora.DATA.dg ora....up.type ONLINE ONLINE hdls01
ora....ER.lsnr ora....er.type ONLINE ONLINE hdls01
ora....N1.lsnr ora....er.type ONLINE ONLINE hdls02
ora.OCRVT.dg ora....up.type ONLINE ONLINE hdls01
ora.asm ora.asm.type ONLINE ONLINE hdls01
ora.cvu ora.cvu.type ONLINE ONLINE hdls02
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora.hdls.db ora....se.type ONLINE ONLINE hdls01
ora....SM1.asm application ONLINE ONLINE hdls01
ora....01.lsnr application ONLINE ONLINE hdls01
ora.hdls01.gsd application OFFLINE OFFLINE
ora.hdls01.ons application ONLINE ONLINE hdls01
ora.hdls01.vip ora....t1.type ONLINE ONLINE hdls01
ora....SM2.asm application ONLINE ONLINE hdls02
ora....02.lsnr application ONLINE ONLINE hdls02
ora.hdls02.gsd application OFFLINE OFFLINE
ora.hdls02.ons application ONLINE ONLINE hdls02
ora.hdls02.vip ora....t1.type ONLINE ONLINE hdls02
ora....network ora....rk.type ONLINE ONLINE hdls01
ora.oc4j ora.oc4j.type ONLINE ONLINE hdls01
ora.ons ora.ons.type ONLINE ONLINE hdls01
ora.scan1.vip ora....ip.type ONLINE
  • 检查数据库
[grid@hdls01 ~]$ srvctl status listener 
Listener LISTENER is enabled
Listener LISTENER is running on node(s): hdls01,hdls02
SQL> select name,status from v$datafile;

NAME STATUS
-------------------------------------------------------------------------------- -------
+DATA/hdls/datafile/system.306.1100288753 SYSTEM
+DATA/hdls/datafile/sysaux.264.1100288753 ONLINE
+DATA/hdls/datafile/undotbs1.263.1100288753 ONLINE
+DATA/hdls/datafile/users.260.1100288753 ONLINE
+DATA/hdls/datafile/undotbs2.277.1100288833 ONLINE
+DATA/hdls/oauser01.dbf ONLINE
+DATA/hdls/hdls2001.dbf ONLINE
+DATA/hdls/oa01.dbf ONLINE
+DATA/hdls/hdls01.dbf ONLINE
+DATA/hdls/hdls02.dbf ONLINE
+DATA/hdls/hdls03.dbf ONLINE
+DATA/hdls/hdls04.dbf ONLINE
+DATA/hdls/hdls05.dbf ONLINE
+DATA/hdls/hdls06.dbf ONLINE
+DATA/hdls/hdls07.dbf ONLINE
+DATA/hdls/hdls08.dbf ONLINE
+DATA/hdls/hdls09.dbf ONLINE
+DATA/hdls/hdls10.dbf ONLINE
+DATA/hdls/others01.dbf ONLINE
+DATA/hdls/others02.dbf ONLINE
+DATA/hdls/others03.dbf ONLINE
+DATA/hdls/indx01.dbf ONLINE
+DATA/hdls/indx02.dbf ONLINE
+DATA/hdls/indx03.dbf ONLINE
+DATA/hdls/indx04.dbf ONLINE
+DATA/hdls/indx05.dbf ONLINE
+DATA/hdls/indx06.dbf ONLINE
+DATA/hdls/indx07.dbf ONLINE
+DATA/hdls/hdls131701.dbf ONLINE
+DATA/hdls/hdls131702.dbf ONLINE
+DATA/hdls/hdls131703.dbf ONLINE
+DATA/hdls/hdls131704.dbf ONLINE
+DATA/hdls/hdls131705.dbf ONLINE
+DATA/hdls/hdls131706.dbf ONLINE
+DATA/hdls/cdc01.dbf ONLINE

35 rows selected.

SQL>

三、小结

  1. 节点之间连接心跳网络的网线有问题,导致心跳网络异常,RAC节点之间不能正常通信,脑裂,ORACLE的服务被中止。RAC集群为了保证一致性和完整性,在心跳网络异常的情况下,会发生脑裂,ORACLE实例会被强制中止。
  2. 更换心跳6类线后,数据库恢复正常。
责任编辑:姜华 来源: 今日头条
相关推荐

2021-12-02 07:50:30

NFS故障内存

2019-12-27 10:43:48

磁盘数据库死锁

2022-09-08 08:50:17

SSDOracleCPU

2021-01-12 07:57:36

MySQLBinlog故障处理

2022-06-01 06:17:42

微服务Kafka

2009-10-29 16:32:34

Oracle表空间

2022-11-03 16:10:29

groovyfullGC

2013-12-12 16:28:04

Lua脚本语言

2022-11-29 16:35:02

Tetris鸿蒙

2022-12-02 14:20:09

Tetris鸿蒙

2022-12-05 09:10:21

2024-05-29 14:26:45

2023-07-04 08:06:40

数据库容器公有云

2019-07-25 08:30:58

数据库服务器故障

2022-08-16 07:49:48

云原生数据库系统

2022-01-10 06:52:59

拖拽库项目搜索

2014-10-21 15:07:04

2023-03-30 09:32:27

2022-11-14 17:01:34

游戏开发画布功能
SQL Server
2858内容
全部话题

相关专题 更多

2024年第十九届中国企业年终评选
2024年第十九届中国企业年终评选
如何发挥数据的最大力量?
如何发挥数据的最大力量?
2024-09-11 10:06:01
开发者系列沙龙:HarmonyOS应用生态构建与拓展
开发者系列沙龙:HarmonyOS应用生态构建与拓展
2024-08-07 16:28:10
我收藏的内容
点赞
收藏
分享

51CTO技术栈公众号

业务
速览
在线客服
媒体
51CTO CIOAge HC3i
社区
51CTO博客 鸿蒙开发者社区 AI.x社区
教育
51CTO学堂 精培 企业培训 CTO训练营

相关内容推荐

盘古网络公司销售网络公司捐款共同富裕斯蒂普网络公司余杭科技网络公司名彔河源网络公司首推6火星网络公司会在云端储存历史记录吗朝阳网络公司立找2火星下拉潍坊网络公司dawnhl现在上海的网络公司有多少家重庆广电重庆有线网络公司网络公司适合抖音吗上海蚁诚网络公司怎么样张店网络公司招聘六安网络公司佳选20火星福建聚创信息科技网络公司上海欢钦网络公司广电网络公司党建工作总结东莞网络公司在塘厦有招聘星晋网络公司的游戏进网络公司好不好厦门同安有什么网络公司美国无线网络公司共青广电网络公司报修电话宣城网络公司就找7火星下拉网络公司人员工资计入什么科目渑池网络公司重庆湛铭网络公司拉萨网络公司到9火星x博士游戏奇谈网络公司中国做大网络公司襄阳易流科技网络公司乐隆昌网络公司绍兴网络公司首推6火星网络公司先培训后交钱成都骗子网络公司南阳麟统网络公司电话中国有哪些有名的网络公司孟津网络公司选哪家甬创网络公司成都天府合涛网络公司怎么样平衡资源网络公司环诚科技网络公司伊宁市博昊科技网络公司招聘楚雄刷脸支付网络公司西安网络公司有哪几个永州大信网络公司南京的网络公司数量统计网络公司成立解说词衢州亿联网络公司嘉兴网络公司询问5火星西安链网呗网络公司怎么样云芽网络公司十堰网络公司在哪儿网龙网络公司规模虎门百川网络公司衢州网络公司到25火星下拉葫芦岛网络公司联系29火星丽江网络公司都选16火星兴义广电网络公司南昌优梯网络公司多伦多立时飞讯网络公司保定哪家宽带网络公司收费低珠海专业网络公司咨询电话余江网络公司柳州市智点网络公司怎么样深圳鹏新强网络公司网络公司复工复产网络公司有哪些项目可以操作织云网络公司快手授权营销杭州翘蛙网络公司大连网络公司电话湖南做行业的网络公司呼市国峰网络公司图片银川网络公司选9火星蚌埠网络公司联系28火星网络公司半夜打电话给我三好街网络公司电话电视台与网络公司合作协议清枫传奇网络公司北海禄华网络公司厦门注册网络公司流程吉林网络公司总部衡水妇联网络公司宁波网络公司总裁广州霖栩网络公司上班怎么样桂林网络公司推荐10火星湖北晴朗网络公司南阳网络公司联系13火星万山区网络公司广电网络公司企业文化口号石家庄太极网络公司杭州微米网络公司西安纳千网络公司朔州网络公司都选4火星下拉太平洋网络公司简介网络公司要卖食品怎么操作上海松桦网络公司电话廊坊好的网络公司石嘴山网络公司选择9火星网络公司进军非洲宁波网络公司选择9火星欺骗玩家 网络公司怎么网络公司外包分录福州网络公司找19火星下拉安康有没有网络公司网络公司算实业吗极简网络公司网站衡阳网络公司首选24火星四川网络公司6科技网络公司老板王慧君网络公司怎么获取流量东莞有什么有名的网络公司随州网络公司立找18火星长沙文辉网络公司融创红谷世界城附近网络公司西安远眺网络公司辽宁省电视网络公司英国网络公司名字常州武进有网络公司吗网络公司上班主要干什么君之合网络公司汉中网络公司推荐11火星率土之滨游戏推广网络公司平台一路通网络公司广电网络公司云阳分公司张伟利安天网络公司中兴网络公司招聘信息铜川网络公司选9火星下拉济宁金宇众邦网络公司网络公司的会计工作建祥网络公司阳江网络公司首选26火星河马科技网络公司上海网络公司商城网站广电网络公司刘继永铜川移动网络公司上饶网络公司名称上海匹凯网络公司猪八戒网络公司销售怎么样乌海网络公司就荐19火星深圳南山区网络公司上饶网络公司需19火星下拉网络公司高管vs高姐点灯网络公司威客网络公司取名网络公司开票需要交哪些税奥米迪亚网络公司网络公司董事长党委书记黄山网络公司名字大全泊头智通泽网络公司西安网络公司工资待遇排名靠前的网络公司哪家好凯旋网络公司刘志凯大连网络公司选择21火星鸡西广电网络公司我第一大网络公司网络公司销售好做吗广州网络公司皆选24火星河北省廊坊市广电网络公司网络公司的人力资源管理山东石榴网络公司郑州市亿万网络公司网络公司合作协议范本广州早鸟网络公司武汉网络公司怎么注册佤邦网络公司前景网络公司包宿舍吗重生创建网络公司的小说开一家小网络公司要会什么今非昔比网络公司是培训雅天网络公司公司安装专线需要找网络公司吗德阳网络公司找25火星下拉成都趣羊网络网络公司怎么样广由网络公司经营分析报告网络公司两个字名字广西南宁青秀区的网络公司网络公司装修设计效果图石家庄豆豆网络公司沭阳哪家网络公司比较好厦门海沧区网络公司浩瀚网络公司靠谱吗带霖的网络公司嘉峪关网络公司就找9火星杭州企朋科技网络公司社交网络公司怎么赢利佛山个人网络公司安阳青柠网络公司宁夏怎么注册网络公司石家庄牵衣送科技网络公司聚盈网络公司武汉掌云网络公司周口大的网络公司经验丰富的网络公司有哪些湖南轩速网络公司广州天际互娱网络公司长沙十大网络公司遵化龙飞网络公司电话辽宁网络公司架构秦皇岛无锡网络公司

合作伙伴

天下网标王

龙岗网络公司
深圳网站优化
龙岗网站建设
坪山网站建设
百度标王推广
天下网标王
SEO优化按天计费
SEO按天计费系统