Congestion on ADS. How to diagnose and prevent reoccurrence of issue

Hi - can you comment on the congestion we see on our system this afternoon. How can we diagnose the root cause? can we use rrdump (or any other tool) to diagnose this, after the event?


cpu2.jpg


swap1.jpg

ADS1 log

<rmds.1.ads.3.itemThread: Notice: Tue Jul 19  15:12:57 2022> Suspended requests due to transport congestion.<END>
<rmds.1.ads.3.itemThread.3.rrcpTransport.: Warning: Tue Jul 19 15:12:57 2022> RRCP STATUS MSG: RRCP_CONG_BEGIN: possible congestion sets in<END>
<rmds.1.ads: Notice: Tue Jul 19 15:12:57 2022> Suspended requests due to transport congestion.<END>
<rmds.1.ads.rrcpTransport.1.sinkSide.rrcp.transmissionBus: Warning: Tue Jul 19 15:12:57 2022> RRCP STATUS MSG: RRCP_CONG_BEGIN: possible congestion sets in<END>
<rmds.1.ads.2.itemThread: Notice: Tue Jul 19 15:12:57 2022> Suspended requests due to transport congestion.<END>
<rmds.1.ads.2.itemThread.2.rrcpTransport.: Warning: Tue Jul 19 15:12:57 2022> RRCP STATUS MSG: RRCP_CONG_BEGIN: possible congestion sets in<END>
<rmds.1.ads.1.itemThread: Notice: Tue Jul 19 15:12:57 2022> Suspended requests due to transport congestion.<END>
<rmds.1.ads.1.itemThread.1.rrcpTransport.: Warning: Tue Jul 19 15:12:57 2022> RRCP STATUS MSG: RRCP_CONG_BEGIN: possible congestion sets in<END>
<rmds.1.ads.1.itemThread: Notice: Tue Jul 19 15:12:59 2022> Resumed requests that were suspended due to transport congestion.<END>
<rmds.1.ads.1.itemThread.1.rrcpTransport.: Warning: Tue Jul 19 15:12:59 2022> RRCP STATUS MSG: RRCP_CONG_END: congestion ends<END>
<rmds.1.ads: Notice: Tue Jul 19 15:12:59 2022> Resumed requests that were suspended due to transport congestion.<END>
<rmds.1.ads.rrcpTransport.1.sinkSide.rrcp.transmissionBus: Warning: Tue Jul 19 15:12:59 2022> RRCP STATUS MSG: RRCP_CONG_END: congestion ends<END>
<rmds.1.ads.3.itemThread: Notice: Tue Jul 19 15:12:59 2022> Resumed requests that were suspended due to transport congestion.<END>
<rmds.1.ads.2.itemThread: Notice: Tue Jul 19 15:12:59 2022> Resumed requests that were suspended due to transport congestion.<END>
<rmds.1.ads.2.itemThread.2.rrcpTransport.: Warning: Tue Jul 19 15:12:59 2022> RRCP STATUS MSG: RRCP_CONG_END: congestion ends<END>
<rmds.1.ads.3.itemThread.3.rrcpTransport.: Warning: Tue Jul 19 15:12:59 2022> RRCP STATUS MSG: RRCP_CONG_END: congestion ends<END>

ADS2.log

<rmds.1.ads: Notice: Tue Jul 19  15:12:58 2022> Suspended requests due to transport congestion.<END>
<rmds.1.ads.rrcpTransport.1.sinkSide.rrcp.transmissionBus: Warning: Tue Jul 19 15:12:58 2022> RRCP STATUS MSG: RRCP_CONG_BEGIN: possible congestion sets in<END>
<rmds.1.ads.2.itemThread: Notice: Tue Jul 19 15:12:58 2022> Suspended requests due to transport congestion.<END>
<rmds.1.ads.2.itemThread.2.rrcpTransport.: Warning: Tue Jul 19 15:12:58 2022> RRCP STATUS MSG: RRCP_CONG_BEGIN: possible congestion sets in<END>
<rmds.1.ads.3.itemThread: Notice: Tue Jul 19 15:12:58 2022> Suspended requests due to transport congestion.<END>
<rmds.1.ads.3.itemThread.3.rrcpTransport.: Warning: Tue Jul 19 15:12:58 2022> RRCP STATUS MSG: RRCP_CONG_BEGIN: possible congestion sets in<END>
<rmds.1.ads.1.itemThread: Notice: Tue Jul 19 15:12:58 2022> Suspended requests due to transport congestion.<END>
<rmds.1.ads.1.itemThread.1.rrcpTransport.: Warning: Tue Jul 19 15:12:58 2022> RRCP STATUS MSG: RRCP_CONG_BEGIN: possible congestion sets in<END>
<rmds.1.ads.1.itemThread: Notice: Tue Jul 19 15:12:59 2022> Resumed requests that were suspended due to transport congestion.<END>
<rmds.1.ads.1.itemThread.1.rrcpTransport.: Warning: Tue Jul 19 15:12:59 2022> RRCP STATUS MSG: RRCP_CONG_END: congestion ends<END>
<rmds.1.ads.2.itemThread: Notice: Tue Jul 19 15:12:59 2022> Resumed requests that were suspended due to transport congestion.<END>
<rmds.1.ads.2.itemThread.2.rrcpTransport.: Warning: Tue Jul 19 15:12:59 2022> RRCP STATUS MSG: RRCP_CONG_END: congestion ends<END>
<rmds.1.ads.3.itemThread: Notice: Tue Jul 19 15:12:59 2022> Resumed requests that were suspended due to transport congestion.<END>
<rmds.1.ads.3.itemThread.3.rrcpTransport.: Warning: Tue Jul 19 15:12:59 2022> RRCP STATUS MSG: RRCP_CONG_END: congestion ends<END>
<rmds.1.ads: Notice: Tue Jul 19 15:12:59 2022> Resumed requests that were suspended due to transport congestion.<END>
<rmds.1.ads.rrcpTransport.1.sinkSide.rrcp.transmissionBus: Warning: Tue Jul 19 15:12:59 2022> RRCP STATUS MSG: RRCP_CONG_END: congestion ends<END>

Client logs included:

2022-07-19 15:14:25.564 [INFO ] [ElektronDispatcher-Consumer] c.m.t.c.m.p.PubSubMarketDataProvider - StatusEvent{subject=BPOD.PG.US, state=UNKNOWN, statusMsg=A24: Moving item to new server which has lower server Id.}

rrcp.sink.log

Tue Jul 19 15:12:57 2022
RRCP SINK: DEBUG: [/local/jenkins/workspace/RTDS35XCore_Release/OS/OL7-64/ssl/RrcpDaemon/Engine/rrcpE_Node.c,Node_reassemblyQdequeue(),557]
Node-@0x9801b0d0, 33702 (122.130.192.10): BC-stream msg gap, expecting 0x3bcb, got 0x3bd1:
DATA(BC) Msg-@0xac742bd8, 0x9fae: with 2 of 3 Pkts
Tue Jul 19 15:12:57 2022
RRCP SINK: DEBUG: [/local/jenkins/workspace/RTDS35XCore_Release/OS/OL7-64/ssl/RrcpDaemon/Engine/rrcpE_Node.c,Node_discardIncompleteMsgs(),6
42]
discarding incomplete msg on reassemblyQ-@0x9801b198:
DATA(BC) Msg-@0xac742bd8, 0x9fae: with 2 of 3 Pkts


We are running EMA 3.6 on ADS 3.5

Tagged:

Best Answer

  • zoya faberov
    Answer ✓

    Hello @duncan_kerr ,

    I would first try to verify if there was a wider network issue, congestion, bottleneck, or similar interrupt of network connectivity at the time the issue was reported on ADSs's logs. Because such an issue would impact ADS heavily. Your organization's network admin/group may be a good contact to ascertain if this was possibly a wider issue.

    This seems more likely, in this case, as from what I see, both ADSs have experienced congestion at the same time.