(ASCEND) SAP storms by Max4000

To: ascend-users@bungi.com
Subject: (ASCEND) SAP storms by Max4000
From: "Drew Payn" <drewpayn@unn.unisys.com>
Date: Wed, 19 Aug 1998 18:11:41 +1000
Sender: owner-ascend-users@max.bungi.com


We have had one Max4000 running quite smoothly for a few months, however we
started to experience ethernet network degradation at
different times of the day. The network utilisation would jump up from
around 8% up to 28%, stay there for about 10-15 minutes and then drop back
to 8% (for about 30mins).  We are using a switch as the backbone  with a
few 10M hubs cascaded of it.  Our Max4000 is connected to one of these 10M
hubs.  We have 3 routers (2 Cisco and 1 Timeplex) that are directly
connected to the switch.

     Looking at the switches statistics, I noticed that one port was
showing 92% brdcst/unicst for 5min period, all other ports were around 0 to
6%.  This port goes to a 10M hub which has the Max connected. I managed to
connect an analyser to this 10M hub whilst one of these storms were
occurring, I had to configure the analyser with the 10BaseT cable
disconnected as the analyser would take too long to configure due to its
ethernet port being in promiscuous mode and taking up its processing time.

     I had the 10BaseT cable connected to the analyser for a very short
time  (~5secs) and captured around 6000 packets  Looking at the capture,
every packet had the Max4000 as a source or destination address.  All
packets were IPX SAP. It seems that the Max was going through a cycle of
sending out SAP update requests and then the network is inundated with
responses, mainly from a Cisco and a Timeplex Router with the Max as the
destination address in these packets.

     This was the status of the max at the time:

sh uptime
system uptime: up 97 days, 1 hours, 21 minutes, 7 seconds

sh rev
SYDAM system revision:  febk.m40 6.0.2


     We locked the timing of these storms down to relate with a WAN link
going up and down on another router two hops away.  We upgrade the Max to
ver 6.0.8 (this had a fix in it for the E1 PRI we are using) to no avail.
The problem seemed to disappear when the WAN link was fixed, i.e. network
stable.

     We have since installed another Max4000 in another location, this time
in a stubby network (1 Timeplex router only).  This Max4000 is also on
v6.0.8, it now seems to be causing SAP storms on that network.

     We have raised a ticket for this with Pacific Asia Ascend (the Max's
and I are in Australia) in early July 98 but they have not come back with
any solution.

     Users on the this new Max are now complaining that their (TCP/IP)
sessions hang every now and then, but will jump back into action after a
minute or two.  Looking at the utilisation on the LAN we get %20 LAN util
peaks occuring erratically, however these do seem to relate with the hangs.
%1 LAN util between these peaks.  Thus I may guess the session hangs may be
caused by the processor spending time trying to process the SAP storms.  We
have a Telebit NetBlazer NB40i on the same LAN, and those users also say
they experience these hangs.

     Can anyone shed any light on this, or where to start to look.

     Drew


++ Ascend Users Mailing List ++
To unsubscribe:	send unsubscribe to ascend-users-request@bungi.com
To get FAQ'd:	<http://www.nealis.net/ascend/faq>

Prev by Date: (ASCEND) No Transmit/Receive
Next by Date: (ASCEND) Downgrading Question
Prev by thread: Re: (ASCEND) Testing radius
Next by thread: (ASCEND) No Transmit/Receive
Index(es):
- Main
- Thread