College dorm Wi-Fi issues

Recently I was asked to look at a local college that is experiencing issues in the evening with wireless users. In this example the complaints from students who live in the dorm are obscure and lacking details. They range from “Wifi goes out” to “Video’s buffer”, and “Its just so slooooow”. So, with this little information known we decided to go over and check out the spectrum to see what was going on.

First I talked with the admins to find out about the building, construction, Bandwidth/networking and how students use the dorm wireless. From our conversation I found that the dorm has dedicated internet of around 400 MBits and monitoring is show it busy but not maxed out. Students are working throughout the day in their dorms, and complaints pretty much only come in at night. Construction is new, but lots of brick – signal should not propagate to far (this is important for later).  The building which is 5 floors, stands by itself with little sources of wifi interference outside of internal APs.

We then went on site and did a passive survey with spectrum analysis to check coverage, interference,  and if the spectrum was busy.  Wifi coverage with was great, lots of APs to cover needed space, maybe too many APs (we will get to that later). When conducting the spectrum analysis —- WHOA! its not busy, but look at the signals that are seen. Below shows an screen shot of the spectrum using Ekahau.

eevator 3

My first thought was that’s a lot of APs! I looked at our layout and we only had about 6 APs in this area, but I see many more than that. So the idea that the building materials (brick) would attenuate the enough that we would not see APs on other floors was incorrect. This gives a lot of credit to on site surveys before installing APs – building materials might be different and react different then we think sometimes.   I was seeing APs 2 floors above with usable signal. Another thing to notice that there is plenty of open spectrum in the 5 gig range, why are all the channels being used close to each other and at the beginning of the spectrum? At this client Ruckus wireless is being used and after checking documentation and asking vendor reps, seems like it might be a firmware thing. Also notice that we have a lot of APs on the same channel that can hear each other, this would cause contention on the channel.

This shows the importance of always doing a site survey and checking spectrum to see what AP coverage is.

To help resolve the issues we created a channel plan to statically set channels and spread them throughout the spectrum, also lowering channel width to 20 MHZ since we have so much AP coverage between floors. By doing these two things we were able to decrease channel contention and give our users a better wireless experience.

One additional setting we modified was to block broadcast/multicast from clients. When doing a packet capture on the spectrum we also noticed that around 30 percent of the traffic was just MDNS. We created ACLs to block this traffic.

 

 

Cisco – Event Management to enable backup interface

I work with a lot of sites that are multi-homed or have a backup connection that, even though its backup still needs to be used at the same time for load balancing.  Recently I was able to have a site that had a very slow MPLS link that was being moved to only backup – and a new SD-WAN link has been installed as primary.  The MPLS link will go away after were sure the SD-WAN option works well.

The only time we want this link to become active is if SD-WAN fails. My first thought is no problem, just  use a routing protocol and a static route – the routing protocol will be our preferred method, and we can just raise the admin distance of the default route to make it a backup. One problem in this scenario, the MPLS uses dynamic routing (EIGRP), and many other locations still use the MPLS for primary, or secondary connections, so we cannot even have this link up (admin state) or the MPLS will advertise the network.

So, to recap what I need is a preferred route to our SD-WAN, and if SD-WAN fails, bring up the backup connection admin state and move traffic to it. Then of course, auto fix everything if SD-WAN comes back online. No problem! Cisco’s Event Manager to the rescue. Cisco’s Embedded Event Manager (EEM) (Copied From Cisco) –is a distributed and customized approach to event detection and recovery offered directly in a Cisco IOS device. EEM offers the ability to monitor events and take informational, corrective, or any desired EEM action when the monitored events occur or when a threshold is reached. An EEM policy is an entity that defines an event and the actions to be taken when that event occurs. When creating EEM scripts, you have two options TCL or CLI – in this case I am using just CLI.

Using EEM on the 3850 core I was able to detect if the default route learned VIA OSPF was removed from the routing table – if that happened, then run the command “No shut” on my MPLS uplink and EIGRP would take over. If the default route was learned at some point through OSPF when things were corrected – then run the command “Shut” on my MPLS uplink. This worked extremely well in combination with Link detection on the Fortigate, and OSPF default route distribution.

The Fortigate is my SD-WAN device, and my default gateway for the network.  I am using link detection to test HTTP access to google. If my WAN interface cannot get a response from google.com in my set time (5 attempts, with 5 seconds between each attempt) then it will remove the default route from my routing table. When this happens the route will be removed from redistribution in OSPF, and removed from the Cisco core. EEM sees this event, and does my list of commands. Below shows the layout, and code to get this going. Interface 0/24 is my MPLS uplink.

Layout2

First I made sure that the MPLS interface was  shutdown and OSPF was up, and I was receiving the redistributed default route from the Fortigate.

config t

event manager applet MPLS-UP
event routing network 0.0.0.0/0 type remove protocol OSPF
action 1 cli command “enable”
action 2 cli command “config t”
action 3 cli command “int gig 1/0/24”
action 4 cli command “no shut”
action 5 cli command “exit”

event manager applet MPLS-DOWN
event routing network 0.0.0.0/0 type add protocol OSPF
action 1 cli command “enable”
action 2 cli command “config t”
action 3 cli command “int gig 1/0/24”
action 4 cli command “shut”
action 5 cli command “exit”

You can check status and history of events by using the show event manager commands.

show

During a failure of my ISP, everything worked great. The default route was removed from OSPF, which caused an event that EEM matched – then it enabled the MPLS interface, and all routes/default was learned VIA EIGRP and the MPLS. When internet was restored, the MPLS interface was shutdown, and all traffic started flowing over SD-WAN.

 

Cisco – Flex links

Cisco Flex links give the ability to have a layer2 redundant connection, or pair of connections configured as an Etherchannel for a primary link. This is an active/passive setup where if the primary connection’s link status goes down, the Flex link will become active, and if the primary comes back it will go into a standby mode and not take back primary functionality unless told to do so with preemption commands. STP is disabled automatically on Flex Links so no need to bother with Portfast.

Flex links started in code a long while back, and not sure how I missed them. I have needed this functionality before if I was connecting to a backup Firewall, or some device that Spanning-tree would have issues with. This options gives a great way to have a backup link configured if you just need it to become active if something happens to the primary link. In the below scenario I have a Fortigate firewall and Cisco 3560 switch. Port FA 0/1 is my primary and goes to Port 1 of the FGT, and Port 12 is the backup port for this link.

Layout

image

Switch config:

interface FastEthernet0/1
description “Connected to Fortigate”
switchport trunk encapsulation dot1q
switchport mode trunk
switchport backup interface Fa0/12
spanning-tree portfast

interface FastEthernet0/12
description “Connected to Fortigate – Backup”
switchport trunk encapsulation dot1q
switchport mode trunk

Below shows that status of the Flex link

status-primary

Notice that the primary state is active and up. Now, I will cause a physical port state change by unplugging the interface and see how many pings/time it takes to failover.

After unplugging the primary connection, the link light of port 12 instantly came on and went green, I didn’t even lose a ping to my switch. All mac/uplinks moved over the backup port but no loss. Below shows the status after. Notice that the backup state is up, not the primary. The primary port after plugging it back up is amber like a blocked port, and both interfaces port status show up, even though the primary port at this point is not forwarding traffic.

status

 

 

 

Protected: Upgrading firmware on Dell 4128-ON

This content is password protected. To view it please enter your password below:

Fortinac PXE DHCP boot options

Fortinac is built on top of CentOS and is a great product. Recently I needed to have default or isolated vlan support PXE booting as well as isolation. This way if a computer is being imaged we don’t have to worry about hard coding ports with vlans, etc. This is important because the NAC cannot look at the client prior to the OS install.

These settings were added to the dhcpd.conf – they would work for any implementation running dhcpd not just Fortinac.

Below is the conf that works.

# Sample /etc/dhcpd.conf

authoritative;
log-facility local6;
ddns-update-style none;
allow bootp;
allow booting;
class “authenticated_clients”
{
match pick-first-value (option dhcp-client-identifier, hardware);
}
# Isolation Scope ISOL_Isolation_blackhole
subnet 172.16.172.0 netmask 255.255.252.0 {
range 172.16.172.10 172.16.175.200;
default-lease-time 28800;
max-lease-time 86400;
option domain-name “blackhole.local”;
option domain-name-servers 172.16.172.254;
option broadcast-address 172.16.175.255;
option routers 172.16.172.1;
next-server PXE-SERVER-IP;
filename “SMSBoot\\x64\\file1.efi”; — NOTICE the two slashes that represent the file path – I was missing this, and of course it could not find the file.

The options for next-server and file-name were the options needed to push PXE settings over.

Restart the service after saving the configuration to dhcpd.conf

802.11 Finding the MCS rate in Wireshark

One great way to troubleshoot wireless issues that clients might be having is to perform a packet capture or use some kind of wireless analysis tool. In this case I have an Ekahau sidekick with Capture. Before this tool I used Ubuntu, Wireshark, and a compatible wireless adapter to capture packets. The MCS or Modulation and Coding Scheme value indicates a lot about the client. For example, the MCS rate is as really a reference number given to the combination of  – amount of Spatial streams, Modulation type and coding scheme. Analyzing the MCS helps when we need to troubleshoot the common problem of “its slow” while everything looks good.

I look at the MCS rate when troubleshooting client problems for a lot of reasons. Since the MCS rate is a reference number describing how much data can be sent over the wireless medium, it gives us an idea of the wireless environment. If the spectrum has lots of contention/interference or the client has low RSSI then the MCS will reflect this. If clients have lots of retrys, or a low RSSI then it will lower the modulation scheme, and therefore lower the MCS rate.

Lets take a look at a 802.11ac packet in Wireshark and break down what each section means. The MCS is located in the Data frame under the 802.11 Radio information section.

packet1

This packet is captured from a Data frame, you will notice the MCS rates are only captured on data packets – Lots of reasons for this. Most management/control frames are sent with the lowest data rate possible to allow them to be understood by clients far away, or if the network is very congested.

In this example we are looking at an 802.11ac frame, and we see right away the MCS index is 7  – from looking at the MCS chart we see that means modulation:64 QAM, which a number 5/6 . The 5/6 is the actual FEC or Forward error correction coding rate, which is given in fraction form. 5/6 equals to 0.83333.. which means that 83% of the data stream can be used to send real data. 5/6 is the highest coding rate.

We can see that the host has 2 Spatial streams, is using a Short guard interval , 40 MHz channel and the data rate is 300 – the data rate is  the highest capable with these settings. Notice the RSSI is -44 and we are just using 64 QAM, you really have to be close and have a clear spectrum to achieve 256 QAM.

Below  is the table of MCS rates for 802.11AC/N – Compliments of WLANPROS.com

802.11n-and-802.11ac-MCS-Chart-2019-1536x1187

Ruckus Zonedirector – Request Error

Today had a strange issue with a Zondirector 3k and thought I would share in case anyone else has the issue. For some reason all of a sudden the web interface started showing this error:

Request Error

Error logged at the server

Enable EjsErrors to see errors in the browser.

SSH, ping, and AP functionality all still worked great. First we tried to reboot the ZD, but this has no effect. Next we tried to redo the cert – and this was the fix.

First SSH into the ZD with proper user/pass

Ruckus> enable
Ruckus# config
Ruckus (config)# certificate
Ruckus (config-certificate)# restore

This will then reboot the ZD, and APs will connect back – I had a loss in AP connections (clients dropped), and then they all reconnected back.

 

Dell S4810 – Getting all Vlans assigned to a port

I thought this might be helpful to share with anyone looking to quickly pull all vlans assigned to a port on a Dell S4810 switch. I think this command works in most FTOS switches.

In this example I need to get all vlans assigned to a port-channel regardless of tagged or untagged. I can always look through the config and see what ports are tagged/untagged on the vlan but this can be a real pain.

So lets see the better option. My port channel is 128. Below shows the output of the command “Show interface switchport X” .

S4810-SW1#show int switchport port 128

Codes: U – Untagged, T – Tagged
x – Dot1x untagged, X – Dot1x tagged
G – GVRP tagged, M – Trunk, H – VSN tagged
i – Internal untagged, I – Internal tagged, v – VLT untagged, V – VLT tagged

Name: Port-channel 128
Description: “Etherchannel to 6060 for internal”
802.1QTagged: Hybrid
Vlan membership:
Q Vlans
T 5,10,12,14-15,20,22,40,150,254,1337

Native VlanId: 999.

S4810-SW1#

Fortigate 6.0 Adding and removing IPs from Quarantine list

Starting in 5.4.1 you could “Quarantine” an IP address. This means that the quarantined host cannot communicate through the firewall.

There are many different parts of the firewall the quarantine an IP address. For example the AV and IPS can both automatically quarantine an IP if it meets a defined violation.

In 6.0 you can view the IPs that have been quarantined by going to Monitor- Quarantine. From here you can see what IPs are blocked, and for what reason. As you can see in the image below 5.188.86.10 has been blocked for 26 days by an admin. If an admin blocks an IP address (as we will see) it shows up with “Administrative” as the source.The other IPs have been blocked by the IPS engine. The below image shows the monitor section.

quarentine

So, lets say that you look into Fortiview and see that a remote IP is sending/receiving a ton of bandwidth and you want make sure that stops. in this example lets quarantine the IP 67.247.21.7.

In this example we can act like I was looking through Fortiview and found an issue that makes me want to block the above IP. You can just click on the IP you would like to block, right click and then select to “quarantine”. When you do this, it will pop up and ask for the length of time you would like to block them for.

block

The above shows that it will ban the IP from communication for the given period of time.

So, lets say we want to remove an IP address that has been quarantined –  No problem, just need to go to Monitor-Quarantine and click on the IP and delete that individual or click to delete all entries.

delete-block

You can modify how long and for what reason the IPS/AV quarantine an address for within the policy. For example, below shows modifying the reason/time of quarantine. The AV settings are within the CLI of the AV policy under “nac-quar”. Something to note, sources are not quarantined by default.

FGT’s entry on configuring AV settings: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-security-profiles-54/Antivirus/Quarantine%20or%20Source%20IP%20ban.htm

Ruckus ICX 7250 VRF setup/config

This entry details the config for setting up and deploying VRFs on a Ruckus ICX 7250. Recently I had an issue where a client had a new ISP and that ISP gave them the Customer WAN /30 subnet, then routed their Customer LAN subnet (Public usable addresses) to their side of the /30.  The customer did not want any extra equipment installed like a router to handle the WAN routing, so the next best thing was to split the Ruckus 7250 switch into a WAN/LAN router – One switch to rule them all! The VRF feature is in Ruckus’s Layer 3 Premium feature set so a license will be needed. In this scenario the 7250 is the local gateway for all Vlans – so local LAN routing, and the Internet router.

Of course there are a lot of problems with the following design, like single point of failure, but its a small site, with 1 48 port switch, Fortigate firewall and cloud Voip SD-WAN router. The purpose of this design is to allow the Voip SD-WAN solution to be outside the firewall, so using the 7250 for both LAN/WAN routing really and it worked well. If the ISP would have not required a customer routing device we would have just setup a Internet-Vlan, set Fortigate/INSpeed to public IPs, and placed them in that vlan. But, the ISP is requiring a routing device in this instance.

Here is the design.

vrf-icx

Config:

I think the ICX series supported VRFs when it was running Brocade firmware, but I would recommend upgrading to Ruckus’s ICX firmware – Version number SPR08080 or greater. Of course the device has to be running the Routing firmware not the switching code. The VRF feature is in Ruckus’s Layer 3 Premium feature set so a license will be needed.

First lets enable the VRF, and increase the amount of routes.

system-max ip-route-default-vrf 9000
system-max ip-route-vrf 500
system-max ip6-route-vrf 500

These commands will  enable the VRF functionality and it will need you to reboot.

Next we can start configuring our VRF. In this case my /30 will be 1.1.1.0/29 – so .1 will be the ISP, .2 will be us. I will setup the routes for the VRF, and then the Vlan interface and apply the /30. There is a keyword in the VE config to make sure its associated to a given VRF. Within the VRF config you need to specifcy the Route Identifier – only matters locally.

vrf INTERNET-VRF
rd 11:11
ip router-id 12.5.110.2
address-family ipv4
ip route 0.0.0.0/0 12.5.110.1
exit-address-family
exit-vrf

vlan 300 name INTERNET-VRF by port  — My WAN Vlan for Fortigate WAN and SD-WAN router WAN interface. The Customer LAN Subnet goes here.
untagged ethe 1/1/19 ethe 2/1/23
router-interface ve 300
spanning-tree 802-1w
spanning-tree 802-1w priority 4094
!
vlan 400 name ISP-VRF by port — /30 ISP network
untagged ethe 1/1/24
router-interface ve 400
!

interface ve 400
vrf forwarding  ISP-VRF – This is the command to associate the VE to the VRF
ip address 12.5.110.2/30

interface ve 300
vrf forwarding INTERNET-VRF – This is the command to associate the VE to the VRF
ip address 1.1.1.2/29

Here is a subset of my user config – Vlan 40 – this is where most of the desktops go, and the gateway in this case 10.6.40.1/24 lives on the switch, on the default VRF.

vlan 40 name Computers by port
untagged ethe 1/1/1 to 1/1/18 ethe 1/1/21 ethe 2/1/1 to 2/1/18 ethe 2/1/22
router-interface ve 40
spanning-tree 802-1w
spanning-tree 802-1w priority 4094
!
!
show run int ve 40
interface ve 40
ip address 10.6.40.1 255.255.252.0
ip helper-address 1 10.6.10.10

Thats it! A show IP route of the default VRF (Switching VRF) shows:

#show ip route
Total number of IP routes: 9
Type Codes – B:BGP D:Connected O:OSPF R:RIP S:Static; Cost – Dist/Metric
BGP Codes – i:iBGP e:eBGP
OSPF Codes – i:Inter Area 1:External Type 1 2:External Type 2
Destination Gateway Port Cost Type Upti me
1 0.0.0.0/0 10.6.254.2 ve 254 1/1 S 1d17 h — This is the Fortigate
2 10.6.0.0/22 DIRECT ve 1 0/0 D 1d17 h
3 10.6.10.0/24 DIRECT ve 10 0/0 D 21h4 m
4 10.6.40.0/22 DIRECT ve 40 0/0 D 1d17 h
5 10.6.100.0/24 DIRECT ve 100 0/0 D 1d18 h
6 10.6.254.0/24 DIRECT ve 254 0/0 D 1d17 h
7 172.16.6.0/29 DIRECT ve 650 0/0 D 1m5s
8 192.168.6.0/24 DIRECT ve 1 0/0 D 1d17 h
9 192.168.100.0/24 172.16.6.1 ve 650 1/1 S 1m4s

But, if we specifcally show the Internet-VRF routes:

#show ip route vrf INTERNET-VRF
Total number of IP routes: 3
Type Codes – B:BGP D:Connected O:OSPF R:RIP S:Static; Cost – Dist/Metric
BGP Codes – i:iBGP e:eBGP
OSPF Codes – i:Inter Area 1:External Type 1 2:External Type 2
Destination Gateway Port Cost Type Uptime
1 0.0.0.0/0 12.116.193.1 ve 400 1/1 S 21h4m
2 12.5.110.0/30 DIRECT ve 300 0/0 D 20h57m
3 1.1.1.0/29 DIRECT ve 400 0/0 D 21h5m

And there we have it, the devices is now in two VRFs, a default and INTERNET-VRF with specific interfaces assigned to it. If you want to test pinging from that VRF specifically you can use the following commands:

#ping vrf INTERNET-VRF 8.8.8.8
Sending 1, 16-byte ICMP Echo to 8.8.8.8, timeout 5000 msec, TTL 64
Type Control-c to abort
Reply from 8.8.8.8 : bytes=16 time=1ms TTL=122
Success rate is 100 percent (1/1), round-trip min/avg/max=1/1/1 ms.