Logs and Monitoring - CompTIA Network+ N10-009 - 3.2 - Professor Messer IT Certification Training Courses

The network never sleeps. In this video, you’ll learn how to monitor the network using flow data, protocol analysis, Syslog, SEIMs, and more.

<< Previous: SNMP

Next Video: Network Solutions >>

On your network, you’re probably using routers, switches, firewalls, and many other network infrastructure devices. And all of these devices are running 365 days a year, seven days a week, 24 hours in every single day. We’ll need to monitor these devices to make sure that they remain up and running. And we may want to query those devices to determine what type of load they’re seeing, perhaps log file information, and other details from each individual device. This allows us to see if people are able to authenticate properly to the device, if there are any security concerns, or how much utilization may be used in CPU or network.

Many times, the network administration team will have a central console, and that console will roll up all of this different information from these many different sources and put everything on one single screen for analysis. One of these data points gathered by this network management station could come from a service known as NetFlow. NetFlow is a summary of statistics based on the flows of traffic that are traversing the network.

NetFlow has a standard way to collect this information from these devices, and there are many different NetFlow products available from many different manufacturers. NetFlow usually starts with a probe that you would put onto your network. This probe may be sitting in the middle of a connection using a physical tap, or may be using a switched port analyzer interface on a switch to be able to gather that information. Raw packets are being sent to this NetFlow probe, and that NetFlow probe is compiling statistics based on all of the packets going across your network.

The summary of all of that information is then sent down to a single NetFlow collector. This means you don’t have to go to separate NetFlow probes to be able to gather all of this detail. You go to the single collector and you can create all of your reports from that one central place. The collector is not only keeping a database of all of this information over time, but often the collector will also include a report generator. This might allow real-time reports and summary views on your console, or you may be able to configure long-term reports to be generated every day.

This is a NetFlow console, and the current screen is showing a NetFlow traffic analyzer summary. You can see that it contains top 10 conversations, which are ingress and egress data for all of these devices. This is over the last hour.

You can also see a top 10 endpoints on your network. This is listed by IP address or the name of the device. Here are the sources of that NetFlow data. And you also have traffic analyzer events that are coming from those NetFlow devices.

NetFlow can also summarize information by port number so that you can see how much application traffic is traversing the network. These are the top five applications on this network. Looks like SSL is the largest amount of traffic on this network. And then there’s SQL server, other web traffic, NetFlow UDP traffic, and port zero traffic.

All of that NetFlow information is effectively metadata that was created based on the packets going across the network. But you could also view the raw packets themselves using a protocol analyzer. This allows you to see the exact bits and bytes sent across the network from one device to another.

Protocol analyzers can collect frames from a wired network, a wireless network, and it can present this information in a form that is easy to read. Instead of viewing all of the hexadecimal data, you can view a summary of these packets going back and forth and then drill down into the details associated with each packet. If you’re trying to find unknown traffic or you’re working to troubleshoot a slow application, you can view all of the conversations taking place over the network using this view inside of your protocol analyzer.

And in some organizations, this data is also collected and stored over a long time frame. This usually includes a very large array of storage drives. And every packet from the network is being stored on those drives for later analysis.

Once we begin gathering all of this data from these many different data sources, we could start building a picture of what a normal day on the network might look like. We refer to this as a network performance baseline. And this can be a very useful set of statistics, especially when we begin troubleshooting the network. If you’re trying to determine if a large amount of utilization on the network is normal, you simply look at your baseline to know what the normal amount of utilization would be during a standard workday. You can then compare that to the value you’re seeing now and then determine if that’s better or worse than what would normally occur on the network.

We can also drill down into these baselines to see what normal performance might be for an individual device, for example. So you’re able to get a very detailed view across many different devices on your network by simply compiling this data into a central baseline. If your organization is using a management console or a SIEM– this would be a Security and Information Event Manager– then you may already be collecting data files and other useful statistics that you can use to build out a performance baseline. This management console, or SIEM console, may be able to look through the data and identify and alert when certain anomalies occur. Or it may be able to present normal baselines for an extended period of time so that you can see when things happen to change over a normal workday.

That management console, or SIEM console, is collecting log files from many different devices. There’s log files inside of your switches, your routers, your firewalls, your servers, and practically any other device you would connect to the network. There is a standard way to transfer log files from all of these devices, even though each one of those devices may be manufactured by a different company. This standard for transferring log files is known as syslog.

You’ll usually see the syslog configuration inside of your switch or router. And you’ll tell the switch or router where to send all of its log files using that standard syslog transfer. This is commonly sent to a SIEM, which is a Security Information and Event Manager. And usually we are storing that on a large drive array inside of that SIEM.

That SIEM is identifying what device sent that log file. It’ll identify what’s known as a facility code, which is the program that originally created that log file. And it will also include a severity level for that line within the log file. So you should look at the configurations in your enterprise switches, routers, and other devices to see what options you might have for sending that syslog information to your SIEM.

If you were to log in to that security information and event manager, you would find a large amount of information available to you. It’s not just page after page of this log information. It’s also all of that information rolled up into reports that you can then use to make decisions.

For example, there might be a dashboard on the SIEM that gives you a real-time status of information about your network. You could see alerts and alarms if there happens to be a large number of failed authentications. Or if a device happens to go offline, you may see that in the dashboard of your SIEM.

Most SIEMs also have very advanced reporting capabilities, so you may be able to go back in time. Start comparing and contrasting different statistics across multiple devices, all using that large database that you’ve compiled within the SIEM. So you might be able to track application usage when it starts on a server. You can track that traffic as it traverses a switch, watch that traffic as it goes through a firewall, and then analyze how much utilization that’s taking on your internet.

And since you’ve now stored all of this log file information over an extended period of time, you might be able to use it for forensics. This will allow you to go back in time and track when someone may have authenticated to the network, and then which services they may have used when they were logged in. This log file information can be sent from nearly any device. This could be a NetFlow collector, it might be a server that has syslog capabilities, or it might be a switch, a router, or a firewall that’s currently running in your infrastructure.

One of the things you may find when you first turn on a SIEM and start sending it syslog data is that it will be receiving a large amount of information. Some of this information might be useful for that individual device, but it may not be necessary to store that for a long period of time within your SIEM. So you may want to change some sensitivity settings on the SIEM to be able to perhaps only capture warning or urgent information, and perhaps not store informational details within the SIEM database.

Here’s an example of an ad hoc data query made on a SIEM. You can see that we’re doing a query here for anything that starts with, fail. So this might be failure, or failed, or the word fail itself, and any records that include the word password. So we’re looking for situations where password authentication has failed or is not completed successfully.

And you can see the series of statistics that are gathered. This creates a single chart across the top. And then we can view individual event records.

For example, these appear to be event records from Windows Server where there is a login failure. It says, unknown username or bad password. The username used for that authentication was Hax0r. And you could see its ACMETECH’s domain and it has a login type and other lines associated with that record.

We can now track this over a long period of time to see, is this normal, or have there been instances during the day when there’s been a large number of failed authentications? This might be able to identify times during the day when there would be a brute force attack or a password spraying attack. And then we can use additional security tools to track that closer or to prevent that event from occurring again.

If you’ve ever had to administer a switch, a router, a firewall, or any of these other infrastructure devices, you know that you would normally SSH into the device, perform a series of commands at the command prompt, and then disconnect. And if you need to make that change again on a different switch or router, you will need to SSH to that additional router and make those changes again. This is something that can be relatively tedious, and almost impossible to manage in large environments where you may have 100 or even thousands of these devices.

For that reason, we need some way to automate this process, and we can do that by using API integration. This allows us to have a central management station communicate directly to a switch, a router, a firewall, or some other device using an Application Programming Interface, or API. This allows us to create applications, scripts, and other programs that can communicate directly to that device without using the SSH console or communicating to that device over a web console.

This means that we can effectively communicate with the native language of that device. So we can send commands, make configuration changes, or query information on that device without having to interactively log in or sit at an SSH prompt and look for that information. And if you’re managing those hundreds or thousands of devices, you can make those changes automatically with one click of the button using this application programming interface.

A lot of the monitoring and features we discussed in this video needed the raw packets from the network to be able to gather those statistics. One of the ways to gather those packets is using port mirroring. This allows you to use an existing switch and redirect traffic from one port on the switch to a packet gathering device that you have connected on a different port on that switch.

We also use port mirroring for security features such as an IDS or IPS. And we may also use this for performance monitoring. For example, we might connect a NetFlow collector to one of these port mirrors. Some switches even support the ability to mirror traffic from one switch to another. So we could have a switch in the data center with our monitoring systems, and be able to gather packets from a network switch that may be in another building or another location.

The port mirror may also be referred to as a switched port analyzer or SPAN. Or we may have a physical network tap that’s connected somewhere between two devices. If we are using a switch, we’re simply taking a copy of this data and sending that copy to a third-party device, such as an IPS or NetFlow collector.

So when a user is communicating to a server, they first are communicating through a switch. So a copy of that data is sent to our IPS, and the original copy of that data is sent on to the server. We can do this for single interfaces or multiple interfaces or VLANs on the same switch, so that we’re able to see all of the traffic that is available on that switch.