11-05-2018 07:02 AM
occasionally we get SNMP error traps to our monitoring system that NXDOMAIN response rate is >80% for a specific DNS member. I am currently trying to find the source of that issue as it keeps returning.
The first problem is that, if the issue occurs, it is usally very short (just a few minutes).
I was expecting to see spikes in the "DNS Replies Trend" for nxdomain replies during the time we get the SNMP error traps, but the graphs don't show >80% of nxdomain responses at any time.
So I cannot reproduce the issue with that dashboard.
Is is possible that the issue was not present for long enough to be properly visible in the graph (it seems the datapoints are a 10 minute snapshot) ?
I then checked the "DNS Top NXDOMAIN / NOERROR (no data)" Dashboard and set the Date time range to when the alarm occured. I see some FQDNs that I would like to check out further.
How can I drill down further to see which clients were responsible for these queries that resulted in NXDOMAIN responses ?
11-08-2018 08:44 AM
I would suggest checking into the Data Connector VM. It's pretty useful if you have the Reporting appliance space to grab all the DNS query data in your environment. Best report it populates is the 'DNS Domains Queried by Client' --> can search on client IP, domain name, and like all the other reports, a specific time range.
My recommendations if you're going to use the Data Connector VM, read the entire install/setup PDF thoroughly and browse the User guide while you're at it. The setup isn't the most straight forward; and also keep an eye on your daily Reporting License Usage (if that matters in your licensing setup).
02-01-2019 03:19 PM
You mentioned about keeping and eye on your daily Reporting License Usage. Once I added the Data Connector, my daily reporting license which is 5GB, but daily average was 0.3GB, cause the daily license to exceed the 5GB limit. This repeated continued until reporting stopped completely after 5 occurances. I was working with Infoblox SE and he provided a temp 30GB license to help resolve the issue, but ran out of time and wasn't able to get the issue resolved. Even after turning off the reporting services to all the DNS servers liste in the Grid. But leaving the reporting service turned on just the reporting server, one DNS server and the Grid Master (GM). Currently running on the 5GB daily license with daily average of 0.3GB with all 20+ servers having reporting service turned on. I attended a webinar this week and asked the question if adding a Data Connector would affect the daily reporting license and that person said that it shouldn't.
I have not enable "forward" on the Data Connector yet because now I'm not receivng data for 'DNS Domains Queried by Client" and have a Case opened with Support. I'm trying to resolve some of the same issues that devnull09 is trying to address.
2 weeks ago
Adding the data connector will absolutely increase the reporting data limit; the data connector is passing all of that data TO the reporting appliance database - hence the increase. There's multiple ways to try and tweak what is exactly passed from the data connector but I haven't found a way that is succesful in my environment (I have 40+ appliances) to stay under my 20GB reporting limit. Recently an SE provided guidance that an unlimited license could be used to increase the reporting limit, but we'd have to switch to the new licesnse agreement (forget what it's called).
- in the Grid DNS Properties --> Logging --> Advanced tab --> Limit / Exclude following domains
- can set filters on the data connector itself to only transfer data from specific Grid members, look up the command: data.destination.global.filters.member
From the 3.0 User Guide (pg 7):
"The Data Collector collects DNS query data from the NIOS Grid and forwards this data to the NIOS reporting server..."