12-23-2015 03:25 AM
I have just deployed DNS Traffic Control (DTC), it's working well but there doesn't appear to be a dashboard widget. We did some tests and made one of the DTC Servers unreachable from one of the members running DTC (we used a VLAN ACL on the switch port to drop traffic to the DTC Server), and the IP was removed from the answer set and we got a warning in the traffic management page, but unless you were looking at this page constantly you wouldn't know there was a problem.
Also when there is a problem, it's difficult to tell which member is having the reachability issue. We have 3 members all doing DTC checks for the same zone, if one member suffers an issue the only way to find out which one is having the problem is to go into LBDN Visualisation and then use the "Filter Node Status" menu to check each individul member. It's not too bad with there being only 3 members, but in more complex environments this will be quite cumbersome.
Basically I think there needs to be more informaton on the traffic management page about which member is suffering from the problem, and there needs to be a dashboard widget so people can see the status of their LBDN's at a glance.
...unless I have missed something? :-)
12-23-2015 03:31 AM
Also it would be nice to see the HTTP return code in the healthcheck status syslog entry, at the moment I have set my healthcheck monitor to accept any response because I am not entirely sure what is coming back, but it would be nice to see if I am actually getting a 200 back from the server rather than a 404.
12-31-2015 10:23 AM
Product Management is looking for feedback on what should be included in an update to the DTC functionality and I have passed this along - thanks!
01-02-2016 05:09 PM
Thanks very much for your input. I recently joined Infoblox, and I'm the Product Manager for DTC. I think adding a widget to the Dashboard makes sense, and will open an RFE to address that.
Also, I just started working with the Reporting team to provide requirements for the DTC reports. In particular, I want to allow a hierarchical view of DTC status. For example, if an LBDN has a Server that has a problem, the LBDN will show an error. Then, you will be able to drill down and see which Pool has the problem and then again to see which Server.
Please do continue to provide input on how we can improve DTC. I'm very excited about the opportunity, and you'll see new features coming soon.
01-03-2016 07:10 AM
Thanks Alan, it's really nice to know that someone is listening! :-)
For example, if an LBDN has a Server that has a problem, the LBDN will show an error. Then, you will be able to drill down and see which Pool has the problem and then again to see which Server.
Will this also show which grid member is experiencing the problem, because if there are multiple members doing healthchecks from different geographical locations, it's possible that not all of them will be exhibiting the same problem?
02-04-2016 03:55 PM
Hi Paul. Sounds like I have a similar setup. When you blocked access from one polling member did you get a warning or did the server get removed?
We have 3 members in a group individually polling a server but when one member access was blocked it removed the server from the pool. There is some confusion if this should be the case so would be interesting to see what you saw.
we will have several hundred servers being polled from 3 members and if one member got isolated on its polling interface it could wipe out all dtc records.
02-04-2016 05:07 PM
My memory is a bit rusty already, but I think it only removed the DTC server from the LBDN answers being served by the particular member that we blocked, the other two members continued to offer the DTC server in their LBDN answers. When we checked the LBDN hierarchy visualisation thing, one member was red, the other two were green.
So I "think" in your case, if one member gets isolated, it will only affect users querying that member, the other members will be unaffected if they are still able to poll successfully.
But don't quote me on that! :-)
I'm not sure why you saw the DTC server get removed from the pool, sounds like something else is going on there.
02-04-2016 11:59 PM
Thanks for the quick reply. A tickets been raised but I can see reasons for and against.
if we have two servers X and Y in a pool in global availability load balance and three members NS1, NS2 and NS3 polling. The idea being that server X is prime and server Y is backup.
If NS3 health monitor to X fails but NS1 and NS2 health monitor to X is fine you could have a inconsistent state where NS1 and NS2 reply with server X IP yet NS3 replies with server Y. What we saw was X was removed from the pool for all members ( red error with comment one monitor failed ).
The problem comes if for any reason NS3 becomes isolated on polling interface yet still talks to the grid ( via a separate management interface for example) it loses health monitor for both X and Y and if it forces NS1 and 2 to align both get removed from the pool and no DTC answer.
hopefully will find what is the correct function/switch..