|
to TSI Gabe
Re: Google DNS versus oursA nifty tool for those interested in benchmarking DNS servers more from their connection(s), and/or comparing more alternatives to TSI's own: » www.grc.com/dns/benchmark.htmI haven't ran it yet, myself, but I will once I eventually get my network setup all sorted out. |
|
GuspazGuspaz MVM join:2001-11-05 Montreal, QC
1 recommendation |
to TSI Marc
said by TSI Marc:Google has that 8.8.8.8 block anycasted.. it's a routing trick.. that make all the routers think that IP is really close but in fact there are srvers everywhere with the same IP... There's no reason TSI couldn't anycast the DNS server IPs so that the same DNS IPs are used anywhere in TSI's territory :P Of course, that's kind of pointless since the vast majority of people use DHCP/PPPoE to automatically set the DNS servers anyhow. Most of the people setting theirs by hand are DNS ricers :P |
|
TSI Marc Premium Member join:2006-06-23 Chatham, ON |
TSI Marc
Premium Member
2012-Oct-11 8:58 pm
well.. there is one good reason.. and well, it's that it would take a bit of lifting to do it and make sure it's done right and for what? not much value..
we'd have to move everything else off that class C... |
|
|
said by nbinont:said by TSI Gabe:K...honestly this is going to be hard to reproduce now that we are running other DNS servers...I guess keep me posted if it happens again and I'll take another look. Since right now I'm able to resolve those domains just fine. Thanks! I'll follow up and try to reproduce it. Well, it seems like whatever Gabe did over the last week has fixed it for me! I verified that it was still acting up a few days ago (and it was), but tonight it seems to be resolving correctly the first time. I waited for the cached entry to expire in TSI's DNS server, then asked it to resolve the site again. Last week it would fail a few times before finally getting something for the cache, and then be good until the cache expired again. Tonight, after the cache expired it worked the first time. Waited for the cache to expire again (30 min expiry in this case), and tried again. Success again! I assume I must be on one of Gabe's new DNS servers - and they seem to be working well! Guess I'll have to go update my review... |
|
|
Good news, time to change things up on my setup a bit and see for myself. |
|
|
mlord join:2006-11-05 Kanata, ON |
to nbinont
You do realize that only the originator (and TSI) can read threads in TSI Direct, right? Not the rest of us, so posting links to those threads doesn't help anyone here. |
|
TSI Marc Premium Member join:2006-06-23 Chatham, ON |
TSI Marc
Premium Member
2012-Oct-11 11:29 pm
said by mlord:You do realize that only the originator (and TSI) can read threads in TSI Direct, right? Not the rest of us, so posting links to those threads doesn't help anyone here. yeah hehe AkFubar pointed that out to me too.. hehe I didn't realise at the time.. looks like nbinont's issue is solved now too. so it's all good |
|
koitsu MVM join:2002-07-16 Mountain View, CA Humax BGW320-500
|
to TSI Gabe
I'm not a Teksavvy customer, but I have a question -- one which after scouring the Internet (assuming what you're using is namebench) nobody has asked. What on earth does the vertical axis represent? It just says "%". Percentage of what? Packet loss? DNS query rejection (NXDOMAIN, or other error)? Basically that graph doesn't mean anything unless it's explained 1) where the data is coming from, 2) the exact test being used, and 3) what each axis represents. For example, namebench.py can send 250 queries to a DNS server. That's nice -- is that 250 concurrent lookups per second? Is that 250 queries total and then it graphs the response time? If the latter, then shouldn't the X axis be query number and the Y axis be response time (in milliseconds), with the visual results being a "scatter graph" followed by a line drawn which indicates the average median? Surely I can't be the only one questioning what on earth that thing is actually showing. Otherwise, if I take it to mean "percentage of queries and how long they took to be answered", it looks to me like TSI's servers are taking between 60-200ms about 70% of the time, and 10-20ms the remaining 30% of the time. While comparatively, Google's ervers are taking between 60-200ms about 80% of the time, and 20-30ms the remaining 20% of the time. And to me, that isn't impressive (if anything the results should be the opposite -- first-time query should be slow, but subsequent queries for the same NS/A/PTR/etc. should return almost instantly due to record caching, assuming all recursive records involved don't have stupid TTLs like 1 second. ) Let me show you what actual kernel developers working on UDP stacks tend to graph when it comes to nameserver performance: » people.freebsd.org/~kris ··· d-pt.png» people.freebsd.org/~kris ··· gige.png» people.freebsd.org/~kris ··· pt-2.png» people.freebsd.org/~kris ··· -nsd.pngWelcome to why just blindly dumping "pretty pictures" isn't helpful without concise (and precise) documentation alongside. |
|
TSI Marc Premium Member join:2006-06-23 Chatham, ON |
TSI Marc
Premium Member
2012-Oct-11 11:58 pm
I'm sure Gabe will chime in but I think it's pretty straight forward what the graph says...
85-90% of queries take 10ms to return a request and all requests always take less then 200ms...
your graphs show queries per second and load.. we're highlighting how quickly a query is returned not how many it can return which is also an important stat no doubt but given we have 4 servers.. load is less of an issue for us. |
|
koitsu MVM join:2002-07-16 Mountain View, CA Humax BGW320-500
|
koitsu
MVM
2012-Oct-12 12:07 am
I don't find this graph straight-forward in any way shape or form. "85-90% of all queries take 10ms to get a result". Okay, that's because you look at the graph and see that the point where the graph "shoots off horizontally" starts at 85%, with the vertical axis being at 10ms, correct? That's the only way I can see how you reached that conclusion. Except if you apply the same logic to the data shown on the rights side of the graph, you could safely say that 97% of all queries took 200ms to get a result... The following graph (X axis = duration, Y axis = nameserver IP) makes perfect sense but doesn't really provide any hard data, though as I said, that one does make sense. It's the first graph that doesn't. |
|
|
to NytOwl
said by NytOwl:A nifty tool for those interested in benchmarking DNS servers more from their connection(s), and/or comparing more alternatives to TSI's own:
»www.grc.com/dns/benchmark.htm
I haven't ran it yet, myself, but I will once I eventually get my network setup all sorted out. This is a great program and for me it helped decreased ping a bit in online game play. For me it had shown that TSI's DNS servers were only second to Rogers'. (I haven't tested it in a couple of days) I still use OpenDNS though because of their Web Filters and overall better security. |
|
TSI Marc Premium Member join:2006-06-23 Chatham, ON |
to koitsu
it's a simple graph...
x axis = time in ms y axis = % of querries..
if 100 querries were sent, order the results by shortest amount of time and put a dot along the y axis and how much time it took and that's the distribution you would get. |
|
TSI Marc |
to koitsu
and no axis of evil |
|
koitsu MVM join:2002-07-16 Mountain View, CA Humax BGW320-500
|
to TSI Marc
Marc, politely: I've had two other senior systems engineers (like myself) look at the graph. Both of them are equally as perplexed, and in the same way I am. I'll let Gabe respond from here on out, but I'll explain more verbosely: What you've described *makes sense* (as in conceptually what you want is doable), but what your first graph actually shows doesn't jibe with what you claim the results are -- and it's because of the type of graph being used + how the data is being graphed. (Readers should note I ABSOLUTELY believe Teksavvy's claims that their nameservers take ~9ms on average vs. Google's 20-30ms. And the reason for that is quite honestly network round trip time between TekSavvy customer and Google's DNS servers, also taking into consideration authoritative nameservers on the Internet who do not work with large EDNS packets (this adds time to the response)). I believe the data you have is confusing because you're using a line graph rather than a scatter graph or scatter plot. Honestly what should be happening under the hood: Loop iteration #1: 1. Issue 100 DNS queries and keeps track of the response time of each query. Query types will vary (different zones, TLDs, A vs. NS vs. PTR etc.), and response times will vary (some will be cached results, some won't be -- those which aren't should be much higher in response time) 2. Get an average response time: add up all 100 query response times, divide by 100. Result: average response time of 100 queries. 3. Graph result on Y axis, with Y axis label "average response time (in ms) of 100 DNS queries". X axis should be incremental based on time, or simply an incrementing variable ($loopcount++). Loop iteration #2: repeat step 1/2/3, except in step 3, the X axis location should be further to the right than before, and that you can draw a line from iteration plot data point #1 to iteration plot data point #2. The resulting graph would look roughly something like this. The first loop iteration -- assuming all the nameservers its querying have *no cached records* -- should be very slow (high response times due to recursive, non-cached lookups). The 2nd loop iteration should be much faster (cached results), the 3rd as well, etc. etc... The 2nd to Nth results should be "roughly" all within the same amount of time -- however, this greatly depends on the data set being measured (more specifically: what the per-record TTL is of something being resolved, or the SOA TTL associated with that record's zone). If you were to take all the graphed averages (how many depends on how many loop iterations you let things run for -- it matters! If just one loop, then the results are worthless!) and put them in their own data set. You could then graph those using a bar graph or bar chart, where each bar would represent response time sections, e.g. 0-10ms, 11-20ms, 21-30ms, etc. and let people see what the "general average" response time is for everything. This is akin (mostly) to the 2nd graph you listed in your post (the blue horizontal bars), except with more granularity. And trust me, I am quite familiar with data/metrics graphing -- I wrote all of what you see there, sans the dygraphs library, and have had to write an entire code base (all perl + dealing with the mess that is RRDTool) to graph VirtualHost bandwidth usage on Apache (using no third-party modules). Not trying to troll or give you a headache, mate! |
|
TSI GabeRouter of Packets Premium Member join:2007-01-03 Gatineau, QC |
TSI Gabe
Premium Member
2012-Oct-12 6:12 am
The graph is being generated by a tool called namedbench, I believe it's Google themselves that released it. This isn't something I created. |
|
TSI Gabe |
TSI Gabe
Premium Member
2012-Oct-12 6:13 am
I understand what you are saying though, there are more details the namedbench report spews out that is missing here and I didn't necessarily want to publish it for fear of releasing internal network info. |
|
koitsu MVM join:2002-07-16 Mountain View, CA Humax BGW320-500
|
koitsu
MVM
2012-Oct-12 6:38 am
Understood. And yeah, in my original/first reply, I linked to the namebench site -- their graphs are identical in layout (see "Response Distribution Chart"), meaning the use of a line plot model. I had a 4th colleague of mine (better educated than myself, especially in mathematics) look at the graphs as well, and he agrees the presentation model is incorrect for what kind of data is trying to be plotted (not that the data itself is wrong!). There are better presentation/layout models ( scatter, etc.) that would present the information in a way that makes more sense, but that's not your fault -- it's the fault of namebench. Although since it uses the Google Chart API, the HTTP arguments could be changed to refer to a different model. The part that shocks me the most is that namebench was written by a pair of Google employees. I'm surprised that someone would write such a useful tool then completely botch the visual representation part. "It's open source, so go fix it, koitsu!" Yeah, and it's Python; I'd rather swallow hot coals. Anyway, thanks for chiming in and clarifying a bit, TSI Gabe , very much appreciated! |
|
AkFubarAdmittedly, A Teksavvy Fan join:2005-02-28 Toronto CAN. 1 edit |
to TSI Gabe
Congrats Gabe/Marc et al. Internet access seems much more snappy here on new page loads. Cheers! |
|
2 edits |
to TSI Gabe
Question... On the "NAMEBENCH" tool...
Some of the DNS servers I tested are coming back with the message "Unable to get uncached results for: namebench2802998020.wordpress.com. ...".
They are then excluded from the results rankings, although some raw response times are still posted by namebench.
What exactly does that message mean, and what is its significance in regards to those servers? edit1: And are there any steps to take to eliminate this situation? Flushing DNS cache, etc.?
edit2: NEVER MIND!!! Duh! |
|
MaynardKrebsWe did it. We heaved Steve. Yipee. Premium Member join:2009-06-17 |
to TSI Gabe
Gabe, You might want to invest in a copy of this bible » www.edwardtufte.com/tuft ··· oks_vdqi |
|
TSI Marc Premium Member join:2006-06-23 Chatham, ON |
to koitsu
said by koitsu:Marc, politely: I've had two other senior systems engineers (like myself) look at the graph. Both of them are equally as perplexed, and in the same way I am.
...
Not trying to troll or give you a headache, mate! Hey no worries didn't mean to come off like that.. I'm a mechanical engineer and built and ran our network for 10 years.. the graph makes sense to me, I just assumed it did to others too. All good though man, appreciate the feedback, I know it's coming from a good place. I'm happy we were able to tweak a bit more performance on this front. Seems we're all excited about that |
|
Teddy Boomk kudos Received Premium Member join:2007-01-29 Toronto, ON
1 recommendation |
to koitsu
said by koitsu:Except if you apply the same logic to the data shown on the rights side of the graph, you could safely say that 97% of all queries took 200ms to get a result... It is essentially a Cumulative Distribution Function: » en.wikipedia.org/wiki/Cu ··· functionThe right side of the graph says that 97% of all queries took less than 200ms to get a result. |
|
1 edit |
to TSI Gabe
Interesting and thanks for the information....
However, it is still always better to use a 3rd party DNS service anyway (even at a slight ms hit) because;
1) Privacy reasons. You want to fragment your services as much as possible.
2) Larger DNS providers like OpenDNS have larger CDN networks in place than a local ISP running a few Akamai servers.
Just my 2 cents.... |
|
|
interesting point, I'd add to that you can also use DNScrypt with opendns Here's the thread I posted about it: » DNSCrypt for Teksavvy users?Gabe is there any chance you could implement DNScrypt with the new DNS servers? I'm hoping to make use of the new Tek DNS servers Thanks! |
|
|
to TSI Gabe
Hello TSI Gabe,
Could you please tell if TSI DNS servers properly respond to DNS SRV queries as needed for correct SIP support? My VoIP provider (Callcentric) are currently experiencing major issues, and one of their suggestion to mitigate impact was to use 3rd party DNS servers as some don't properly resolve DNS SRV queries. This is copied from their status update page:
Update 10/19 - DNS SRV
We have received reports that some users are having problems using our DNS SRV based servers.
This is specifically because the new list of servers returned is not properly parsed by some DNS servers due to the size of the information, or not being returned at all. As such, we recommend using different DNS servers in your router and/or device if you are experiencing problems.
We have tested the servers below and found that they resolve our SRV records properly:
xxx.xxx.xx.x xxx.xxx.xx.x |
|
|
n3k0
Anon
2012-Oct-21 7:22 am
+1
Same problem on my end. Had to use DNS servers supplied on CC's webpage, within my device, to resolve.
Using Teksavvy's DNS servers resulted in registration failure on my device. |
|
|
OTIS3
Member
2012-Oct-21 9:45 am
I was having lots of trouble with callcentric as well. both their recommend opendns and teksavvy were failing. i had to disable these two options to get it working again.
Use DNS SRV DNS SRV Auto Prefix |
|
|
They say it should work with either DNS SRV or not, and mine is still using it and registers properly. Teksavvy's DNS weren't working with DNS SRY set to yes. When I changed to their recommended DNS servers in my 3102 it started working.
Can we have an detailed answer from Teksavvy on this one? |
|
mlord join:2006-11-05 Kanata, ON |
to TSI Marc
I've been re-testing TSI DNS since this thread began, and thus far it hasn't failed on any sites for us (a record for TSI DNS here), and seems plenty quick enough now.
So TSI is now number one on the "Forwarders" list for our local DNS. Good stuff, guys! |
|
TSI GabeRouter of Packets Premium Member join:2007-01-03 Gatineau, QC |
TSI Gabe
Premium Member
2012-Oct-21 12:52 pm
I can take a look but it would be really useful if you guys could provide me with a hostname to test against. |
|