Archives


Plenary Session
4 November 2014
11 a.m..

CHAIR: Welcome to the second block of the Plenary Sessions. Our next speaker, his presentation is entitled Anycast on a shoestring, and I recommend that you, as an audience, not focus so much on the actual product that is being demonstrated through this presentation, but focus on the, I dare say, brilliant deployment strategies that are applicable to a much larger problem space than just Anycast. This should be highly inspirational. Ladies and gentlemen, give it up for Nat Morris.

NAT MORRIS: This is a presentation about an Anycast network that I built, it's Anycast on a shoestring, as I wanted to call it Anycast for shits and giggles.

This is a little bit about me. I am Nat and I live in west Wales with my wife Clare and our dog. I work in Cumulus Networks. I also do some consultancy.

So, I am guessing a lot of people here are familiar with Anycast. It's a simple concept. You announce the same prefix for multiple locations. You get the benefits of BGP and the traffic will go to the closest location. You can use it for load balancing and a local impact of service.

In summary, I really fancy deploying a DNS Anycast service and I was motivated by two really great presentations I had seen online. The first one was Bill Woodcock and this was best practices in DNS Anycast service provision. And the second one that kind of gave me the kick to do it was Dave Knight's dense Anycast deployment presentation, where he talks about what they did at ICANN, deploying many Anycast boxes using Puppet. I wanted to gain experience of doing automated ISPs. I had a think about it and could I build a kind of reasonable size Anycast network without spending too much? And what did I want to offer? So I thought I could offer a secondary DNS service. I needed to be IPv4 and picks to keep Martin Levy happy. And it was going to be a free service with no SLA and no revenue. And the kind of the critical point for this was does my wife Claire need to find out? It had to be several thousand dollars a year as the running cost.

So these were the kind of my requirements. It had to be separate from my existing management network. I had a spare /24 and a spare /48, then I took to the RIPE website and filled out the forms for a request for a new ASN. I needed to build a kind of framework to do the automation and that's what I'm going to talk through during this slide.

I wanted to pay with a whole bunch of new tools, message Qs. I open‑sourced everything. You can have a look.

This is basically an overview of what the network looks like. I have Anycast nodes around the place and I announced the same /24 and the same /48 from every location. There is normally a link net to the service provider, or to the upstream. It's like a /30 or a /64 between my Anycast node. I just have a BGP session with them. I don't accept any routes. I set the default to point to them and I just announce my prefixes. I do the same everywhere, the same subnets as being announced.

To manage it, originally I was doing everything over SSH and this was becoming a bit of pain because I needed to use TSIC to protect the zone transfers from my transfer host so what I do I split up kind of an overlay network which is open VPN, so I have a hub, a management box and all my Anycast nodes connect back and I have an RFC 1918 space and additional loop backs. All my management traffic, all the message through beanstalk traffic, all my zone transfers are sent down this open VPN management tunnels and that makes life a little bit easier.

This is kind of how the DNS zone transfers work. I have a transfer host, that people add ‑‑ they allow transfers from their masters to that box. From there, it does its own transfers out to all the Anycast nodes. It also supports notify. So if someone sends a zone notification of a change to the ASFR box, the transfer takes place and it gets pushed out in a couple of seconds.

I wanted to kind of make management and looking after all the metadata about the nodes easy. I was going to use Postgres on but it seemed hip and trendy to JSON for everything now. I used a JSON document store. I just put JSON objects to describe everything. These are how I stored the detail about the zone. I have the account that it's on, the zone itself and the master where I pulled the zone from where it's really stored. On the right‑hand side there's an example of how I describe an Anycast location. So this one is a box in Edinburgh, I stored the ISO code, the FQDN, so I could put it on a nice map and then I described the BGP session as well in JSON, and it's really, if I want to have an additional attribute for a node, I add it in. There is no schemer; it's pretty free form. This is a box I have got with Charlie at Fluency.

So, how was I going to build the application?
Well, a lot of people I have seen DNS services I built in the past for companies, everything is run with Cron jobs and Rsync and that's not a great nowadays because I don't want to have someone to add a zone ‑‑ when they add a zone I don't want them to have a wait 15 minutes for the Cron job to kick in. I started using beanstalk which is a message queue, I have a public API in Python and flask and when users make APR requests, the data gets put into the rethink database straight away. But also a message gets put into the bean stalk queue and then running on all the servers around the place I have a Python service, that subscribes to the message queue. As soon as the zones are added the message gets into the queue, the services on all the distributed nodes see the little message, and then they instantly update the DNS server configuration and load the new Joan. It can make changes really quick. You can just add a zone on the API. Within four or five seconds, it's in place on all the boxes. I also store a lot of data into registers as well, that's a really neat key value store and I kind of cache a lot of values in there so I don't have to keep writing stuff to the database layer.

So, the next kind of ‑‑ I was thinking where can I host the nodes? And I had to fit into that magic price point of a 1,000 dollars a year to operate it. This kind of ruled out co‑lo or dedicated servers. I need an a really wide reach. I thought virtual machines that could be perfect for this, maybe I could swap with friends or look at low cost hardware. I started looking for virtual machines. The first company I came about was Mythic Beasts, they are a hosting company in the UK. They were line of Linux savvy. This was £7 a month for my first box. This was kind of perfect. I took a bit of a gamble on this one. I hasn't spoken to them about BGP before. I opened a ticket and in the ticket I just wrote, oh I have a /24 and a /48 at PI, can you set me up with a BGP session? And I pointed them at my as in the RIPE database and the as Yours sincerely set. They replied yeah, here is the BGP sessions, it's ready. I I was like wow, this is super easy. The quick success of that drove me on to look for other VM providers.

I stumbled across this website called low end box, which is for VMs that cost less than $10 a month. So, I had to have a good look through the providers, because there's a mix of virtualisation technologies out there. A lot of them are OpenVZ‑based, so they were containers and I wasn't kind of keen on that. I wanted to run my own kernel. I needed to find a provider that could do KVM Xen or VMware, because I was running a mix of Quagga and BIRD. I looked on low end talk on this message board and I Googled for the VPS BGP session. So, one of the first ones I found was a company called anyNode in Detroit. This time I didn't want to be so risky and so I messaged them first so ask if they support BGP and suddenly they said yes. So this box of $80 a year for a VM with that. That's kind of the order that came through. My budget was starting to get used to it but the cost was still low. At the time I thought $80 was a bargain for a box with a BGP session. I kept on searching online. And the next one I found was Indian bargain. So this is a VPS provider in India. They were advertising VPS for $120 a year. I thought that was good so I was going to click next on the shopping cart and type in my credit card details and I Googled them and I found a discount code and it was a re occurring $72 a year money off so the VPS dropped in price, the total price for the whole year was a $40 a year. That happen wasn't for the BGP session, that was the VPS with a BGP session.

So, after I started I was a couple of weeks in and I had these two nodes and things were starting to grow. I put up a VPS in my exists KVM server. I had a box with Mythic Beasts, the one in Detroit and the box in India. And because of all the provisioning and the JSON and the message queue and stuff things were ease to deploy. It took about ten minutes. I had to get the router password, install Debbion, install Puppet and the magic happened. Everything was provisioned and it was making life a lot easier. So I wanted to kind of scale up and some people had heard about what I was doing, and I had lots of offers from people and I got in touch with friends that I knew had networks, so, by the end of October 2013, this time last year, I had had seven nodes. Simon from Freemax in Germany got in touch, he was like hey it could be cool if we could swap VMs, it was kind of growing.

The next thing is there was lots of people that said they could help me out but they couldn't necessarily host a VM. I was thinking how could I do this with super low cost hardware? There was various issues why people couldn't host VMs. Maybe the network engineering teams didn't have access to the VM host that were there. They were maybe run by separate teams, they might have had no VM infrastructure at all or in some cases people were running things like OpenStack Neutron or the VMs were actually routed by the hyper adviser with just a default route. So I couldn't do BGP. So the solution I looked at was a raspberry pie, the $35 low cost unit, 512 mega RAM, 16 GPS D card. So, I bought a raspberry pie and it arrived, I sent it to my friend in Belfast, and I put NSD 3 on there and did some little bit of performance testing, it was doing 200 queries per second. With NSD 3 it needed to bounce the day Monday every time you added zone. I swapped to Alpha power DNS. I updated Puppet. Changed the templates. It was magic. I forgot about this raspberry pie that I sent out. Then I sat in a hotel room, I flipped it again and thought I'd go for BIND and put it it in Puppet and a few minutes later all the boxs were in BIND.

Then my friend Matt in put in a box and connect it up to SF mix. I couldn't turn down that opportunity, but I needed to find ‑‑ his criteria was it was zero U, so you can see here Anycast service shoved in the side of the rack on its side and I did something that had two network cards, so, I found this fit 2 PCI, which is normally like a PC you put on the back of a monitor. These are the ‑‑ I started peering with people, on there, so I had sessions with H E, ISC and wired, this location was one of the first times I brought up a peering request via foursquare comments on check‑ins, because I could see Aaron Hughes went to 200 pools I'd comment on it and say can I have a BGP peering session, finally after this happened a few times he gave in.

I had problems with the raspberry pie. It was arm and I couldn't easily emulate that in my test environment that I was building stuff. And loading zones were slow. Ruby on the raspberry pie was like a dog. So I had more offers to host nodes. My friend said put a box here, I need to think how can I do something cheap? I start using gigabit bricks, I had in my box as a small X86, and put a 30‑gig M sasher in there. I gave one of these to Andy and, he swapped out the box I had in Belfast, which is sat here on to, some ME series routers, these are quite good because they were quad core and it was X86 and it was easier to test.

Today, I have got 12 nodes live and I have got 6 in build. I have actually got 8 in build. You can see the green ones are locations that are live and the blue ones are in the kind of various stages of deployment. So, I thought I'd share some of the fun I have had along the way. I was often the first BGP customer for some of these VM hosts. And they were ‑‑ people ‑‑ they weren't getting it right. In a lot of cases they still had their routing policies to prefer transit routes over people. In a couple of occasions people would give me access to their routers and I'd say here is the route map you need to add and this is how you configure it. One of the things I found scary is a lot of these cheap VPS providers, they don't have checking on the routes. So they had no filters prefix list, could I add a prefix and it could add‑on the Internet instantly. People didn't have communities, which I find useful for managing how I do the traffic engineers, I have helped a lot of people getting those implemented. I have been using two RIPE projects extensively. The first one is RIPE Atlas, so I had some scheduled measurements to query the server to I can look at where the traffic it going. Everyone I added on Friday before I went to bed is I started using RIPE Stat so I query using the rest API, which is easy, I queried the /48 to look at the BGP state of it, from all my nodes announced ‑‑ I can check the v6 rules are in place. These RIPE projects are awesome.

There appears to be market for VPSs that have BGP sessions as well. These are a couple that came up on Google. Port Lane, they sell a BGP session for an extra 10 euros a month, or the same in ZettaGrid, this is an Australian company. You can configure the BGP session on another web form. I'm hoping they do some LOHX and checking them first before setting them up.

There's a bunch of discoveries I made a long the way. It was easier to find budget Anycasters, I'd spot them on the Hurricane Electric sites. There is a whole kind of interesting models I have seen. One of them is for Anycast people is they have like a share /24, and they will offer dedicated IPs within that /24 bound to the same box, with customer reverse DNS. The zones would be slaved on the host. That's normal. Lots of people are doing that. The next one was a kind of hosted 24, where I a single DNS Anycast err would offer to host a 24 out of another provider space, this is kind of typical. I have seen it used by a lot of the domain name registries. The next two I thought were slightly murkier; there is some people that have a shared /24, but they sell 32s GRE tunnelled back to a customers' location. It looks like Anycast but the 32 going down to a single authoritative service. It's like the illusion of Anycast but without the performance benefits. The last one is, I have seen this a few times, people ‑‑ they might only have a single /24, and they will announce a /24, do Anycast on the edge, but then they put the rest of the 24 in a tunnel back to a single location, where they run their web interface and their mail and everything. I guess ‑‑ as long as the hosting the zones at the edge there is a performance benefit, but for the rest of the traffic is kind of dog legging all over the place in tunnels.

The discoveries were that not every DNS hosting company hosts the zones at the edge. When I was doing the DNS queries, something obviously was going skew ways because it was going down tunnels. This is a comment about the GRE stuff. There is lots of people that like to sell GRE tunnels for Anycasters' /32.

One of the things I wanted to play with was HTTP over on CloudFlare, and so what I did was I deployed light HTTP on all my nodes and I have a JSON file which has the details about the host so you can kind of Curl this URL and it will return the JSON based on where you queried it from.

So, what I did, I'll going to keep honest and host all the zones at the edge. I need to finish a web interface because I had a barrier to entry of people using the service at the moment is that they use my rest API. That doesn't seem to stop people. I had like crazy e‑mails from people, in Iceland saying this is really cool, can you host 14 thousand domains for us. You can have an API key but start adding 1,000 a week to see how this goes. I need to make my hub more consumable. Make it an open source project. I need to put in install instructions and getting started guide. I want to start supporting multi‑master as well, so I could pull zones off a customer's box, DSIG would be nice. I want to mix it up a bit. At the moment I am running Quagga and BIND on in my boxes. I'd like to have a field in the database. So I can specify the ‑‑ one box for running BIRD with knots or never running XBGP and power DNS. Because that would kind of ‑‑ variety is the spice of life and I don't want to be bound to two projects.

I'm also looking for friendly hosts around the place, maybe you could host one of these, I have a suitcase full of them.

So, as of today these are the boxes that are live at the moment. You could see this on my website. It's updated live from ‑‑ these are the boxes that are in build, some new ones in Australia, some boxes going out to Singapore, some stuff in Austria as well and various boxes in Iceland. I have got a mix. I am trying to push people to give me v6 sessions as well but that's a bit of a struggle.

What are the take a ways from this? I never thought at the start of this project that I'd find a VPS session for $40 a year with full table BGP. When I started tinkering with the Internet in the late nineties I had a 64 KPSI lease line to my flat and they wanted $300 turn up fee and initial another £100 a month for the initial BGP session. Now he can get a VM with a full table. I want to make sure that people don't ‑‑ if you are offering these BGP session to say your customers, use prefix lists and look at route hygiene and use the hype database.

Anycast can end pre‑service delivery. It doesn't have been DNS. You can could use something else to deploy HTTP services as well. And also make all things, because I just doing this in between my day job, sat at the sofa at night, because I built it with Puppet and used other things to make it simple to spin up a new node. I encourage anybody that's working on projects like this to put together a slide and come up here and present about it.

And so, have I got any questions?

(Applause)

CHAIR: That was a great talk. Thank you.

AUDIENCE SPEAKER: Dave Freedman. Thanks for the great talk. I just, I have two questions. First of which is your liveliness detection and how that's related to the announcement of your running cost prefix. How do you determine that you should continue announce ago running cost prefix and on what criteria do you withdraw and how?

NAT MORRIS: I have a work in programme that listens to the message queue and updates BIND. I have another process that runs, it's the same service, a similar service and I check locally that the name server is replying to certain queries and answering as I would expect. I query the local name server every 30 seconds and if it doesn't answer then I withdraw the prefix. Was there a question about the anchor prefixes.

AUDIENCE SPEAKER: The second question I have is about denial of services facts. Have you received any?

NAT MORRIS: Not that I'm aware of, not at the moment. But I'm nearly at sort of 20 nodes with 20,000 zones, so I'm guessing ‑‑

AUDIENCE SPEAKER: Sometime soon. I hope not.

NAT MORRIS: I'm pushing back the data, so I'm hoping that happen I can graph all those interesting things and show them as well.

AUDIENCE SPEAKER: Peter law insure, ISC. Very interesting presentation. As someone who works for an organisation that does Anycast, pretty widely, a couple of questions: One is do you do any sort of ‑‑ like we for have for local effort nodes, we do, we prepare no export. Do you do anything like that or all the nodes basically globally Anycast and everyone sees all the nodes?

NAT MORRIS: At the moment, I initially didn't really play about with communities and I had lots of strange problems like I had a box in Belfast with David, and it had level 3 transit, so my customers in London were preferring the box in Belfast. So I ended up having to speak nicely to him, can you give me a community that I can set that doesn't announce it to level 3. I have lots of odd quirks where I have got this box, the one that's down the side of the rack that peers with HE and I was taking their kind of free v6 transit and I find that a lot of customers will shoot over to that node. I have had some interesting e‑mails from people after doing this presentation twice now, and one thing I found is that I don't have like a consistent service provider around the place, that's problematic because traffic will dog leg and go to not necessarily to the closest location. But I had an interesting offer from Digital Ocean, and they said this is super cool and why don't we give you four instances in our regions. I was like, can I take on 40 extra nodes? I'm hoping that having common providers would make life easier.

AUDIENCE SPEAKER: And so, have you done any sort of benchmarking sort of ‑‑ because between all these various hardware platforms and so forth, do you have any sort of common every node can handle so many queries per second for...

NAT MORRIS: Not at the moment. I haven't done any kind ‑‑ I bench marked the raspberry pie before I put it in the post, but not at the moment. That's something I'd like to add. I'm hoping to share ‑‑ on my site you can see the live state of all the BGP sessions and I'm hoping to add the queries per second on those and share those graphs live. I'm interested in talking to people that want to collaborate on this.

AUDIENCE SPEAKER: Martin Seebe. First of all thanks very much for this very exciting experience sharing, and it's complete and ‑‑ so, my question is, what would be the typical user of this? And maybe there are many times, for example, people looking for something better than Unicast without SLA. And second question. Are you looking, based on this experience to improve the service to some sort of SLA or are you giving ‑‑ showing the way for people who build in their low cost something which is better than the Unicast with the resilience. Too many questions maybe?

NAT MORRIS: The type of customers, they seem to be people that have a personal domain and have a VPS that have the primary and they want to do secondary. I have hosting companies that are like hey, we want to do Anycast, can you host 3,000 zones for us? I'm like, we'll take it on slowly and see how it goes. I'm not sure where it's going related to SLAs, it's just a fun folly at the moment. What's the other question?

AUDIENCE SPEAKER: Actually, amazingly I'm thinking of some TLDs out there who are still in the Unicast who maybe this is better than the simple Unicast in one narrow country. So is it possible just to offer them the way to do it or...

NAT MORRIS: I had an e‑mail when I was on the train yesterday morning from a ccTLD that wanted to try it out. I'm not sure how to take it at the moment. Maybe swap out some of these boxes for bricks and try and get more than 5 tonne of nodes. I'm interested in people who have got ideas about what it could be used for. It would be great to talk to people after.

AUDIENCE SPEAKER: Anyway, you are really showing the way that Anycast is not rocket science. Thank you.

NAT MORRIS: And automation isn't either. You can script these things together and automate all things.

CHAIR: Automate all things. I agree. Are there any questions maybe from the chat room? None? In that case I will hand the mic to Will.

(Applause)

CHAIR: Hi, our next presentation is from Brett Carr, who many of you will know. He is going to talk about name space collisions in the route.

BRETT CARR: I work for ICANN within the global domains division.

So, today, I want to talk to you about DNS name collision and the risk mitigation around it, and I don't know how to go to the next slide.

So, name collisions. Name collisions in the DNS is not a new thing. It's something that has existed for quite some time and every time a new top level domain gets added to the route which has happened ever since the DNS has been in existence, there's been a risk of a name collision related to that new top level domain. The difference at the moment is that obviously new top level domains are getting added much more frequently than in the past.

So, the other day I was thinking about ‑‑ I was preparing for this talk and I was thinking about name collisions and obviously name collision Yours sincerely something that existsed in DNS but also name collision Yours sincerely a general problem in life sometimes, and I was trying to think of some examples of name collisions that are not in the Internet that might give some people in the audience an idea of what a name collision is if they didn't already know.

So, I was walking my daughter to school at the time and I was thinking, trying to think of examples, and she called out to one of her friends over the playground and to say good morning to her, and another little girl answered and who had the same name which caused quite a bit of confusion, as you can imagine. If only she had used the fully qualified child's name and everything would have been fine. So, I gave her those instructions for the next day.

So, we know from research conducted on things like day in the life of an Internet that the route DNS servers receive lots and lots of queries for top level domains that don't exist everyday. These are mainly because some enterprises use what they consider to be private name spaces within their internal networks, but they may be become figured in such a way that those DNS requests leak out of the network and even for those organisations where they are using private name spaces or what they consider to be private name spaces, and they have configured these that the DNS queries don't leak out of the network, is often quite difficult to not let the users leak out of the network. So we can't stop users coming to conferences like this, going to Internet cafes, and if their laptops or tablets are configured to use those private name spaces, sometimes, again, those queries will leak and hit the routes.

So, these queries happen because these enterprises use these name spaces for various reasons. Quite often we find that this is the case because they have used names that were listed in documentation many years ago. Obviously companies like Microsoft have listed particular names within their documentation, some people may have adopted those names. Also, most operating systems have got the concept of search lists, and this is a process where if the user uses a short name, then the search list is appended to the name and those search lists are processed in different ways in different operating systems so they can lead to behaviour that isn't expected particularly by some administrators.

So the crunch comes when we started adding new top level domains into the route servers, and these queries that we're getting previously might suddenly now start succeeding, so they might be getting a near record or an MX record, depending on what's been asked for. They might be getting a referral to another set of name servers. So in other words the behaviour changes. At best, it could be a nuisance to the users because a site or application they were using previously because it was resolving to one address might now resolve to something else or fail in a different way and that might mean that they are application stops working. At worst it can also potentially lead to data leakage. So, we can have an application maybe that now he results to a different address but sends some logging credentials or other data because it doesn't realise that the end point has changed.

So, ICANN have been aware of the issues around name collisions for quite sometime as part of the new gTLD programme. We have looked at various mitigation measures, there is no red pill to solve the name collision problem but we can put various measures in place that may help things along a bit and draw people's attention to it a bit more. There are three strings that appeared in the research that was done on queries that were hitting the roots that were much more prevailing than anything else and those three top level domains have been deferred from being delegated indefinitely, so they are .mail, .corp and .home, so we're not saying they'll never be delegated but what we have said is they'll never be delegated until we see the amount of queries that the routes drop to a much lower level than they are currently.

The second measure we put in place is that we insist on 120 days of no activation of names in a TLD from the date of the contract signed with ICANN. This is to allow anybody using an internal certificate authority to revoke any names within it.

Thirdly, we have put in place an emergency response system related around name collision, so there's a process you can follow on the ICANN website to report a name collision, and then a member of our staff looks at that, investigates it, and then we'll work with the end user and the registry in question to try and solve that issue with them. In extreme circumstances where there's a danger to human life related to a name collision, there is a process we can follow to pull the delegation from the root.

Lastly, we have a process known as the 90 day controlled interruption. So this is something we put in place for every new gTLD has to follow, add these extra records into their zone for 90 days and these are intended to give a certain set of responses for all ‑‑ for the majority of queries to that TLD to send systems TLD in the ‑‑ to investigate the name collision issue. There are two different types of controlled interruption. There is what's called wild card controlled interruption. This is where a set of wild card records are inserted at the apex of the zone and so all queries so that zone will receive those, an answer based on those wild card records. And then there is, for TLDs that have been delegated for a longer period, we have what's called SLD controlled interruption. This is where we give the TLD a long list of certain SLDs, and that list is made up from research we have done on using the day in the life of the Internet data again of commonly queried SLDs for that TLD at the route. And for everyone of those SLDs, the TLD has to again insert these wild card records which I'll talk about in a second.

So, the aim of this is to leave some bread crumbs behind for sys admins to notice issues and see something in their logs or in their intrusion detection systems.

So, the first of these records that we asked them to put in returns a strange loop‑back address of 12705353. This is inserted ‑‑ it's meant to make connections fail and make connections not go outside the host it's asking, but it's also meant because it's such a curious, strange‑looking address that people will notice it and go, that's funny, I'll like that up and see what it is. So other records that we insert in there is that we put MX, SRV, TXT records, which again point to a DNS name collision occurring and hopefully people will look those up and find these slides or something else for instance.

One of the questions we quite commonly get asked is why there is no IPv6 address for controlled interruption. Obviously, you'll all know IPv4 is a /8 is dedicated to loop back which is a large waste but it does mean that, in this context, we had something to play with. IPv6 is a lot more efficient and only as a /28 for a loop back. We didn't really have anything to choose in IPv6 to use. A few people suggested using IPv4 to IPv6 mapped addresses. And we did some research on this, but we found that there was a lack of standardisation implementation across different OSs and some did it one way and some did it another way. And really the variations meant that the use was so unpredictable that we thought it was better to do nothing rather than do something that might cause bigger issues.

So, I talked about SLD mode CI previously and within SLD CI there is actually two different types as well. When we originally sent the spec out to the registries we asked them to insert these four records into their zone for each second level domain. And some feedback we got from the registries led us to ‑‑ suggested that we height want to put wild cards in for these four records instead of flat records because that would catch for queries and we did change the spec and we now allow either, but encourage strongly the wild card method.

So, ICANN have a monitoring system in place which monitors all the new gTLDs to check their compliance of CI everyday, and we send out you know, compliance notices if people aren't doing things correctly and we count the amount of days and when it gets to 90 days we let them carry on with normal operations. Currently, there are 344 TLDs in, 78 in wild card and four not yet started. The 344 and 78 to actually swap around because the rules state that new TLDs now are only allowed to do wild card mode. The SLD mode is a legacy thing, so, the actually 344 will drop down to nothing eventually and the 78 should go up.

ICANN also commissioned some web ads with a famous search engine, so there is some figures there around the impression that is we got based on the searches around controlled interruption.

So, as I mentioned earlier, we have a reporting process where anybody can report a name collision. Up to date, we have received 11 reports. Four of those are related around the use of search lists. Four are related to internal name spaces in use. Two are very simple configuration typos which we were able to solve quickly. And one still currently under investigation. One of these reports currently involve harm to human life. So none of them required the pulling of the delegation. We were able to get the end user and the edge statutory to work together to solve these in the majority of cases.

So, the two tips that ICANN want to give everybody more than anything else is that, you should only use public DNS fully qualified domain names even in your internal network. Don't expect any TLD that you are currently using will stay as a private TLD because anything could happen in the new gTLD programme going forward.

And where possible, avoid your use of search lists. So I'm sure many of you will find it difficult to tell users to stop using short names, but we just want top point out that there are dangers there and to try and limit the use of them where you can.

So, there's three URLs up there, where more information is available around this subject matter. The first of those is a white paper written by us which goes into all of the points I have gone into today but in a little bit more depth. The second URL is where you can actually report a name collision should you have one. And then the last is just a general portal on the ICANN website relating to name collision.

So, lastly, another question that we regularly get asked is, how do I know which new gTLDs are coming into the root shortly? If you take a look at the URL there, there is a CSV file, that has lots of information in it. But two important things related to this is that there is a contract signing date and a delegation date in there, and so if a TLD is contracted, so we have signed the contract with the registry but it's not currently delegated, we are dating for contract but no date in for delegation. If that's the case you know that particular TLD will be coming to the route fairly shortly.

So that's the end of my name collision presentation. So if anybody has any questions, I'd be happy to try and answer them.

(Applause)

AUDIENCE SPEAKER: I think there's a question on RC that was asked before, if you want to ask that.

AUDIENCE SPEAKER: I have a question from Chris Hills, from National Grid. If it's possible to reserve a TLD for private use similar to RFC 1918, such as private, then there won't be a possibility for name collision.

BRETT CARR: That's not possible currently but I think there's been some talk on that in the IETF, but I don't know the full details of that. But if you were to look on the IETF DNS mailing list I think you'd see some discussion around that subject.

AUDIENCE SPEAKER: Hi. This is Martin. Thanks, Brett, for this presentation which goes towards more outreach of this name collision problem. I can identify three communities concerned with this: The gTLD community which is really well aware of that, and some of them are undergoing this and it's not very pleasant for them. The two other communities are addressed, for example, for IT professionals, and the other is operators of resolvers. I can't really see how ICANN is really out reaching to them. Who is reading the TXT record? Be careful, your configuration is bad and who is really reached by your message which is, oh, you have been running a private TLD so far, you really need to think, to migrate with something else because tomorrow, there will be some production TLD. I can't really see the effort. Because, the first community should be gTLD operators, which is really having the most load, and at the end of the journey, maybe, the result will not be as expected, let's say, after 90 days, what will happen. I'm not sure that it will be really effective. So, I'm a little bit sceptical about pushing the outreach for one community and only publishing documents for IT professionals and people who are running software.

AUDIENCE SPEAKER: Dave Freedman from Claranet. I got a question: Does this problem of collisions affect IDNs?

BRETT CARR: I'm not sure. There hasn't ‑‑ the reports we have had, none of them have been IDNs. But ‑‑

AUDIENCE SPEAKER: But, knowing that it could potentially be an international community could be affected by it, is the interruption and the messaging for that relevant? What happened if message goes back in English to a community that are not able to understand it?

BRETT CARR: Yeah, point taken. I'm not really sure what the answer is though apart from pointing out in every language in the world, which I'm not sure that's feasible. But, yeah, I think the problem does exist in IDNs but probably much less than in English text.

AUDIENCE SPEAKER: Carsten Schiefner, no affiliation, just speaking as an interested Internet citizen. To answer the first question or to give a comment to the first question, I guess there is an effort going on in the IETF to have an RFC 1918 like space in the name space as well, and to my knowledge, .mail and .corp and also .home are being deferred indefinitely for the time being, they might likely be pushed under this newly created umbrella. My question is, we have talked about the new gTLDs for almost a decade right now, so I just wonder, and possibly hopefully you have an explanation for that, why this issue, this problem has been put up only essentially in the very last minute.

BRETT CARR: I don't have an answer for that, I'm afraid. Personally, I have only been in ICANN for only three months. I'm not involved in ICANN policy very much before that, so...

AUDIENCE SPEAKER: Okay. So you haven't heard any explanations or rumours within the company either?

BRETT CARR: Not really. I think ‑‑ let's talk about it one to one.

AUDIENCE SPEAKER: Okay. Thank you.

AUDIENCE SPEAKER: Lars Liman here. I kind of ‑‑ I'm not really all positive to this thing because I think that misconfiguration hurts. When you misconfigure your systems, you should take the blame yourself for doing that. That leads to making information leakage good, etc., your problem. Who makes the mistake. And that's how it works in real life and that's how it should work on the Internet as well. And you could actually compare that to using net 10 on your own Internet network. To be able to reach your house on your internal network you must look them up and even if you use public domain names that lead to an address record that says net 10 something, when you move your host outside your laptop and try to connect to the net 10 here in the hotel, you will reach something else, so you have a similar problem but you don't seem to address that. Why not? I'm not going to ask you ‑‑ I just want to point out that there are already numerous problems like this where miss configuration leads to these problems so trying to mitigate one with somewhat questionable methods are not really helpful. It's not going to fix the problem.

BRETT CARR: I don't disagree, but as in the same way that RIPE, ICANN is a community‑driven organisation and the community asked us to go in this direction.

AUDIENCE SPEAKER: Okay. Another thing is that if I was the computer manufacturer Dell, I would issue a slow DDoS attack on the roots asking for .hp. That would prevent that from being delegated to my competitor. Is that what's going on now? Do we know that?

BRETT CARR: I think they would have had have already have done that rather than do it now.

AUDIENCE SPEAKER: Also using the address 157.53.53, that's a perfectly routable prefix. It just so happens that most computers don't route it, but try to configure it on your Cisco, it works like a charm. So, that's also a bit questionable.

So, there are lots of arguments, I think, for saying that this is somewhat questionable. But I'm not going to try to prevent you from doing it.

AUDIENCE SPEAKER: Randy Bush, IIJ. Carsten, there is one more draft in the IETF concerning the domain .onion. This is a very serious problem because it seems to work today because you will attempt to resolve .onion down the Tor tunnel and you will reach a reasonable resolver. But the attacker who was trying to mount for instance a nation state attack against dissidents in their country, if they can block it from entering the Tor tunnel and resolve it in the DNS, they own you.

BRETT CARR: Do you have a question?

AUDIENCE SPEAKER: I was just waiting for you to answer and for the Chair ‑‑

BRETT CARR: It sounded like a comment rather than a question. I didn't hear a question.

DANIEL KARRENBERG: I was waiting for you to acknowledge me. I have ‑‑ first of all, I'd like to protect Brett here a little, because in my opinion and observation of this, first of all, this hasn't come up in the last minute; this has come up already like a year ago, I think more than a year ago, and the proposal of the requirement then by ICANN was that people log their queries and then get back through the RIR databases to the owners of the address space that the queries were coming from and we managed to prevent that bad idea. So, I have to say this is a slightly better idea.

The thing here is that this is coming from the bunch of lawyers that ICANN is. It doesn't come from the ICANN community, my personal opinion. So... and the comment I have, and what I would like you to take back to ICANN, and I say it again, publicly, is, and I think the first statement was in the same vein, if you try to ‑‑ [Mosan], I think it was ‑‑ if you are trying to reach the operational communities, it might be helpful to actually come here before you implement stuff like that ‑‑

BRETT CARR: Fair enough. Thank you.

CHAIR: We only have time for one more question I'm afraid. Sorry.

AUDIENCE SPEAKER: Patrik Falstrom, Chair of the Security and Stability Committee of ICANN. I'd like to comment on a couple of things that you pointed out. First of all, that we will get these kind of problems as something that we in ISOC reported many years ago but it was not picked up by the policy development process in ICANN. It was picked up in our second or third report or something and specifically when the whole issues with the internal name certificates came up when people understood what was going on.

So it is not a new thing but some people were surprised for some weird reason.

The second thing is, that I would like to point out that we in [FSAC] commented on the JS report that this sort of proposal is built upon, and our proposal was to, instead of using 127053.53, was to look at our alternative methods, for example, honey pots. So I just want to point out that we gave different advice to ICANN but ICANN draw the conclusion to move forward this this manner. So the question of better for whom, given the different alternatives, is something that is still up and discussed. Thank you.

CHAIR: I'm afraid we have to move onto the next presentation now, otherwise we will ‑‑

AUDIENCE SPEAKER: Just a brief comment. Carsten Schiefner. Just a direct answer to what Patrik has just said. This wasn't really a question or my comment wasn't really targeting at the issue itself but just at the means and the ways how ICANN has dealt with it, and that was from my point of view the very last minute, if not even five minutes after the launch of the new gTLDs, so, just to be clear on that. Thanks.

BRETT CARR: Thank you. I'll take that feedback back.

CHAIR: Thank you very much.

(Applause)

CHAIR: Agustin Formoso. Please come to the stage. Our next speaker will be elaborating on latency in the LACNIC region and discovering Internet topologies through at that way. It's called the Simon Project.

AGUSTIN FORMOSO: Hello everyone. I am part of the software development team at LACNIC, and I'm going to present you a side project we have that it tries to determine the regional inter‑connection through latency measurements. I'm going present where the project comes from, what state it is today and where it is going to.

So, the project started at about 2007, the regional connectivity forums, where people from different perspectives tried to tackle a very big problem that had the region that was ‑‑ there was scarce knowledge about the state of regional connectivity. The main reasons that motivated this was that the perceived performance of networks was quite low mainly because traffic was routed in a suboptimal way. Even traffic sometimes routed through Internet exchange points outside the region, so costs and performance received from within Latin America were very poor.

So, given those problems, the main objective was to provide open and accessible and recent information for all of those interested in having knowledge at how the state of the Internet was at Latin America, could access it in a very simple way.

After the group formed in 2009, a Java applet was written which run in the client browser and did several ping tests to many different sites. In 2012, the first IPv6 points were added to the platform and the information, the database was open to everybody. Later in 2014, given some problems, the Java applet had, we decided to get into production a web tool based in Java script results gathered in the first year of the tester, we show them in a country latency matrix which shows all the countries where tests were directed to or originated from. And ‑‑ well, just scanning quickly the matrix, we can see that the latency ‑‑ the inner latency of the countries, it's logically less than the mean latency of the whole matrix and you can identify some countries whose performance is clearly worse or better than other countries in the region.

When we tried to rank the countries in the region, we had to get a criteria for us to do the ranking. We didn't want to be unfair with any of the measurements, so, for example, if you're ranking two countries that are not together or the traffic doesn't flow between them, let's say the measurement we're doing doesn't depend only on this those two countries, you start to be unfair. So what we could do is just rank the inner latency, let's say the latency that originates and is destined to the same country.

So, from the matrix, what we could chart was a big global mean from all the data set, but when we start to explore how latency behaves, charting a few histograms of the profile between any two countries from the matrix, we can see that traffic actually behaves in not only one way, but in several different ways. This, for example, between Argentina and Brazil, we can see that the measurements actually had three clear different behaviours, and the bad thing about it is that the worst performing peak in this histogram is the one that holds the most samples. We could say that the traffic between these two countries could and should be improved.

This is for 2009. And this one is for the all of the years that came later. We could see that the histogram shifted to the left considerably, but anyway, the three peaks are still there. One of the worst things about this probably are not the peaks, but I'm going to the slide before, in 2009, 12 hundred samples were taken and in the five years that continued, just 1,900, let's say the amount of samples per year lowered a lot, that meant that the tool was not being used. So, we started to run the risk of not getting enough samples to be representative of the region.

Given that, we started to think how we could get a real perspective of the regional Internet state with a very scaleable tool that should automate measurements and provide many more data constant along the years. So, we started thinking about deploying some tools in regional servers, but the nature of the measurement started to change. What the applet did was measuring the end user and this would measure other things. So, we asked the question what about using JavaScript for measuring RTTs. Actually, JavaScript is not a language meant to do network measurements or precise measurements, but for the purpose it it had to fulfil, we thought it would be worth taking the risk and looking how JavaScript performed and if we could draw any conclusions out of it.

We saw also that there were many resources out there, we could deploy this in a very easy way and it would scale instantly. The motivation behind this was getting more measurements and getting more test points, so that we could have relevant information to draw conclusions.

So, looking for what had already been done with JavaScript and measuring latency, measuring latency in networks, we saw that [Jaco] had already walked the path but the only problem that the tester had was that it measured only the client server link, let's say how a user perceives the performance from a specific site, I mean the site he is visiting. We thought, well, it would be much better if one single user, let's say one single browser could ping many, many test points.

So, we decided to write our own tester with a very simple algorithm, that was getting the test point, check that the test point is online and I can do add measurements to it. And then the latency measurement itself is doing an HTTP get to a resource so we hang on the error hooks that provide the JavaScript provides, and therefore, we make the latency measurements. Afterwards, we do some filtering in order to strip out layers and getting sensible data and in the end we do a post to the central database.

For those who want to implement this script on their site, it is very easy for them to configure it and they have several functions, several hooks where they can post their database and/or draw some night charts and basically configure to their own will.

What goes ‑‑ underlying the JavaScript HTTP get, we started to sniff TCP packets and make sure exactly what was going on under the hood. So, JavaScript has access to the user part and we cannot do anything between T0 and T1. The most proximate events we can hook on are T1 and T0, so we have to start making some assumptions about about the server performance where D1 and D2 are where any server can introduce and we have to make sure that we are quite low. For example, D1 may depend on the load the server is having at a determined time and D2, specifically depends on the size, the web page requested, let's say the 404, the variable 404 web page size.

So, pushing this tool a little bit further, we thought that we could actually not only get the HTTP get response time. We thought about what about inferring some TCP behaviour out of the HTTP measurement. So when we sniffed the network and charted the HTTP verse TCP, we thought that there was a consistent shift in the latency histograms and that the nature of the curves were changed, where TCP is more like a centred normal distribution and the HTTP looks a lot more like a lock normal distribution with a right tail. So, we thought well, there might be some parameters that can take us from the measured red curve to the inferred green one. So, we thought, well one of those things might be the browser. And actually when running the same tests from same origin, same destination at the same time from different browsers, you actually start seeing that the behaviour is very different. For example, you can see the three browsers where it's ‑‑ where the distribution parameters change a lot. So we thought, well, it's just ‑‑ in order to normalise all the browsers and read the information coherently, we could build a table having for example the blue brother as a reference, we could say well it has shifted, the red one has shifted 15 milliseconds or the green one 18 milliseconds in order to take those distributions and make it look a lot more like the blue one.

Some problems we faced were that some ‑‑ lots of samples start to timeout when people start visiting another tabs or change desk tops or put the application in background, but that basically was solved by filtering and getting string outliers. Some things we really couldn't solve and are quite important is the fact that people started doing the tests from congested networks, wireless or wired links, and we really don't have access from JavaScript to which interface the test is going out. So, anybody who has already done stuff with this, I would be ‑‑ I would be pleased to talk with them.

Out of the JavaScript measurements, we built a very similar matrix than the applet tester. The first thing we noticed was that minimal countries started to appear in the matrix. This is ‑‑ well, basically one of the main purpose we thought at the beginning that was making the script very scaleable. These are tests for about a month, and we can see that most of the region has been represented already in these measurements. There is still some gaps in places with no tests, but as time goes by, they should automatically start to appear in the matrix. We can see forcing a little bit in our imagination, that there starts to appear the same main diagonal that appeared in the applet latency matrix will depict the inner latency ‑‑ the in country latency.

Similarly to the applet matrix, we did a ranking which basically reflects the main diagonal I was talking about but where you can see it better.

So, what about the histograms? Are they behaving ‑‑ are JavaScript histograms behaving similarly to the applet ones? Well, these are some tests run last year and these are the ones run this year. As well, let me go back ‑‑ as well with the Java applet, with the JavaScript, we see that the histogram shifts clearly to the lower latency values, which we think it's great, and shows that the network latency is getting better.

Also, what we see is that, in last year, only 220 samples were taken, whereas this month alone, 1,900 samples were taken and we are getting more representative data from the whole region.

Naturally, when you start doing these kind of measurements, you start to ask yourself, well what about IPv6? What about its deployment? And well, we saw some things for example that the IPv4 histogram looks a lot cleaner than the IPv6 one. As it has one clear mean and the IPv6 has three different sub‑means, let's say. We first starting to ask more questions about it, we thought that this number would be quite interesting. At the moment we know that this tester is not actually a very feasible IPv6 deployment tool, but the fact that we have perceived 4 and a half percent of IPv6 measurements, makes us think that we are more or less in the magnitude of IPv6 deployment in the region with this. It is not bad for us. What we had to be very careful about is choosing the sites where the script is hosted as the audience visiting that site start to buy os our measurements, we have to choose very precisely the sites that are representative not only of geographically of the whole region, let's say sites that are accessed from many countries, but of the network technologies. I mean, we should choose those that are accessed both by IPv4 and IPv6.

So, tend we thought well, JavaScript is not a standard language for making measurements. Lots of parameters may vary the data set, but in the end we're getting more tests to more test points in a very easy way. It's just adding one line of JavaScript in some sites and you start to harvest lots of data where actually today there's no other measurement platform that is very deployed in the region. If you start to see test points from other major projects, you see there are not many and you end up having few points in few sites and not measuring the whole region. And then some numbers that we wanted to look at to make sure that they were consistent between them and consistent with other measurement tools, and it's latency within the autonomous system, within the country and the regional latency overall, which we think they are coherent.

And some vision for the future is trying to get to know the nature of the underlying distributions a lot better, but doing better statistic analysis, getting it to publish better autonomous system statistics in order for them to know how their networks are being related, latency‑wise, to the rest of the region. And try to build the TCP table I was talking about in order to try to infer TCP performance through HTTP measurements.

And, we think getting data from other major projects will be very healthy and we should establish a criteria for getting this massive databases in order not to shallow our own representative database.

Well, to end the presentation, I'd like to thanks the LACNIC software development team, whose work is very important for this project to go on. CAIDA project we have been in touch with and provided us with so much information, especially for you, and thank you for your presentation. I think you have enjoyed it.

(Applause)

AGUSTIN FORMOSO: I'd like to receive some questions if anyone has any.

AUDIENCE SPEAKER: Bartosz Musznicki from INEA. Thank you for the interesting talk. Could you please get back to the graph that compares the IPv4 and IPv6 results.

It looks like the results for IPv6 are can be more shifted to the right. Why do you think that is here?

CHAIR: It's because the addresses are longer and it takes more time.

AGUSTIN FORMOSO: Well, our main objective is to have the platform and the data open to everybody. I'm no network researcher, so I really wouldn't ‑‑ I don't know what the answer for that would be.

AUDIENCE SPEAKER: Maybe anyone else knows here? No. Thank you.

AUDIENCE SPEAKER: Hi. My name is Andreas. I am currently contributing to you, Boomerang JS. The library that you have been talking about. Have you, in your research, looked at one of the resource timing API and the and a half timing API that is currently implemented in most modern browsers?

AGUSTIN FORMOSO: It is in our road map, getting more precise services from the browsers. But not at the moment. I mean, we have looked at, but it's not implemented right now, but we think it's a very good thing to have in the tester and we like lots of accuracy to the measurements. But, it is in our future plans.

AUDIENCE SPEAKER: Plus, given that you have the resource timing API, you could actually insert specific files through your JavaScript application, or script, that would fetch different files from the network and upon what size they are and where they come from, depending wherever they come from actually, you can match that against what the resource timing API can tell you about how they came back through the browser and make some better correlation about the latency towards your server and the response of uptime server.

AGUSTIN FORMOSO: Okay. Thanks.

CHAIR: Any more questions maybe from the chat room?

AGUSTIN FORMOSO: I think people are too hungry.

CHAIR: Thank you for your presentation. Much appreciated.

(Applause)

CHAIR: Now, before you all leave for lunch, I have two service announcements I'd like to make.

The first one is for the Friday session, you can still submit lightning talks today. So if you have a crazy idea that is controversial and you want to aggravate people, please make a submission for the Friday session and the Programme Committee will then review your submission and possibly include you.

The second thing I'd like to say is more of a reminder. Please rate the talks on the website. We had three fantastic talks in this block and your feedback and comments are much appreciated. And you can even win a prize with it, an Amazon voucher of £100. So rate the talks, and I'll see you guys back here in 1.5 hours.

(Lunch break)

LIVE CAPTIONING BY MARY McKEON RMR, CRR, CBC

DOYLE COURT REPORTERS LTD, DUBLIN, IRELAND.

WWW.DCR.IE