The attached diff is not finished. I've just rediffed and adjusted whatever I could, but I have no more time today for keep working on it. The get_inet_addr function in system.c needs more work to get the families right (basically a few switch statements in there). The patched pike compiles on Linux/Debian, it will compile on any linux with glibc 2.2+, should compile fine on FreeBSD/NetBSD/Solaris. I'm attaching the latest Pike 7.3 diff for the stuff which I used as the basis for the 7.5 one. You can use it for reference wherever I goofed up (which I undoubtedly did). If you do any tweaks on the patch, please send me a diff against the latest 7.5 CVS,
thanks,
marek
Marek Habersack grendel@caudium.net writes:
The attached diff is not finished. I've just rediffed and adjusted whatever I could, but I have no more time today for keep working on it. The get_inet_addr function in system.c needs more work to get the families right (basically a few switch statements in there).
I've only had a first look at the code, and my first question is: Is there a good reason why the open_socket function need to take a family argument? To me, it seems a lot cleaner to just give it an address, and have the code examine the address to figure out which address family it is. (For numerical addresses, that's trivial, for dns names it's harder to get it right given that dns in general can return several addresses, and of different families, so one would need an "address list" API).
Regards, /Niels
On Wed, Apr 23, 2003 at 10:43:28PM +0200, Niels Möller scribbled:
Marek Habersack grendel@caudium.net writes:
The attached diff is not finished. I've just rediffed and adjusted whatever I could, but I have no more time today for keep working on it. The get_inet_addr function in system.c needs more work to get the families right (basically a few switch statements in there).
I've only had a first look at the code, and my first question is: Is there a good reason why the open_socket function need to take a family argument? To me, it seems a lot cleaner to just give it an address,
The original thought was to let the calling code decide and the reason is that you can use special IPv6 prefixes to encapsulate the IPv4 address in the IPv6 address (I'm not talking about an IPv6-on-IPv4 tunnel, but a format like 2002::XXX.YYY.ZZZ.VVVV, IIRC) and I thought optional code to do that might be added at a later date to open_socket. What's more, some code might want to force a specific family for some reason - remember that AF_INET, AF_UNIX and AF_INET6 are not the only families available (STCP comes to mind, it has its own family, I think).
and have the code examine the address to figure out which address family it is. (For numerical addresses, that's trivial, for dns names
I thought about it, but I usually stick to the rule that too much automation is bad - in the case of fairly low-level code like the open call, I think this (optional) flexibility is quite justified.
it's harder to get it right given that dns in general can return several addresses, and of different families, so one would need an "address list" API).
Yep, that's another issue that would have to be dealt with. Actually, IPv6 is differentiated to IPv4 in DNS by using a different type of the address record - AAAA instead of A, respectively. But then, if a DNS resolver returns addresses from all families, you are unable to guess which one the caller really wants - specifying the family is very helpful here.
regards,
marek
Marek Habersack grendel@caudium.net writes:
What's more, some code might want to force a specific family for some reason - remember that AF_INET, AF_UNIX and AF_INET6 are not the only families available (STCP comes to mind, it has its own family, I think).
Does open_socket handle AF_UNIX? I didn't know that. As for STCP, I haven't read up on that, but I'd be surpriced if that's a new address family, rather than just new protocol (third argument to socket, with name-to-number mapping in /etc/protocols).
and have the code examine the address to figure out which address family it is. (For numerical addresses, that's trivial, for dns names
I thought about it, but I usually stick to the rule that too much automation is bad - in the case of fairly low-level code like the open call, I think this (optional) flexibility is quite justified.
What I didn't like about it, I think, was the default setting. I don't think there should be a configurable (global or per socket) default. Either the family argument should be mandatory, or there should be a well defined, constant, default behaviour. Getting the family from looking at the address string (or resolving it using DNS) seems like the most reasonable default behaviour to me.
Yep, that's another issue that would have to be dealt with. Actually, IPv6 is differentiated to IPv4 in DNS by using a different type of the address record - AAAA instead of A, respectively. But then, if a DNS resolver returns addresses from all families, you are unable to guess which one the caller really wants - specifying the family is very helpful here.
The resolver should return both ipv4 and ipv6 addresses. The user of a DNS name should not need to know in advance if that name refers to a machine with an IPv4 or IPv6 address. Actually, I think that a well-behaved pike program that uses DNS names to refer to hosts should not even need to know that IPv4 and IPv6 are different, network programming should be easier in Pike than in C.
/Niels
Yep, that's another issue that would have to be dealt with. Actually, IPv6 is differentiated to IPv4 in DNS by using a different type of the address record - AAAA instead of A, respectively. But then, if a DNS resolver returns addresses from all families, you are unable to guess which one the caller really wants - specifying the family is very helpful here.
The resolver should return both ipv4 and ipv6 addresses. The user of a DNS name should not need to know in advance if that name refers to a machine with an IPv4 or IPv6 address. Actually, I think that a well-behaved pike program that uses DNS names to refer to hosts should not even need to know that IPv4 and IPv6 are different, network programming should be easier in Pike than in C.
I am ok with the idea that Programming in Pike should be easier in Pike than in C. But if we look in DNS for ipv4/6 address and there is no DNS server responding, how can Pike should react ?
Also, I think that we should explicitly provide if we want a IPv6 only or IPv4 or both v4/v6 addresses when open a socket. This maybe be "optional", but it should exist an option for the developper to provide explictly what protocol is wanted for his own application....
/Xavier
I am ok with the idea that Programming in Pike should be easier in Pike than in C. But if we look in DNS for ipv4/6 address and there is no DNS server responding, how can Pike should react ?
Like it always has responded, with throwing an "invalid address" exception. If you can't resolv the address, you can't bind/connect to it even if you know what family it is supposed to belong to.
/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)
Previous text:
2003-04-24 16:49: Subject: Re: IPv6 diff for Pike 7.5.7 (today's snapshot)
Yep, that's another issue that would have to be dealt with. Actually, IPv6 is differentiated to IPv4 in DNS by using a different type of the address record - AAAA instead of A, respectively. But then, if a DNS resolver returns addresses from all families, you are unable to guess which one the caller really wants - specifying the family is very helpful here.
The resolver should return both ipv4 and ipv6 addresses. The user of a DNS name should not need to know in advance if that name refers to a machine with an IPv4 or IPv6 address. Actually, I think that a well-behaved pike program that uses DNS names to refer to hosts should not even need to know that IPv4 and IPv6 are different, network programming should be easier in Pike than in C.
I am ok with the idea that Programming in Pike should be easier in Pike than in C. But if we look in DNS for ipv4/6 address and there is no DNS server responding, how can Pike should react ?
Also, I think that we should explicitly provide if we want a IPv6 only or IPv4 or both v4/v6 addresses when open a socket. This maybe be "optional", but it should exist an option for the developper to provide explictly what protocol is wanted for his own application....
/Xavier
/ Brevbäraren
Xavier Beaudouin kiwi@oav.net writes:
I am ok with the idea that Programming in Pike should be easier in Pike than in C. But if we look in DNS for ipv4/6 address and there is no DNS server responding, how can Pike should react ?
I don't quite understand this remark. If a pike program wants to look up an dns name (for instance by Stdio.File()->connect("pike.ida.liu.se", 22)), and the name can't be resolved, then connect has to fail in one way or the other. But that hasn't got much to do with IPv4/IPv6.
After reading some of the pike code (in particuler get_inet_addr, in system.c), it seems pike will only use one address. Then we're in trouble if there are two addresses, one ipv4 and one ipv6, and only one of them works. Here, an application preference could help a little. Or one could just hope or assume that the system resolver is smart enough, and for example returns the IPv4 address(es) first if we have no IPv6 connectivity. (I think there's some internet draft on address selection that the resolved is supposed to follow, and the sorting rules are quite hairy).
I think the right way to solve this is to have some function that given a name returns a list of addresses (somewhat like getaddrinfo, but perhaps with a more pikish interface). The low level connect function should accept such an address list, and try all addresses. This can be completely transparent to pike programs using the connect method (and it can get pretty fancy with asyncronous connects, as pike can try to connect to a few addresses in parallel).
And there should be a listen function that takes a list of addresses and binds and listens on all of them. This seems a little harder to do transparently, though.
Also, I think that we should explicitly provide if we want a IPv6 only or IPv4 or both v4/v6 addresses when open a socket.
One can have some option that says "I would prefer IPv6 for this socket" or "I *must* have IPv6 on this socket, if that doesn't work, the operation should fail rather than fall back to IPv4". But most programs should not need to care about that.
/Niels
Le jeudi, 24 avr 2003, à 17:13 Europe/Paris, Niels Möller a écrit :
Xavier Beaudouin kiwi@oav.net writes:
I am ok with the idea that Programming in Pike should be easier in Pike than in C. But if we look in DNS for ipv4/6 address and there is no DNS server responding, how can Pike should react ?
I don't quite understand this remark. If a pike program wants to look up an dns name (for instance by Stdio.File()->connect("pike.ida.liu.se", 22)), and the name can't be resolved, then connect has to fail in one way or the other. But that hasn't got much to do with IPv4/IPv6.
humm.. yeah you are right... but Stdio.File()->connect("127.0.0.1", 22); can pass by DNS...
Imagine such entry :
127.0.0.1.mydomain.com. IN A 10.12.1.4 IN AAAA 3ffe::dead
How can we sure that 127.0.0.1 is ipv4 not ipv6 ?
After reading some of the pike code (in particuler get_inet_addr, in system.c), it seems pike will only use one address. Then we're in trouble if there are two addresses, one ipv4 and one ipv6, and only one of them works. Here, an application preference could help a little. Or one could just hope or assume that the system resolver is smart enough, and for example returns the IPv4 address(es) first if we have no IPv6 connectivity. (I think there's some internet draft on address selection that the resolved is supposed to follow, and the sorting rules are quite hairy).
I agree.
I think the right way to solve this is to have some function that given a name returns a list of addresses (somewhat like getaddrinfo, but perhaps with a more pikish interface). The low level connect function should accept such an address list, and try all addresses. This can be completely transparent to pike programs using the connect method (and it can get pretty fancy with asyncronous connects, as pike can try to connect to a few addresses in parallel).
And there should be a listen function that takes a list of addresses and binds and listens on all of them. This seems a little harder to do transparently, though.
Yeah
Also, I think that we should explicitly provide if we want a IPv6 only or IPv4 or both v4/v6 addresses when open a socket.
One can have some option that says "I would prefer IPv6 for this socket" or "I *must* have IPv6 on this socket, if that doesn't work, the operation should fail rather than fall back to IPv4". But most programs should not need to care about that.
/Niels
Xavier Beaudouin kiwi@oav.net writes:
Imagine such entry :
127.0.0.1.mydomain.com. IN A 10.12.1.4 IN AAAA 3ffe::dead
And again, this doesn't look like an ipv6 issue, you have exactly the same problem with "search mydomain.com" in /etc/resolv.conf, and the single A record
127.0.0.1.mydomain.com. IN A 10.12.1.4
Most applications will map the string "127.0.0.1" to the ip address 127.0.0.1, but an application that tries dns first will map it to 10.12.1.4.
My answer to that is: Don't do it; that's a configuration recommended only to people who like to shoot themselves in the feet.
I think the following simple rules can be used for interpreting the string address argument to functions like Stdio.File->connect():
1. If it contains a :, then it must be a numerical IPv6 address. Convert it to an IPv6 address (or return an error if the syntax isn't right).
2. If it contains a dot, but doesn't end with one, check if it's a valid IPv4 address (and there's no need to support the old fashined variants of numerical IPv4 addresses with less than three dots). Convert it to an IPv4 address (or return an error if the syntax isn't right).
2a. (Alternative second rule) If it contains a dot, doesn't end with a dot, and the character after the last dot is a digit, then it must be a numerical IPv4 address. Convert it just as in (2).
3. Otherwise, it must be a symbolic name. Resolve it using facilities like the system's /etc/hosts file and the DNS system. The result from the resolution process determines if it's an IPv4 or IPv6 address (in general, the result is list of addresses, which need not all be of the same type).
Note that with these rules, you can try connect("127.0.0.1.", 80) to connect to the machine with the DNS name 127.0.0.1. Which will of course fail, because there's no .1 top domain ;-)
Ah, and one more thing: The second argument to connect should really be int|string, where a string is interpreted as a service name to be looked up using SRV records or the system's /etc/services file.
/Niels
Ah, and one more thing: The second argument to connect should really be int|string,
It already is. I implemented that days ago. :-)
/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)
Previous text:
2003-04-24 21:02: Subject: Re: IPv6 diff for Pike 7.5.7 (today's snapshot)
Xavier Beaudouin kiwi@oav.net writes:
Imagine such entry :
127.0.0.1.mydomain.com. IN A 10.12.1.4 IN AAAA 3ffe::dead
And again, this doesn't look like an ipv6 issue, you have exactly the same problem with "search mydomain.com" in /etc/resolv.conf, and the single A record
127.0.0.1.mydomain.com. IN A 10.12.1.4
Most applications will map the string "127.0.0.1" to the ip address 127.0.0.1, but an application that tries dns first will map it to 10.12.1.4.
My answer to that is: Don't do it; that's a configuration recommended only to people who like to shoot themselves in the feet.
I think the following simple rules can be used for interpreting the string address argument to functions like Stdio.File->connect():
- If it contains a :, then it must be a numerical IPv6 address.
Convert it to an IPv6 address (or return an error if the syntax isn't right).
- If it contains a dot, but doesn't end with one, check if it's a
valid IPv4 address (and there's no need to support the old fashined variants of numerical IPv4 addresses with less than three dots). Convert it to an IPv4 address (or return an error if the syntax isn't right).
2a. (Alternative second rule) If it contains a dot, doesn't end with a dot, and the character after the last dot is a digit, then it must be a numerical IPv4 address. Convert it just as in (2).
- Otherwise, it must be a symbolic name. Resolve it using facilities
like the system's /etc/hosts file and the DNS system. The result from the resolution process determines if it's an IPv4 or IPv6 address (in general, the result is list of addresses, which need not all be of the same type).
Note that with these rules, you can try connect("127.0.0.1.", 80) to connect to the machine with the DNS name 127.0.0.1. Which will of course fail, because there's no .1 top domain ;-)
Ah, and one more thing: The second argument to connect should really be int|string, where a string is interpreted as a service name to be looked up using SRV records or the system's /etc/services file.
/Niels
/ Brevbäraren
pike-devel@lists.lysator.liu.se