Experimental VoIP Setup 1.1 2009-11-21


Changelog 1.0 -> 1.1:
Phone: Firmware version 1.2.2.14 -> Firmware version 1.2.2.19
Phone: Account 1: NAT Traversal (STUN): No, but send keep-alive -> No
Phone: Account 2: NAT Traversal (STUN): No, but send keep-alive -> No
Phone: Account 4: NAT Traversal (STUN): No, but send keep-alive -> No
Observation 7: Text changed
Observation A: Text changed
TODO 5: Added

Equipment:
ADSL Modem: D-Link DSL-320B, Annex A/L/M, Hardware version D1, Firmware version EU_1.00
Router: Linksys WRT54GL v1.1, Firmware version DD-WRT v24-presp2 (07/21/09) voip - build 12533
Phone: Grandstream GXP2010, Hardware version HW0.2B, Firmware version 1.2.2.19, Bootloader version 1.1.6.6
Time Switch: Conrad DCF Time Switch Everflourish EMT707RCC

Configuration + Some Info:
ADSL: Mode: Bridge mode with 1483 bridged IP LLC as the connection type
ADSL: ATM Parameter: Virtual Path Identifier (VPI): 8
ADSL: ATM Parameter: Virtual Channel Identifier (VCI): 32
ADSL: ADSL Modem IP address: Fixed address 192.168.1.1 (a modem firmware bug does not allow to change that)
ADSL: ADSL Modem Subnet Mask: 255.255.255.0
ADSL: Advanced ADSL Settings: Modulation Type: Autosense
ADSL: Advanced ADSL Settings: Capability Bitswap Enable: Yes
ADSL: Advanced ADSL Settings: Capability SRA Enable: No
ADSL: Status Info: Downstream rate: 3004 Kbps
ADSL: Status Info: Upstream rate: 317 Kbps
Router: WAN connection type: PPPoE
Router: WAN IP address type: Variable Public Class A Address
Router: WAN MTU: Manual 1492
Router: LAN IP address: Fixed address 10.xxx.yyy.15
Router: Running: DHCP, NTP, WLAN, DNSMasq, Syslogd, Milkfish, SPI Firewall, QoS, HTTP, Cron
Router: Setup: Advanced Routing: Operating Mode: Gateway
Router: Wireless: Basic Settings: Wireless Mode: AP
Router: Wireless: Basic Settings: Wireless Network Mode: G-Only
Router: Wireless: Basic Settings: Network Configuration: Bridged
Router: Wireless: Wireless Security: Security Mode: WPA2 Personal
Router: Wireless: Wireless Security: WPA Algorithms: AES
Router: Services: Services: DNSMasq: Enabled
Router: Services: Services: Local DNS: Disabled
Router: Milkfish: Main Switch: Enabled
Router: Milkfish: From-Substitution: Yes
Router: Milkfish: From-Domain: <dynsipUserName>.dynsip.org
Router: Milkfish: Milkfish Username: <dynsipUserName>
Router: Milkfish: Milkfish Password: <dynsipPassWord>
Router: Milkfish: SIP Trace: Disabled
Router: Milkfish: Dynamic SIP: Enabled
Router: Milkfish: SIP Database: Local Subscribers: User 1: sip:<numericTerrasipIdent>@terrasip.net
Router: Milkfish: SIP Database: Local Subscribers: User 2: sip:<nonnumericBluesipIdent>@bluesip.net
Router: Milkfish: SIP Database: Local Subscribers: User 3: <internalPhoneNumber>
Router: Milkfish: SIP Database: Local Subscribers: Password 1: <terrasipPassWord>
Router: Milkfish: SIP Database: Local Subscribers: Password 2: <bluesipPassWord>
Router: Milkfish: SIP Database: Local Subscribers: Password 3: <internalPassWord>
Router: Milkfish: Advanced DynSIP Settings: DynSIP Domain: <dynsipUserName>.dynsip.org
Router: Milkfish: Advanced DynSIP Settings: DynSIP Update URL: dynsip.org/nic/update
Router: Milkfish: Advanced DynSIP Settings: DynSIP Username: <dynsipUserName>
Router: Milkfish: Advanced DynSIP Settings: DynSIP Password: <dynsipPassWord>
Router: Security: Firewall: SPI Firewall: Enabled
Router: NAT/QoS: Static Port Forwarding: None
Router: NAT/QoS: UPnP: Off
Router: NAT/QoS: QoS: Start QoS: Enabled
Router: NAT/QoS: QoS: Port: WAN
Router: NAT/QoS: QoS: Packet Scheduler: HTB
Router: NAT/QoS: QoS: Uplink: 150 kbps (well-adjusted, critical influence on the audio quality)
Router: NAT/QoS: QoS: Downlink: 2500 kbps (in fact, this value does not reduce the download rate of 3000 kbps)
Router: NAT/QoS: QoS: Optimize for Gaming: No
Router: NAT/QoS: QoS: Service Priority ntp: Premium (currently important)
Router: NAT/QoS: QoS: Service Priority rtp: Premium (talks are possible, even under load)
Router: NAT/QoS: QoS: Service Priority sip: Premium (phone remains registered, even under load)
Router: NAT/QoS: QoS: MAC Priority Grandstream GXP2010 Phone: Premium
Router: NAT/QoS: QoS: Ethernet Port Priority Port1 ... Port4: Exempt, 100M
Router: Admin: Management: Router Management: Overclocking: 216 MHz
Router: Admin: Keep Alive: Schedule Reboot: At a set Time: <routerRebootTime> Everyday
Router: Admin: Commands: Firewall: Line 1: ifconfig vlan1:0 192.168.1.15 netmask 255.255.255.0 (modem access)
Router: Admin: Commands: Firewall: Line 2: iptables -t nat -I POSTROUTING -o vlan1 -d 192.168.1.15/24 -j MASQUERADE (modem access)
Phone: Settings: IP address: Fixed address 10.xxx.yyy.71 (but via DHCP)
Phone: Settings: Time Zone: GMT+1:00
Phone: Settings: Daylight Savings Time Rule: 3,-1,7,2,0;10,-1,7,2,0;60 (proved)
Phone: Settings: Local RTP port: 5004
Phone: Settings: Keep-alive interval: 20 seconds
Phone: Settings: Use NAT IP: <empty>
Phone: Settings: STUN server: <empty>
Phone: Account 1: Account Active: Yes
Phone: Account 1: SIP Server: terrasip.net
Phone: Account 1: Outbound Proxy: <hostname.to.10.xxx.yyy.15>
Phone: Account 1: SIP User ID: <numericTerrasipIdent>
Phone: Account 1: Authenticate ID: <numericTerrasipIdent>
Phone: Account 1: Authenticate Password: <terrasipPassWord>
Phone: Account 1: Name: <givenNameSurName>
Phone: Account 1: Use DNS SRV: Yes
Phone: Account 1: SIP Registration: Yes
Phone: Account 1: Unregister On Reboot: Yes
Phone: Account 1: Register Expiration: 60 minutes
Phone: Account 1: Local SIP port: 5060
Phone: Account 1: SIP Registration Failure Retry Wait Time: 120 seconds (router reboot needs 70 seconds)
Phone: Account 1: NAT Traversal (STUN): No
Phone: Account 1: Send DTMF: via RTP (RFC2833)
Phone: Account 1: SRTP Mode: Disabled
Phone: Account 2: Account Active: Yes
Phone: Account 2: SIP Server: bluesip.net
Phone: Account 2: Outbound Proxy: <hostname.to.10.xxx.yyy.15>
Phone: Account 2: SIP User ID: <nonnumericBluesipIdent>
Phone: Account 2: Authenticate ID: bluesip/<nonnumericBluesipIdent>
Phone: Account 2: Authenticate Password: <bluesipPassWord>
Phone: Account 2: Name: <givenNameSurName>
Phone: Account 2: Use DNS SRV: Yes
Phone: Account 2: SIP Registration: Yes
Phone: Account 2: Unregister On Reboot: Yes
Phone: Account 2: Register Expiration: 60 minutes
Phone: Account 2: Local SIP port: 5062
Phone: Account 2: SIP Registration Failure Retry Wait Time: 120 seconds (router reboot needs 70 seconds)
Phone: Account 2: NAT Traversal (STUN): No
Phone: Account 2: Send DTMF: via RTP (RFC2833)
Phone: Account 2: SRTP Mode: Disabled
Phone: Account 3: Account Active: No
Phone: Account 4: Account Active: Yes
Phone: Account 4: SIP Server: <hostname.to.10.xxx.yyy.15>
Phone: Account 4: Outbound Proxy: <empty>
Phone: Account 4: SIP User ID: <internalPhoneNumber>
Phone: Account 4: Authenticate ID: <internalPhoneNumber>
Phone: Account 4: Authenticate Password: <internalPassWord>
Phone: Account 4: Name: <givenNameSurName>
Phone: Account 4: Use DNS SRV: Yes
Phone: Account 4: SIP Registration: Yes
Phone: Account 4: Unregister On Reboot: Yes
Phone: Account 4: Register Expiration: 60 minutes
Phone: Account 4: Local SIP port: 5066
Phone: Account 4: SIP Registration Failure Retry Wait Time: 120 seconds (router reboot needs 70 seconds)
Phone: Account 4: NAT Traversal (STUN): No
Phone: Account 4: Send DTMF: via RTP (RFC2833)
Phone: Account 4: SRTP Mode: Disabled

Timetable:
At 0202 hours: Phone is switched off
At 0317 hours: Router reboots
At 0432 hours: Phone is switched on

Observations:
1. Starting from some "Cold start point", the system at all behaves occasionally as expected.
2. The WAN interface traffic shaping seems to work fine.
3. With respect to the audio quality, the Grandstream GXP2010 is very good.
4. The quality of the audible communication suffers extremely, when the router's WLAN interface
   gets under load. It seems that the router CPU is too slow to deal with both VoIP and WPA2/AES
   key calculations at the same time.
5. If the communication partner runs Linphone on a notebook connected via UMTS 3G,
   the overall communication quality varies between good and unserviceable, whereat
   one gains the impression that the codecs PCMU and PCMA give better results then all
   the others being available, when the packet loss rate is very high.
6. From the compatibility point of view, Terrasip seems to be an excellent SIP provider.
   The Grandstream phone in debug mode yields strings as "TerraSIP Advanced Router 1.0.8"
   and "X-Asterisk-HangupCauseCode" (2009-11-12).
7. By means of switching off the inauspicious Grandstream phone keep-alive activities, the
   overall behavior with respect to the Bluesip RTP streams completely changed. The system
   has been running untouched for several days. During that time period, every even day, all
   incoming calls to the Bluesip account were successfully operated. In contradiction, every
   odd day, all incoming calls to the Bluesip account malfunctioned in the sense that the
   caller could speak and listen while the callee could speak but not listen. I.e., the
   RTP stream from Bluesip did not arrive at the Milkfish proxy on our router.
   In the good cases, syslog entries as
     rtpproxy[7999]: DBUG:handle_command: received command "UAEI ...@bluesip.net 217.74.179.28 28004 ...;1"
     rtpproxy[7999]: INFO:handle_command: new session ...@bluesip.net, tag ...;1 requested, type strong
     rtpproxy[7999]: INFO:handle_command: new session on a port 48394 created, tag ...;1
     rtpproxy[7999]: INFO:handle_command: pre-filling caller's address with 217.74.179.28:28004
     rtpproxy[7999]: DBUG:doreply: sending reply "48394 10.xxx.yyy.15#012"
     /usr/sbin/openser[9999]: ERROR: extract_body: message body has length zero
     /usr/sbin/openser[9999]: ERROR: force_rtp_proxy2: can't extract body from the message
   appeared. In the opposite case, such ones as
     /usr/sbin/openser[9999]: ERROR: force_rtp_proxy2: no available proxies
     /usr/sbin/openser[9998]: ERROR: extract_body: message body has length zero
     /usr/sbin/openser[9998]: ERROR: force_rtp_proxy2: can't extract body from the message
   appeared. On the other hand, Terrasip-related phone calls were successfully operated all the
   time. The temporal behavior of the system suggests the existence of a DNS problem. Unfortunately,
   it is not clear here what the software around Dynsip exactly does. Because of the From-Substitution
   in combination with an existing DNS A-record, the server dynsip.org attracts all the SIP traffic
   from the SIP provider, playing the role of a SIP relay. So, the provider has a constant partner
   to communicate with, fine. I can only guess that, if Dynamic SIP is enabled in Milkfish on our
   router, our router also uses the server dynsip.org as a SIP relay, but the latter is not sure.
   But, what about the RTP streams ? The Milkfish RTP proxy sends directly to the provider, and,
   the RTP stream from the provider will not disembark at dynsip.org, too, for performance reasons.
   The provider will also send its RTP stream to the Milkfish RTP proxy directly. And now the question.
   How can dynsip.org help us to pass the change of our router's WAN IP address ? No idea ! Why ?
   As the experiences show, dealing with dynamic IP addresses, one has to solve two problems. Firstly,
   the current IP address has to be provided, and, secondly, one has to care for caching effects. The
   second point is less easy to tackle. Here caching typically happens outside of our scope. Caching
   is a DNS behavior, and it has therefore to be treated by means of DNS entries.
   We should try Dyndns, there is no doubt whether Dyndns provides first-class services.
   The concept with Dyndns is quite clear. Let us take a look.
     $ nslookup
     >
     >
     > server 204.152.184.76
     Default server: 204.152.184.76
     Address: 204.152.184.76#53
     >
     >
     > set querytype=a
     >
     >
     > <dyndnsHostName>.dyndns.org
     Server:         204.152.184.76
     Address:        204.152.184.76#53
     Non-authoritative answer:
     Name:   <dyndnsHostName>.dyndns.org
     Address: {routrWanAdr}
     >
     >
     > set querytype=soa
     >
     >
     > <dyndnsHostName>.dyndns.org
     Server:         204.152.184.76
     Address:        204.152.184.76#53
     Non-authoritative answer:
     *** Can't find <dyndnsHostName>.dyndns.org: No answer
     Authoritative answers can be found from:
     dyndns.org
             origin = ns1.dyndns.org
             mail addr = hostmaster.dyndns.org
             serial = 973902183
             refresh = 600
             retry = 300
             expire = 604800
             minimum = 600
     >
     >
     > ^D
     $
   According to my understanding are answers, belonging to <dyndnsHostName>.dyndns.org queries,
   associated with 10 minutes TTL here. If we furthermore read http://en.wikipedia.org/wiki/Time_to_live,
   under the headline "DNS records", it is clear what is going on. With 10 minutes TTL, Bluesip
   will not task the right to cache anything for one day as it is the customary default for
   static addresses, because it is explicitly said that the answer of the address query belonging
   to <dyndnsHostName>.dyndns.org cannot be assumed to be true after a little while.
8. Some providers have problems to resolve ENUM entries, some not. Deutsche Telekom resolves,
   for example, Bluesip ENUM entries of the form +4989... without any problem. In Germany, one has
   to dial 089..., as usual. Vodafone, Arcor, and Bluesip behave similarly. Performing a call to
   such an ENUM entry from Spain, operated by Telefónica de España, also works well where one will,
   of course, always dial 004989... In this case the question is who manages the call for Telefónica
   in Germany, i.e., who does its job par excellence with respect to that topic.
   1&1, O2 et cetera also resolve the Bluesip ENUM entries +4989... without difficulty,
   whereat some providers need 004989... while others content themselves with 089...
   On the other hand, Sipgate is a candidate which often fails to resolve the above ENUM entries.
   In many cases, the first try is unsuccessful while a second or third one works. Sipgate seems
   to accept both, the 089... and the 004989... dial-in variants (2009-11-12). Furthermore, the
   operator in Germany, working for the operator belonging to www.espantel.com, does not perform
   queries at all to resolve a Bluesip ENUM entry +4989... in Germany. So, it is seen that ENUM
   is still an issue.
9. Next are five selected and anonymized syslog entries as they have been generated by the
   Grandstream phone to file the SIP traffic belonging to an incoming call.
     a) INVITE sip:{userIdent}@10.xxx.yyy.71:5062;transport=udp SIP/2.0
          Record-Route: <sip:10.xxx.yyy.15;r2=on;ftag={tag1};lr=on>
          Record-Route: <sip:{routrWanAdr};r2=on;ftag={tag1};lr=on>
          Record-Route: <sip:{sipProviderAddressA};ftag={tag1};lr=on>
          Via: SIP/2.0/UDP 10.xxx.yyy.15;branch={branch1}
          Via: SIP/2.0/UDP {sipProviderAddressA};branch={branch2}
          Via: SIP/2.0/UDP {sipProviderAddressB}:5060;branch={branch3};rport=5060
          User-Agent: SIP provider PSTN GW
          ...
     b) SIP/2.0 200 OK
          Via: SIP/2.0/UDP 10.xxx.yyy.15;branch={branch1}
          Via: SIP/2.0/UDP {sipProviderAddressA};branch={branch2}
          Via: SIP/2.0/UDP {sipProviderAddressB}:5060;branch={branch3};rport=5060
          Record-Route: <sip:10.xxx.yyy.15;r2=on;ftag={tag1};lr=on>
          Record-Route: <sip:{routrWanAdr};r2=on;ftag={tag1};lr=on>
          Record-Route: <sip:{sipProviderAddressA};ftag={tag1};lr=on>
          User-Agent: Grandstream GXP2010 1.2.2.14
          ...
     c) ACK sip:{userIdent}@10.xxx.yyy.71:5062;transport=udp SIP/2.0
          Record-Route: <sip:10.xxx.yyy.15;r2=on;ftag={tag1};lr=on>
          Record-Route: <sip:{routrWanAdr};r2=on;ftag={tag1};lr=on>
          Via: SIP/2.0/UDP 10.xxx.yyy.15;branch=0
          Via: SIP/2.0/UDP {sipProviderAddressA};branch=0
          Via: SIP/2.0/UDP {sipProviderAddressB}:5060;branch={branch4};rport=5060
          Route: <sip:{routrWanAdr};r2=on;ftag={tag1};lr=on>,
                 <sip:10.xxx.yyy.15;r2=on;ftag={tag1};lr=on>
          User-Agent: SIP provider PSTN GW
          ...
     d) BYE sip:{phoneNumber}@{sipProviderAddressB} SIP/2.0
          Via: SIP/2.0/UDP 10.xxx.yyy.71:5062;branch={branch5}
          Route: <sip:10.xxx.yyy.15;r2=on;ftag={tag1};lr=on>
          Route: <sip:{routrWanAdr};r2=on;ftag={tag1};lr=on>
          Route: <sip:{sipProviderAddressA};ftag={tag1};lr=on>
          User-Agent: Grandstream GXP2010 1.2.2.14
          ...
     e) SIP/2.0 200 OK
          Via: SIP/2.0/UDP 10.xxx.yyy.71:5062;branch={branch5}
          Record-Route: <sip:{routrWanAdr};r2=on;ftag={tag2};lr=on>
          Record-Route: <sip:10.xxx.yyy.15;r2=on;ftag={tag2};lr=on>
          User-Agent: SIP provider PSTN GW
          ...
   Remember, not all the records of the SIP dialogue are filed here, and the records
   are truncated somehow. It is seen that in a), b), c), and d) all the lines containing
   the substring {routrWanAdr} should better disappear. The Grandstream phone does not
   have to know {routrWanAdr}. In e), it seems that anything went wrong. The line
   with the substring 10.xxx.yyy.15 should appear before the line with {routrWanAdr}, and,
   a question could be where the "200 OK" message has come from. Nevertheless, it seems that
   in SIP messages UUCP-style bang paths are applied, being a pity.
   The following fragment shows, just as foot for thought, how a brute-force address
   substitution could be performed, based on regexps, not too nice, of course.
     cat fileNameIn                                                                                          \
       | sed -e 's![:=@ ]0*AAA\.0*BBB\.0*CCC\.0*DDD[; :]!{{{{{&}}}}}!g'                                      \
             -e 's,{{{{{.,&{{{{{,g'                                                                          \
             -e 's,.}}}}},}}}}}&,g'                                                                          \
             -e 's,{{{{{[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*}}}}},{{{{{aaa.bbb.ccc.ddd}}}}},g' \
             -e 's,{{{{{,,g'                                                                                 \
             -e 's,}}}}},,g'                                                                                 \
             -e 's![:=@ ]0*AAA\.0*BBB\.0*CCC\.0*DDD$!{{{{{&}}}}}!'                                           \
             -e 's,{{{{{.,&{{{{{,'                                                                           \
             -e 's,{{{{{[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*}}}}},{{{{{aaa.bbb.ccc.ddd}}}}},'  \
             -e 's,{{{{{,,g'                                                                                 \
             -e 's,}}}}},,' > fileNameOut
   AAA.BBB.CCC.DDD and aaa.bbb.ccc.ddd are the two essential IP addresses of the router.
   The procedure is quite easy and there is a small risk to forget anything to rewrite.
   Another record from the REGISTER dialogue, "200 OK", shows that there was no rewriting, too.
     SIP/2.0 200 OK
       Via: SIP/2.0/UDP 10.xxx.yyy.71:5062;branch=...
       CSeq: 20011 REGISTER
       Contact: <sip:{userIdent}@{routrWanAdr}:5060;transport=udp>;q=0.5;expires=3600
       Warning: 392 {sipProviderAddressB}:5060 "Noisy feedback tells: ... req_src_ip={routrWanAdr} ..."
       ...
   And, it is seen that the provider stores the router's WAN address, which is also not the best sign.
A. 2009-11-13, 2009-11-15: The following message sequences appeared in the syslog:
     /usr/sbin/openser[9999]: ERROR: sip_msg_cloner: cannot allocate memory
     /usr/sbin/openser[9999]: ERROR: new_t: out of mem:
     /usr/sbin/openser[9999]: ERROR: t_newtran: new_t failed
     /usr/sbin/openser[9999]: ERROR: sl_reply_error used: I'm terribly sorry, server error occurred (1/SL)
   As a consequence, the following Grandstream phone SIP registration attempts failed with 500.
   Up to now, the problem appeared never again.

TODO:
1. Setup a second WRT54GL v1.1 as a WLAN AP bridge, allowing to disable WLAN on the main router.
   This will probably solve the performance problem described under observation 4, and,
   probably the memory problem described under observation A, too.
2. Ask the people who setup the server dynsip.org whether the above configuration is compatible
   with their ideas how to use dynsip.org.
3. Ask the people who run the proxy dynsip.org for some useful debugging on dynsip.org, to find
   out the reason why the RTP streams do not arrive at the Grandstream phone, as described under
   observation 7.
4. Ask in the forum whether it is believed that there might be any memory leak, or, the Linksys
   WRT54GL v1.1 with its 16 MB of RAM is somewhat too small for all that above.
5. Check out Dyndns instead of Dynsip.

Sat, 21 Nov 2009 22:10:19 +0100
Stephan K.H. Seidl