DNS Troubleshooting

Problem 

During automated provisioning by vRealise Automation, the DNS changes may fail to be picked up by the workflow executing in VRO.

Cause

Due to replication delays in your DNS infrastructure or due to caching at various points in the DNS lookup process, it may be necessary to tweak replication intervals, cache TTLs or wait times in order to successfully provision your VMs.

Affected Versions

  • All Versions


Solution


Depending on the complexity of your DNS topology you may need to make changes in different places.

They are grouped into 3 types of changes: replication timers, caching TTLs and wait timers.


Caching

Step 1. Disable caching in Java VM

There is a DNS cache built into the Java virtual machine and it does not respect DNS TTLs sent from your DNS infrastructure. It is important to disable this cache to ensure that updates or new entries are seen by workflows running in Orchestrator.

For each VRO appliance, SSH as user root to the vRO server (e.g. SSH via PuTTy)

Update the file
# sed -i.bak -e 's/networkaddress.cache.negative.ttl=10/networkaddress.cache.negative.ttl=0/' -e 's/#networkaddress.cache.ttl=-1/networkaddress.cache.ttl=0/' /usr/java/jre-vmware/lib/security/java.security

Check settings
# grep networkaddress /usr/java/jre-vmware/lib/security/java.security                                                              networkaddress.cache.ttl=0
networkaddress.cache.negative.ttl=0

Restart Orchestrator
# service vco-server restart


Step 2. Verify TTL for negative cache response (Optional)

In the case where a workflow running on the VRO appliance or another system attempts to resolve the A record or PTR record before propagation of the record, then the DNS resolver server may cache the negative response. i.e. "the record does not exist". The period of time that this negative cache will held in cache downstream is set by:

The TTL of this record is set from the minimum of the MINIMUM field of the SOA record and the TTL of the SOA itself. 

You may want to consider decreasing the SOA record minimum TTL for the DNS zone where the record is being added. For example for Windows DNS, the default for a zone is 15 minutes.

Replication

Considerations for Active Directory integrated DNS

  1. AD replication intervals, intra-site and inter-site:
    1. Consider enabling change notifications for remote sites
    2. or consider adding an additional SovLabs endpoints for each site so that DNS changes can be made local to each site
  2. Windows DNS server dspollinginterval, determines how frequently changes are read from AD integrated dns zones.

Separate Resolver and Authoritative Servers

e.g. BIND resolvers and Windows DNS for authoritative zones

  1. Tuning of negative TTL behaviour is very important:
    1. settings in the zone's SOA record
    2. minimum negative cache TTLs on intermediate caches

Wait before DNS validation

To avoid records being negatively cached at all it is possible to inject a sleep time into the provisioning process to allow the records to propagate, whilst not fail safe for anomalous propagation times (essential to disable JVM caching to improve reliability) this can be a useful first step that does not require a change to the DNS infrastructure.

Set Pre DNS Validation sleep seconds

Set the number of seconds to pause before validating DNS entries. 

  1. Login to the vRA tenant
  2. Click on the Design tab > Blueprints
  3. Hover over the desired blueprint name and click Edit
    1. Click on the vSphere machine component on the Blueprint Design Canvas
    2. Click on the Properties tab
    3. In the Custom Properties section:
      1. Click on the  New Property button
      2. Name field: Type in SovLabs_preDnsValidationSleepSeconds
      3. Value field: Type the number of seconds to sleep
      4. Click on the  button
    4. Click OK
Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.