Sei sulla pagina 1di 124

12/6/2016

Vantio CacheServe 7.1


Nominum Caching Name Server

20161205-07:20

Course Overview

1. Introduction to DNS and the Caching Name Server


2. Basic CacheServe configuration using nom-tell
3. Operations Topics
– Statistics and Understanding Resource use
– Rate-Limiting, Events and the SNMP Agent
– Cache Poisoning: Vulnerabilities and Defenses
4. RTV and RTA (Real-Time Visibility/Alerts)
5. Precision Policies
– resolution customization
– rate-limiting against amplification attacks
6. Special Topics
– ECS and Equivalence Classes
– IPv6 and DNS64 support
– Perl API for Command Channel

1
12/6/2016

1. Introduction to CacheServe

• DNS refresher
– Model and Implementation
• Key/Value Database
– Key: Domain Name and Type
– Value: Resource Record Data
• Distributed and Hierarchical
– Using dig
• Caching and its place in the DNS
• Performance and Security

Types of DNS Resource Records

Type Descriptive Name Example of RDATA

A IPv4 address 192.168.0.1

AAAA IPv6 address 2001:4860:4001:801::1012

PTR Pointer (Reverse) c-98-234-218-128.hsd1.ca.comcast.net.

CNAME Canonical Name www-cctld.l.google.com.

MX Mail Exchanger 20 mx3.correodeempresas.telefonica.es.

NS Name Server nsjc8hos01.telefonica-data.com.

SOA Start of Authority nsjc8hos01.telefonica-data.com.


dnsadmin.tsai.es. 2012031502
86400 7200 2592000 300

2
12/6/2016

DNS Name Space

""

gov edu us fr arpa


com

nominum att stanford princeton ca nj in-addr e164

bug www ddns

Domain Name is a sequence of labels:


udp mypc www.nominum.com.
Domain

nominum.com
Zone is domain info controlled by a name server:
ddns.nominum.com.
Domain is node and all descendants: Zones are distinguished by SOA records
nominum.com.
5

Resolving a domain name


Source: http://www.ripe.net/training/dnssec/material/slides/page7.htm

3
12/6/2016

Functions of Name Servers

• Two fundamentally different activities


– Authoritative Service is data publishing
– Caching Service is data fetching

Overview of CacheServe

• High performance caching-only name server


– low latency
– high throughput
• Built for mission-critical operations with
Nominum proprietary code
– efficient
– secure
• Supports custom policy actions which leverage
DNS as a control point (precision policies)

4
12/6/2016

Typical Deployment Model

Vantio
Name Name
Cache
Server
Serve
AuthServe
Server

 Caching  Authoritative
– Handles lookups from – Performs no recursion
(“stub resolver”) clients – Data maintained locally
– Gets data from – “Updates” handled by local
Authoritative servers admin, AXFR/IXFR or
– “Updates” handled by TTL DDNS 9

Conceptual Foundations
• Configuration
– Server
– View & View-
Selector
– Resolver
• Cache Inbound In- Outbound
query memory
– Contents processing Cache
lookup
processing
– Size
• Diagnostics
– Statistics
– Logging Vantio CacheServe

5
12/6/2016

Server Communication: Command


Channel
• When a Nominum server process starts up, it begins
listening on a TCP port for commands.
• This is the processes’ Command Channel (CC).
• Nominum programs that listen on a CC include CacheServe,
AuthServe, the Statistics Monitor (statmon), the SNMPAgent,
and the Nanny.

AuthServe CacheServe
process process
listening on listening on
TCP/9253 TCP/9434

Command Channel Usage

• Programs communicate with each other through CCs.


• For example, an SNMP Agent sending to AuthServe or
CacheServe.
• The CC is also used by the main Nominum administration
tool, nom-tell, to configure servers.
• Additionally, Nominum provides a CC SDK which supports
scripting with Perl,Python, and Java.
• The SDK is downloaded separately.

6
12/6/2016

Command Channel Usage

If the serve (e.g. AuthServe or CacheServe) is not running, using its CC to access or
modify the configuration, is not possible.

With the tools ans_dumpconf, ans_editconf, cacheserve-dumpconf &


cacheserve-editconf access to a stopped server’s configuration is supported.

CacheServe AuthServe

# nom-tell cacheserve # nom-tell ans


cacheserve> version SNMP ans> version SNMP
Agent Agent
CacheServe 7.1.1.0 ANS 5.4.3

import com.nominum.cc import com.nominum.cc

Command Channel Security

• For security, by default, CCs only listen on 127.0.0.1.


• They can be configured to listen on any address.
• For both local and remote communication, a server requires
a matching shared secret before accepting commands.
• A shared secret can be provided as a
command line argument, but are most
commonly read from the Nominum
file: /etc/channel.conf Vantio
CacheServe

# ls -lg /etc/channel.conf
-rw------- 1 root 324 Jun 7 11:06 /etc/channel.conf

# nom-tell cacheserve
cacheserve> version SNMP
Agent
CacheServe 7.1.1.0

import com.nominum.cc

7
12/6/2016

/etc/channel.conf

• When a Nominum server is installed, it appends an entry to:


/etc/channel.conf
# grep 'cacheserve ' /etc/channel.conf
cacheserve 9434 fQQwjICextoJOSOh/ekj8JxWvylNf8wG2THIVExP6+KsVofE

The TCP port. The shared secret is a text


string. It can be set to anything.

The service name. Both servers and clients read this file when starting. They
discover the port (and for remote access, the IP address) as well as the shared
secret.

The service name can be changed.

If a host where nom-tell is used to connect to several remote CacheServes,


the service names in /etc/channel.conf would need to be unique. For
example, CS7-1, CS7-2, etc.

BIND Configuration Overview

BIND configuration listen-on 1.2.3.4

– 1 or more text view "world" {


files match-clients

• Global statements
zone1
hints

• View-specific
root.db root.db

statements
zone2
forward

– Options for view


forwarders
zone file
zone3
– Zone definitions stub
stub-config

– Zone configuration }

• Hints in zone file named.conf

8
12/6/2016

CacheServe Configuration
Overview

CacheServe configuration:
– 1 logical “database” vdb2
– Several disk files cacheserve
• Server object
– unique vdb2

• Resolver object
– Cache and resolution instructions
• Use nom-tell to inspect and configure

CacheServe Features
• Multi-core support
• Cache
– Read/Control the cache (inspect, dump / flush)
– Shared cache with “resolver” object
• Customized resolution with Precision Policies
– Rate-Limit or black-hole clients
– Drop specific queries (mitigate DNS amplification attack)
– Manipulate answers with preferred address sorting
• Layered Resistance to attack
– Glue Segregation
– Conservative caching
– Spoofing defense
• window contraction
• Attack avoidance
• Statistics
– Server/Resolver levels
– Real Time Visibility (aka RTV or Querystore)
• Events, Real Time Alerts (aka Querythreshold) and SNMPAgent
• DNSSEC support

9
12/6/2016

Summary

• Described motivation for caching-only server


• Introduced Vantio CacheServe
– Internals
– Associated Systems
• Outlined features

2. Basic CacheServe Operation

• CacheServe configuration basics


• Using nom-tell
• Out-of-the-box behavior
• Global and more specific statements
• How to start and stop CacheServe

10
12/6/2016

Unpacking

• CacheServe is distributed in the local package format


of an operating system. (e.g. RPMs for Red Hat.)
– A tar file contains the package files, READMEs, etc.
• Packages:
Install first.
– Nominum utilities
– Nominum TimeZone Data
– CacheServe
– Optional to install: Nanny, SNMP Agent, Statmon
• Read instructions in the INSTALL file.
• Complete documentation shipped in PDF format.

Installing on Red Hat

• As root, use rpm to add the Nominum Utilities,


TimeZone, CacheServe and optional packages:
# rpm -ivh nomutils-X.Y-nn.rpm
# rpm -ivh nom-timezone-data-X.Y-nn.rpm
# rpm -ivh cacheserve-X.Y-nn.rpm

X.Y are current version numbers


nn is Nominum’s build number

11
12/6/2016

Key Files & Directories

• Everything for CacheServe is found under the


directories:
/etc/ /usr/local/nom/ /var/nom/cacheserve/

/etc/channel.conf The default location of the


/etc/init.d/cacheserve CacheServe database files.

/usr/local/nom/etc/cacheserve.license the license file.


/usr/local/nom/etc/sysconfig/cacheserve the file
contains arguments for starting cacheserve.

sbin/ the cacheserve executable is here.


man/ the man pages are installed here.

Further directories found here are more important for other


Nominum servers.

License File

• Product and features encoded in key


• Lifetime determined by expiration date
• Create /usr/local/nom/etc/cacheserve.license
# cat cacheserve.license
product = cacheserve
customerid = 306
reqid = 11
created = "2016-03-25 15:43:26"
customer_name = "Nominum Training"
expires = "2016-07-31 23:59:59"
limits = ((concurrency 2))
uuid = "daefcbf2-a424-4e2e-84e2-e1081ddbde41"
--
CCgbsBYoCsGkWWqAB+8dftDhKW1pWB5ZzPrPbdsbaMKbyVqF1y9T1gc=

12
12/6/2016

AuthServe & CacheServe Activities:


Syslog

• Activities in AuthServe & CacheServe:


• are logged in syslog.
• can appear over a CC (covered elsewhere).
• can be sent as SNMP traps (elsewhere).
• Default: syslog messages land in: /var/log/messages
# tail -f /var/log/messages
AuthServe example.
Aug 23 16:52:16 CentOS6 ANS[1351]: info: default/p2.nominum.com (master): added
Aug 23 16:52:16 CentOS6 ANS[1351]: info: default/p2.nominum.com (master): modified (content)
Aug 23 16:52:16 CentOS6 SNMPAgent[1368]: warning: nom_splaytree_insert: exists
Aug 23 16:52:16 CentOS6 SNMPAgent[1368]: warning: nom_splaytree_insert: exists
Aug 23 16:54:05 CentOS6 ANS[1351]: info: default/non.existant.example.org (<none>): added
Aug 23 16:54:13 CentOS6 ANS[1351]: error: maintenance: default/non.existant.example.org (192.0.2.9#53): too many SOA
query retransmits
<OUTPUT SUPRESSED>

High Availability: nanny

• Nominum servers are designed for high availability


and should not crash.
• As a backup, a watchdog system monitors and
restarts a process should it crash.
• The nanny is an optional independent watchdog
process for all Nominum servers.

nanny

Auto-Nanny

Other
cacheserve statmon snmpagent Nominum
processes

13
12/6/2016

High Availability: Auto-Nanny

• A newer watchdog, the auto-nanny, is built into the


CacheServe process.
• Currently, it is standard to use the classic nanny to
start CacheServe, which runs the auto-nanny, so
there is a double watchdog.

nanny

Auto-Nanny

Other
cacheserve statmon snmpagent Nominum
processes

Nanny Operations

• Both nanny systems work as a parent process for the


process(es) they are watchdogging.
• If a child exits with a non-zero status, the nanny
restarts the process.
# ps -ef | egrep "nanny|cacheserve " | grep -v egrep
root 9000 11810 0 03:42 ? 00:00:00 /usr/local/nom/sbin/cacheserve -F
root 9003 9000 0 03:42 ? 00:00:00 /usr/local/nom/sbin/cacheserve -F
root 11810 1 0 Jun07 ? 00:00:00 nom-nanny: nanny (running)

Process 11810 is the classic nom-nanny.


11810 started cacheserve, process 9000.
9000 took on the roll of the auto-nanny, and started 9003,
the actual cacheserve process working as a DNS server.

Should 9000 exit, 11810 will be notified.


Should 9003 exit, 9000 will be notified.

14
12/6/2016

Starting The Nanny

• With a standard installation, both the Nanny, and


server (CacheServe, AuthServe), begin automatically
on boot.
• If they are not running, such as just after installation,
they can be manually started.
# service nanny start
Starting nanny: /usr/local/nom/sbin/nom-nanny: info: listening for commands on
127.0.0.1#9449
<OUTPUT SUPRESSED>

# ps -ef | egrep "nanny|cacheserve " | grep -v egrep


root 9523 1 0 06:11 ? 00:00:00 nom-nanny: nanny (running)

Starting CacheServe

• The recommended procedure for running


CacheServe is under the nanny.
• The CacheServe startup script detects if the nanny is
running.
• CacheServe starts properly when the nanny is, and is
not, running.
# service cacheserve start
Starting cacheserve: [ OK ]

# ps -ef | egrep "nanny|cacheserve " | grep -v egrep


root 9523 1 0 06:11 ? 00:00:00 nom-nanny: nanny (running)
root 9555 9523 0 06:16 ? 00:00:00 /usr/local/nom/sbin/cacheserve -F
root 9558 9555 2 06:16 ? 00:00:00 /usr/local/nom/sbin/cacheserve -F

15
12/6/2016

Stopping CacheServe

• Running servers under the nanny has the advantage


that shutting the nanny down, stops all the servers.
# ps -ef | egrep '9673|cacheserve' | grep -v egrep
root 9673 1 0 06:21 ? 00:00:00 nom-nanny: nanny (running)
root 9687 9673 0 06:21 ? 00:00:00 snmpagent: subagent (running)
root 9703 9673 0 06:21 ? 00:00:00 cacheserve-statmon: running
root 9727 9673 0 06:21 ? 00:00:00 /usr/local/nom/sbin/cacheserve -F
root 9730 9727 0 06:21 ? 00:00:00 /usr/local/nom/sbin/cacheserve -F

# service nanny stop


Stopping nanny: [ OK ]

# ps -ef | egrep '9673|cacheserve' | grep -v egrep


Process 9673 is the nom-nanny,
# and parent of the cacheserve,
statmon and snmpagent
processes.

nom-tell

• nom-tell is the main tool for administrating Nominum servers.


• It has an interactive and a non-interactive mode.
• It is similar to BIND’s rndc, but offers more features and has
a far more capable interactive mode.
• It takes a CC service name to
find the process to connect Three examples of running
with. nom-tell. The first two are
interactive, the last non-interactive.
# nom-tell cacheserve
nom-tell 16.1.0.0, interactive mode Note that the interactive prompt
matches the CC service name from
cacheserve>
the command.

# nom-tell statmon The command was originally


nom-tell 16.1.0.0, interactive mode known as nom_tell, but has
statmon>
been changed to nom-tell.

Currently, both command


# nom-tell snmpagent process-information
<OUTPUT SUPRESSED>
names are supported.

16
12/6/2016

Using nom-tell

• Simple instructions over Command


Channel
– version
– process-information
– stop

• Modify all aspects of configuration


<object>.<method> field=value

Fields can be listed in any order


Incremental syntax (+=) appends list items

Running nom-tell Non-Interactively


nom-tell can be used from the
command line non-interactively
# nom-tell cacheserve version by providing full commands. (If
request: the “n” was left off “version”, the
{ command would fail.)
type => 'version'
} It is useful for scripting, and for
output that can be piped into
response: command line filters (e.g. grep,)
{ but it is challenging to use by
type => 'version' hand.
vendor => 'Nominum'
product => 'Vantio CacheServe'
platform => 'rhel-6-x86_64' To reduce the output, specific
version => '7.1.0.1' fields can be selected.
build => '0'
expiration => 'Sun Jul 31 16:59:59 2016'
The examples are shown with
}
CacheServe.
nom-tell works identically with
# nom-tell -F vendor cacheserve version
AuthServe and other Nominum
Nominum
products.

17
12/6/2016

Running nom-tell Interactively

nom-tell cacheserve
# nom-tell cacheserve
nom-tell 3.1.1.1, interactive mode without providing a command,
starts interactive mode.
cacheserve>
cacheserve> By default, the prompt is
cachserve>
cacheserve> version
{
type => 'version' A command is always repeated
vendor => 'Nominum' as “type” as part of the output.
product => 'Vantio CacheServe'
platform => 'rhel-6-x86_64'
version => '7.1.0.1'
build => '0'
expiration => 'Sun Jul 31 16:59:59 2016'
}
Command line use is most
commonly interactive.

CC Polling Command: version

# nom-tell cacheserve version The commands available with


request: nom-tell are what the specific
{ server accepts.
type => 'version'
} version is common to all
servers.
response:
{
type => 'version'
vendor => 'Nominum'
product => 'Vantio CacheServe'
platform => 'rhel-6-x86_64'
version => '7.1.0.1'
build => '0'
expiration => 'Sun Jul 31 16:59:59 2016'
}

18
12/6/2016

CC Polling Command:
process-information
process-information is common
cacheserve> process-information to all servers, but the output is
{ product specific.
type => 'process-information'
arguments => ('/usr/local/nom/sbin/cacheserve' '-F')
pid => '8715'
current-time => '1465554112.603118'
start-time => '1465551870.080224' The command line arguments
host-name => 'training1.nominum.com' that started the server, and the
working-directory => '/var/nom/cacheserve' process identifier.
node-id => 'dafff0c3-054b-5d19-b994-4d23fe5d70f2'
license => {

The location of the database


product => 'cacheserve'

files.
customerid => '306'
<OUTPUT SUPRESSED>

Time values are shown


in UNIX Time. Later
slides show how to License information from the
convert these values license file.
(date –d %).

CC Process Control:
stop and restart

# nom-tell -F pid cacheserve process-information


8715

# nom-tell cacheserve restart > /dev/null

# nom-tell -F pid cacheserve process-information


8880
After a restart, there is a new
# nom-tell cacheserve stop > /dev/null process.

It is not possible to start a


# nom-tell cacheserve start
<OUTPUT SUPRESSED> process with the CC.
nom-tell: critical: could not send to 'cacheserve': Connection refused

# nom-tell cacheserve
nom-tell 16.1.0.0, interactive mode

cacheserve> version
error: could not send to 'cacheserve': Connection refused
cacheserve>

nom-tell will start, even when a server isn’t running.


It is when commands are sent, that an error appears.

19
12/6/2016

Interactive nom-tell: <TAB>


<TAB> provides context sensitive
help and command completion.
cacheserve> <TAB>
address-list. layer. server.
address-node. monitoring. stop
auth-monitoring. name-list. telemetry.
auth-server-list. name-node. uuid
auth-server-node. policy. version
binding. process-information view.
connection. ratelimiter. view-selector.
dns64. resolver.
instance-information restart
cacheserve> au<TAB>
auth-monitoring. auth-server-list. auth-server-node.
cacheserve> auth-s<TAB>
auth-server-list. auth-server-node.
cacheserve> auth-server-

The bold text was added automatically


after <TAB> was pressed.

Interactive nom-tell: Working


Comfortably

• In addition to the standard command line controls


like cursor-left, cursor-right and delete, nom-tell
supports the default key bindings from the BASH
shell (i.e. Emacs key bindings).
cacheserve> view-selector.update source-address=192.0.2.9

<control-a> and the prompt jumps to <control-e> and the prompt jumps to
the beginning of the line. the end of the line.

cacheserve> server.query qname=ftp.nominum.com qtype=A view=Int


cacheserve> server.query view=Int qname=ftp.nominum.com qtype=A

Arguments can be in any order.

20
12/6/2016

Interactive nom-tell: quit and exit

cacheserve> <TAB>
address-list. layer. server.
address-node. monitoring. stop
auth-monitoring. name-list. telemetry.
auth-server-list. name-node. uuid
auth-server-node. policy. version
binding. process-information view.
connection. ratelimiter. View-selector.
dns64. resolver.
instance-information restart
cacheserve> exit
#

Either exit or quit will end an interactive nom-tell


session.

exit and quit are nom-tell specific commands, not


commands provided by the server over the CC, and
are therefore not listed by <TAB>.

CC Configuration

• Object Examples:
– Server
– View
– Resolver (CacheServe Only)
– Zone (AuthServe Only)
• Methods:
– get, mget, list
– update, replace
– delete
• Fields:
– Selecting configuration element of interest
– Use “tab” in nom-tell to display options

21
12/6/2016

Objects Objects constitute a server’s configuration.

Objects are recognized by trailing dots, which indicate


that a method (e.g. get, update, add, etc.) accesses or
manipulates the object.
cacheserve> <TAB>
address-list. layer. server.
address-node. monitoring. stop
auth-monitoring. name-list. telemetry.
auth-server-list. name-node. uuid
auth-server-node. policy. version
binding. process-information view.
connection. ratelimiter. view-selector.
dns64. resolver.
instance-information restart
cacheserve>

ans> <TAB>
block-checkpoints monitoring. stop
checkpoint node. unblock-checkpoints
ddns-monitoring. process-information uuid
federation. request-events version
instance-information restart view.
list-drivers server. zone.
list-events show-events
ans>

Note: Layer Object


(CacheServe Only)

cacheserve> <TAB>
address-list. layer. server.
address-node. monitoring. stop
auth-monitoring. name-list. telemetry.
auth-server-list. name-node. uuid
The CacheServe layer
auth-server-node. object is also an option under
policy. other
version
objects,
binding. where, using <TAB>, it appears
process-information frequently. view.
Ignore it.
connection. ratelimiter. view-selector.
dns64. resolver.
instance-information restart systems (e.g. N2 products) and
Layers are added by provisioning
cacheserve>
are not designed for direct use.

A pure CacheServe installation has one layer, operator.


Additional layers cannot be added (an N2 license is required).

22
12/6/2016

Methods

cacheserve> view.<TAB>
view.add view.get view.mget view.update
view.delete view.list view.replace

cacheserve> server.<TAB>
server.add server.query
server.all-errors server.replace
server.block-checkpoints server.statistics
server.checkpoint server.unblock-checkpoints
server.delete server.update
server.get server.usage

Methods access or modify an object.

The methods .list .get and .mget provide


information about an object (or objects). They are
fundamental methods used very frequently and are found
on most objects.
Objects that have only one instance have only .get.

Access Method: list


Everything between the { } is related to one
view.
cacheserve> view.list
{ type is an exception. It appears at the top of
type => 'view.list'
all commands, listed in the first object (here
name => 'world'
} the first view).

ans> view.list world is the ony view in a newly install


{ CacheServe.
type => 'view.list'
name => 'default' default is the only view in a newly install
} AuthServe.

cacheserve> view.list
{
type => 'view.list'
name => 'world' Note how type is only shown in the first view
} listed.
{
name => 'yyy'
}

23
12/6/2016

Access Method: get

The get method shows the configuration


cacheserve> view.get details of the object.
{ It requires identifying which object, view in
type => 'view.get' this case, is to be displayed.
err => 'missing required field "name": syntax error'
}
cacheserve>

Instead of the desired output, there is an error


with an explanation of the problem.

The err tag means the command has failed.

Access Method: get


The view name is required.
cacheserve> view.get <TAB>
exclude-fields fields layer name
cacheserve> view.get name=<TAB>
<string>
cacheserve> view.get name=wor<TAB> <TAB> only works for listing
<string> and completing commands,
cacheserve> view.get name=world not for object names.
{
type => 'view.get'
name => 'world'
resolver => 'world'
}
The entire world view configuration is
displayed.
This is the default configuration, which has
only the name and resolver fields.

24
12/6/2016

Access Method: mget

The mget combines .list and .get by


cacheserve> view.mget showing all configuration for all objects.
{
type => 'view.mget' Depending on the object, it can produce a
resolver => 'world' lot of output.
name => 'world'
} Note again that the command gets repeated
{ as type in the first view listed.
resolver => 'world'
name => 'yyy'
}
{
resolver => 'world'
comment => 'Important info.'
time-zone => 'UTC'
name => 'zzz'
}

Additional Arguments: list, get, mget

cacheserve> view.list<TAB>
descending key max-results start
end layer skip-first
cacheserve>
cacheserve> view.get<TAB>
exclude-fields fields layer name
cacheserve>
cacheserve> view.mget<TAB>
descending fields max-results
end key skip-first
exclude-fields layer start

The access methods all accept additional arguments to limit


or modify output.

The arguments are mostly useful when there is a lot of output.


(For example, the CC has an upper limit on how much
information can be sent.)

25
12/6/2016

Modification Methods

• The exact methods available is specific to an object.


• Most objects are modified through one of four
common methods.
cacheserve> view-selector.<TAB>
view-selector.add view-selector.list view-selector.replace
view-selector.delete view-selector.mget view-selector.update
view-selector.get view-selector.query
The methods .add, .delete, update
cacheserve> monitoring.<TAB> and .replace are commonly used for
monitoring.get monitoring.statistics modifying objects.
monitoring.replace monitoring.update
There aren’t slides explicitly showing
cacheserve> view.<TAB> these methods, because they are
view.add view.get view.mget shown again and again throughout
view.update view.delete view.list the course.

Adding Elements to a List:


Incremental Syntax: += -= Lets add the IPv6
# nom-tell cacheserve server.get | grep patterns loopback to client
patterns => ('127.0.0.1/32' '172.16.0.0/16') addresses served.

# nom-tell cacheserve server.update 'listen-on-matching=({patterns=(::1 )})' >


/dev/null

# nom-tell cacheserve server.get | grep patterns


patterns => ('::1/128') WHOOPS! The previous
patterns are gone.
# nom-tell cacheserve server.update 'listen-on-matching+=({patterns=(127.0.0.1
172.16/16)})' > /dev/null

# nom-tell cacheserve server.get | grep patterns


patterns => ('::1/128')
patterns => ('127.0.0.1/32' '172.16.0.0/16')

Warning: with .update, new


information overwrites the old.

Instead of using =, use +=.

26
12/6/2016

nom-tell: History

• nom-tell can be configured through an


environmental variable to keep a history of commands
between sessions.
# export NOM_TELL_HISTFILE=~/.nom_tell_histfile

First Look at the resolver Object


CacheServe is a resolver.
Internally, it supports multiple
resolver objects.

cacheserve> resolver.get When communicating with


{ CacheServe, the resolver name
type => 'resolver.get' must be provided.
err => 'missing required field "name": syntax error'
}

cacheserve> resolver.get name=world


{
type => 'resolver.get' world is the default resolver
name => 'world' object.
}
The resolver object is covered
later in the course.

27
12/6/2016

Manipulating Query Responses


with the resolver Object (preload)
• A company policy or government edict may require
blocking certain output.
# dig @::1 facebook.com Normal resolution.
<SELECTED OUTPUT SHOWN>
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19535
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
The resolver object is one of
several ways to manipulate
facebook.com. 249 IN A 173.252.91.4

query responses.

cacheserve> resolver.update name=world preload-nxdomain=(facebook.com)

Now the response is


# dig @::1 facebook.com NXDOMAIN.
<SELECTED OUTPUT SHOWN>
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 11982
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

Laboratory Environment

• CacheServe 7
• dig Internet

• The Internet
• Traffic Generator

INSTRUCTOR
Traffic Generator DNS Client

dig Vantio
CacheServe
DNS Client

STUDENT HOST

28
12/6/2016

Simple Configuration
• NXDOMAIN is not the same as REFUSED
– Verifiably non-existent domains get NXDOMAIN
– CacheServe has limited support for CHAOS-class so
most queries result in REFUSED responses
• Use nom-tell to send CC commands

– Modify Server object (IP address to listen on)

listen-on-matching=({patterns=(<IP1> <IP2>) port=0})

port=0 means “use the CacheServe default port” (53)

– Modify “world” resolver (create a preload for localhost)

preload=((localhost A 127.0.0.1))

Exercise 1

• Activities
– Install CacheServe
– Start the server
– Send queries
– Use nom-tell to send CC commands
• Version
• Modify Server object (IP address to listen on)

listen-on-matching=({patterns=(<IP1> <IP2>) port=0})

• Modify “world” resolver (create a preload for localhost)


• Discussion
– Command line options (see Exercise 18) include
• syslog facility control
• License file

29
12/6/2016

Split-DNS (Views)

• Split-DNS allows a DNS server to provide different


answers, for identical queries.

• Most commonly, the decision of what data to provide,


is based on the querier’s IP address.
•Other alternatives include the server’s address where the
query arrived.

• Views implement split-DNS.

• In CacheServe, views are implemented through


three objects: resolver, view, and view-selector

view-selector

• An arriving query is best-matched to a


view-selector view-selector, and then processed
(default)
through it.
• Most objects are named, but view-
selectors are identified based on their
view-selector
selection criteria (most commonly
source-address).
source-address =>
'192.0.2.0/24'

• A new system has only an unidentified,


view-selector
source-address =>
default, view-selector.
2001:db8:a1d::/48'
The default.
cacheserve> view-selector.list
{
type => 'view-selector.list'
}

30
12/6/2016

view-selector -> view

• A view-selector points to one view.


• The view must exist before the view-
view-selector
(default)

view => 'world' selector can point to it.


# cacheserve> view-selector.mget
view-selector {
source-address => type => 'view-selector.mget'
'192.0.2.0/24’
view => 'customer-X' view => 'world'
}
{

view-selector view => 'customer-X' A view-selector can


source-address => source-address => '192.0.2.0/24' have only one
2001:db8:a1d::/48’
} source-address.
view => 'customer-X'
{ It is therefore normal to
view => 'customer-X'
have many selectors
pointing to the same
source-address => 'fe80::/10'
view.
}

view-selector -> view

A newly installed system has


view-selector one view, “world,” and the
(default) default view-selector points
to it.
view => 'world'

view
name=world
view-selector Enterprises with simple
source-address =>
'192.0.2.0/24’ systems often never need
view => 'customer-X' more than one view-
selector or additional view.
view However, ISPs often need
name=customer-X many view-selector and
view-selector
source-address => views.
2001:db8:a1d::/48’
view => 'customer-X' Some enterprises use two
views, one for internal
hosts, one for external.

31
12/6/2016

view

• A view points to one resolver.


• The resolver must exist before the
view can point to it.
view cacheserve> view.mget
name=world
resolver => 'world' {
type => 'view.mget'
resolver => 'res-4-X'
name => 'customer-X'
view }
name=customer-X
{
resolver => ’res-4-X'
resolver => 'world'
name => 'world'
}

view-selector -> view -> resolver


A newly installed system has
one resolver, “world,” and the
view-selector “world view” points to it.
(default)
view => 'world'

view resolver
view-selector name=world name=world
source-address => resolver => 'world'
'192.0.2.0/24’
view => 'customer-X'

view resolver
name=customer-X name=res-4-X
view-selector resolver => ’res-4-X'
source-address =>
2001:db8:a1d::/48’
view => 'customer-X'
A resolver is a cache and
instructions for resolving
queries not in the cache.

32
12/6/2016

Are view Objects Necessary?

• view-selectors decide where each query is sent.


• resolvers provide answers.
• So what about views?
• View objects, like resolver objects, can modify
results (policies).
• Modification of results is covered later in the
course.
• For now it is only important to understand that
there is an administrative choice to modify, in a
view, or in a resolver.

Shared Resolver

view-selector
(default)
view => 'world'

view resolver
view-selector name=world name=world
source-address => resolver => 'world'
'192.0.2.0/24’
view => 'customer-X'

view
name=customer-X
view-selector resolver => ’world'
source-address =>
2001:db8:a1d::/48’
view => 'customer-X'
Multiple views can share a
resolver taking advantage of a
single cache.

33
12/6/2016

Unused Objects

view-selector
(default)
view => 'world'

view resolver
view-selector name=world name=world
source-address => resolver => 'world'
'192.0.2.0/24’
view => 'customer-X'

view If a resolver has no view


name=customer-X pointing to it, or if a view
view-selector resolver => ’world' has no view-selector
source-address =>
2001:db8:a1d::/48’ pointing to it, it is unused.
view => 'customer-X'

view resolver
name=customer-X name=res-4-X
resolver => ’res-4-X'

Command Guidelines for Resolvers,


Views, and View-Selectors

• A resolver must be added before a view can reference it.


• A view must be added before a view-selector can reference it.

cacheserve> view.add name=NewView resolver=Whoops


{
type => 'view.add'
err => 'unknown resolver "Whoops"'
}
• A view cannot be deleted if a view-selector references it.
• A resolver cannot be deleted if a view references it.
• Resolvers and views have names.
• View-selectors are identified by their criteria (most commonly
“source-address”).
cacheserve> view-selector.add view=world source-address=::1

34
12/6/2016

Controlling view-selector, view,


and resolver Objects
• The control methods for view-selector, view and
resolver objects are straightforward.
Here are a few examples.
Normal resolution.
cacheserve> resolver.add name=res-4-X

cacheserve> view.add name=customer-X resolver=world


cacheserve> view.update name=customer-X resolver=res-4-X

cacheserve> view-selector.add view=customer-X source-address=198.51.100.128/25


cacheserve> view-selector.add view=customer-X source-address=2001:db8:cafe::/48

A resolver is removed with


resolver.delete. All properties are
removed with it, and the only way to restore
the resolver, is to recreate it.

Other resolver Object Uses

• For those familiar with resolver forwarding and with


stub zones, note that they are configured through a
resolver object.
• Forwarding and stub zones are not part of the standard one
day CacheServe course.

cacheserve> resolver.update name=world stub=…

cacheserve> resolver.update name=world forward=…

35
12/6/2016

Exercise 2

• Activities
– Create "internal" resolver (natural DNS)
– Create additional view / view-selector
– Interpret statistics from multiple resolvers
• Discussion
– Most specific view-selector wins
– Vantio 5's "first-match algorithm" with
traditional (indexed) views not supported

3. Operations

• Cache Operation: in-depth


• Diagnostics
• Nominum Nanny
• snmpagent
• Spoofing Defenses

36
12/6/2016

Resource Use and Control

• Expect optimal CacheServe performance when


the process is CPU-bound
– No disk access interruptions
– No network capacity limitations
• Memory
– Cache size and Recursion Contexts
• Interpretation of Statistics
– Cache-hit fraction

Resource: Cache Memory


• Cache—Info learned from authoritative sources
Nearly Expired: prefetch
Expired RRSet (TTL)
new
results
Recently used Oldest

max-cache-size
(default 1 GB per resolver.)
resolver.update name=world max-cache-size=XX

The cache isn’t actually sorted.


However, CacheServe knows how recently each RRSet was used.

For efficiency, expired RRSets aren’t deleted, only marked.

If the cache is full, space from expired RRSets is used.


If the cache is full and there are no expired RRSets, then least recent used RRSets
are deleted to make space for new results.

37
12/6/2016

Resource: Recursive Memory


• Recursion Contexts—Ongoing queries

new
query Newly started Longest running

max-recursive-clients
(default 25,000. Maximum 250,000, which is 9GB. )
server.update max-recursive-clients=XX

Each outstanding lookup uses about 32KB of memory.


An attack can generate 20,000 unique recursions or more.

Overview: server.statistics
cacheserve> server.statistics Server statistics are global to the
{ process and aggregated.
type => 'server.statistics'
current-time => '1465170511.119244'
server-start-time => '1465155494.709822'
node-id => 'bd0ea83e-da86-5c0f-bd43-5c6905b96b0a'
user-time => '6.131067'
system-time => '9.568545'
memory-in-use => '36951344'
reset-time => '1465155494.813856'
statistics => { Time values are shown in UNIX Time.
requests-received => '5' On Linux systems, the time can be
responses-received => '2' made human readable in the local
requests-sent => '2' timezone with:
responses-sent => '5' # date -d @1465170511.119244
lookups => '5' Sun Jun 5 16:48:31 PDT 2016
recursive-lookups => '2'
} reset-time is when the statistics
} were last set back to zero.

38
12/6/2016

server.statistics reset=true
cacheserve> server.statistics<TAB>
all reset
cacheserve> server.statistics reset=true
{
type => 'server.statistics'
current-time => '1465182308.233563'
server-start-time => '1465155494.709822'
The statistics can be set to zero by
node-id => 'bd0ea83e-da86-5c0f-bd43-5c6905b96b0a'
setting the boolean argument reset to 1,
user-time => '11.459257'
system-time => '17.772298' t, or true.
memory-in-use => '36957400'
reset-time => '1465155494.813856' The final statistics before the reset are
statistics => { displayed.
requests-received => '5'
responses-received => '2'
requests-sent => '2'
responses-sent => '5'
lookups => '5'
recursive-lookups => '2'
}
}

server.statistics all=true
cacheserve> server.statistics
{ Zero value statistics are suppressed.
<OUTPUT SUPRESSED>
statistics => {
}
}
cacheserve> server.statistics all=1
{ All statistics can be seen by setting the
<OUTPUT SUPRESSED> boolean argument all to 1, t or true.
statistics => {
requests-received => '0'
responses-received => '0'
requests-sent => '0'
responses-sent => '0'
rate-limited-requests => '0'
requests-no-view => '0'
tcp-requests-sent => '0'
lookups => '0'
<OUTPUT SUPRESSED>

39
12/6/2016

The Server Statistics


memory-in-use: The memory requested
cacheserve> server.statistics from the memory allocator and memory
{ used by the cache (it does not include
type => 'server.statistics' overhead for allocator bookkeeping,
rounding, fragmentation or free lists.)
current-time => '1465170511.119244'
server-start-time => '1465155494.709822'
node-id => 'bd0ea83e-da86-5c0f-bd43-5c6905b96b0a'
user-time => '6.131067' requests-received from clients.
system-time => '9.568545' requests-sent to other DNS servers.
memory-in-use => '36951344' responses-received from servers.
reset-time => '1465155494.813856'responses-sent to clients.
statistics => { lookups by this resolver. Different
requests-received => '5' from queries because a query can
responses-received => '2' involve multiple lookups due to
requests-sent => '2' following CNAME RRs, looking up NS
responses-sent => '5' addresses and DNSSEC keys, root
lookups => '5' server priming, etc.
recursive-lookups => '2' recursive-lookups are queries that
} could not be answered from the
} cache.

More Server Statistics


tcp-clients: The current
cacheserve> server.statistics all=t number of outstanding queries to
{ other servers with TCP.
<OUTPUT SUPRESSED>
statistics => { tcp-requests-sent: The total
requests-received => '83' number of queries that were sent
responses-received => '162' with TCP.
requests-sent => '162'
responses-sent => '83'
rate-limited-requests => '0'
requests-no-view => '0'
tcp-requests-sent => '0'
lookups => '131'
recursive-lookups => '113'
formerr-loop-dropped => '0' recursion-contexts-in-use:
recursion-contexts-in-use => '0'
How many queries are currently
tcp-clients => '0'
outstanding to other servers. On
}
lightly loaded systems, seeing a
}
value other than zero is rare.

40
12/6/2016

Server Statistics usage:


Cache Hit Rate
• Cache hit rate formula:
1 – (recursive lookups/lookups)
• Recursive lookups are queries sent to auth servers.
• Lookups includes those CacheServe answered from
its cache and those sent to other nameservers.
• Lookups originate from both internally generated
and external client queries.
# cacheserve-stats
clnt clnt auth auth user sys total q/ recur hit
req/s resp/s req/s resp/s %cpu %cpu %cpu cpusec cntxs rate%
------- ------- ------ ------ ----- ----- ----- ------- ------ -----
2 2 4 4 0.2 0.3 0.5 - 0 20.0
1 1 0 0 0.2 0.0 0.2 - 0 100.0

resolver.statistics
• A CacheServe server can have multiple resolvers.
• Each resolver has its own statistics.
cacheserve> resolver.statistics name=world
{
type => 'resolver.statistics'
current-time => '1465183712.188338'
<OUTPUT SUPRESSED>
memory-in-use => '36958224'
name => 'world'
reset-time => '1465155494.815641' resolver.statistics requires a resolver
cache-memory-in-use => '0' name to display.
statistics => { Freshly installed, CacheServe has one
lookups => '4' resolver: ’world’
queries => '4'
responses-by-rcode
cache-memory-in-use is specific to=>
this{ The server.statistics arguments, all and
noerror
resolver’s cache. => '4' reset, apply to resolver.statistics.
}
}
memory-in-use is for the server. It is the Resetting server.statistics does not
}same value found in server.statistics. effect resolver statistics.

41
12/6/2016

resolver.statistics all=true
cacheserve> resolver.statistics name=world all=1
{
<OUTPUT SUPRESSED>
statistics => { More statistics are available for
lookups => '4' resolvers than for the server.
<OUTPUT SUPRESSED>
requests-sent => '0'
tcp-requests-sent => '0'
rate-limited-requests => '0'
queries => '4'
dropped-recursions => '0'
interrupted-recursions => '0'
responses-by-rcode => {
noerror => '4'
formerr => '0'
servfail => '0'
nxdomain => '0'
notimp => '0'
refused => '0'
yxdomain => '0'
yxrrset => '0'
nxrrset => '0'
notauth => '0'
notzone => '0'
<OUTPUT SUPRESSED>

Exercise 3

• Activities
– run “cacheserve-stats”
– Modify the max-cache-size (resolver)setting
– Implement a shared cache with resolver
• Discussion
– License file determines concurrency

42
12/6/2016

resolver.recursing

• A recursive server works on resolving a RRset by


sending iterative queries (flag RD=0).
• A busy server can be recursing on thousands of
RRsets simultaneously.
cacheserve> resolver.recursing name=world
{
type => 'resolver.recursing'
resolutions => (
{
name => ’somename.examle.com'
type => 'AAAA'
}
)
} On a resolver under light load, such as in the lab,
resolver.recursing will generally show no output.

The example shows one outstanding RRSet


being recursed on.

resolver.inspect

• resolver.inspect shows a resolver’s cache content


for a domain name. All RR types in the cache are
shown.
cacheserve> resolver.inspect name=world domain=yahoo.com
{
type => 'resolver.inspect' The domain name isn’t in the cache.
err => 'domain not found'
}
cacheserve> resolver.inspect name=world domain=a.yahoo.com
{
type => 'resolver.inspect' The domain name doesn’t exist
name => 'world'
domain => 'a.yahoo.com' (NXDOMAIN). The non-existence was
exists => 'false' cached for 600 seconds. In 595s the
ttl => '595' NXDOMAIN entry will expire.
nonexistence-proof => (
(
'yahoo.com'
{
SOA => {
ttl => '595'
data => ('ns1.yahoo.com. hostmaster.yahoo-inc.com. 2016060601
3600 300 1814400 600')
<OUTPUT SUPRESSED>

43
12/6/2016

resolver.inspect
cacheserve> resolver.inspect name=world domain=nominum.com
{
<OUTPUT SUPRESSED>
domain => 'nominum.com'
exists => 'true'
types => {
TXT => {
exists => 'true'
ttl => '3580'
data => ('"v=spf1 include:_spf.nomin<OUTPUT SUPRESSED>"')
origin => '64.89.228.10'
}
A => {
exists => 'true'
ttl => '27' Three RRsets for the domain name
data => ('162.209.114.115')
origin => '64.89.234.2' are cached: TXT, A, and SPF.
}
SPF => { No data exists for the RR SPF; SPF
exists => 'false'
has been negatively cached.
ttl => '46'
nonexistence-proof => (
( origin is the authoritative server that
<OUTPUT SUPRESSED> provided the RRset.

Glue Segregation (Preamble to:


resolver.inspect-delegation)

• Queries are
answered from
Name
Cache
the “Name
Cache”
• Lookups use
Delegation
the “Delegation Cache
Cache”

Cache

44
12/6/2016

resolver.inspect-delegation

•resolver.inspect-delegation shows the cached NS


RRSet for a domain.
cacheserve> resolver.inspect-delegation name=world domain=google.com
{
type => 'resolver.inspect-delegation'
err => 'domain not found'
} Currently, NS RRs for google.com
aren’t cached.

resolver.inspect-delegation
cacheserve> resolver.inspect-delegation name=world domain=google.com
{
type => 'resolver.inspect-delegation' After the resolver queried for
name => 'world' the NS RRs of google.com,
domain => 'google.com'
ttl => '168746' they were added to the
servers => ( cache.
{
server => 'ns1.google.com'
addresses => (
{
type => 'A'
origin => '192.54.112.30'
ttl => '168746'
glue => 'true' RTT (round trip time)
addresses => (
{ measures the response time
address => '216.239.32.10' from the server in
rtt => '46140' microseconds.
}
)
} It does not appear until the
) resolver first uses this NS.
}
<OUTPUT SUPRESSED>

45
12/6/2016

resolver.flush

• resolver.flush deletes RRsets from a resolver’s cache.


• A single domain name can be removed (name).
• A name can be an apex, and all subdomains are also removed
(domain).
• Individual RRsets for a specific type cannot be removed.
cacheserve> resolver.flush name=world target=(name google.com)
{
type => 'resolver.flush'
}

cacheserve> resolver.flush name=world target=(domain nominum.com)


<OUTPUT SUPRESSED>

Flush the entire cache.


cacheserve> resolver.flush name=world target=(domain .)
<OUTPUT SUPRESSED>
Also flush the entire
cacheserve> resolver.flush name=world cache.
<OUTPUT SUPRESSED>

server.query

• server.query is a DNS querying tool similar to dig, but with


very powerful features related to Nominum.
• It was an addition to an early version of CacheServe7.0.
• Like dig, server.query defaults to query for an A RR.

cacheserve> server.query qname=www.ripe.net


{
type => 'server.query'
qname => 'www.ripe.net'
qtype => 'A'
rcode => 'NOERROR'
result => 'success'
flags => ('qr' 'rd' 'ra')
answer => (('www.ripe.net' 'A' '21600' '193.0.6.139'))
response-size => '46'
response-time => '0.092282'
resolver => 'world' CacheServe specific content.
view => 'world'
view-selector => {
source-address => '0.0.0.0/0'
}
resolution => 'true'
}

46
12/6/2016

server.query Compared With dig


cacheserve> server.query qname=buffalo.edu
{
type => 'server.query'
qname => 'buffalo.edu' Match the colors to compare
qtype => 'A' server.query and dig output.
rcode => 'NOERROR'
result => 'success'
flags => ('qr' 'rd' 'ra')
answer => (('buffalo.edu' 'A' '28799' '128.205.201.57'))
response-size => '45'
response-time => '0.000027'
resolver => 'world'
view => ’`world'
view-selector => {
source-address => '0.0.0.0/0'
} # dig +nocmd +noque +noauth +noadd @127.1 buffalo.edu
} From the response-time ;; Got answer:
(query time) we see that ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7542
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0,
the dig ran first; there ADDITIONAL: 0
was no cached entry.
server.query ran after the ;; ANSWER SECTION:
response was cached. buffalo.edu. 28800 IN A 128.205.201.57

;; Query time: 109 msec


(This can also be gleaned ;; SERVER: 127.0.0.1#53(127.0.0.1)
from the TTLs.) ;; WHEN: Mon Jun 6 07:26:35 2016
;; MSG SIZE rcvd: 45

server.query: Querying Options

• A sampling of some of the querying options available.


• qtype
• qclass
• tcp Processes the query as if it were received over TCP.
• flags Query flags(aa ad cd qr ra rd tc), the default is rd.
cacheserve> server.query qname=version.bind qtype=txt qclass=ch
{
type => 'server.query'
qname => 'version.bind'
qtype => 'TXT'
qclass => 'CH'
rcode => 'NOERROR'
result => 'success'
flags => ('qr' 'rd' 'ra')
answer => (('version.bind' 'TXT' '0' '"Nominum Vantio CacheServe 7.1.0.1"'))
response-size => '76'
response-time => '0.000013'
}

47
12/6/2016

server.query: Advanced Options

• A sample of options with functionality specific to CacheServe or


otherwise not available through common querying tools.
• client-address The source address of the query.
• resolver Give a result from specified resolver.
(Ignores view selectors.)
• view Give a result from specified view.
(Ignores view selectors.)
• force-resolution Ignore the cache.
• tracing Show steps to process query.
cacheserve> server.query qname=ripe.net tracing=1 force-resolution=1
<OUTPUT SUPRESSED>
trace-messages => ('1465226514.829597: query ripe.net. type A class IN' '1465226514.829610:
iterating prequery policies' '1465226514.829614: starting lookup' '1465226514.829622:
resolving ripe.net./A' '1465226514.829633: closest known zone cut is ripe.net.'
'1465226514.829648: 6 known server addresses, 0 missing server addresses' '1465226514.829656:
sending to 162.159.25.153 (c2.authdns.ripe.net.)' '1465226514.829665: send udp q=0x400bae8
id=61652 socket=0.0.0.0#37451' '1465226514.829699: waiting for response, timeout=500000,
<OUTPUT SUPRESSED>

Delegation v Auth Answer

name server of “net”?


 Root server
provides non-
authoritative answer
(glue)

 gTLD server
provides
authoritative answer

48
12/6/2016

Exercise 4
• Activities
– Read cache
• Names using resolver.inspect
• Server using resolver.inspect-delegation
– Flush specific domain with flush command
– server.query
• Emulate client properties
• Cacheserve does all but send response
–force-resolution=true
–tracing=true
• Discussion
– To analyze truly empty-cache behavior,
consider creating a temporary resolver
– Clone a resolver?

Manipulating Query Processing with


Policies
• Earlier we examined manipulating query responses
by preloading the resolver.
cacheserve> resolver.update name=world preload-nxdomain=(facebook.com)
cacheserve> resolver.update name=world preload=((www.ourCompany.local A 10.1.1.1))

• policy is a CacheServe feature which controls


processing.
• Policies are bound to views or the entire server.

• One method for identifying which requests are


candidates for policy treatment: address-list
• Optionally, addresses can be hardcoded into a
policy. An address-list is then not required.

49
12/6/2016

address-list and address-node


Objects
• An address-node contains an address or a network.
cacheserve> address-node.<TAB>
address-node.add address-node.list address-node.update
address-node.delete address-node.mget
address-node.get address-node.replace

• An address-list contains address-nodes.


cacheserve> address-list.<TAB>
address-list.add address-list.get address-list.mget
address-list.delete address-list.list address-list.replace
address-list.dump address-list.load address-list.update

• An address-node resides in an address-list; the


list must be added before the node.
• There are no address-lists or address-nodes on a
newly installed system.
• After an address-list has been created, it is useless
until applied to some purpose.

Adding an address-list
cacheserve> address-list.list
{
type => 'address-list.list'
}

cacheserve> address-list.add name=DoS_badGuys


{
type => 'address-list.add'
}

cacheserve> address-list.list
{
type => 'address-list.list'
name => 'DoS_badGuys'
}

50
12/6/2016

Adding address-nodes to an
address-list
cacheserve> address-node.add list=DoS_badGuys address=192.0.2.44
{
type => 'address-node.add'
}
cacheserve> address-node.add list=DoS_badGuys address=192.0.2.128/25
<OUTPUT SUPRESSED>

cacheserve> address-node.mget
{
type => 'address-node.mget'
list => 'DoS_badGuys'
address => '192.0.2.44/32'
}
{
list => 'DoS_badGuys'
address => '192.0.2.128/25’
}
cacheserve> address-list.mget
{
type => 'address-list.mget'
name => 'DoS_badGuys'
count => '2'
lowest-address-v4 => '192.0.2.44'
}

The policy Object

• Policy object:
selector identifies the traffic to match.
action is what to do.
• After being created, a policy is not of any use. (It must
be bound to a view or server object.)
cacheserve> policy.add name=StopBadGuysPolicy selector=(client-address DoS_badGuys)
action=truncate
{
type => 'policy.add'
}

• Selection is basically possible for every part or a


query or response (e.g. qtype, qname, response-size).

• Actions include: refuse, fail, drop, answer-nxdomain,


answer-noerror, truncate

51
12/6/2016

The binding Object

• A binding objects connects a policy to a view or the


server.
• After a binding is created, the policy is enforced.
cacheserve> binding.add policy=StopBadGuysPolicy view=world priority=100
{
type => 'binding.add'
}

• A binding executes a policy prequery, postquery or


presend (when field).
• Prequery is the default and runs when the query
arrives, before checking the cache.
• Postquery bindings run when a reply arrives; for a
reference (e.g. CNAME), it will run multiple times.
• Presend is run just before the response is sent.

binding Object Priorities

• Multiple prequery bindings, multiple postquery


bindings, and presend bindings can match for the
same query.
• Only one of each will be implemented.

• The priority indicates which policy will be executed,


with lower priority values having higher preference.
• If policies have equal priority, only one will be
executed, but which is not defined.

52
12/6/2016

Exercise 5
cacheserve> policy.add name=blackhole
selector=(client-address blocked_clients )
action=drop

• Activities
– Blackhole a client by implementing a
• address-list with address-node
• policy
• binding to server object
– Use server.query to verify policy behavior
• Discussion Formerly
policy.simulate
– No server restart required
lvp-query
• Initial configuration
• Modification of “blocked_clients” IP list

Events

• Events are CC messages produced by CacheServe.


• They inform administrators of CacheServe activities.
• An interactive CC can subscribe to Events.
• There are approximately 35 Events in total.
• Most activities that Events represent:
• are written to syslog.
• can be converted to an SNMP trap.

53
12/6/2016

Event Generation

• Events are generated for changes of state.


• graceful shutdown: server.stop
• configuration changed: resolver.changed

• Events are generated when thresholds are exceeded.


• maximum clients reached:
• server.udp-recursion-limit
• maximum TCP clients reached:
• server.tcp-client-limit

• Events are generated when an action is triggered or cleared:


• ratelimiter.onset
• resolver.id-spoofing-suspected

Connection Object

• A CC session is represented by the connection


object.
cacheserve> connection.get A connection is initially not subscribed to
{ any events.
type => 'connection.get'
events => ()
all-events => ('address-list.changed' 'address-node.changed' 'auth-
monitoring.changed' 'auth-server-list.changed' 'auth-server-node.changed'
'binding.changed' 'dns64.changed' 'layer.changed' 'layer.provisioning-
connected' 'layer.provisioning-connection-failure' 'layer.provisioning-
disconnected' 'layer.provisioning-reimaging' 'layer.provisioning-update-
connection.get
failure' 'layer.provisioning-update-success' conveniently'name-
'monitoring.changed' lists all
list.changed' 'name-node.changed' 'policy.changed' 'policy.hit'
events available for
'ratelimiter.abate' 'ratelimiter.changed' 'ratelimiter.onset' subscription.
'resolver.changed' 'resolver.flush' 'resolver.id-spoofing-suspected'
'server.changed' 'server.configuration-error' 'server.formerr-loop'
'server.restart' 'server.stop' 'server.tcp-client-limit' 'server.udp-
recursion-limit' 'telemetry.changed' 'view-selector.changed' 'view.changed’)
}

54
12/6/2016

Connection: idle-timeout
• A connection has only two configurable characteristics.
• subscribed events
• connection timeout
• Connection configuration is applicable to the current CC only.

cacheserve> connection.update idle-timeout=5


The connection will time out and
{ disconnect in 5 seconds.
type => 'connection.update' (The default is 5 minutes.)
}
cacheserve> connection.get
{
idle-timeout in the connection object.
type => 'connection.get'
events => ()
all-events => ('address-list.changed' 'address-node.changed’
<OUTPUT SUPRESSED> When the idle-timeout triggers, the
selector.changed' 'view.changed') connection is closed.
idle-timeout => '5' nom-tell is still running.
}
Executing a command establishes a
cacheserve>
error: 'cacheserve' closed the connection new connection (with a new TCP port).
cacheserve> The timeout in the new connection is the
default of 5 minutes.

Connection: Event Subscription


Event subscriptions can be
cacheserve> connection.update events=(<TAB> individually selected.
address-list.changed policy.hit
address-node.changed ratelimiter.abate
<OUTPUT SUPRESSED>
cacheserve> connection.update events=(server.stop server.restart)
{

}
type => 'connection.update'
Easy subscriptions to all
cacheserve> connection.subscribe-all events.
{
type => 'connection.subscribe-all'
}

cacheserve> connection.update events=(ratelimiter.onset


ratelimiter.onset resolver.flush )
{ Rerunning
connection.update overrides
type => 'connection.update'
}
cacheserve> previous subscriptions.

Subscriptions and Timeout:


To prevent missing an event, a connection will not timeout
when it has a subscription, unless the idle-timeout has been
explicitly set.

55
12/6/2016

Connection: Event Notification


• Event notification appears immediately and asynchronously.
cacheserve> view.lis
event:
{
type => 'resolver.flush'
name => 'world'

}
target => ('domain' '.')
nom-tell makes the interruption painless
cacheserve> view.lis by maintaining what was being typed.

• A non-interactive CC accept connection commands, but it is


absolutely useless.
# nom-tell cacheserve connection.subscribe-all
request:
{
type => 'connection.subscribe-all'
}

response:
{
type => 'connection.subscribe-all’

Connection: Unsubscribing to
Events
• Several ways exist to unsubscribe from events.
• Additionally, if the server restarts, event subscriptions
are lost.
cacheserve> connection.update unset=(events)
cacheserve> connection.update events=()
cacheserve> connection.replace events=()
cacheserve> connection.replace
cacheserve> exit
cacheserve> quit connection.replace is like stopping
and starting nom-tell, except history
is maintained and the TCP port remains
open (i.e. it doesn’t change).

connection.replace resets the idle-


timeout.

56
12/6/2016

CacheServe Events and


SNMP Traps

• Support for SNMP traps


and GETs
 Trap destination defined in CacheServe

/var/nom/snmpagent/
 Run snmpagent
1 2
Event from Trap to
CacheServe SNMP tool
SNMP
Agent

127

SNMP GETs

 Support for SNMP GET

CacheServe

2 1
Instruction to GET from
CacheServe SNMP tool
SNMP

3 4
Agent

Response to Agent Result to


SNMP tool
129

57
12/6/2016

rate-limiting
• DNS amplification attack
– Flood of requests with victim’s IP address as source
– Saturate victim’s network link
• “Perfect” rate-limiting: unbounded memory / time
• CacheServe defense
– LRU(Least Recently Used) maintains clients’ state
• Drop some queries if client exceeds limit
• Allow limited “bursts”
– Log and send event with client details

rate-limiting with policy

• Simple configuration
– ratelimiter.add name=first qps=2
fields=((client-network (32 128)))
• /32 and /128 implies per-client “buckets”
– policy.add name=client_ratelimit
selector=(ratelimiter first)
action=truncate
– binding.add policy=client_ratelimit
view=world priority=10
• Monitor mode: change ratelimiter to unenforced

58
12/6/2016

Rate Limiter "fields"

Exercise 6
• Activities
– Request Events with interactive nom-tell
with new connection object (use old style
request-events for statmon)
– Configure SNMP traps with nom_snmpagent
– Enable policy-based ratelimiter
– Eliminated server's rate-limiting, rate-
limiting-max-qps, rate-limiting-
unenforced ,truncate-factor
– also gone …by-response-size, use
response-size selector and execute at
“presend”

59
12/6/2016

DNS: what needs protection

Corrupting data Impersonating master


Cache impersonation
Zone administrator
1
4
Zone file master Caching server

2
3 5

Dynamic
updates
slaves Stub resolver
Cache pollution by
Data spoofing
Unauthorized updates
Altered zone data

Server protection Data protection


135

Review: recursion

• Lookup from Caching to Authoritative Servers


– DNS query (domain-name, class, type)
– Random XID (16 bits-around 65,000 values)
• Wait for first answer that arrives
– On correct socket (IP address & source port)
– with correct domain-name, class, type, XID
• Select useful information
– Answer section
– Authority and Additional sections

60
12/6/2016

Spoofed responses to lookups

• Easy
– Create datagram
– Find source port(s)
– Send (one or more)
• Less Easy
– Guess XID
• Hard
– When to send them
• At TTL expiration
• Triggered by query
(not hard, if attacker
knows or controls
when query was
made)

Cache Poisoning Overview

• Understanding the Response-Spoofing problem


– Attackers motivated to seize control of domains
– Minimal tools required to exploit vulnerability
• Strategies and success probability
– Historical perspective
– Kaminsky family of attacks
• Prevention strategies in Vantio CacheServe
– Compacting the success window
– Automatic spoofing detection
– Selective record caching

61
12/6/2016

ID Spoofing Attacks

• These attacks get a resolver to accept an incorrect


RRset.
• Resolver clients are then given the incorrect data.
• An attacker sends unsolicited answers to a resolver.
• To succeed, the attacker must match the XID, the
source socket, and the query (domain-name, type,
and class).
• Names include: ID spoofing attacks, ID guessing
attacks and brute-force spoofing attacks.
• The incorrect RRset is commonly returned with a
large TTL, so it also known as a cache poisoning
attack.

A Window of Opportunity

• Brute-force attack
– High rate of responses theoretically needed to match XID
– Query Source Port Randomization effectively shrinks window
– Lower latency reduces spoofing efficiency

Lookup query sent Lookup response received


XID=34932 XID=34932
XID=6367 6368 6369 6370 6371

50 ms to 5 seconds time

62
12/6/2016

Tiny Window becomes HUGE

Kaminsky-style attacks
– Risk of poisoning considered tolerable by
most DNS operators until March 2008
– Dan Kaminsky devises new strategy
• Trigger lookups on demand
– Query names which are not cached
– Spoofed response flood begins at once
– Exposes any name to brute-force poisoning
• Judiciously constructed “Additional” records
– inject NS RRs

Attacker opens the Window

• Kaminsky-style attack
– Initiate exploit anytime
– Tune the spoof attempts and repeat at will
– Info in “additional” section will hijack domain
Example Query: fo5emde.wellsfargo.com

Lookup Response Query 1 Response Query 2 Response


query sent received sent 1 received sent 2 received

time

Attacker probe Attacker Trigger 1 Response 1 Trigger 2 Response 2


query arrives response sent arrives sent arrives sent

63
12/6/2016

CacheServe Protection Settings


for ID Spoofing Attacks
• The settings here are covered in the following slides.
• query-source-pool, query-source-pool-v6:
control the pool of ports from which CacheServe
sends outgoing queries.
• log-id-spoofing: controls if CacheServe logs
warnings for suspected ID spoofing attacks.
• qname-case-randomization: controls how
CacheServe randomizes the case of requests.
• qname-case-randomization-exclusions:
excludes certain queries from case randomization.

CacheServe features

• Compact the window of opportunity


– QSPR (query-source-pool)
–QSPR=Query Source Port Randomization
– Low latency
• Restrict use of additional records
– Ignore additional info in answers (not referral)
– Ignore authority info in answers (not referral)

64
12/6/2016

CacheServe Feature: QSPR

• The industry solution for Kaminsky’s findings, was to


have recursors randomize source ports for queries.
• This increases the difficult of successfully
executing the attack.
• QSPR (Query Source Port Randomization) is
enabled by default in CacheServe.
# lsof -i UDP | grep cacheser | grep '*' | wc -l
512
# ss -lup | grep cacheserv | grep -Ev '127.0.0.1|fe80:' | wc -l
513

Linux commands to approximate the number of


UDP ports opened by an unaltered
CacheServe.

(What to grep from the commands changes with


changes to CacheServe’s configuration.)

CacheServe Feature: QSPR

• For efficiency, CacheServe opens all its random


outgoing UDP ports when it starts, or when the
number of ports is changed.
• Changing the number of ports is through the resolver
object.
cacheserve> resolver.update name=world query-source-pool=(1024 192.0.2.1#0)

1024 outgoing querying ports will be This should be zero, or leave out the #0
used in the world resolver for the IP altogether.
address 192.0.2.1.
Any other value is taken as a start for a
Note that the querying ports is set for sequential list of ports. Useful if firewalls
each IP address used for outgoing must be traversed.
queries (generally one IPv4 and one
IPv6 address).

65
12/6/2016

CacheServe Feature: QSPR

• Increasing the QSPRs:


# lsof -i UDP | grep cacheser | grep '*' | wc -l
512
# ss -lup | grep cacheserv | grep -Ev '127.0.0.1|fe80:' | wc -l
513

# nom-tell cacheserve \
'resolver.update name=world query-source-pool=(4096 192.0.2.1#0)’

# lsof -i UDP | grep cacheser | grep 192.0.2.1 | wc -l


2048
# ss -lup | grep cacheserv | grep 192.0.2.1 | wc -l
2048

Although it can be configured


higher, the number of random
UDP ports maximizes at 2048
per querying address.

CacheServe Feature: QSPR

• Viewing the open ports:


# lsof -i UDP | grep cacheser | grep 192.168.88.213 | head –n 3
cacheserv 1364 root 543u IPv4 23842 0t0 UDP 192.168.88.213:29313
cacheserv 1364 root 544u IPv4 23843 0t0 UDP 192.168.88.213:20264
cacheserv 1364 root 545u IPv4 23844 0t0 UDP 192.168.88.213:20025

# ss -lup | grep cacheserv | grep '192.168.88.213' | head –n 3


UNCONN 0 0 192.168.88.213:28593 *:*
users:(("cacheserve",1364,2450))
UNCONN 0 0 192.168.88.213:21937 *:*
users:(("cacheserve",1364,2438))
UNCONN 0 0 192.168.88.213:49073 *:*
users:(("cacheserve",1364,2317))

66
12/6/2016

log-id-spoofing

• The resolver setting log-id-spoofing configures


CacheServe to log a message when it suspects an ID
spoofing attack.
• Logging is only done, when there is a relatively
strong suspicion that an attack is taking place.
• The resolver.id-spoofing-suspected event
is raised when an ID spoofing attack is suspected.
• It is issued at the same time as the log entry is
made.
• The id-spoofing-defense-queries statistic
tracks the times the defense mechanism has been
triggered (TCP used instead of UDP).
cacheserve> resolver.update name=world log-id-spoofing=true
Default: false

Query Case Randomization

• By mixing the case of outgoing queries, recursors can


lower the risk of ID spoofing attacks.
• By default, CacheServe sends queries with the case
matching the arriving query (randomization=off).
cacheserve> resolver.update name=world qname-case-randomization=off
cacheserve> resolver.update name=world qname-case-randomization=unenforced
cacheserve> resolver.update name=world qname-case-randomization=enforced
cacheserve> resolver.update name=world qname-case-randomization=silent-enforced

unenforced: Log only.


enforced: Trigger spoofing defense
If a zone is found with authoritative mechanism (queries over TCP, raise
servers that do not properly respond event, etc) and log.
with mixed case, it can be white-listed silent-enforced: Trigger spoofing
with: defense mechanism but don’t log.
qname-case-randomization-
exclusions

67
12/6/2016

Other CacheServe Behavior

• CNAMEs was saved when other data existed and used if other data expired
from cache
Q: www.google.com TYPE1000
A: www.google.com CNAME www.evil.org
– CacheServe does not cache the CNAME in that case
• Additional Section data in answers ignored ( but not in referrals).
Q: 0001.google.com A
A: 0001.google.com A 1.1.1.1
Ignored
AD: www.google.com A 6.6.6.6
• Glue (separate delegation cache neutralizes attack)
Q: 0001.google.com A
AU: 0001.google.com NS www.google.com,
AD: www.google.com A 6.6.6.6

Exercise 7

• Check default QSPR


lsof –p <pid> | wc –l
shows ports used for outgoing requests
• Improve resistance by increasing ports
• Resolver qname-case-randomization(default off)
enforced unenforced silent-enforced
• Find cases of “qname” case mismatch
• Enforce case matching:
id-spoofing-defense-queries shows count
auth-monitoring also shows TCP requests
• Exclude domains from case-randomization

68
12/6/2016

DNS: what needs protection

Corrupting data Impersonating master


Cache impersonation
Zone administrator
1
4
Zone file master Caching server

2
3 5

Dynamic
updates
slaves Stub resolver
Cache pollution by
Data spoofing
Unauthorized updates
Altered zone data

Server protection Data protection


153

DNSSEC Summary

• Data authenticity and integrity by signing the


Resource Records Sets with private key
• Public DNSKEYs used to verify the RRSIGs
• Children sign their zones with their private
key
– Authenticity of that key established by
signature (hash) published in parent zone
– Data is not encrypted

69
12/6/2016

DNSSEC: What is it?

• Four new resource records


– RRSIG: the signature for a resource record
– DNSKEY: a public key
– NSEC: an indication of ‘holes’
– DS: hash of public key published to parent or added
to Trust Anchor Repository

• Types of keys:
– ZSK: zone signing key
• This is used to sign the RR’s in a zone.
– KSK: key signing key
• This is used to sign the DNSKEY’s in a zone.
– Done to avoid more communication with the parent or
external resolvers

DNSSEC: What is it?

• Trust anchor – the public key or hash of the


public key used for a particular zone
– This must be communicated to the resolver in
order to correctly validate a signature.
• Signing the ‘root’ – indicates that a trust anchor
exists for the ‘root’ zone that can be used as the
start for validation.
• DLV (Dynamic Lookaside Validation) service
– Not supported

70
12/6/2016

Exercise 8

• Activities
– Configure CacheServe "DNSSEC-aware"
– Configure CacheServe built-in managed-key
Specify “.” only, omit key

• Discussion
– Cacheserve uses EDNS0 by default
– Evolution of root key via RFC 5011:
CacheServe will “follow” rollover
– Enable “log-dnssec” resolver configuration
element for additional detail

6. Real-Time Visibility (RTV)

71
12/6/2016

Real-Time Visibility (RTV)

• RTV collects and stores queries in a database.


• RTV additionally provides a powerful system to
access and analyze the collected data.
• The collection feature is similar to DNSTAP, found in
other DNS servers.
• RTV is available in both AuthServe and CacheServe.
• RTV is also known as the querystore.
• More accurately: RTV is made up of the querystore
and statmon.
• The querystore is also the database of stored
queries.

RTV: statmon and the Querystore

• RTV is disabled by default.


• When enabled, a server
does not store the queries. Vantio Name
Server

• Instead, it sends them to


(CacheServe
or
Querystore

another process, the


AuthServe)

Statistics Monitor (statmon).


• (Obviously, statmon must be
running.) Statistics

• statmon has a database for


Monitor
(statmon)

storing queries, known as the


querystore.
nom-tell

72
12/6/2016

Server Querystore and Statmon

• Query collection is enabled


on a server using the
monitoring object. Vantio Name
Server

• Access and analysis of (CacheServe


or
Querystore

queries is through the


AuthServe)

Statistics Monitor (statmon).

Statistics
Monitor
(statmon)

nom-tell

Querystore (RTV): What is


Collected? For CacheServe only, the object
auth-monitoring collects queries to
When configured, the authoritative servers in a separate
monitoring object querystore (separate database). Both
collects arriving queries. queries and answers are collected.

Other DNS Servers


Clients of Name Server (e.g.: Authoritative
Name Server Servers, Forwarders)

monitoring can be
additionally configured to
collect responses.

73
12/6/2016

Accessing The Querystore

• Like for CacheServe and AuthServe, communication


with the Statistics Monitor is over a CC.
# nom-tell statmon

statmon> <TAB>
auth-report. process-information show-events
data-streaming. querystore. stop
instance-information report. uuid
list-events request-events version
statmon>

Enabling CacheServe or AuthServe


to collect queries is covered later
(the monitoring object).

Querystore: count and Time


Limitation
• count is the number of queries recorded.
• It will continually increase and decrease.
• statmon does not only add queries to the database,
but removes older ones as well.
• By default one day of data is stored.
• As older queries are removed, count decreases.
# statmon> querystore.count
{
type => 'querystore.count'
count => '113422' The total number of queries currently in
} the querystore.

74
12/6/2016

Querystore: Queries Per Second

• qps is the number of queries per second that have


been received.
• Like count it continually increases and decreases.
statmon> querystore.qps
{
type => 'querystore.qps'
qps => '42.495'
}

The queries per second of all queries


currently in the querystore.

Querystore: Top Domains

• top-domains are the individual domain names, not


apexes, that have been most queried.
statmon> querystore.top-domains max-results=3
{
type => 'querystore.top-domains'
domain => 'google.com' Without max-results, the top twenty
percentage => '21.6' most queried names are displayed.
qps => '13.040'
count => '3925'
}
{
domain => 'www.google.com'
percentage => '17.0'
qps => '10.272'
count => '3092' The domain names aren’t listed alone.
} Total count, qps, and percentage of all
{ queries is included as well.
domain => 'xyz.google.com'

75
12/6/2016

Querystore: Top Clients

• top-clients are the IP addresses from which the most


queries have come.
statmon> querystore.top-clients
{
type => 'querystore.top-clients'
address => 'fd0c:a43a:811f:ac:10bb::'
percentage => '93.8'
qps => '0.050'
count => '15'
}
{
address => '127.0.0.1'
percentage => '6.2'
qps => '0.003' A productive server could have hundreds
count => '1' of thousands of clients.
} This test server only has had two.

Querystore: Replay

• replay provides the full details of individual queries.


statmon> querystore.replay This use of
querystore.replay,
{
type => 'querystore.replay'

without limiting the output


timestamp => '1465408166'
start-time => '1465408165.668365'
end-time => '1465408165.668365'
serial => '356630'
through further options, is
ip-version => '4'
client-address => '172.16.187.1#56806'
strongly discouraged.
local-address => '172.16.187.10#53'

Even a very lightly loaded


name => 'abc.nominum.com'

All the output here is


query-class => 'IN'

resolver typically handles


query-type => 'A'

for one query only.


view => 'world'
zone => 'nOminUm.CoM'
resolver => 'world'
several queries per second.
query-id => '62828' The amount of output
generated will be massive.
flags => ('RD')
response-flags => ('RA' 'RD')
request-size => '33'
response-size => '84'
result-code => 'nxdomain' The engine that forwarded the
query to the statmon:
engine-name => 'cacheserve'
engine-version => '7.1.1.0'

CacheServe or ans (AuthServe)


node-id => 'dafff0c3-054b-5d19-b994-4d23fe5d70f2'
}
<OUTPUT SUPRESSED>

76
12/6/2016

Querystore: Replay: Output

Time values are shown in UNIX


statmon> querystore.replay
{
type => 'querystore.replay'
Time. On Linux systems, the time
timestamp => '1465408166'
can be made human readable in the
start-time => '1465408165.668365'
local timezone:
end-time => '1465408165.668365' # date -d @1465408166
serial => '356630' Wed Jun 8 10:49:26 PDT 2016
ip-version => '4'
client-address => '172.16.187.1#56806'
local-address => '172.16.187.10#53'
name => 'abc.nominum.com' This is the 356,630 query that
statmon has processed.
query-class => 'IN'
query-type => 'A'
view => 'world'
zone => 'nominum.com'
resolver => 'world'
query-id => '62828'
flags => ('RD')
response-flags => ('RA' 'RD')
request-size => '33'
response-size => '84'
result-code => 'nxdomain'
engine-name => 'cacheserve'
engine-version => '7.1.1.0'
node-id => 'dafff0c3-054b-5d19-b994-4d23fe5d70f2'
}
<OUTPUT SUPRESSED>

Querystore: Replay: Output


statmon> querystore.replay
{
type => 'querystore.replay'

The client socket.


timestamp => '1465408166'
start-time => '1465408165.668365'
end-time => '1465408165.668365'
serial => '356630'
ip-version => '4'
client-address => '172.16.187.1#56806'
local-address => '172.16.187.10#53'
name => 'abc.nominum.com'
query-class => 'IN' The server socket where the query
query-type => 'A' arrived.
view => 'world'
zone => 'nominum.com'
resolver => 'world'
query-id => '62828'
flags => ('RD')

The three fields of any query:


response-flags => ('RA' 'RD')
request-size => '33'
response-size => '84'
result-code => 'nxdomain' domain name, class, and type.
engine-name => 'cacheserve'
engine-version => '7.1.1.0'
node-id => 'dafff0c3-054b-5d19-b994-4d23fe5d70f2'
}
<OUTPUT SUPRESSED>

77
12/6/2016

Querystore: Replay: Output


statmon> querystore.replay
{
The view that received the query.
type => 'querystore.replay'
timestamp => '1465408166'
start-time => '1465408165.668365'
end-time => '1465408165.668365'

AuthServe: The name of the zone.


serial => '356630'
ip-version => '4'
client-address => '172.16.187.1#56806'
local-address => '172.16.187.10#53'
name => 'abc.nominum.com'
query-class => 'IN'

CacheServe: Zone from the


query-type => 'A'
view => 'world'
zone => ’nominum.com' AUTHORITY section when
resolver => 'world' applicable (e.g. NXDOMAIN).
query-id => '62828'
The resolver that received the
flags => ('RD')
query (CacheServe only).
response-flags => ('RA' 'RD')
request-size => '33'
response-size => '84'

The query-id (message-id) of the


result-code => 'nxdomain'
engine-name => 'cacheserve'

incoming query.
engine-version => '7.1.1.0'
node-id => 'dafff0c3-054b-5d19-b994-4d23fe5d70f2'
}

The flags in the incoming query


<OUTPUT SUPRESSED>

(flags), and in the outgoing


response (response-flags).

Querystore: Replay: Output

The byte counts of the query and


statmon> querystore.replay
{
type => 'querystore.replay'
timestamp => '1465408166' response.
start-time => '1465408165.668365'
end-time => '1465408165.668365'
serial => '356630'

The Response-Code (RCODE) in


ip-version => '4'
client-address => '172.16.187.1#56806'
local-address => '172.16.187.10#53'
name => 'abc.nominum.com' the answer sent (e.g. NOERROR,
query-class => 'IN'
query-type => 'A'
NXDOMAIN, REFUSED, etc)
view => 'world'
zone => 'nominum.com'
resolver => 'world'
query-id => '62828’

Various engines (CacheServe,


flags => ('RD')
response-flags => ('RA' 'RD')
request-size => '33' AuthServe) can send information
response-size => '84' to the statmon.
result-code => 'nxdomain' This indicates the source engine
engine-name => 'cacheserve' and its version.
engine-version => '7.1.1.0'
node-id => 'dafff0c3-054b-5d19-b994-4d23fe5d70f2'
}
<OUTPUT SUPRESSED>

The node-id is a Nominum internal


uuid for the engine.

78
12/6/2016

Querystore: Replay: Output


{
timestamp => '1465465165'
<OUTPUT SUPRESSED> By default, answers are not sent
name => 'cmu.edu'
by a server to the statmon, and
query-class => 'IN'
therefore not logged. However,
query-type => 'A'
view => 'world'
the result-code is logged by
<OUTPUT SUPRESSED>
response-size => '41'
default.
result-code => 'noerror'
<OUTPUT SUPRESSED>
}

Answers are logged when


{
timestamp => '1465466288' enabled through an option in
<OUTPUT SUPRESSED>
CacheServe or AuthServe.
name => 'ibm.com'
query-class => 'IN'
query-type => 'A'
view => 'world'
<OUTPUT SUPRESSED>
response-size => '41'
result-code => 'noerror'
answer => (('ibm.com' '21600' 'A' 'IN' '129.42.38.1'))
<OUTPUT SUPRESSED>
}

Querystore: Restricting Output

• The querystore commands just shown are amongst


the most useful.
• To get the most use out of any querystore command,
restrictions are placed on the output.
# nom-tell statmon querystore.qps | grep 'qps ='
qps => '495.026'
QPS over what time period?

Without specifying it, all queries


in the querystore are included,
and the duration those are kept
is dependent on the configuration
set in CacheServe or AuthServe.

79
12/6/2016

Querystore: Limiting Duration

• duration limits the calculation to a time window.


# nom-tell statmon querystore.count duration=300 |grep 'count ='
count => '1214’
# nom-tell statmon querystore.count duration=5m |grep 'count ='
count => '1213’
# nom-tell statmon querystore.qps duration=300s | grep 'qps ='
qps => ’4.337’
statmon> querystore.replay duration=1
{
type => 'querystore.replay'
timestamp => '1465468359' In the first three examples, the
<OUTPUT SUPRESSED> data set is limited to the most
recent 300 seconds.
qps
For replay, 1s of data is retrieved.

duration t Scalers may be used for the


values (e.g. m=minutes, h=hours).
-24 hours now

Querystore: End

• end is Unix Time in seconds after which queries


should not be included.
• end can be used together with duration but they are
not in the same units.
• With the help of the date program duration and end
work well together.
# nom-tell statmon querystore.count duration=60 end=\
$(($(date +%s) -240)) | grep 'count =’
How many queries were there in a
count => '1150'
one minute period, starting five
minutes ago (ending 4 minutes ago)?
# nom-tell statmon querystore.replay duration=3600
end=$(($(date +%s) -7200))
Show all queries over one hour
<OUTPUT SUPRESSED>
ending two hours ago.

80
12/6/2016

Querystore: Interval

• interval limits the calculation to queries from a time


range.
statmon> querystore.top-domains interval=(2016-06-09:03:40:00
2016-06-09:03:45:00)
{
type => 'querystore.top-domains' Interval takes a start and stop time
domain => 'wormhole.movie.edu' for the queries to include.
<OUTPUT SUPRESSED> Format: YYYY-MM-DD:hh:mm:ss

The start time is 3:40AM on June 9th,


2016.

statmon> querystore.count interval=(2016-06-09:03:40:00 2016-06-


09:03:45:00)
{
type => 'querystore.count'
count => '451'
}

Interval, duration, end

qps

T4 T3 t

T5-T4 T5 T1 T2 T-T3 T (now)

querystore.count interval=(T1,T2)
querystore.count duration= T3
querystore.count end=T5 duration=T4

81
12/6/2016

Querystore: max-results

• max-results limits the output of lists.


statmon> querystore.top-domains max-results=2
{
type => 'querystore.top-domains'
domain => 'wormhole.movie.edu'
percentage => '17.7'
qps => '0.009'
count => '62'
}
{
domain => 'wh.movie.edu'
percentage => '9.1' Commands that don’t output lists,
qps => '0.004' such as querystore.qps and
count => '32' querystore.count, can not be
} limited by max-results.
statmon> querystore.qps <TAB>
anonymize duration end filter interval source

Querystore: filters

• Filters limit the output of lists.


statmon> querystore.replay filter=( (client-address (t (::1))) )
<OUTPUT SUPRESSED>

Multiple filters can be combined. Each filter is a field, a Boolean, and the
value of the field to match.
Here the client must be ::1, the
response must come from the world Here only a client with the address ::1
view, the RCODE must be NOERROR, (IPv6) is include in the output.
and the response size must be 100
bytes or less. Formatting hint: Each individual filter
ends in three closing parenthesis.
Additionally, only results from the last
ten minutes are included.

statmon> querystore.replay filter=( (client-address(t (::1)))


(view (true (world))) (result-code (1 (NOERROR)))
(response-size-ge (f (100))) ) duration=10m
<OUTPUT SUPRESSED>

82
12/6/2016

Querystore: Filter Usage


statmon> querystore.replay filter=( (client-address(t (::1)))
(client-address (true (127.0.0.1))) )
{
type => 'querystore.replay'
} Although multiple filters can be
combined, using the same filtering
criteria twice (e.g. client-address)
never produces any results!

statmon> querystore.replay filter=( (client-address (true


(127.0.0.1 ::1) )) )
{ Logical OR is achieved by adding
type => 'querystore.replay' additional values to one filtering property.
timestamp => '1465480356'
<OUTPUT SUPRESSED>

Querystore: Domain Filtering


statmon> querystore.top-domains filter=((name (true
(diehard.movie.edu robocop.movie.edu))))
{
type => 'querystore.top-domains'
domain => 'robocop.movie.edu'
When filtering on domains, the
percentage => '50.2' percentage is of the total output.
qps => '0.129'
count => '926'
}
{
domain => 'diehard.movie.edu'
percentage => '49.8'
qps => '0.127'
count => '918'
}

statmon> querystore.top-domains filter=((name (true


(robocop.movie.edu))))
{
type => 'querystore.top-domains'
domain => 'robocop.movie.edu'
percentage => '100.0'
qps => '0.160'
count => '1152'
}

83
12/6/2016

Querystore: Practical Example


statmon> querystore.count duration=5m
{
type => 'querystore.count'
count => '12023'
}
statmon> querystore.count filter=((client-address
(t(172.16.187.1)))) duration=5m
{
type => 'querystore.count'
count => '11499'
}

Here we see almost all queries in the


last five minutes were from one host.

On a production name server normally


serving many hosts, this may be an
indication of an attack.

Querystore: Domain Filtering:


name vs. domain
statmon> querystore.top-domains filter=((name (true
(movie.edu))))
{
type => 'querystore.top-domains'
domain => 'movie.edu'
percentage => '100.0'
qps => '0.181'
count => '1302'
}
statmon> querystore.top-domains filter=((domain (true
(movie.edu))))
{
type => 'querystore.top-domains'
domain => 'wormhole.movie.edu' name filters for the given domain-name.
percentage => '15.2'

domain treats the given domain-name


qps => '0.353'
count => '2540'

as an apex and filters for everything


}
{

within that domain.


domain => 'wh.movie.edu'
percentage => '8.1'
qps => '0.188'
count => '1357'
}
{
domain => 'terminator.movie.edu'
percentage => '7.8'
qps => '0.182'
count => '1311'
}
{
domain => 'movie.edu'
<OUTPUT SUPRESSED>

84
12/6/2016

Enabling the Querystore

• Viewing of query statistics through the statsmon is


possible after it has been enabled.
• Query collection is enabled in a server (CacheServe
or AuthServe) using the monitoring object.
cacheserve> monitoring.get
{
type => 'monitoring.get' In a newly installed server,
} monitoring is disabled.

cacheserve> monitoring.<TAB>
monitoring.get monitoring.statistics
monitoring.replace monitoring.update
statistics are only in CacheServe.

Enabling the Querystore

• Enabling query collection with defaults:


cacheserve> monitoring.update querystore={}
ans> monitoring.update querystore={}

• Selective defaults:
• duration: 24 hours: When a query's age reaches
the duration, it is removed from the querystore.
• max-size: unlimited: If the querystore exceeds
max-size, the oldest queries are deleted.
• anonymize-search-results: false: Whether to
anonymize client addresses in querystore results.
• include-answers: false: Whether to store
queries’ answers returned to the clients.

85
12/6/2016

Enabling the Querystore

When configured, the


monitoring object
collects arriving queries.

Clients of CacheServe
Name Server

monitoring include-answers=true

• To avoid filling available disk space, it is highly


recommended set max-size.
cacheserve> monitoring.replace querystore={max-size=300M duration=7d
include-answers=true}

Disabling the Querystore

• Disabling the logging of queries:


cacheserve> monitoring.update unset=(querystore)
ans> monitoring.update unset=(querystore)

86
12/6/2016

Querystore: Auth-Monitoring

The object auth-monitoring collects


queries to authoritative servers.

Other DNS Servers


Clients of CacheServe (e.g.: Authoritative
CacheServe Servers, Forwarders)

• auth-monitoring is analogous to monitoring. It


collects outgoing queries to other servers, not arriving
queries from clients (defaults are the same).
cacheserve> auth-monitoring.replace auth-querystore={max-size=200M
duration=3d include-answers=false}
To access the auth-querystore in statmon:
statmon> auth-querystore.<TAB>

Exercise 9

• Activities
– Enable RTV with CC instruction
– Define querystore
• short lifetime to acquire new data; see it “age-out”
• long lifetime to accommodate forensic activity
– Experiment with
• Searches Now including core domains

• Filters
– “in-line” search (report)
– Dump querystore as text file
• statmon_export utility

87
12/6/2016

Real-Time Alerts

• RTV (covered earlier)


– Permits aggregation and reporting
– Facilitates audit of infrequent requests
• Real-Time Alerts (aka querythresholding)
– Asynchronous indication of qps change
– Works well only for streams at least 5qps
– Duration / Onset (Threshold Value) / Abate
• Modifiers:
– filter (on individual threshold definition)
– querythreshold-filter (applies to all
thresholds)

querythreshold configuration

• Declare statistic to track (eg name, result-code)


• Define qualifiers
– window and triggers qps
• Duration (seconds)
• Onset (qps)
t
• Abate (qps) -24 hours now

– what action to take


• log
• log-and-event qps

– label output with id


t (sec)
• User-defined text duration

88
12/6/2016

Sample querythreshold
• Track total number of queries
• Define querythreshold in CacheServeonset
– Statistics info
10
abate
• Duration (15 seconds) total
7

• Onset (10 qps)


(qps)
average

• Abate (7 qps)
– Identifier and Action -15 duration t (sec)
• Inspect statistics in statmon
• Log or Event when average value
– Exceeds onset on way up, and
– Falls below abate on the way down

Percentage-based querythreshold

• Absolute threshold values can be problematic


– Periodic variations
– Normal growth

 Monitor ratios with threshold-percentage


onset
10,000
Compute fraction of servfails:
abate ________ _____
7,000
total servfail ÷ total
_____
(qps)
total
This value is insensitive to volume
servfail
________
(qps)
servfail
-15 duration t (sec)

89
12/6/2016

Configuring threshold-percentage

querythreshold => (
(
'total'
{
id => 'server_qps'
action => ‘none'
threshold => ('15' '10' '5')
}
)
(
'result-code'
{
id => 'servfail_fraction‘
action => 'log'
target => 'servfail'
threshold-percentage => ('20' '30' '10' 'server_qps')
}

Exercise 10

• Activities
– Update monitoring object with threshold
declarations for statistics tracking
– Listen for events from statmon
• Discussion
– Events triggered when the average value of a
tracked attribute (for example, total QPS) over
a duration
• exceeds an onset value (this is the “active”
state)
• falls below an abate value

90
12/6/2016

CacheServe Utilities

• CacheServe stores its configuration in a database.


• Normal access is through a command channel (CC)
communicating with CacheServe.
• The CacheServe utilities are an advanced feature
that allow the databases to be read and manipulated
even when CacheServe is not running.
• The utilities work differently when communicating
with a running server or with databases directly.

cacheserve-dumpconf
# nom-tell cacheserve resolver.get name=world The nom-tell command above
<OUTPUT SUPRESSED> and cacheserve-dumpconf
response: command below provide the same
{ information retrieved from a running
type => 'resolver.get' CacheServe over the CC.
name => 'world'
preload => (('localhost' 'A' '127.0.0.1') ('localhost' 'AAAA' '::1'))
log-id-spoofing => 'true'
qname-case-randomization => 'enforced'
query-source-pool => ('2048' '192.168.88.213#0')
}
The command does
# cacheserve-dumpconf --object-type resolver --name world NOT directly access
{ the database file.
name => "world"
preload => (("localhost" "A" "127.0.0.1") ("localhost" "AAAA" "::1"))
log-id-spoofing => "true"
qname-case-randomization => "enforced"
query-source-pool => ("2048" "192.168.88.213#0")
}

91
12/6/2016

cacheserve-dumpconf
Here, CacheServe
has been stopped.
# cacheserve-dumpconf --object-type resolver --name world
cacheserve-dumpconf: critical: Connection refused

# nom-tell cacheserve resolver.get name=world


<OUTPUT SUPRESSED>
nom-tell: critical: could not send to 'cacheserve': Connection refused

# cacheserve-dumpconf --configuration /var/nom/cacheserve/cacheserve.vdb2 \


--object-type resolver --name world
{
name => "world"
With CacheServe stopped,
preload => (("localhost" "A" "127.0.0.1") ("localhost" "AAAA" "::1"))
CacheServe utilities can directly
log-id-spoofing => "true"
access the database with
qname-case-randomization => "enforced"
--configuration and the
database name (-c can be used
query-source-pool => ("2048" "192.168.88.213#0")
}
as well).

The “.vdb2” in the database


name can be excluded.

cacheserve-editconf

• cacheserve-editconf can communicate with a


running CacheServe or directly with a database.
• It opens in a text editor, configurable with the EDITOR
and VISUAL shell environment variables.
• If they are not set, it opens in vi.
# cacheserve-editconf --configuration /var/nom/cacheserve/cacheserve \
--object-type resolver --name world

{
name => "world"
preload => (("localhost" "A" "127.0.0.1") ("localhost" "AAAA" "::1"))
log-id-spoofing => "true" In the vi editor.
qname-case-randomization => "enforced"
query-source-pool => ("2048" "192.168.88.213#0")
} For non vi users, from BASH
~ you can change the editor. For
~ example:
~
"~/.nom/tmp/cacheserve_editconf.4033" 7L, 224C
# export EDITOR=nano

92
12/6/2016

Other CacheServe Utilities

• cacheserve-deleteconf allows the removal of an


object, for example a view, or a policy.
• cacheserve-loadconf loads objects.

Using cacheserve-*
• Database (shell) utilities:
cacheserve-dumpconf cacheserve-listconf
cacheserve-editconf cacheserve-loadconf
• CacheServe running
cacheserve-dumpconf --list-all --object-type view
cacheserve-dumpconf --object-type view --name world
cacheserve-editconf --object-type server
• CacheServe stopped
cacheserve-editconf –c /var/nom/cacheserve/cacheserve --view foo
cacheserve-dumpconf –c /var/nom/cacheserve/cacheserve --all
cacheserve-loadconf –c /var/nom/cacheserve/cacheserve --all file

NEW: loads
ALL objects

93
12/6/2016

Exercise 11

• Activities
– Get/set configuration elements with utilities
• Discussion
– Configuration argument (–c)
# cacheserve-loadconf –c /tmp/cacheserve
# cacheserve-loadconf –c /tmp/cacheserve.vdb2
these both mean edit the database in the directory
/tmp/cacheserve.vdb2
– Most useful for special tasks
• Recovery
• Migration (Cloning)

Exercise 12

• Activities
– Use cacheserve-convertconf to create
new database from Vantio output
• Make Vantio 5 DB using vantio-loadconf
• Dump the DB into a file /tmp/vantio_5.txt
• Run ConvertConf on the file

cacheseve-convertconf –c /tmp/vantio7/_cacheserve
/tmp/vantio_5.txt

94
12/6/2016

Having CacheServe Directly


Answer Queries
• CacheServe can be configured to answer queries
directly.
• These features do not make CacheServe into an
authoritative DNS server.
• Policies allow a variety of manipulations, e.g.
answering with NXDOMAIN.
• (Policies appeared earlier in the course.)
• Preload and synthesize statements are another
option for answer manipulation.
• There is some overlap with policies, e.g.
NXDOMAIN as an answer can be done with either.

resolver.update preload

• preload configures CacheServe with a RRSet to


respond to a query.
• It appeared earlier in the course.
• Preload functionality can also be achieved with a
policy.
cacheserve> resolver.update name=world preload=((facebook.com. AAAA 2001:db8::a))

# dig @127.1 aaaa facebook.com


<OUTPUT SUPRESSED>
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
<OUTPUT SUPRESSED> CacheServe does not provide an
facebook.com. 0 IN AAAA 2001:db8::a authoritative answer for preloads.

95
12/6/2016

resolver.update preload-nxdomain
cacheserve> resolver.update name=world preload-nxdomain=(facebook.com)
<OUTPUT SUPRESSED>
err => 'preload-nxdomain for "facebook.com." conflicts with preloaded records'

cacheserve> resolver.update name=world preload-=((facebook.com. AAAA 2001:db8::a))

cacheserve> resolver.update name=world preload-nxdomain=(facebook.com)

preload-nxdomain The “-=“


# dig +noall +comment @127.1 facebook.com aaaa
removes the existing preload.
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 39358
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

# dig +noall +comment @127.1 facebook.com a


;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 26288
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: preload-nxdomain
0, ADDITIONAL: 0 also
appeared earlier in the course.

resolver.update
synthesize-nxdomain
# dig +noall +comment @127.1 facebook.com
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 20171
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

# dig +noall +comment @127.1 www.facebook.com preload-nxdomain is only for


;; Got answer: the specific domain name. It is
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: not an apex.
27975
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

cacheserve> resolver.update name=world synthesize-nxdomain=(facebook.com)

# dig +noall +comment @127.1 www.facebook.com


;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 41203
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
synthesize-nxdomain is an
apex.

96
12/6/2016

resolver.update preload-nxrrset

cacheserve> resolver.update name=world preload-nxdomain-=(facebook.com)

cacheserve> resolver.update name=world preload-nxrrset=((facebook.com AAAA))

# dig +noall +comment +answer @127.1 facebook.com


;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2530
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; ANSWER SECTION:
facebook.com. 285 IN A 69.171.230.68

# dig +noall +comment +answer @127.1 facebook.com aaaa


;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10590

There
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: is not a
0, ADDITIONAL: 0
synthesize-nxrrset command.

CacheServe Stub Statement

• Stub used to by-pass A


the normal resolution
process B

• Typical use: reverse


lookups for RFC 1918 CS
space C

client
X
Y

Pseudo-configuration:
stub 10.in-addr.arpa X {1.1.1.1}

97
12/6/2016

resolver.update stub

• CacheServe supports stub zones, where it is


configured with the addresses of authoritative servers
and queries them directly.
• It is commonly used for zones on authoritative
servers that can’t be accessed through normal
resolution, such as RFC 1918 reverse lookups.
cacheserve> resolver.update name=world \
stub=((16.172.in-addr.arpa ((ns1.mycompany.example (172.16.1.1)))))

cacheserve> resolver.update name=world \


stub=((silly.example ((ns1.silly.example (192.0.2.1)))))

The domain name is an apex. The name of the auth server The IP address of
Everything under it will be (not used for anything except the auth server.
accessed as a stub. logging).

CacheServe Forward Statement

• Forward used to transfer


the recursive resolution
A

process to another entity B

• Example: Name server in


DMZ Z
C

CS

client
Pseudo-configuration:
forward . Z only {2.2.2.2}

98
12/6/2016

resolver.update forward

• CacheServe supports forwarding of specific domains.


• Forwarding skips the normal recursive process.
• CacheServe sends a query with the RD flag set to
an IP address (correctly, to another recursor).
• BIND calls this a forward zone.
• To forward everything (like BIND’s forwarders
stanza) set the domain name to: “.”
cacheserve> resolver.update name=world
forward=((example.com first (1.1.1.1 2.2.2.2) ))

The domain name is first: Try one forwarder after the other. If they do not
an apex. Everything respond, resolve the query normally.
below it will be only: Fail if all forwards do not respond
forwarded. off: Disable forwarding for a subdomain of a
forwarded domain. (The IP address list must be empty.)

Exercise 13

• Activities
– Compare effects of using preload-nxdomain
statement for a single name and synthesize-
nxdomain
– create a CacheServe stub statement that points
to an authoritative name server
– Set forward statements using CC instructions

99
12/6/2016

Exercise 14

• Record authoritative traffic

cacheserve> auth-monitoring.update auth-


querystore={}

Searches might include RDATA, among others:


filter=((answer (true ({rdata=1.2.3.4
type=A}))))

Custom Resolution with Policies

domain1
domain2 policy1
domain3 policy2
domain4

Lists of names or IPs Behaviors

binding1
binding2

Link client population to


be influenced with lists to
which behaviors apply

100
12/6/2016

Ignoring amplification (ANY) queries

isc.org
drop ANY
ripe.net

List of names Behaviors

binding

world view refuses to


process type-ANY query
for domain(s) on list

Exercise 15
• Implement "drop type-ANY query" amplification defense
– Add list of domains and binding to “world” view
action => drop
selector => (and ((qtype (ANY)) (qname
(amplification-domains exact-or-www))))
• Implement Preferred Address Sorting
– Many services provide multiple A records:
apple.com. 3600 IN A 17.172.224.47
apple.com. 3600 IN A 17.149.160.49
– Normal processing is to rotate the sequence
policy permits creation of in-network values to prefer
action => (sort-addresses ((in-net) false))
No selector
Binding executed postquery “remove-unmatched” flag

101
12/6/2016

Advanced Rate Limiting

 selector initial-qname means "limit if query name being


processed is same as received in request" (not CNAME)
 policy action “truncate”
cacheserve> ratelimiter.add name=foo qps=1000
fields=((client-network (24 64))
(query-name (3)))

cacheserve> policy.add name=bar


action=truncate selector=
(and (initial-qname (rate-limiter foo) ))

cacheserve> binding.add policy=bar server=1


priority=1

Policy-based rate limiting

• All the normal selectors can be used


• The Policy’s action dictates whether CacheServe drops or
truncates queries which exceed the QPS rate.
• A given query should only touch the same rate limiter once.
• Bad use: server policy and view policy use the same
rate limiter
• Bad use: any policy and response-size limiting sharing a
limiter.

102
12/6/2016

Selecting which queries are limited

• The policy decides what is limited, but not how.


• Chain selectors with and() to filter a query, such as:
– list membership
– network address
– qname or qtype
• The first selector should be initial-qname (omits CNAME)
• The last selector should be (ratelimiter <name>)

selector=(and( \
initial-qname \
… other selectors …
(rate-limiter foo) \
))

Defining how queries are limited

•The ratelimiter object defines what fields are used to


bucket similar queries together.
•Currently these are:
– query-type
– (client-network (ipv4-bits ipv6-bits))
– (query-name (labels-to-keep))

103
12/6/2016

Rate Limiter Fields

Combining Rate Limiter Fields

104
12/6/2016

Setting max-entries

•Defaults to 10,000
•Only uses what is required
•General sizing guidelines:
– The more specific you are, the more entries you need.
– More specific query-names or client-networks
– Combinations of various fields in the same limiter

Setting max-entries

•Detecting “too small” situations via statistics:

cacheserve> ratelimiter.statistics name=foo all=true


{

statistics => {

current-entry-count => '10000'
expiring-entry-age => '129951'
}
}

•129951 / 1,000,000 == 0.129951 seconds


•This is too short a time to effectively apply rate limiting.

105
12/6/2016

Rate Limiting Statistics

•Each rate limiter has statistics.


•policies do not have statistics except via statmon.
If the rule of “one limiter, one policy” is followed, the rate
limiter statistics are identical to the policy actions taken.
cacheserve> ratelimiter.statistics name=foo all=true
{

statistics => {
uses => '1001284'
indications-by-qps => '124885'
indications-by-bps => '0'

}
}

Response-Size Rate Limiting

•Uses a named ratelimiter object.


•Policy selector is response-size 1024
•This selector will match if the size of the response
packet is greater than or equal to value specified
•Must be bound at “presend” time

106
12/6/2016

EXAMPLE: List Membership

EXAMPLE: Multiple Views

107
12/6/2016

BAD EXAMPLE 1

Ratelimiter is called twice: at server and view scopes

Exercise 16

• Implement "truncate" amplification defense based on list


membership

108
12/6/2016

ECS Review
• PROBLEM: Traditionally, authoritative servers do not know IP address of
originating DNS client
• SOLUTION: Use EDNS optional RR to “forward” client IP data

Authority
www.google.com CacheServe qname YES
Returns response
learns IP white- appropriate for Client
listed? AND scope
from packet + OPTRR

NO Authority
Returns response
CacheServe
appropriate for Resolver

• ECS adopters include CDN operators


• GOTCHA: Multiplicity of answers must be cached

SEND RECEIVE
www.google.com PREFIX SCOPE NET
25.24.8.0/24 12 25.16.0.0
25.185.8.0/24 13 25.184.0.0
25.197.8.0/24 12 25.192.0.0
25.0.0.0/9 25.234.8.0/24 14 25.232.0.0

25.128.0.0/9

26.0.0.0/9

/13 Cache for www.google.com


½ million IPs

109
12/6/2016

CacheServe 7 ECS configuration


 ECS has been supported in Vantio/CacheServe for years
 Enable domains for which ECS should be used:
cacheserve> resolver.update client-subnet={whitelist=google.com
valid-addresses=0.0.0.0/0} name=world

Authority
returns response
Client www.google.com
CacheServe qname YES appropriate for Client AND
25.144.78.9 learns IP white- scope
from packet listed? + OPTRR

NO
Client can provide PREFIX,
Authority
valid-addresses ACL CacheServe returns response
determines whether CS forwards appropriate for Resolver
it or not

• To use server.query for testing, set valid-addresses=0.0.0.0/0

CDN e.g. Akamai

SCOPE ANSWER

24 A
25 B
24 C
24 D
25.0.0.0/9 25.2.0.0/17

25.128.0.0/9 25.2.128.0/17

26.0.0.0/9 25.3.0.0/17
/21
2048 IPs

110
12/6/2016

Equivalence Class Configuration

ADDRESS LIST
SCOPE NAME CONTENTS
24 A 25.2.8.0/24,25.2.40.0/24, …

25* B 25.2.23.128/25,25.2.87.128/25,

24 C
24 D 25.2.52.0/24,25.2.76.0/24, …

cacheserve> address-list.add name=A


cacheserve> address-node.add address=25.2.8.0/24
list=A
cacheserve> resolver.update name=world
client-subnet={whitelist=akamai.com equivalence-
classes=(A)}

• Default is to pass /24 to auth servers.


Configure max-source-prefix-v4 to extend to /25

Equivalence Class Example

ADDRESS LIST
NAME CONTENTS
A 25.2.8.0/24,25.2.40.0/24, …

B 25.2.23.128/25,25.2.87.128/25,

Client
CacheServe Attempt to match Authority
learns IP any lists in returns response
25.2.40.92
from Equivalence Class + OPTRR appropriate for Client
packet AND scope
25.2.8.0/24

CacheServe

• Representative Address for an Equivalence Class is by default the


lowest value, override with each list’s representative-address-v4

111
12/6/2016

IPv6 Transition with NAT64

IPv6 services
IPv6-only net
acme.com

host1 example.org

host2 NAT64
dst 64:ff9b::102:304 IPv4 services

 Site accessed with pure v6 yahoo.com

transport: acme.com google.com


1.2.3.4

 Legacy (v4) sites referenced inside IPv6-


only net as </96-prefix>:<IPv4>

DNS64 synthesizes AAAA

Authoritative
Dual-stack node resolver
v6
Name
Server
CS 7

host2 NAT64

 Fetch AAAA record(s) from


authoritative servers
v4
Name
Server

 IF type AAAA do not exist, concatenate


prefix and A record(s) of same name

112
12/6/2016

CacheServe DNS64 support

• Prefixes stored in dns64 objects (eg a and b)


• DNS64 enabled by policy at any scope desired:
cacheserve> dns64.mget
prefix => '64:ff9b::/96'
name => '64:ff9b::/96'
cacheserve> policy.add name=a action=(dns64 64:ff9b::/96)
cacheserve> binding.add server=1 when=postquery priority=5 policy=a
– Reverse record for PTR requires second policy:
4.0.3.0.2.0.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.b.9.f.
f.4.6.0.0.ip6.arpa. IN PTR

• Other customizations in dns64 object include


• Ignore specific AAAA responses (exclude known bad ones)
• ACLs on which A records to process (mapped v4 addresses)
• Define a suffix (if prefix is less than 96 bits)

Exercise 18

• Use IPv6 tools:


– ping6 <IPv6_addr>
– dig @<IPv6_addr>
• Configure dns64 object(s)
• Create policy and binding
– confirm CacheServe synthesizes answers Best practice
– AAAA queries over IPv4 get what? is to add
selector for
• Discussion v6 only
– Reverse records can be synthesized with another
policy

113
12/6/2016

Command-Line Options

• CacheServe and AuthServe accept command line


arguments modifying how they will run.
• In many cases these are not necessary.
• To pass arguments, create a configuration file:
/usr/local/nom/etc/sysconfig/{cacheserve,ans}
• A shell variable contains the arguments:
CACHESERVE_OPTIONS ANS_OPTIONS
• A startup script reads the file:
/etc/init.d/{cacheserve,ans}
• Do not modify these startup scripts directly.
# cat /usr/local/nom/etc/sysconfig/cacheserve
CACHESERVE_OPTIONS="--license /root/cacheserve.license"

# nom-tell cacheserve process-information | grep arguments


arguments => ('/usr/local/nom/sbin/cacheserve' '--license'
'/root/cacheserve.license' '-F')

A Selection of Command-Line
Options

• --license <file>
– Read given file as license
• -c <filename>
– Use filename as configuration file/database
• --channel <service>
– Open the command channel defined by service
• -s <syslog-facility>
– Use syslog facility syslog-facility for logging
• --usage
– Brief listing of all options.
• -h --help
– Information about options.

114
12/6/2016

Examining the Startup Arguments

• The CC process-information command shows


the arguments that started the server.
• A change to the arguments require restarting the
server:
•/etc/init.d/{cacheserve,ans} restart
•nom-tell {cacheserve,ans} restart will not
pickup the new arguments!
# cat /usr/local/nom/etc/sysconfig/cacheserve
CACHESERVE_OPTIONS=”--license /root/cacheserve.license”

# nom-tell -F arguments cacheserve process-information


('/usr/local/nom/sbin/cacheserve' '--license' '/root/cacheserve.license' '-F')

# nom-tell -F arguments ans process-information


('/usr/local/nom/sbin/ans' '--foreground-with-syslog')

Revisiting: /etc/channel.conf

• Servers (AuthServe, CacheServe, Nanny, etc) read


channel.conf to known which sockets to listen on.
• They further learn the secret to demand from clients
on each socket (on each CC).

• Servers are passed command line arguments to know


which CCs in channel.conf to listen on.
• A listing in channel.conf alone is not sufficient.
• A CC is assigned with a --channel <service>
argument.

• Of course the strongly recommended way to pass the


--channel argument is in the script:
/usr/local/nom/etc/sysconfig/{cacheserve,ans}

115
12/6/2016

No --channel Argument

• Without --channel, AuthServe assumes a CC called


ans. CacheServe assumes a CC called cacheserve.
• If a --channel argument is provided, there are no
assumed arguments.
• CacheServe & AuthServe can listen on multiple CCs.
# cat /usr/local/nom/etc/sysconfig/ans
cat: /usr/local/nom/etc/sysconfig/ans: No such file or directory
A --channel argument is not being passed.
AuthServe uses the CC ans in: /etc/channel.conf
# service ans start
Starting Nominum Authoritative DNS server (ANS): [ OK ]
# grep '^ans ' /etc/channel.conf
ans 9253 88utSKQ6Iz1gkE6BR4VdJhMI6l/Qotf8UsDiaS4jPb9oL+VO

AuthServe is listening on: 127.0.0.1:9253

No --channel Argument
# tail -100 /var/log/messages | grep listening.for.commands
Sep 12 22:33:54 CentOS6 ANS[21844]: info: listening for commands on
127.0.0.1#9253
Confirming the socket.
# ss -an | grep 9253
LISTEN 0 128 127.0.0.1:9253 *:*
# nom-tell -F vendor ans version
Nominum Again, confirming the socket.

116
12/6/2016

The --channel Argument

# cat /etc/channel.conf
ans 9253 88utSKQ6Iz1gkE6BR4VdJhMI6l/Qotf8UsDiaS4jPb9oL+VO
ansv6 ::1#9253 HiMom
The bold lines were added manually. The
ans-2 10.0.2.15#9253 HiMom
blah 10000 Hello2 remainder were added as software was
ans-statmon 9993 installed.
1P5/Q9TQGsOzzH2kmD47g27qtdh3RWalinSLStrN1tRx8kJh
snmpagent 9912 ViETZRan9GrmmFHkJLEsn8EvrV8IUOOtMIhjVV+VffLu97n4
statmon 9994 CacheServe & AuthServe can listen on
`//1TuWFboY/XbZ/Me+1ZBi553q+lkJ8VYpCHoUo72fflrnm0
nanny 9449 multiple CCs.
ekkkjy9vXnms2n9eN6sob2YGRAWxTQF6DRmW6HqdcxSxFVFX

# cat /usr/local/nom/etc/sysconfig/ans
ANS_OPTIONS="--channel ansv6 --channel blah --channel ans-2"

CacheServe & AuthServe can listen on multiple CCs.

When a --channel argument is provided, there are no


assumed arguments.
(AuthServe will not not listen on the CC labeled ans.)
# service ans restart
Stopping Nominum Authoritative DNS server (ANS): [ OK ]
Starting Nominum Authoritative DNS server (ANS): [ OK ]

The --channel Argument

# tail -100 /var/log/messages | grep listening.for.commands


Sep 12 23:06:12 CentOS6 ANS[22211]: info: listening for commands on
::1#9253 Confirming the three sockets.
Sep 12 23:06:12 CentOS6 ANS[22211]: info: listening for commands on
127.0.0.1#10000
Sep 12 23:06:12 CentOS6 ANS[22211]: info: listening for commands on
10.0.2.15#9253
Again, confirming the sockets.
# ss -an | egrep '9253|10000'
LISTEN 0 128 10.0.2.15:9253 *:*
LISTEN 0 128 ::1:9253 :::*
LISTEN 0 128 127.0.0.1:10000 *:*

# nom-tell -F vendor ansv6 version Communication is possible


Nominum over all three configured CCs.
# nom-tell -F platform ans-2 version
rhel-6-x86_64
# nom-tell -F product blah version Communication over the
ANS standard ans CC is not
# nom-tell -F product ans version possible.
nom-tell: critical: could not send to 'ans': Connection refused

117
12/6/2016

/etc/channel.conf: Clients
The CC service name does
# grep ansv6 /etc/channel.conf not get need to match
Zansv6 ::1#9253 HiMom between the server and
client.

When AuthServe started the


service was labeled ansv6,
but has since been modified.
# nom-tell -F vendor ansv6 version
nom-tell: critical: 'ansv6' is not a known service name or network
address

# nom-tell -F vendor Zansv6 version Communication is not possible


Nominum with the ansv6 name.

It works with the new name,


Zansv6.

/etc/channel.conf: The CC Service


Name
# grep -C1 '#' /etc/channel.conf
ans 9253 88utSKQ6Iz1gkE6BR4VdJhMI6l/Qotf8UsDiaS4jPb9oL+VO
Zansv6 ::1#9253 HiMom
ans-2 10.0.2.15#9253 HiMom The socket and secret can be
blah 10000 Hello2 provided on the command line
(/etc/channel.conf is ignored.)
# nom-tell -F platform 10.0.2.15#9253 --secret HiMom version
rhel-6-x86_64

# nom-tell 10.0.2.15#9253 -s HiMom The nom-tell prompt matches


nom-tell 3.0.46.3, interactive mode the CC argument provided.

10.0.2.15#9253> exit

# nom-tell blah The nom-tell prompt matches


nom-tell 3.0.46.3, interactive mode the CC argument provided.

blah>

118
12/6/2016

/etc/channel.conf: IP Addresses

• For a server, an IP address means: listen on


• For a client, an IP address means: destination
• If an address is not provided, it defaults to: 127.0.0.1
# grep -C1 '#' /etc/channel.conf
ans 9253 88utSKQ6Iz1gkE6BR4VdJhMI6l/Qotf8UsDiaS4jPb9oL+VO
Zansv6 ::1#9253 HiMom
ans-2 10.0.2.15#9253 HiMom
blah 10000 Hello2

NOTE:
/etc/channel.conf is a service definition file.
That is the default file, but it can be overridden
through the NOM_CHANNEL_CONF environment
variable or by : ~/.nom/channel.conf

Exercise 19

• Configure
/usr/local/nom/etc/sysconfig/cacheserve
with command-line options

119
12/6/2016

6. Perl CC API

• Introduction
– Nominum’s SDK packaged separately
(available at no cost)
• CC Perl API Examples
• Creating simple programs

Review of Nominum CC

• Benefits:
– Allows direct access
• Configuration of every aspect of server configuration
• Scripted inspection of querystore
– Listens on loopback address at port 9434/9994
– Provides authentication and encryption
• Uses:
– nom-tell is an example of a program that uses CC
– Accessible through programmatic interface (API)
• Perl
• Python
• Java

120
12/6/2016

Create and Access the CC

Vantio Vantio
CacheServe CacheServe

# nom-tell cacheser

port 9434 on IPv4


cacheserve version
CacheServe 7.0.0.0

loopback

The Perl CC API

• Connections and message parsing handled


through Nom::CC modules
• The Command Channel message is a hash
table of various fields
• The API handles most of the message
construction, you just need to fill in the _data
section, which is (itself) a hash table.
• Refer to the Command Channel API
documentation for complete details

121
12/6/2016

Trivial Example

• Here is a program that gets the CacheServe version


number, like nom-tell cacheserve version :
#! /usr/local/bin/perl
use Nom::CC::Channel;
use Nom::CC::Message;
my $chan = new Nom::CC::Channel("cacheserve");
my $request = new Nom::CC::Message({type => "version"});
my $response = $chan->send($request);

print $response->{version}, "\n";

Notes on Trivial Example

• The Nominum module(s) must be declared


• $chan
– reference to channel object
– destination defined by argument (eg
“cacheserve” from /etc/channel.conf
• $request
– reference to the payload of a CC message

122
12/6/2016

More Trivial Example Notes

• What is in $response ?

– Result of “send” method on the channel


– Select required data by its “tag”
– If there is an error, the tag “err” exists and contains
a value indicating the problem

Examples and Tips

• Sample scripts used in training


– Data fetching and formatting:
• cacheserve_getserver.pl
– Processing a sequence:
• cacheserve_listresolvers.pl
– equivalent of cacheserve-stats.py:
• cacheserve-stats.pl
• Enable command channel logging in CacheServe:
– monitoring.update log+=(command/info)
– server.update log-command-channel=1

123
12/6/2016

References

Customers have access to these resources:


• E-mail support
– support@nominum.com
• Support Online site
– https://support.nominum.com

124