GGF Network Measurements Working Group
Internal document
Eric Boyd, Dan Gunter, Martin Swany

     Requirements for the Network Measurements Report Schema

Abstract

This document describes the requirements for reporting network
measurements to Grid middleware. Current projects that could use this
functionality include Internet2 [I2] and the DANTE project [DANTE].

Table of Contents
-----------------
1.     Introduction
2.     Terminology
3.     Report contents
3.1.   General
3.2    Identifier
3.3.   Characteristic
3.4.   Subject
3.5.   Methodology
3.6.   Results
4.     Metadata/data separation
5.     Data Location and Protocol
6.     Unresolved issues
7.     References


1. Introduction

A prime requirement for reports of network measurements is a common
data model and format. Network measurements are available from a large
variety of tools and infrastructures. Different Grid sites will
naturally choose different sets of tools from among those
available.  Although the actual information reported by these tools is
fundamentally similar, to compare the results from different sites
Grid middleware needs a common data model and format.

The data model and format should be expressible in an XML schema
language. The Grid middleware tools are all moving towards, or already
using, Web Services technology. The basic data format used in Web
Services is XML; thus, the measurement reports should be serializable
as XML. To easily guarantee this, the formal schema for the
measurement reports should be an XML schema language, such as W3C XML
Schema or OASIS RELAX NG. Note that this does not require that all
measurement reports actually be nested angle-bracket tags, only that
the model would allow that form of serialization.

Now we have established the need for an XML schema. The rest of this
document describes the requirements for the contents of that schema.

2. Terminology

The issuer of a report is called a "reporter", and the receiver of
that report is called a "reportee".

A "characteristic" is defined as in [HIER]: the intrinsic properties
of a portion of the network that are related to the reliability and
performance of the network.

The "subject" is the portion of the network being measured. This
corresponds to the term "network entity" in [HIER].

The "methodology" is the means of making the measurement.

3. Report contents

3.1. General requirements

Where possible and appropriate, use should be made of equivalent
schema element names and structures to those contained in the NMWG
request schema.

To allow implementations maximum flexibility in "batching" data in
reports, each network measurement report may contain, subject to
restrictions imposed by the request, any number of data samples and
statistics for any number of network "characteristics" (see 2.2)
measured on any number of network portions.  Note that this means
either that the reportee is able to request multiple characteristics
in one query or that previous queries have been aggregated.  This, of
course, includes the simple case of one sample for one characteristic
on a single network path.

The reporter and reportee are assumed to have synchronized clocks,
e.g. by using an NTP implementation, or by direct synchronization with
a GPS time source. This assumption makes it possible to report
timestamps directly, not as relative times.

In several places below, the requirement for "extensibility" is
expressed. To allow reportees to easily recognize elements that are
extensions of the existing declared structures, the serialization
format should clearly indicate in a general way which elements are
extensions. In XML, this should be done through XML namespaces: any
element from a foreign namespace is an extensibility element, and a
processor can quickly decide how it should process the element
(e.g. ignore it) from the namespace.  Another issue where namespaces
could be useful is in protocol versioning.  This should has the same
essential mechanism as extensibility.

3.2. Identifier

To provide correlation between request and report over
connectionless transports, and to add robustness over
connection-oriented transports, each report should carry an (opaque)
identifier. This identifier will be identical to the identifier
provided in the corresponding request.  NOTE: this is not specified in
the request document.

3.3. Characteristic

One or more network characteristics must be present in a report. The
network characteristics in [HIER] should be used, and extended or
modified as necessary. All characteristics should be named according
to the DAMED-WG hierarchical naming convention [DAMED].  By virtue of
the data / metadata separation, data referencing multiple
characteristics can be included in the same message, and even
potentially interleaved.

3.4. Subject

One or more subjects must be present in a report. Where possible, the
abstract model presented in [HIER] should be followed to classify and
represent the subject. The representation of the subject should be
extensible so that new network topologies or addressing schemes can be
added without rewriting the schema. A reportee should be able to ask
for multiple subjects in one request to reduce overhead.  Obviously,
the mapping between subjects and various data elements in the same
message must be unambiguous.

The identifiers used for the subject, such as IP addresses or MAC
addresses, should, where possible and practical, be globally
unique. However, it is understood that for security or other
implementation reasons this may not always be possible. At a minimum,
the identifier for a given subject must be unique within a given
network report, and should be consistent across reports between the
same reporter and reportee.

3.5. Methodology

The methodology may be present in the report. If there are multiple
characteristics or subjects, or both, there may be multiple
methodologies present.  In any case, the methodology must be
unambiguously defined.  Likewise, each report element must be
described by some methodology.  Additionally a multi-part report may
contain only methodology data to be referenced by subsequent reports.

If present, the methodology should specify what software or hardware
tool(s) were used to make the measurement, and what those tools'
parameters were during the measurement. When different portions of the
subject used different tools (or tool versions), the tool(s) should be
reported separately for each portion. For example, if the client and
server in an `iperf' test had different software versions, the version
information should be reported separately for each endpoint.

A pre-defined set of parameters should be formulated, based on
[PROF]. This set should be extensible with arbitrary new parameters.

3.6. Results

One or more results must be present in the report. The results are all
the measured, or "observed", (time-varying) values and associated
statistics. Each result value is associated with exactly one
combination of characteristic, subject, and methodology. These
combinations may each be associated with multiple results.

A single result must specify the time interval(s) for the observed
values, and then zero or more observed values, and zero or or more
"statistics" derived from all the observed values (not necessarily
just the subset of the observed values that were reported).

A time interval consists of one or two timestamps. If it has one
timestamp, t1, it means "ending at time t1". If it has two timestamps,
t1 and t2, the semantics are "from t1 to t2, inclusive". The time
intervals could be disjoint, as in occasional pings, or could be
continuous, as in the periodic reports from an iperf. For continuous
periodic reports, only the first time interval would need two
timestamps; successive time intervals would simply indicate the time
of the next report. Also, if practical, the smallest time interval
including all the individual result time intervals should also be
reported (assuming that it's easy for the reporter to calculate this).

Results should have a type, i.e. be a named schema structure. The
result types should follow the NM-WG CIM profile and related IPPM
documents. The schema must be extensible in this area, i.e it must
allow arbitrary result types to be included without invalidating the
report. Result types may have arbitrary complexity, although
implementation considerations discourage complex hierarchical
structures.

Statistics must also be extensible. They may or may not be
typed. Standard statistics that must be supported are min, max, mean,
median, and confidence interval.

For improved responsiveness, requesters may indicate that the data
from reports be delivered in "batches" of some size. To allow for
this, the results should always carry a sequential identifier and the
total number of expected identifiers.  Thus each element in the batch
can be identified as element "X of N" with the degenerate case "only"
equal to "1/1".  Optionally, additional information about batch size,
number of results so far, and number of results remaining may be
reported.

4. Data, Metadata and Mapping

A report may include metadata sections for each characteristic,
subject, and methodology that is used in the report.  Metadata is
defined as the information that is not specific to a single result
element, but general to the set of elements and potentially multiple
data sets. That is, a methodology may be used for multiple
measurements in multiple non-contiguous time periods, or even for
multiple subjects.  Each metadata section must have a unique, opaque
ID.  This ID can be copied from the request, or generated
automatically by the reporter.  The ID should be human readable.

A data section will be linked to the characteristic, subject, and
methodology by the metadata IDs.  Thus, each element must by
unambiguously related to an identifier for a characteristic, a subject
and a methodology, whether as a tuple containing those elements or by
group membership.

The scope of the metadata ID is at least, but not limited to, a single
report. For example, if multiple reports are being "streamed" over a
persistent connection, a metadata ID from the first report may be
referred to (i.e. used) by subsequent reports. To go one step further,
imagine that the reporter and reportee have certain measurement types
for which some of the metadata is known through an out-of-band
mechanism (e.g., "well-known"). In this case, the metadata ID could
also be determined out-of-band, and therefore report results could be
associated with a metadata ID that was never physically part of any
report.

This separation of data and metadata reduces protocol overhead and
redundancy. Yet it retains association between data and metadata such
that a mechanical translation could join the fundamental elements into
a fully specified, human readable form.

5. Data Location and Protocol

A report may include information about where the result (data)
elements can be found, and what protocol(s) to use in retrieving and
decoding them. The default location for data items is "inline",
i.e. in the report itself, and the default protocol/encoding is the
same as the protocol/encoding being used for the report. NOTE: This is
not specified in the request document. NOTE(2): This would need to be
incorporated in the "capability discovery" part of the protocol as well.

The primary motivation for allowing communication of the result set to
proceed outside of the report itself is efficiency. The report will,
in general, be an XML document. A disadvantage of XML as a wire format
is that it is much more verbose and expensive to parse than binary and
simpler textual formats. By providing "hooks" for out-of-band
transports and encodings, we make the infrastructure usable even where
XML as a data format is hopelessly inefficient.

6. Unresolved issues

This is a list, in no particular order, of issues to which the group
has zero or more than one answer.

- Should all unrecognized extensions to the known schema
(e.g. unrecognized XML namespaces) be ignored, or should the decision
to ignore or raise an error be left up to the (reportee) implementation?

- Capability discovery.  One way to discover the capabilities of a
reporter is to send a request with a certain feature, and check if it
gets rejected. If the requests are human generated, this is
frustrating and error prone. It generates a lot of overhead, and even
automated discovery gets increasingly impracticable with the number of
possible extensions.

- Comma-separated values (CSV). [In XML messages] should "CSV" be a
standard/recognized encoding for inline results (see Section 5)?

Therefore, a reportee should be able to request the capabilities of
the reporter directly. To be able to report, and process capabilities
in an automated way, a capabilities reporting mechanism must be
specified. Hence, the possible extensions to the protocol have to be
specified, formalized, and as a consequence, limited.


6. References

[DAMED] GGF DAMED "An analysis of "Top N" Event Descriptions":
        http://www-didc.lbl.gov/damed/documents/TopN_Events_final_draft.pdf

[DANTE] http://www.dante.net/

[HIER]  GGF NM-WG "A Hierarchy of Network Performance Characteristics
        for Grid Applications and Services":
        http://www-didc.lbl.gov/NMWG/docs/draft-ggf-nmwg-hierarchy-01.pdf

[GHPN]  GGF Grid High Performance Networking Research Group (GHPN-RG):
        http://forge.gridforum.org/projects/ghpn-rg/

[I2]    <internet-2 http>

[IPERF] Iperf: http://dast.nlanr.net/Projects/Iperf/

[NTP]   IETF "RFC 958: Network Time Protocol (NTP)":
        http://www.ietf.org/rfc/rfc0958.txt?number=958