This is a static archive of the previous Open Grid Forum Redmine content management system saved from host redmine.ogf.org file /boards/15/topics/40 at Thu, 03 Nov 2022 15:41:58 GMT Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Public Comments Archive - Open Grid Forum

Are there special circumstances where WSP* should be allowed even when ES is not allowed?

Added by Steve Hanson about 9 years ago

Does it ever make sense to allow WSP* as a terminator in non-delimited scenarios? IBM DFDL has some test cases that exploit this, and at least one IBM end user has used WSP* in a list of values to model an optional terminator. WTX allows their equivalent <OWSP> entity to be used on its own, but say that their "stopping syntax" does not look for "nothing" so actual white space would need to be present - in other words its the same as if <WSP> had been used. Need to consider whether not allowing WSP* makes a conceptually simple format hard to model as regular expressions etc need to be resorted to.


Replies (8)

RE: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Steve Hanson about 9 years ago

If we allow WSP* on its own in DFDL 1.0, I think it must be on the grounds that it keeps the modeling of a simple problem simple, for example, to consume optional leading and trailing white space in an intuitive way without having to resort to regexs. I feel that there are indeed such scenarios, and that IBM has customers who are likely exploiting this already. I am consequently wary about disallowing this.

If we do allow then it some rules must apply to give sensible behaviour.

1) dfdl:initiator: WSP* on its own disallowed when dfdl:initiatedContent is 'yes'

2) dfdl:terminator: WSP* on its own is treated like WSP+ when scanning for delimiters, but like WSP* when consuming

3) dfdl:separator: WSP* on its own is treated like WSP+ when scanning for delimiters, but like WSP* when consuming

4) dfdl:textStandardZeroRep: WSP* on its own disallowed (have defaulting for empty case)

RE: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Michael Beckerle about 9 years ago

Per WG call on 2013-10-01

drop 2 and 3 above. Keep 1 and 4.

Clarify that WSP* is not allowed as a delimiter when determining the length of an element by scanning for delimiters.

Clarify that WSP* is allowed as a delimiter for dfdl:lengthKind values other than 'delimited'.

RE: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Steve Hanson about 9 years ago

Clarification to last reply - should say WSP* on its own.

I think that the updated 2) and 3) can be expressed as follows:

2) dfdl:terminator: WSP* on its own is disallowed when an element has dfdl:lengthKind 'delimited' or has a child element (recursively) that has dfdl:lengthKind 'delimited'.

3) dfdl:separator: as per dfdl:terminator

RE: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Michael Beckerle about 9 years ago

Point 2 is not quite right. This recursion is interrupted if an element of complex type with specified length appears between the parent having WSP* as a delimiter, and the contained child having lengthKind delimited. Furthermore, if the child has its own terminator, then that IS the delimiter, and WSP* wouldn't even be considered in scope for it. If the child is contained inside a group with a terminator, then depending on the location of the child in that group, that terminator may be required to appear before the WSP* and so the WSP* wouldn't be in scope for it.

In other words, it's complicated. We don't want to have to reiterate everything about delimiters and which are in scope for a lengthKind delimited element here.

Possible wording "..... has a a child element having delimited length nested within such that this WSP* delimiter may be needed to determine its length". But I am not sure it is worth trying to keep 2. It sounds more precise than the more general: WSP* is not allowed as a delimiter when determining the length of an element by scanning for delimiters.

Resolved: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Steve Hanson about 9 years ago

Agreed on the following on 8/10 call:

1) dfdl:initiator: WSP* on its own disallowed when dfdl:initiatedContent is 'yes'.

2) dfdl:terminator: WSP* on its own is disallowed when determining the length of a component by scanning for delimiters.

3) dfdl:separator: WSP* on its own is disallowed when determining the length of a component by scanning for delimiters.

4) dfdl:textStandardZeroRep: WSP* on its own is always disallowed.

Resolved: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Steve Hanson over 8 years ago

Updated errata 2.148 in experience document 1 to cover 1), 2) and 3).
The spec is already clear about 4) from errata 2.42.

(1-8/8)

This is a static archive of the previous Open Grid Forum Redmine content management system saved from host redmine.ogf.org file /boards/15/topics/40 at Thu, 03 Nov 2022 15:42:07 GMT