Forums » #117 - DFDL v1.0 Revision »
Are there special circumstances where WSP* should be allowed even when ES is not allowed?
Added by Steve Hanson about 9 years ago
Does it ever make sense to allow WSP* as a terminator in non-delimited scenarios? IBM DFDL has some test cases that exploit this, and at least one IBM end user has used WSP* in a list of values to model an optional terminator. WTX allows their equivalent <OWSP> entity to be used on its own, but say that their "stopping syntax" does not look for "nothing" so actual white space would need to be present - in other words its the same as if <WSP> had been used. Need to consider whether not allowing WSP* makes a conceptually simple format hard to model as regular expressions etc need to be resorted to.
Replies (8)
RE: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Steve Hanson about 9 years ago
If we allow WSP* on its own in DFDL 1.0, I think it must be on the grounds that it keeps the modeling of a simple problem simple, for example, to consume optional leading and trailing white space in an intuitive way without having to resort to regexs. I feel that there are indeed such scenarios, and that IBM has customers who are likely exploiting this already. I am consequently wary about disallowing this.
If we do allow then it some rules must apply to give sensible behaviour.
1) dfdl:initiator: WSP* on its own disallowed when dfdl:initiatedContent is 'yes'
2) dfdl:terminator: WSP* on its own is treated like WSP+ when scanning for delimiters, but like WSP* when consuming
3) dfdl:separator: WSP* on its own is treated like WSP+ when scanning for delimiters, but like WSP* when consuming
4) dfdl:textStandardZeroRep: WSP* on its own disallowed (have defaulting for empty case)
RE: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Michael Beckerle about 9 years ago
Per WG call on 2013-10-01
drop 2 and 3 above. Keep 1 and 4.
Clarify that WSP* is not allowed as a delimiter when determining the length of an element by scanning for delimiters.
Clarify that WSP* is allowed as a delimiter for dfdl:lengthKind values other than 'delimited'.
RE: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Steve Hanson about 9 years ago
Clarification to last reply - should say WSP* on its own.
I think that the updated 2) and 3) can be expressed as follows:
2) dfdl:terminator: WSP* on its own is disallowed when an element has dfdl:lengthKind 'delimited' or has a child element (recursively) that has dfdl:lengthKind 'delimited'.
3) dfdl:separator: as per dfdl:terminator
RE: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Michael Beckerle about 9 years ago
Point 2 is not quite right. This recursion is interrupted if an element of complex type with specified length appears between the parent having WSP* as a delimiter, and the contained child having lengthKind delimited. Furthermore, if the child has its own terminator, then that IS the delimiter, and WSP* wouldn't even be considered in scope for it. If the child is contained inside a group with a terminator, then depending on the location of the child in that group, that terminator may be required to appear before the WSP* and so the WSP* wouldn't be in scope for it.
In other words, it's complicated. We don't want to have to reiterate everything about delimiters and which are in scope for a lengthKind delimited element here.
Possible wording "..... has a a child element having delimited length nested within such that this WSP* delimiter may be needed to determine its length". But I am not sure it is worth trying to keep 2. It sounds more precise than the more general: WSP* is not allowed as a delimiter when determining the length of an element by scanning for delimiters.
Resolved: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Steve Hanson about 9 years ago
Agreed on the following on 8/10 call:
1) dfdl:initiator: WSP* on its own disallowed when dfdl:initiatedContent is 'yes'.
2) dfdl:terminator: WSP* on its own is disallowed when determining the length of a component by scanning for delimiters.
3) dfdl:separator: WSP* on its own is disallowed when determining the length of a component by scanning for delimiters.
4) dfdl:textStandardZeroRep: WSP* on its own is always disallowed.
Resolved: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Steve Hanson over 8 years ago
Updated errata 2.148 in experience document 1 to cover 1), 2) and 3).
The spec is already clear about 4) from errata 2.42.
DONE - RE: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Michael Beckerle over 8 years ago
changes in draft draft-gwdrp-dfdl-v1.0.4-r05.docx
DONE - RE: Are there special circumstances where WSP* should be allowed even when ES is not allowed? - Added by Steve Hanson about 8 years ago
(1-8/8)