document #342
Remove BOM processing from DFDL 1.0
Status: | closed | Start date: | 06/27/2019 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | - | % Done: | 100% |
|
Category: | - | |||
Target version: | DFDL v1.0 | |||
Document Type: | Proposed Recommendation |
Description
DFDL 1.0 includes some support for Unicode Byte Order Marks (BOMs). When parsing it recognizes when a document starts with a BOM and preserves the BOM in the infoset. When unparsing it uses the infoset to output a BOM at the start of the document if required. It does NOT provide support for BOMs that occur at other points in the document.
Experience has shown that few of the formats encountered by DFDL implementations use Unicode as the encoding, and there have been no use cases when a BOM was used. Given that neither IBM DFDL nor Daffodil have yet implemented BOM support, it has been agreed by the WG to drop BOM support from the DFDL 1.0 spec.
History
Updated by Steve Hanson over 3 years ago
Note that it is possible and practical to create a DFDL schema to model BOMs explicitly, so dropping support does not mean that documents with BOMs become unparsable.
Updated by Steve Hanson over 3 years ago
Work required to remove BOM support from spec:
- 4.1.1. Remove [unicodeByteOrderMark] enum from the infoset
- 9.2. Remove unicodeByteOrderMark from the grammar.
- 11. Remove forward reference to 11.1 from the 'Encoding' property description.
- 11.1. Remove section and two footnotes.
Updated by Michael Beckerle over 3 years ago
- Target version set to DFDL v1.0
Updated by Michael Beckerle about 3 years ago
- Status changed from submitted to accepted
Erratum 5.50
Updated by Michael Beckerle about 2 years ago
- Status changed from accepted to closed
- % Done changed from 0 to 100
(Other formats not available in this archive.