"Content is not allowed in prolog" when parsing perfectly valid XML on GAE
The "Content is not allowed in prolog" error typically occurs when you try to parse an XML document that contains characters before the XML prolog (the <?xml ...?> declaration).
The "Content is not allowed in prolog" error typically occurs when you try to parse an XML document that contains characters before the XML prolog (the <?xml ...?> declaration). The prolog is the first line of an XML document, and it specifies the version of XML and the encoding used in the document.
In some cases, the error can also occur if the XML document contains a Byte Order Mark (BOM) at the beginning of the document. The BOM is a special Unicode character that indicates the byte order and encoding of the document, and it is sometimes added to the beginning of an XML document by text editors or other software.
To fix the "Content is not allowed in prolog" error, you can try the following:
- Check the XML document for any characters before the XML prolog. If you find any characters, remove them and save the document.
- Check the XML document for a BOM at the beginning of the document. If you find a BOM, remove it or strip it programmatically before parsing.
- Make sure that the XML document is well-formed and follows the correct syntax.
- Make sure that the XML document is encoded in UTF-8. Standard JAXP parsers used in GAE Java follow standard XML parsing rules, which typically require UTF-8 or explicitly declared encodings.
- If you are using the
DocumentBuilderclass to parse the XML document, ensure the input source is clean. TheDocumentBuilderFactorydoes not have anignorePrologflag; instead, handle BOM or leading whitespace by preprocessing the input string or stream.
For example:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setIgnoringComments(true);
dbf.setExpandEntityReferences(false);
// Strip BOM if present
if (xml.startsWith("\uFEFF")) {
xml = xml.substring(1);
}
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(new StringReader(xml)));I hope this helps! Let me know if you have any other questions.