com.ctc.wstx.dtd
Class FullDTDReader

java.lang.Object
  extended by com.ctc.wstx.io.WstxInputData
      extended by com.ctc.wstx.sr.StreamScanner
          extended by com.ctc.wstx.dtd.MinimalDTDReader
              extended by com.ctc.wstx.dtd.FullDTDReader
All Implemented Interfaces:
InputConfigFlags, ParsingErrorMsgs, InputProblemReporter

public class FullDTDReader
extends MinimalDTDReader

Reader that reads in DTD information from internal or external subset.

There are 2 main modes for DTDReader, depending on whether it is parsing internal or external subset. Parsing of internal subset is somewhat simpler, since no dependency checking is needed. For external subset, handling of parameter entities is bit more complicated, as care has to be taken to distinguish between using PEs defined in int. subset, and ones defined in ext. subset itself. This determines cachability of external subsets.

Reader also implements simple stand-alone functionality for flattening DTD files (expanding all references to their eventual textual form); this is sometimes useful when optimizing modularized DTDs (which are more maintainable) into single monolithic DTDs (which in general can be more performant).

Author:
Tatu Saloranta

Field Summary
 
Fields inherited from class com.ctc.wstx.sr.StreamScanner
CHAR_CR_LF_OR_NULL, CHAR_FIRST_PURE_TEXT, CHAR_LOWEST_LEGAL_LOCALNAME_CHAR, INT_CR_LF_OR_NULL, mCachedEntities, mCfgNsEnabled, mCfgReplaceEntities, mCfgTreatCharRefsAsEntities, mConfig, mCurrDepth, mCurrEntity, mCurrName, mDocXmlVersion, mInput, mInputTopDepth, mNameBuffer, mNormalizeLFs, mRootInput, mTokenInputCol, mTokenInputRow, mTokenInputTotal
 
Fields inherited from class com.ctc.wstx.io.WstxInputData
CHAR_NULL, CHAR_SPACE, INT_NULL, INT_SPACE, MAX_UNICODE_CHAR, mCurrInputProcessed, mCurrInputRow, mCurrInputRowStart, mInputBuffer, mInputEnd, mInputPtr, mXml11
 
Fields inherited from interface com.ctc.wstx.cfg.InputConfigFlags
CFG_AUTO_CLOSE_INPUT, CFG_CACHE_DTDS, CFG_CACHE_DTDS_BY_PUBLIC_ID, CFG_COALESCE_TEXT, CFG_INTERN_NAMES, CFG_INTERN_NS_URIS, CFG_LAZY_PARSING, CFG_NAMESPACE_AWARE, CFG_NORMALIZE_LFS, CFG_PRESERVE_LOCATION, CFG_REPLACE_ENTITY_REFS, CFG_REPORT_CDATA, CFG_REPORT_PROLOG_WS, CFG_SUPPORT_DTD, CFG_SUPPORT_DTDPP, CFG_SUPPORT_EXTERNAL_ENTITIES, CFG_TREAT_CHAR_REFS_AS_ENTS, CFG_VALIDATE_AGAINST_DTD, CFG_XMLID_TYPING, CFG_XMLID_UNIQ_CHECKS
 
Fields inherited from interface com.ctc.wstx.cfg.ParsingErrorMsgs
SUFFIX_EOF_EXP_NAME, SUFFIX_IN_ATTR_VALUE, SUFFIX_IN_CDATA, SUFFIX_IN_CLOSE_ELEMENT, SUFFIX_IN_COMMENT, SUFFIX_IN_DEF_ATTR_VALUE, SUFFIX_IN_DOC, SUFFIX_IN_DTD, SUFFIX_IN_DTD_EXTERNAL, SUFFIX_IN_DTD_INTERNAL, SUFFIX_IN_ELEMENT, SUFFIX_IN_ENTITY_REF, SUFFIX_IN_EPILOG, SUFFIX_IN_NAME, SUFFIX_IN_PROC_INSTR, SUFFIX_IN_PROLOG, SUFFIX_IN_TEXT, SUFFIX_IN_XML_DECL
 
Method Summary
protected  String checkDTDKeyword(String exp)
          Method called to verify whether input has specified keyword; if it has, returns null and points to char after the keyword; if not, returns whatever constitutes a keyword matched, for error reporting purposes.
protected  void checkXmlIdAttr(int type)
           
protected  void checkXmlSpaceAttr(int type, WordResolver enumValues)
           
protected  boolean ensureInput(int minAmount)
          Method called to make sure current main-level input buffer has at least specified number of characters available consequtively, without having to call StreamScanner.loadMore().
 EntityDecl findEntity(String entName)
          Method that may need to be called by attribute default value validation code, during parsing....
protected  EntityDecl findEntity(String id, Object arg)
          Abstract method for sub-classes to implement, for finding a declared general or parsed entity.
static DTDSubset flattenExternalSubset(WstxInputSource src, Writer flattenWriter, boolean inclComments, boolean inclConditionals, boolean inclPEs)
          Method that will parse, process and output contents of an external DTD subset.
protected  void handleGreedyEntityProblem(WstxInputSource input)
           
protected  void handleIncompleteEntityProblem(WstxInputSource closing)
          Handling of PE matching problems is actually intricate; one type will be a WFC ("PE Between Declarations", which refers to PEs that start from outside declarations), and another just a VC ("Proper Declaration/PE Nesting", when PE is contained within declaration)
protected  void handleUndeclaredEntity(String id)
          Undeclared parameter entity is a VC, not WFC...
protected  void initInputSource(WstxInputSource newInput, boolean isExt, String entityId)
          Method called when an entity has been expanded (new input source has been created).
protected  boolean loadMore()
          Need to override this method, to check couple of things: first, that nested input sources are balanced, when expanding parameter entities inside entity value definitions (as per XML specs), and secondly, to handle (optional) flattening output.
protected  boolean loadMoreFromCurrent()
           
protected  void parseDirective()
           
protected  void parseDirectiveFlattened()
          Method similar to parseDirective(), but one that takes care to properly output dtd contents via DTDWriter as necessary.
protected  DTDSubset parseDTD()
           
protected  void readComment(DTDEventListener l)
          Method similar to MinimalDTDReader.skipComment(), but that has to collect contents, to be reported for a SAX handler.
protected  String readDTDKeyword(String prefix)
          Method called usually to indicate an error condition; will read rest of specified keyword (including characters that can be part of XML identifiers), append that to passed prefix (which is optional), and return resulting String.
static DTDSubset readExternalSubset(WstxInputSource src, ReaderConfig cfg, DTDSubset intSubset, boolean constructFully, int xmlVersion)
          Method called to read in the external subset definition.
static DTDSubset readInternalSubset(WstxInputData srcData, WstxInputSource input, ReaderConfig cfg, boolean constructFully, int xmlVersion)
          Method called to read in the internal subset definition.
protected  void readPI()
          Method similar to MinimalDTDReader.skipPI(), but one that does basic well-formedness checks.
 void setFlattenWriter(Writer w, boolean inclComments, boolean inclConditionals, boolean inclPEs)
          Method that will set specified Writer as the 'flattening writer'; writer used to output flattened version of DTD read in.
 
Methods inherited from class com.ctc.wstx.dtd.MinimalDTDReader
dtdNextChar, dtdNextFromCurr, getErrorMsg, getLocation, getNextSkippingPEs, handleExpandedSurrogate, skipComment, skipCommentContent, skipInternalSubset, skipInternalSubset, skipPI, throwIllegalCall
 
Methods inherited from class com.ctc.wstx.sr.StreamScanner
_reportProblem, _reportProblem, closeAllInput, constructFromIOE, constructNullCharException, constructWfcException, expandBy50Pct, expandEntity, fullyResolveEntity, getCurrentInput, getCurrentLocation, getIntEntity, getLastCharLocation, getNameBuffer, getNext, getNextAfterWS, getNextChar, getNextCharAfterWS, getNextCharFromCurrent, getNextInCurrAfterWS, getNextInCurrAfterWS, getSource, getStartLocation, getSystemId, inputInBuffer, loadMore, loadMoreFromCurrent, markLF, markLF, parseEntityName, parseFNameForError, parseFullName, parseFullName, parseFullName2, parseLocalName, parseLocalName2, parsePublicId, parseSystemId, parseUntil, peekNext, pushback, reportProblem, reportProblem, reportValidationProblem, reportValidationProblem, reportValidationProblem, reportValidationProblem, reportValidationProblem, resolveCharOnlyEntity, resolveNonCharEntity, resolveSimpleEntity, skipCRLF, skipFullName, throwFromIOE, throwFromStrE, throwInvalidSpace, throwInvalidSpace, throwLazyError, throwNullChar, throwNullParent, throwParseError, throwParseError, throwUnexpectedChar, throwUnexpectedEOB, throwUnexpectedEOF, throwWfcException, tokenTypeDesc
 
Methods inherited from class com.ctc.wstx.io.WstxInputData
copyBufferStateFrom, findIllegalNameChar, findIllegalNmtokenChar, getCharDesc, isNameChar, isNameChar, isNameStartChar, isNameStartChar, isSpaceChar
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

readInternalSubset

public static DTDSubset readInternalSubset(WstxInputData srcData,
                                           WstxInputSource input,
                                           ReaderConfig cfg,
                                           boolean constructFully,
                                           int xmlVersion)
                                    throws XMLStreamException
Method called to read in the internal subset definition.

Throws:
XMLStreamException

readExternalSubset

public static DTDSubset readExternalSubset(WstxInputSource src,
                                           ReaderConfig cfg,
                                           DTDSubset intSubset,
                                           boolean constructFully,
                                           int xmlVersion)
                                    throws XMLStreamException
Method called to read in the external subset definition.

Throws:
XMLStreamException

flattenExternalSubset

public static DTDSubset flattenExternalSubset(WstxInputSource src,
                                              Writer flattenWriter,
                                              boolean inclComments,
                                              boolean inclConditionals,
                                              boolean inclPEs)
                                       throws IOException,
                                              XMLStreamException
Method that will parse, process and output contents of an external DTD subset. It will do processing similar to readExternalSubset(com.ctc.wstx.io.WstxInputSource, com.ctc.wstx.api.ReaderConfig, com.ctc.wstx.dtd.DTDSubset, boolean, int), but additionally will copy its processed ("flattened") input to specified writer.

Parameters:
src - Input source used to read the main external subset
flattenWriter - Writer to output processed DTD content to
inclComments - If true, will pass comments to the writer; if false, will strip comments out
inclConditionals - If true, will include conditional block markers, as well as intervening content; if false, will strip out both markers and ignorable sections.
inclPEs - If true, will output parameter entity declarations; if false will parse and use them, but not output.
Throws:
IOException
XMLStreamException

setFlattenWriter

public void setFlattenWriter(Writer w,
                             boolean inclComments,
                             boolean inclConditionals,
                             boolean inclPEs)
Method that will set specified Writer as the 'flattening writer'; writer used to output flattened version of DTD read in. This is similar to running a C-preprocessor on C-sources, except that defining writer will not prevent normal parsing of DTD itself.


findEntity

public EntityDecl findEntity(String entName)
Method that may need to be called by attribute default value validation code, during parsing....

Note: see base class for some additional remarks about this method.

Overrides:
findEntity in class MinimalDTDReader

parseDTD

protected DTDSubset parseDTD()
                      throws XMLStreamException
Throws:
XMLStreamException

parseDirective

protected void parseDirective()
                       throws XMLStreamException
Throws:
XMLStreamException

parseDirectiveFlattened

protected void parseDirectiveFlattened()
                                throws XMLStreamException
Method similar to parseDirective(), but one that takes care to properly output dtd contents via DTDWriter as necessary. Separated to simplify both methods; otherwise would end up with 'if (... flatten...) ... else ...' spaghetti code.

Throws:
XMLStreamException

initInputSource

protected void initInputSource(WstxInputSource newInput,
                               boolean isExt,
                               String entityId)
                        throws XMLStreamException
Description copied from class: StreamScanner
Method called when an entity has been expanded (new input source has been created). Needs to initialize location information and change active input source.

Overrides:
initInputSource in class StreamScanner
entityId - Name of the entity being expanded
Throws:
XMLStreamException

loadMore

protected boolean loadMore()
                    throws XMLStreamException
Need to override this method, to check couple of things: first, that nested input sources are balanced, when expanding parameter entities inside entity value definitions (as per XML specs), and secondly, to handle (optional) flattening output.

Overrides:
loadMore in class StreamScanner
Returns:
true if reading succeeded (or may succeed), false if we reached EOF.
Throws:
XMLStreamException

loadMoreFromCurrent

protected boolean loadMoreFromCurrent()
                               throws XMLStreamException
Overrides:
loadMoreFromCurrent in class StreamScanner
Throws:
XMLStreamException

ensureInput

protected boolean ensureInput(int minAmount)
                       throws XMLStreamException
Description copied from class: StreamScanner
Method called to make sure current main-level input buffer has at least specified number of characters available consequtively, without having to call StreamScanner.loadMore(). It can only be called when input comes from main-level buffer; further, call can shift content in input buffer, so caller has to flush any data still pending. In short, caller has to know exactly what it's doing. :-)

Note: method does not check for any other input sources than the current one -- if current source can not fulfill the request, a failure is indicated.

Overrides:
ensureInput in class StreamScanner
Returns:
true if there's now enough data; false if not (EOF)
Throws:
XMLStreamException

checkDTDKeyword

protected String checkDTDKeyword(String exp)
                          throws XMLStreamException
Method called to verify whether input has specified keyword; if it has, returns null and points to char after the keyword; if not, returns whatever constitutes a keyword matched, for error reporting purposes.

Throws:
XMLStreamException

readDTDKeyword

protected String readDTDKeyword(String prefix)
                         throws XMLStreamException
Method called usually to indicate an error condition; will read rest of specified keyword (including characters that can be part of XML identifiers), append that to passed prefix (which is optional), and return resulting String.

Parameters:
prefix - Part of keyword already read in.
Throws:
XMLStreamException

readPI

protected void readPI()
               throws XMLStreamException
Method similar to MinimalDTDReader.skipPI(), but one that does basic well-formedness checks.

Throws:
XMLStreamException

readComment

protected void readComment(DTDEventListener l)
                    throws XMLStreamException
Method similar to MinimalDTDReader.skipComment(), but that has to collect contents, to be reported for a SAX handler.

Throws:
XMLStreamException

findEntity

protected EntityDecl findEntity(String id,
                                Object arg)
Description copied from class: StreamScanner
Abstract method for sub-classes to implement, for finding a declared general or parsed entity.

Overrides:
findEntity in class MinimalDTDReader
Parameters:
arg - If Boolean.TRUE, we are expanding a general entity
id - Identifier of the entity to find

handleUndeclaredEntity

protected void handleUndeclaredEntity(String id)
                               throws XMLStreamException
Undeclared parameter entity is a VC, not WFC...

Overrides:
handleUndeclaredEntity in class MinimalDTDReader
Throws:
XMLStreamException

handleIncompleteEntityProblem

protected void handleIncompleteEntityProblem(WstxInputSource closing)
                                      throws XMLStreamException
Handling of PE matching problems is actually intricate; one type will be a WFC ("PE Between Declarations", which refers to PEs that start from outside declarations), and another just a VC ("Proper Declaration/PE Nesting", when PE is contained within declaration)

Overrides:
handleIncompleteEntityProblem in class MinimalDTDReader
Throws:
XMLStreamException

handleGreedyEntityProblem

protected void handleGreedyEntityProblem(WstxInputSource input)
                                  throws XMLStreamException
Throws:
XMLStreamException

checkXmlSpaceAttr

protected void checkXmlSpaceAttr(int type,
                                 WordResolver enumValues)
                          throws XMLStreamException
Throws:
XMLStreamException

checkXmlIdAttr

protected void checkXmlIdAttr(int type)
                       throws XMLStreamException
Throws:
XMLStreamException