javolution.io
Class UTF8StreamReader

Object
  extended by Reader
      extended by UTF8StreamReader
All Implemented Interfaces:
Closeable, Readable, Reusable

public final class UTF8StreamReader
extends Reader
implements Reusable

This class represents a UTF-8 stream reader.

This reader supports surrogate char pairs (representing characters in the range [U+10000 .. U+10FFFF]). It can also be used to read characters unicodes (31 bits) directly (ref. read()).

Each invocation of one of the read() methods may cause one or more bytes to be read from the underlying byte-input stream. To enable the efficient conversion of bytes to characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read operation.

Instances of this class can be reused for different input streams and can be part of a higher level component (e.g. parser) in order to avoid dynamic buffer allocation when the input source changes. Also wrapping using a java.io.BufferedReader is unnescessary as instances of this class embed their own data buffers.

Note: This reader is unsynchronized and does not test if the UTF-8 encoding is well-formed (e.g. UTF-8 sequences longer than necessary to encode a character).

Version:
2.0, December 9, 2004
Author:
Jean-Marie Dautelle
See Also:
UTF8StreamWriter

Field Summary
 
Fields inherited from class Reader
lock
 
Constructor Summary
UTF8StreamReader()
          Creates a UTF-8 reader having a byte buffer of moderate capacity (2048).
UTF8StreamReader(int capacity)
          Creates a UTF-8 reader having a byte buffer of specified capacity.
 
Method Summary
 void close()
          Closes and resets this reader for reuse.
 int read()
          Reads a single character.
 void read(Appendable dest)
          Reads characters into the specified appendable.
 int read(char[] cbuf, int off, int len)
          Reads characters into a portion of an array.
 boolean ready()
          Indicates if this stream is ready to be read.
 void reset()
          Resets the internal state of this object to its default values.
 UTF8StreamReader setInput(InputStream inStream)
          Sets the input stream to use for reading until this reader is closed.
 UTF8StreamReader setInputStream(InputStream inStream)
          Deprecated. Replaced by setInput(InputStream)
 
Methods inherited from class Reader
mark, markSupported, read, read, skip
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

UTF8StreamReader

public UTF8StreamReader()
Creates a UTF-8 reader having a byte buffer of moderate capacity (2048).


UTF8StreamReader

public UTF8StreamReader(int capacity)
Creates a UTF-8 reader having a byte buffer of specified capacity.

Parameters:
capacity - the capacity of the byte buffer.
Method Detail

setInput

public UTF8StreamReader setInput(InputStream inStream)
Sets the input stream to use for reading until this reader is closed. For example:
     Reader reader = new UTF8StreamReader().setInput(inStream);
 
is equivalent but reads twice as fast as
     Reader reader = new java.io.InputStreamReader(inStream, "UTF-8");
 

Parameters:
inStream - the input stream.
Returns:
this UTF-8 reader.
Throws:
IllegalStateException - if this reader is being reused and it has not been closed or reset.

ready

public boolean ready()
              throws IOException
Indicates if this stream is ready to be read.

Overrides:
ready in class Reader
Returns:
true if the next read() is guaranteed not to block for input; false otherwise.
Throws:
IOException - if an I/O error occurs.

close

public void close()
           throws IOException
Closes and resets this reader for reuse.

Specified by:
close in interface Closeable
Specified by:
close in class Reader
Throws:
IOException - if an I/O error occurs.

read

public int read()
         throws IOException
Reads a single character. This method will block until a character is available, an I/O error occurs or the end of the stream is reached.

Overrides:
read in class Reader
Returns:
the 31-bits Unicode of the character read, or -1 if the end of the stream has been reached.
Throws:
IOException - if an I/O error occurs.

read

public int read(char[] cbuf,
                int off,
                int len)
         throws IOException
Reads characters into a portion of an array. This method will block until some input is available, an I/O error occurs or the end of the stream is reached.

Note: Characters between U+10000 and U+10FFFF are represented by surrogate pairs (two char).

Specified by:
read in class Reader
Parameters:
cbuf - the destination buffer.
off - the offset at which to start storing characters.
len - the maximum number of characters to read
Returns:
the number of characters read, or -1 if the end of the stream has been reached
Throws:
IOException - if an I/O error occurs.

read

public void read(Appendable dest)
          throws IOException
Reads characters into the specified appendable. This method will block until the end of the stream is reached.

Parameters:
dest - the destination buffer.
Throws:
IOException - if an I/O error occurs.

reset

public void reset()
Description copied from interface: Reusable
Resets the internal state of this object to its default values.

Specified by:
reset in interface Reusable
Overrides:
reset in class Reader

setInputStream

public UTF8StreamReader setInputStream(InputStream inStream)
Deprecated. Replaced by setInput(InputStream)



Copyright © 2005-2012 Javolution. All Rights Reserved.