public abstract class ByteToCharConverter extends Object
This class defines an interface to allow conversion of bytes
to characters for a particular encoding scheme. Encoding converters
should reside in the com.integpg.io
package.
Many of the encoding schemes need to take state into account in the
conversion process. That is, the conversion to a char might depend on
the byte sequence converted before it. To accommodate this, the
ByteToCharConverter has the ability to remember state between conversions
(between calls to convert()
. Therefore, the caller
should call the flush()
method to finalize the conversion
and reset the converter's internal state.
Subclasses of this abstract class need to implement getMaxCharCount()
,
convert()
, flush()
, and getName()
.
Programs should not call into a converter directly. A
better method of executing byte
array conversions
is to use the java.lang.String(byte[],String)
constructor.
...
//byte[] preConvertedBytes is previously declared and
//has a sequence of UTF8 encoded bytes
String str = new String(preConvertedBytes,"UTF8");
...
This will convert the bytes stored in preConvertedBytes
into a String
according to the UTF8 encoding scheme.
CharToByteConverter
Constructor and Description |
---|
ByteToCharConverter() |
Modifier and Type | Method and Description |
---|---|
abstract int |
convert(byte[] src,
int srcStart,
int srcEnd,
char[] dst,
int dstStart,
int dstEnd)
Converts the specified
byte array into a char array based on
this ByteToCharConverter 's encoding scheme. |
abstract int |
flush(char[] buff,
int start,
int end)
Tells the
ByteToCharConverter to convert
any unconverted data it has internally stored. |
static ByteToCharConverter |
getConverter(String name)
Dynamically loads a
ByteToCharConverter for the specified encoding scheme. |
static ByteToCharConverter |
getDefaultConverter()
Returns the default
ByteToCharConverter for the system. |
abstract int |
getMaxCharCount(byte[] forThis,
int start,
int end)
Returns the number of characters that the specified
byte
sequence will require for encoding. |
abstract String |
getName()
Returns the name of this encoding scheme.
|
public static ByteToCharConverter getConverter(String name)
ByteToCharConverter
for the specified encoding scheme.
All converters should be placed in the com.integpg.io package
, and have class
name ByteToCharNAME, where NAME is the encoding scheme. For example, the UTF8
ByteToCharConverter
is called com.integpg.io.ByteToCharUTF8
.name
- the name of the encoding schemenull
if none could be foundpublic static ByteToCharConverter getDefaultConverter()
ByteToCharConverter
for the system. The name
of the default encoding scheme is stored in the system property
"file.encoding". This method finds the name of the default
encoding scheme, and calls getConverter()
with that name
as its argument.null
if the converter could not be foundgetConverter(java.lang.String)
public abstract int getMaxCharCount(byte[] forThis, int start, int end)
byte
sequence will require for encoding. For instance, in UTF8
encoding, a one, two, or three byte sequence may encode to
one char
. This method should always be called before the
convert()
method. The value returned may not be the actual
number of converted characters that will be produced due to
conversion errors, but it will be the maximum that will be
produced.forThis
- the byte sequence to determine the
required encoding sizestart
- offset into the byte array to begin processingend
- the ending offset in the byte array to stop processing.
The number of processed bytes will then be (end-start)
.byte
sequenceconvert(byte[],int,int,char[],int,int)
public abstract int convert(byte[] src, int srcStart, int srcEnd, char[] dst, int dstStart, int dstEnd) throws CharConversionException
byte
array into a char
array based on
this ByteToCharConverter
's encoding scheme. getMaxCharCount()
should
always be called first to find out how much room is required in the
destination char
array.src
- the same byte array passed to getMaxCharCount()
srcStart
- the same starting offset as passed to getMaxCharCount()
srcEnd
- the same ending offset as passed to getMaxCharCount()
dst
- the destination character array.dstStart
- the offset to begin storing converted bytes in the
destination arraydstEnd
- the ending location for storing converted bytes into the
destination array. This argument may usually be ignored, as
the algorithm may choose to continue converting bytes until
finished.CharConversionException
- if an illegal byte sequence is encountered
that cannot be convertedgetMaxCharCount(byte[],int,int)
,
flush(char[],int,int)
public abstract int flush(char[] buff, int start, int end) throws CharConversionException
ByteToCharConverter
to convert
any unconverted data it has internally stored.
Some ByteToCharConverter
's will store state between
calls to convert()
. Since the converter may be left in
an unknown state, the converter should be flushed to
notify it that no more input will be received. The converter
can handle any unfinished conversions before its output is
used.buff
- the destination character arraystart
- the next available offset into the destination arrayend
- offset in the destination array to stop placing data
(may be ignored by some algorithms)flush()
CharConversionException
- if an illegal character is encountered that cannot be convertedconvert(byte[],int,int,char[],int,int)
public abstract String getName()