com.devexperts.io
Class IOUtil

java.lang.Object
  extended by com.devexperts.io.IOUtil

public class IOUtil
extends Object

Utility class that provides algorithms for data serialization and deserialization. It defines several compact data formats and provides clean and convenient API to use them.

CompactInt

The CompactInt is a serialization format for integer numbers. It uses encoding scheme with variable-length two's complement big-endian format capable to encode 64-bits signed numbers.

The following table defines used serial format (the first byte is given in bits with 'x' representing payload bit; the remaining bytes are given in bit count):

 0xxxxxxx     - for -64 <= N < 64
 10xxxxxx  8x - for -8192 <= N < 8192
 110xxxxx 16x - for -1048576 <= N < 1048576
 1110xxxx 24x - for -134217728 <= N < 134217728
 11110xxx 32x - for -17179869184 <= N < 17179869184 (includes whole range of signed int)
 111110xx 40x - for -2199023255552 <= N < 2199023255552
 1111110x 48x - for -281474976710656 <= N < 281474976710656
 11111110 56x - for -36028797018963968 <= N < 36028797018963968
 11111111 64x - for -9223372036854775808 <= N < 9223372036854775808 (the range of signed long)
 

Compact Encapsulation

The Compact Encapsulation is a method of wrapping serial data for representation on another layer. There is little dedicated API for compact encapsulation exist - it is a technique used implicitly by other API.

This method first writes length of encapsulated data in a compact format and then writes data itself. By convention values of length lesser than -1 are illegal (reserved for future use); length value of -1 indicates special case of null data (if applicable); length value of 0 indicates empty data (as applicable; e.g. no data elements); and positive values of length indicate either number of data elements or number of bytes they occupy.

Note: the length of encapsulated data is formally declared as long value; readers shall read full 64-bit length value and report overflow if they cannot handle large values.

UTF

The UTF API works with Unicode character data in several formats: Unicode code point (int value), UTF-16 encoding (String class) and UTF-8 encoding (serial format). The UTF API uses compact encapsulation with length defined as number of UTF-8 encoded bytes.

Note: the UTF API uses official UTF-8 encoding format while Java serialization uses modified UTF-8 format. This results with several APIs that simingly work with same UTF-8 encoding, yet they differ in encoding and encapsulation.

Object

The Object API helps to serialize and deserialize objects - either individual or declared groups.

The serial form of individual object is defined as a byte array those content is a result of independent serialization of that object by new instance of ObjectOutputStream. When this byte array is written to the output it uses compact encapsulation.

The serial form of declared group of objects uses more efficient serialization of primitive data (individual and arrays) and single ObjectOutputStream for remaining non-primitive data. It uses compact encapsulation of resulting byte array when writing it to the output. This API requires knowledge of declared data types on both sides (serialization and deserialization) in order to work. It is intended for cases when method arguments shall be serialized for RMI because in this cases both sides know declared types of those arguments.

Compression

The Compression API allows serial data to be compressed in order to reduce space it occupies and save resources needed to store or transmit the data.

The compression API uses Deflate algorithm (see RFC 1951) and wraps compressed data blocks using ZLIB format (see RFC 1950). It can be used arbitrarily and it is also intended to be used transparently by serialization API.


Method Summary
static Object bytesToObject(byte[] bytes)
          Deserializes an array of bytes into object with Java Serialization.
static Object bytesToObject(byte[] bytes, ClassLoader cl)
          Deserializes an array of bytes into object with Java Serialization.
static Object[] bytesToObjects(Class[] types, byte[] bytes)
          Deserializes an array of bytes into declared group of objects according to their declared types.
static Object[] bytesToObjects(Class[] types, byte[] bytes, ClassLoader cl)
          Deserializes an array of bytes into declared group of objects according to their declared types.
static void checkRange(byte[] b, int off, int len)
          Throws IndexOutOfBoundsException if parameters are out of range.
static byte[] compress(byte[] bytes)
          Compresses an array of bytes using Deflate algorithm as appropriate.
static byte[] decompress(byte[] bytes)
          Decompresses an array of bytes using Inflate algorithm repeatedly as appropriate.
static byte[] deflate(byte[] bytes, int level)
          Compresses an array of bytes using Deflate algorithm with specified compression level.
static int getCompactLength(long n)
          Returns number of bytes that are needed to write specified number in a compact format.
static byte[] inflate(byte[] bytes)
          Decompresses an array of bytes using Inflate algorithm (reverse of Deflate algorithm).
static boolean isCompressionEnabled()
          Returns value of compression strategy.
static byte[] objectsToBytes(Class[] types, Object... objects)
          Serializes a declared group of objects to an array of bytes according to their declared types.
static byte[] objectToBytes(Object object)
          Serializes an object to an array of bytes with Java Serialization.
static byte[] readByteArray(DataInput in)
          Reads an array of bytes from the data input in a compact encapsulation format.
static char[] readCharArray(DataInput in)
          Reads an array of characters from the data input in a CESU-8 format with compact encapsulation.
static String readCharArrayString(DataInput in)
          Deprecated.  
static int readCompactInt(DataInput in)
          Reads an int value from the data input in a compact format.
static long readCompactLong(DataInput in)
          Reads a long value from the data input in a compact format.
static Object readObject(DataInput in)
          Reads an object from the data input as a Java-serialized byte array with compact encapsulation.
static Object readObject(DataInput in, ClassLoader cl)
          Reads an object from the data input as a Java-serialized byte array with compact encapsulation.
static int readUTFChar(DataInput in)
          Reads Unicode code point from the data input in a UTF-8 format.
static String readUTFString(DataInput in)
          Reads Unicode string from the data input in a UTF-8 format with compact encapsulation.
static void setCompressionEnabled(boolean compressionEnabled)
          Sets new value for compression strategy.
static void writeByteArray(DataOutput out, byte[] bytes)
          Writes an array of bytes to the data output in a compact encapsulation format.
static void writeCharArray(DataOutput out, char[] chars)
          Writes an array of characters to the data output in a CESU-8 format with compact encapsulation.
static void writeCharArray(DataOutput out, String str)
          Writes an array of characters to the data output in a CESU-8 format with compact encapsulation.
static void writeCompactInt(DataOutput out, int v)
          Writes an int value to the data output in a compact format.
static void writeCompactLong(DataOutput out, long v)
          Writes a long value to the data output in a compact format.
static void writeObject(DataOutput out, Object object)
          Writes an object to the data output as a Java-serialized byte array with compact encapsulation.
static void writeUTFChar(DataOutput out, int codePoint)
          Writes a Unicode code point to the data output in a UTF-8 format.
static void writeUTFString(DataOutput out, String str)
          Writes a string to the data output in a UTF-8 format with compact encapsulation.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

checkRange

public static void checkRange(byte[] b,
                              int off,
                              int len)
Throws IndexOutOfBoundsException if parameters are out of range.


getCompactLength

public static int getCompactLength(long n)
Returns number of bytes that are needed to write specified number in a compact format.

Parameters:
n - the number those compact length is returned
Returns:
number of bytes that are needed to write specified number in a compact format

writeCompactInt

public static void writeCompactInt(DataOutput out,
                                   int v)
                            throws IOException
Writes an int value to the data output in a compact format.

Parameters:
out - the destination to write to
v - the int value to be written
Throws:
IOException - if an I/O error occurs

writeCompactLong

public static void writeCompactLong(DataOutput out,
                                    long v)
                             throws IOException
Writes a long value to the data output in a compact format.

Parameters:
out - the destination to write to
v - the long value to be written
Throws:
IOException - if an I/O error occurs

readCompactInt

public static int readCompactInt(DataInput in)
                          throws IOException
Reads an int value from the data input in a compact format. If actual encoded value does not fit into an int data type, then it is truncated to int value (only lower 32 bits are returned); the number is read entirely in this case.

Parameters:
in - the source to read from
Returns:
the int value read
Throws:
IOException - if an I/O error occurs

readCompactLong

public static long readCompactLong(DataInput in)
                            throws IOException
Reads a long value from the data input in a compact format.

Parameters:
in - the source to read from
Returns:
the long value read
Throws:
IOException - if an I/O error occurs

writeByteArray

public static void writeByteArray(DataOutput out,
                                  byte[] bytes)
                           throws IOException
Writes an array of bytes to the data output in a compact encapsulation format. This method defines length as a number of bytes.

Parameters:
out - the destination to write to
bytes - the byte array to be written
Throws:
IOException - if an I/O error occurs

readByteArray

public static byte[] readByteArray(DataInput in)
                            throws IOException
Reads an array of bytes from the data input in a compact encapsulation format. This method defines length as a number of bytes.

Parameters:
in - the source to read from
Returns:
the byte array read
Throws:
IOException - if an I/O error occurs

writeCharArray

public static void writeCharArray(DataOutput out,
                                  char[] chars)
                           throws IOException
Writes an array of characters to the data output in a CESU-8 format with compact encapsulation. This method defines length as a number of characters.

Parameters:
out - the destination to write to
chars - the char array to be written
Throws:
IOException - if an I/O error occurs

writeCharArray

public static void writeCharArray(DataOutput out,
                                  String str)
                           throws IOException
Writes an array of characters to the data output in a CESU-8 format with compact encapsulation. This method defines length as a number of characters. This is a bridge method that accepts String and treats it as char array.

Parameters:
out - the destination to write to
str - the string to be written
Throws:
IOException - if an I/O error occurs

readCharArray

public static char[] readCharArray(DataInput in)
                            throws IOException
Reads an array of characters from the data input in a CESU-8 format with compact encapsulation. Overlong UTF-8 and CESU-8-encoded surrogates are accepted and read without errors. This method defines length as a number of characters.

Parameters:
in - the source to read from
Returns:
the char array read
Throws:
UTFDataFormatException - if the bytes do not represent a valid CESU-8 encoding of a character or if resulting code point is beyond Basic Multilingual Plane (BMP)
IOException - if an I/O error occurs

readCharArrayString

public static String readCharArrayString(DataInput in)
                                  throws IOException
Deprecated. 

Reads an array of characters from the data input in a CESU-8 format with compact encapsulation. Overlong UTF-8 and CESU-8-encoded surrogates are accepted and read without errors. This method defines length as a number of characters.

Parameters:
in - the source to read from
Returns:
the char array read
Throws:
UTFDataFormatException - if the bytes do not represent a valid CESU-8 encoding of a character or if resulting code point is beyond Basic Multilingual Plane (BMP)
IOException - if an I/O error occurs

writeUTFChar

public static void writeUTFChar(DataOutput out,
                                int codePoint)
                         throws IOException
Writes a Unicode code point to the data output in a UTF-8 format. The surrogate code points are accepted and written in a CESU-8 format.

Parameters:
out - the destination to write to
codePoint - the code point to be written
Throws:
UTFDataFormatException - if codePoint is not a valid Unicode character
IOException - if an I/O error occurs

readUTFChar

public static int readUTFChar(DataInput in)
                       throws IOException
Reads Unicode code point from the data input in a UTF-8 format. Overlong UTF-8 and CESU-8-encoded surrogates are accepted and read without errors.

Parameters:
in - the source to read from
Returns:
the Unicode code point read
Throws:
UTFDataFormatException - if the bytes do not represent a valid UTF-8 encoding of a character
IOException - if an I/O error occurs

writeUTFString

public static void writeUTFString(DataOutput out,
                                  String str)
                           throws IOException
Writes a string to the data output in a UTF-8 format with compact encapsulation. Unpaired surrogate code points are accepted and written in a CESU-8 format. This method defines length as a number of bytes.

Parameters:
out - the destination to write to
str - the string to be written
Throws:
UTFDataFormatException - if str is too long
IOException - if an I/O error occurs

readUTFString

public static String readUTFString(DataInput in)
                            throws IOException
Reads Unicode string from the data input in a UTF-8 format with compact encapsulation. Overlong UTF-8 and CESU-8-encoded surrogates are accepted and read without errors. This method defines length as a number of bytes.

Parameters:
in - the source to read from
Returns:
the Unicode string read
Throws:
UTFDataFormatException - if the bytes do not represent a valid UTF-8 encoding of a string
IOException - if an I/O error occurs

objectToBytes

public static byte[] objectToBytes(Object object)
                            throws IOException
Serializes an object to an array of bytes with Java Serialization. This method understands non-serializable Marshalled objects as a special case and returns the result of Marshalled.getBytes() call.

Parameters:
object - the object to be serialized
Returns:
the byte array with serialized object
Throws:
IOException - if object cannot be serialized

bytesToObject

public static Object bytesToObject(byte[] bytes)
                            throws IOException
Deserializes an array of bytes into object with Java Serialization.

Parameters:
bytes - the byte array to be deserialized
Returns:
the deserialized object
Throws:
IOException - if object cannot be deserialized

bytesToObject

public static Object bytesToObject(byte[] bytes,
                                   ClassLoader cl)
                            throws IOException
Deserializes an array of bytes into object with Java Serialization.

Parameters:
bytes - the byte array to be deserialized
cl - the ClassLoader that will be used to load classes; null for default
Returns:
the deserialized object
Throws:
IOException - if object cannot be deserialized

objectsToBytes

public static byte[] objectsToBytes(Class[] types,
                                    Object... objects)
                             throws IOException
Serializes a declared group of objects to an array of bytes according to their declared types. This method understands non-serializable Marshalled objects as a special case and implicitly converts them into original objects via Marshalled.getObject() call.

Parameters:
types - the declared types of serialized objects
objects - the actual objects to be serialized
Returns:
the byte array with serialized objects
Throws:
IllegalArgumentException - if types and objects have different lengths
ClassCastException - if actual object types do not match declared types
IOException - if objects cannot be serialized

bytesToObjects

public static Object[] bytesToObjects(Class[] types,
                                      byte[] bytes)
                               throws IOException
Deserializes an array of bytes into declared group of objects according to their declared types.

Parameters:
types - the declared types of serialized objects
bytes - the byte array to be deserialized
Returns:
the deserialized objects
Throws:
ClassCastException - if actual object types do not match declared types
IOException - if objects cannot be deserialized

bytesToObjects

public static Object[] bytesToObjects(Class[] types,
                                      byte[] bytes,
                                      ClassLoader cl)
                               throws IOException
Deserializes an array of bytes into declared group of objects according to their declared types.

Parameters:
types - the declared types of serialized objects
bytes - the byte array to be deserialized
cl - the ClassLoader that will be used to load classes; null for default
Returns:
the deserialized objects
Throws:
ClassCastException - if actual object types do not match declared types
IOException - if objects cannot be deserialized

writeObject

public static void writeObject(DataOutput out,
                               Object object)
                        throws IOException
Writes an object to the data output as a Java-serialized byte array with compact encapsulation. This method understands non-serializable Marshalled objects as a special case and writes the result of Marshalled.getBytes() call.

Parameters:
out - the destination to write to
object - the object to be written
Throws:
IOException - if an I/O error occurs or if object cannot be serialized

readObject

public static Object readObject(DataInput in)
                         throws IOException
Reads an object from the data input as a Java-serialized byte array with compact encapsulation.

Parameters:
in - the source to read from
Returns:
the object read
Throws:
IOException - if an I/O error occurs or if object cannot be deserialized

readObject

public static Object readObject(DataInput in,
                                ClassLoader cl)
                         throws IOException
Reads an object from the data input as a Java-serialized byte array with compact encapsulation.

Parameters:
in - the source to read from
cl - the ClassLoader that will be used to load classes; null for default
Returns:
the object read
Throws:
IOException - if an I/O error occurs or if object cannot be deserialized

deflate

public static byte[] deflate(byte[] bytes,
                             int level)
Compresses an array of bytes using Deflate algorithm with specified compression level.

Parameters:
bytes - the byte array to be compressed
level - the compression level from 0 to 9 inclusive; -1 for default
Returns:
the compressed byte array

inflate

public static byte[] inflate(byte[] bytes)
                      throws DataFormatException
Decompresses an array of bytes using Inflate algorithm (reverse of Deflate algorithm).

Parameters:
bytes - the byte array to be decompressed
Returns:
the decompressed byte array
Throws:
DataFormatException - if data format error has occured

isCompressionEnabled

public static boolean isCompressionEnabled()
Returns value of compression strategy.

Returns:
true if compression is enabled, false otherwise

setCompressionEnabled

public static void setCompressionEnabled(boolean compressionEnabled)
Sets new value for compression strategy.

Parameters:
compressionEnabled - the value of compression flag

compress

public static byte[] compress(byte[] bytes)
Compresses an array of bytes using Deflate algorithm as appropriate. Unlike deflate(byte[], int) method this method determines if compression is necessary or not based on compression strategy (see isCompressionEnabled()) and heuristics related to specified byte array. If decided negative it returns original byte array, otherwise it uses fastest compression level.

This method is intended for transparent compression of serialized data and similar cases.

Parameters:
bytes - the byte array to be compressed as appropriate
Returns:
original byte array or compressed byte array depending on decision

decompress

public static byte[] decompress(byte[] bytes)
Decompresses an array of bytes using Inflate algorithm repeatedly as appropriate.

This method is intended for transparent decompression of serialized data and similar cases.

Parameters:
bytes - the byte array to be decompressed as appropriate
Returns:
the decompressed byte array


Copyright © 2013 Devexperts. All Rights Reserved.