Thursday, July 19, 2007

Two Cousins - String and StringBuffer

Introduction
If at all Strings exists then why do we need StringBuffer?
Untill now I pacified myself stating: its just a version of String with inbuilt buffer and dynamic size ability while compromising immutability and loosing its position of datatype.
But things changed when I started to optimize my code, and studied the relationship String and StringBuffer share.
Lets first discuss capabilities of each saperately to understand them and the relation they share.

String
String is one of the most powerful thing Java provided to programmers, it is a immutable object and treated as a datatype. Also it is the only object witch allows binary operation '+' and the assingnment operation '+='.
But still it is the Whipping Boy of Java performace, the reson behind is clear, it uses internally a char array to represent a String (witch takes heap memory, while all other datatype variable are created on stack) but there is no way to avoide Strings. Though there are things like String pool provided by JVM to share strings (internally by JVM not visible to user) but still '+' and '+=' costs a lot as each binary opration requires 3 string objects to be created (thus object creation cost as well as heap consumption).

StringBuffer
Its just like a String with an underlaying char array to store the string, but with dynamic extandability and utility methods like append, delete, insert and replace. With an option to define initial arraysize to reduce arraycopy calls. And to add to its stack - it is thread safe.
So are you getting something, its just like String but with methods to manipulate them more efficiently. As StringBuffer does not return a new objec for each manipulation operation like any append call will require only one StringBuffer object.

The Cousin Relationship
Use StringBuffer in manipulation and then use toString method to get the final String.
Well by now we all are smart enough to find out that, but wait a second its just a normal behaviour of any object's toString method where is the cousin relationship, do they really have Blood relation.
Yup they do share Blood, StringBuffers toString method returns a String object witch share the same underlaying array that of the StringBuffer, with a flag shared set to true. Thus no conversion cost is involved form converting a StirngBuffer to Stirng object.
They share the same char array till any other manipulation operation is performed on StirngBuffer, witch leads in creation of new underlaying array for StringBuffer object and invoking a System.arrayCopy method. The String object does not change and continue to point to the previous internal char array.

How does it happens
When toString method is called on StringBuffer it returns a new string as follows
return new String(this);

Thus calling constructor of String taking StringBuffer as argument, witch looks like this
public String (StringBuffer buffer) {
synchronized(buffer) {
buffer.setShared();
this.value = buffer.getValue();
this.offset = 0;
this.count = buffer.length();
}
}

Where setShared method set the StringBuffer's shared flag to true, witch in handled in every String manipulation opration in StringBuffer as
if (shared) copy();

Witch creates a new copy of char array and set StringBuffers storage pointer to it. (see copy methods code below)
private final void copy() {
char newValue[] = new char[value.length];
System.arraycopy(value, 0, newValue, 0, count);
value = newValue;
shared = false;
}

No comments: