Home > Articles > Webmaster resources > Javascript string compression

August 2008

Javascript string compression tutorial, including search and replace

While I was creating my Writing Wisdom widget, I spent some time looking at how I could optimise the size of a Javascript file that consisted predominantly of text data.

The approach I used was simple, and a project that included much more text might justify a more advanced approach. But using these simple techniques, I was able to chop about 15% off my file size (and therefore off my hosting bill for that particular file).

Note that I'm not talking about compressing the Javascript code itself - I'm talking about compressing text data. There's plenty written elsewhere about optimising Javascript code.

This tutorial includes instructions on how to use split and join in Javascript.

Doing the splits with Javascript

One of the most useful commands in Javascript is split, which is used to divide a string into an array. So you could take a sentence, and then use split to put all the words into different elements of an array, like this:

<script language="Javascript">

var sentence="This is where music goes to die.";
var words=sentence.split(" ");

for (i=0; i<words.length; i++)
{
document.write( i+ words[i] +"<BR>");
}

</script>

Below is the output you get when that script runs. The number at the start is the element number of the array, so the array words[0] contains "This", for example.

Here we've separated words where the spaces are, but you can use any character or sequence of characters.

Compressing array declarations in Javascript

I often see Javascript routines that define array pairs, like this:

question=new Array();
answer=new Array();

question[0]="What do you call a donkey with three legs?";
answer[0]="A wonkey";

question[1]="What do you get if you cross an elephant with a rhinoceros?";
answer[1]="An eleoceros";

There's a lot of repetition in the array definitions there. As I said, this article isn't really about compressing Javascript itself, but in array and text heavy scripts it can save a lot of space if you define the arrays like this instead:

qanda=new Array("What do you call a donkey with three legs?|A wonkey", "What do you get if you cross an elephant with a rhinoceros?|An eleoceros");

You've kept the question and answer pairs together, while also taking advantage of the shorter way to declare arrays. When you need to use the data, you can then separate the question and answer by splitting at the bar character("|"). Note that the split command discards the separator, in this case the bar character.

var chosenjoke=1;
var spl=qanda[chosenjoke].split("|");

document.write("Question:"+spl[0]+"<BR>");
document.write("Answer:"+spl[1]);

Spotting the Join with Javascript

There is an opposite to split: join will take the contents of an array and combine it into a string, using whatever separator you specify between the different data items.

fruits=new Array("apples","pears","bananas","oranges");
var fruitlist=fruits.join(", ");
document.write("I like "+fruitlist);

This produces the following output. Note the space after the comma in both the code above, and in the resulting output:

I like apples, pears, bananas, oranges

Javascript search and replace

By using split and join together, it's possible to create a search and replace routine.

var oldstring="The Spectrum is the best computer ever. Spectrums rule!";

newstring=oldstring.split("Spectrum").join("Amstrad");
document.write(newstring);

The second line there creates a new string. It starts by separating the old string, using the word 'Spectrum' as the separator. The separator is removed automatically by the split command, which means the word 'Spectrum' is removed and the parts of the sentence on each side of that word are put into different array elements.

We then use join to combine all those array elements together into a single string again, but we use the separator 'Amstrad' between the different sentence parts. The end result is that the word Spectrum is replaced by Amstrad in the string. Here's the resulting output.

Javascript text compression using search and replace

I compressed the text in my quotes widget by looking for frequently repeated sequences of words or characters and replacing them with symbols. It's a technique I first saw used in the Amstrad game Spellbound years back.

I used this text analyser to identify those words that were used most often. I also looked at sequences of letters that were frequently repeated and replaced those with symbols. The extent to which you'll be able to compress your text will depend on how many recurring patterns there are. In my case, the words writing, book and author came up very often.

For the purposes of an example, let's use this nursery rhyme, which has plenty of repetition in it:

Peter Piper picked a peck of pickled peppers
A peck of pickled peppers Peter Piper picked
If Peter Piper picked a peck of pickled peppers
Where's the peck of pickled peppers Peter Piper picked ?

This might be how you would normally handle it in Javascript:

var sourcestring="Peter Piper picked a peck of pickled peppers<BR>A peck of pickled peppers Peter Piper picked<BR>If Peter Piper picked a peck of pickled peppers<BR>Where's the peck of pickled peppers Peter Piper picked ?"

document.write(sourcestring);

Now, here it is using a search and replace routine:

replacestring=new Array("1|Peter Piper ","2| picked ","3| pickled peppers ","4| peck ");

var sourcestring="12a4of3<br>A4of312<br>If 12a4of3<br>Where's the4of312?";

for (i=0;i<replacestring.length;i++)
{
tempstring=replacestring[i].split("|");
sourcestring=sourcestring.split(tempstring[0]).join(tempstring[1]);
}
document.write(sourcestring);

The text string has been compressed from 207 characters to 59 characters for sourcestring plus 62 characters for the new replacestring. That represents a saving of about 38% on the text space, but clearly the Javascript required to decompress the text will carry a file size penalty. In this example, the code has gone from 254 characters for the uncompressed text to 345 characters for the compressed version (it can be cut to 234 using shorter variable names).

As more text is added, though, the compression routine starts to pay for itself. New words can be easily added to the search and replace routine, using punctuation symbols and short character combinations (eg !1, !2, !3) to replace recurring phrases in new text that's added.

You'll need to weigh up the circumstances in which this script is useful to you, but for applications that include a lot of natural language text, this script could create substantial savings.

As an additional benefit, it can be a handy way to make Javascript text strings difficult to understand. It falls way short of Javascript encryption, but if you were writing an adventure game, it could be enough to stop people working out where the treasure's buried.

More Javascript and webdesign resources

Please browse my other website design tutorials and Javascripts here.

Credits

© Sean McManus. All rights reserved.

Visit www.sean.co.uk for free chapters from Sean's coding books (including Mission Python, Scratch Programming in Easy Steps and Coder Academy) and more!

Discover my latest books

Coding Compendium

Coding Compendium

A free 100-page ebook collecting my projects and tutorials for Raspberry Pi, micro:bit, Scratch and Python. Simply join my newsletter to download it.

Web Design in Easy Steps

Web Design IES

Web Design in Easy Steps, now in its 7th Edition, shows you how to make effective websites that work on any device.

100 Top Tips: Microsoft Excel

100 Top Tips: Microsoft Excel

Power up your Microsoft Excel skills with this powerful pocket-sized book of tips that will save you time and help you learn more from your spreadsheets.

Scratch Programming in Easy Steps

Scratch Programming IES

This book, now fully updated for Scratch 3, will take you from the basics of the Scratch language into the depths of its more advanced features. A great way to start programming.

Mission Python book

Mission Python

Code a space adventure game in this Python programming book published by No Starch Press.

Cool Scratch Projects in Easy Steps book

Cool Scratch Projects in Easy Steps

Discover how to make 3D games, create mazes, build a drum machine, make a game with cartoon animals and more!

Walking astronaut from Mission Python book Top | Search | Help | Privacy | Access Keys | Contact me
Home | Newsletter | Blog | Copywriting Services | Books | Free book chapters | Articles | Music | Photos | Games | Shop | About