Saturday, 31 March 2012

Eating the Comma

A valid JSON fragment.

We all love JSON (Java Script Object Notation) for serialization.

It just feels so natural, you’ve got numbers and strings, all wrapped up in arrays and objects.  And that’s it!

I've found myself writing tiny data exporters quite a bit lately, and wanted to share a technique I like to call “Eating the Comma”.

Take a look at the JSON on the right.  You can immediately see that it's well-formed.  But now take a closer look at the commas.  See how we can’t tell until the next line if we need a comma on this line?

If everything is in memory, it's not a problem, we know when we get to the end of the array, or the last field in an object.

But what if we're generating the data on fly?

Or if there are gaps in the data (like the "vendor" string) where we need to look ahead in the output stream to tell if we've reached the end?

Bad JSON - too many commas!

Wouldn’t it be easier to program if we could output this “Bad JSON” instead?

So how can we be lazy when we write the output, but still write out valid JSON? Something like this:

We’ll go ahead and write out the commas, but then delete them again when we close the array or object.

Here's how that might look in C++ :

(click to zoom)

Well, this is my first blog post, so please use the comments below and let me know what you think.

No comments:

Post a Comment