Encoding Data inside of a URL Query String

Garrett Bodley
4 min readJun 7, 2021

--

Handing over a ticket
Photo by Ben White on Unsplash

In my last blog post, I introduced Noodlr, an online step sequencer that programmatically generates music based on user input. Due to popular demand, I decided to implement a feature that encodes the state of the sequencer a URI safe, base 64 string.

Design Goals

One of my least favorite things online is when websites force you to create an account to interact with their content. As such, it was important to me that vamps could be easily shared with anyone, not just other users with Noodlr accounts.

Constraints:

  • The ability to share with anyone, not just other Noodlr accounts
  • Users should be able to share their vamps with their friends via text. This means vamps must be shared via a custom-generated link.
  • When visiting a share-link, Noodlr should immediately import the shared vamp, along with the vamp’s name, tempo, and volume settings.
  • Anyone who receives a share link can edit the vamp and save it to their own account without affecting the version linked to the original creator.
  • The share link must be less than 2000 characters long, the general limit for most browsers.

After some brainstorming, I decided to encode the vamp data inside the share link itself as a query string. This would allow me to avoid dealing with complicated permissions on the server, especially when I wanted users to be able to load the shared vamp without making an account on the site.

First Attempts

My first idea was to simply take the data structure I was already creating to save vamps to the server, convert it to JSON, and then convert that string to hexadecimal. The Javascript documentation for parseInt() says it can convert strings into hexadecimal and toString() readily converts any hexadecimal back into a string. Should be as easy as calling those functions and reading/writing a query string, right? Wrong.

A screenshot showing this method failing to encode the string “the rain in spain lands mainly in the plains”. parseInt return NaN which is then converted to “NaN” by toString()

As it turns out, parseInt() only converts strings that already represent integers, so it’s relatively useless when it comes to encoding a full sentence.

What about ASCII codes? I could convert the string into the corresponding ASCII codes and then encode that as hexadecimal, right? Unfortunately, that doesn’t work either for several reasons. ASCII codes can be of varying lengths so it would be very cumbersome to identify where each char code begins and ends. Even ignoring that, while you can convert each character to its ASCII code and then turn those codes to hex, the result ends up being longer than what you’re trying to encode, entirely defeating the purpose of encoding.

A screenshot showing code that converts a string to its corresponding ASCII codes and then transforms those codes into hexadecimal. In this instance, the result is two times longer than the original string.
this seems counterproductive…

Okay, I know which methods don’t work. How do I actually go about solving this?

lz-string and Lempel–Ziv–Welch Compression

Ultimately I realized that I was attempting, albeit naively, to construct a compression algorithm for a string. After doing a little research, I stumbled across lz-string on npm, which uses a modified LZW compression algorithm.

From wiki:

Lempel–Ziv–Welch (LZW) is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978. The algorithm is simple to implement and has the potential for very high throughput in hardware implementations.[1] It is the algorithm of the widely used Unix file compression utility compress and is used in the GIF image format.

Amazingly, lz-string has the ability to not only compress your string but can also encode the output to ensure that it is URI safe. Armed with this fantastic software, I easily compressed my JSON string down to around 1000 characters, well within the URL limit.

lz-string simplifies everything down to a single function: compressToEncodedURIComponent()

As you can see above, I also simplified the grid notation (grid explained here) by reducing it down to a series of nested arrays filled with booleans to denote whether a note was active or not. This helped shorten the query string even further.

Thanks to lz-string, decoding the data only requires one line of code

Quick and easy decompression thanks to lz-string

Once decoded, the data in the query string can be processed and used as needed within your program.

Some additional logic to map the notation to Noodlr’s grid, as well as a redirect in case the query string does not contain valid data

Shareable Vamps: Now at your Fingertips!

Having figured the compression/decompression logic out, all that was left was to create a component to display the link. Having worked through this problem I definitely feel like I have a stronger understanding of the difference between encoding vs compression, and now have a reference should I ever want to encode and compress data in a URL or other string.

The new share feature is now live on the Noodlr website. Check it out, make some music, and share with your friends!

--

--

Garrett Bodley
Garrett Bodley

No responses yet