Learn to Code via Tutorials on Repl.it!

← Back to all posts
JavaScript Memoizers - Raising the speed limit part 1
h
Baconman321 (1051)

I'm

sure you guys have used JavaScript at least once in your life.

As with all coding languages, though... you will run into performance issues... whether it be bad program implementations or the language itself, you will almost for sure run into some kind of problem relating to performance in your programming life!

This is my new series that focuses on increasing your JavaScript performance, called Raising the speed limit.

This is my very first post under this series, so don't pressure me (much), please.

Whether I continue or not depends on whether or not you guys want to hear more from me and Raising the speed limit.

In this particular tutorial, we will focus on a way that can speed up JavaScript significantly, sometimes by hundreds of milliseconds! This method is called a memoizer (I thought it was memorizer at first, lol).

You should know and be familiar with class and arrow function syntax as well as overall JavaScript syntax/functionality if you expect to understand this tutorial. If not, javascript.info has some good tutorials on them.

Ready to have a go at that fresh 60 FPS, baby?

Or if you aren't using this for a game that's cool, too!

Ok, so... what is a memoizer.

Well, you all know about web cache, right?

To speed up your web page loading time (or to put less load on the server serving the web page), static resources can be cached. Then, the computer will serve those assets with the cached file instead of retrieving one from the internet every time.

A memoizer is like that, but it stores (or "memorizes"/storing a memo, hence the name "memoizer") data and instead of re-calculating something it will return the already cached data.

Now, you might be thinking "how do I even use this?", and that's alright because that's what I made this tutorial for.

A memoizer is quite useful for functions that return output in relation to the input.

Note that I say in relation to. You shouldn't really use memoizers for arbitrary data returned from input, like AJAX calls. You don't cache dynamic files (most of the time). Likewise, you shouldn't store data that is likely to vary - even with the same input.

Let's say for instance, the factorial function.

I'm not that far in math yet to have a use for a factorial, and I don't know if you guys are but it doesn't matter.

All we need to know is that a factorial function can take up a lot of power, since it's a lot of times recursive (depending on how you implement it).

An factorial function would look like this:

function factorial(num) {
  return (num === 0) ? 1 : num * factorial(num - 1);
}

As you can already see here, it will constantly multiply num with a call factorial with num - 1.

With smaller numbers (say, 1-100), it may not be too bad of a performance issue, but with bigger numbers it might start to cause a stack overflow and/or really cause some hanging (program freezing).

What we can do instead is create a memoizer that will cache our results and be ready to give them at a later time.

Memoizers are't the best for single-time function calls with certain input, but are quite good for recurring data patterns.

In this case, the factorial will return the same results if we call it with the number 100.

So, let's create a memoizer, shall we?


We'll make this as a class so you can use the memoizer on many different functions/calculations.

class Memoizer {
  
}

Well, that isn't going to help us, now is it?

Let's add some more to it. God I hope these people know how to use classes.

Adding the constructor to the Memoizer class, it should look like this:

class Memoizer {
  constructor(){
    this.methods = [];
  }
}

Why do we have a methods array?

Well, remember how I said we'd make it a class so you can use it on many different functions?

Sure, you could use it on calculations, but it would be easiest to put those calculations in a function and call it multiple times (instead of just putting the math somewhere in your code).

You wouldn't just randomly copy-paste calculations in multiple places, you put it in a function (which is pretty much what they [functions] are for anyways).

So, therefore we are making this class capable of taking in multiple functions and memorizing input/output in relation to each of them.

Either way, our class still won't be of any help to us with just this.

Let's make sure we can add some methods later on!

Below the constructor, add a new class method:

registerMethod(method){
  this.methods.push({
    "method": method
  });
}

There, now we can refer to a certain function stored when extracting/feeding data to/from it.

Still though, I don't think you guys want to feed data in yourself. Let's make a method to do it for us!

As always, under the last class method:

feedData(method, input, results){
  this.methods.forEach(v => {
    if(v.method === method){
      if(v.memory){
        v.memory.forEach((v) => {
          if(v.input === input){
            v.results = results;
          }
        });
        v.memory.push({"input": input, "results": results});
      }
      else{
        v.memory = [{"input": input, "results": results}];
      }
    }
  });
}

This seems a bit complex for us, but bear with me here!

Basically, it looks for if the method contains a memory object. If it does, then simply add to that the data. If it doesn't (it's never been fed data before) then just create a memory object with the data as the first stored input/output.

The input could be anything, from a single value to an array/object of values. The same goes for the output. The only problem here is nested arrays/objects, but we'll get into that later (preferably after we actually finish our Memoizer class and discuss more about it).

It also checks if the data exists. You may be thinking why would I need that?

One thing I have found is that sometimes you receive more output with the same input.

You are probably scratching your head in confusion right now, so I'll give an example.

I made a least-common-multiples function. It will find a certain number of least common multiples/least common multiples up to a certain number (E.g: it will find 5 least common multiples and/or find least common multiples up until that least common multiple goes over the number 1000).

One thing I found was that the user could input the same input twice, but expect twice as many least common multiples (E.g: the first time they inputted least common multiples of 1 and 2 and only wanted to find 5 of them but the second time they wanted to find 10 of them). Instead of storing the exact quantity of each common multiples each time, I would extract the output and start from there, only calculating from the last common multiple they stored (E.g: 2, 4, 5, 6 the first time and the second time they wanted up until 20, I would calculate from 6). Then, I would store the final result in the cache. If they requested less than last time then it's as simple as giving them a sliced array of the total info stored.

So, we can store data! Now though, we have to find a way to retrieve it or check if it exists in the first place.

Under the last class method, add another:

inMemory(method, input){
  let hasEntry = false;
    this.methods.forEach(v => {
      if(v.method === method){
        if(v.memory){
          v.memory.forEach(memEntry => {
            if(memEntry.input.filter((val, index) => val === input[index]).length === memEntry.input.length){
              hasEntry = true;
            }
          });
        }
        else{
          hasEntry = false;
        }
      }
    });
  return hasEntry;
}

This one looks over the input we are passing in and compares it to the input stored. If each of them match then it will return true. If not, false.

Note that it filters the memory entry and checks if it matches up with the input and not vice versa.

Therefore, you can always add in more input, but make sure that the input corresponding to the memory entry's input at that index matches up (E.g: [1, 2, 3] will match [1, 2, 3, 4], but make sure that at least the 1, 2 and 3 match).

That's nice and all, but we still need to fetch the results, not just know that our memoizer has them! Fishing them out ourself defeats the purpose of having the inMemory method in the first place, since we would have to iterate over the memory bank ourself!

Now then, let's add another method to the class, shall we?

fetchEntry(method, input){
  let entry = [];
  this.methods.forEach(v => {
    if(v.method === method){
      v.memory.forEach(memEntry => {
        if(memEntry.input.filter((val, index) => val === input[index]).length === memEntry.input.length){
          entry = memEntry.results;
        }
      });
    }
  });
  return entry;
}

The concept is similar to the last method we made, but instead of changing a variable to true to indicate that a match was found, it reassigns an array to the results if it is found. If the results aren't found then it will return an empty array. An empty array will also be returned if the results is empty. You can change the method to distinguish the difference between a non-match and an empty-results scenario, but that isn't necessary for this tutorial.

Now, you probably are into the coding at this point, but we are done!

Well, at least for the class.

The finished class should look like this:

class Memoizer {
  constructor(){
    this.methods = [];
  }
  registerMethod(method){
    this.methods.push({
      "method": method
    });
  }
  feedData(method, input, results){
    this.methods.forEach(v => {
      if(v.method === method){
        if(v.memory){
          v.memory.forEach((v) => {
            if(v.input === input){
              v.results = results;
            }
          });
          v.memory.push({"input": input, "results": results});
        }
        else{
          v.memory = [{"input": input, "results": results}];
        }
      }
    });
  }
  inMemory(method,input){
    let hasEntry = false;
      this.methods.forEach(v => {
        if(v.method === method){
          if(v.memory){
            v.memory.forEach(memEntry => {
              if(memEntry.input.filter((val, index) => val === input[index]).length === memEntry.input.length){
                hasEntry = true;
              }
            });
          }
          else{
            hasEntry = false;
        } 
        }
      });
    return hasEntry;
  }
  fetchEntry(method,input){
    let entry = [];
    this.methods.forEach(v => {
      if(v.method === method){
        v.memory.forEach(memEntry => {
          if(memEntry.input.filter((val, index) => val === input[index]).length === memEntry.input.length){
            entry = memEntry.results;
          }
        });
      }
    });
    return entry;
  }
}

Well, now... that's a lot of code! However, a general trend I have seen is that the more code there is, the more functionality the program has.

That said it's just a trend I have seen. That doesn't mean that it's true for all scenarios.

Ok, this is great and all, but now you guys are probably wondering how do I use it? (or for those who paid attention to this tutorial: how do I apply this to a "real" scenario?. We'll get there soon, just bear with me for now).

The first thing we got to do is create an instance of this class. You shouldn't create multiple instances of these, unless you want to keep certain data separate for better program neatness.

Initialize it like this:

const memo = new Memoizer();

I would recommend putting the instance initialization before any of the functions you are going to use on this and use the registerMethod method after all of the functions are defined. So, basically it would look something like this:

const memo = new Memoizer();
function factorial(num) {
  return (num === 0) ? 1 : num * factorial(num - 1);
}
mem.registerMethod(factorial);

Now we are going to change things inside of the factorial function. Based off of what I have taught you guys, can you try to figure this out on your own? Try making a new HTML, CSS, JS project and copy-paste the Memoizer class code in as well as the factorial function. Then use what you have learned so far to make factorial properly cache and serve cached data.

If you can't find out how (or you don't want to try), then just continue reading!

Inside factorial, we need to change how the function works. You will see in a moment why, but the first thing we have to do is to put the code that will check for cached data at the top of the function:

if(memo.inMemory(factorial, num)){
  return memo.fetchEntry(factorial, num);
}

Now, we will have to change how the factorial is found. If we were to just simply feed the data with the factorial of the number, the factorial is going to execute itself again which in turn will trigger the feedData again (since feedData() is inside factorial) resulting in a stack overflow. Instead, we are going to create a function that will be doing the callbacks instead of our factorial function. Delete the return statement in the factorial function and instead add this:

const f = num => (num === 0) ? 1 : num * f(num - 1);

This seems a bit confusing if you haven't done arrow syntax before. But, basically it is a mini factorial function inside of our factorial function. It will do the calculating for us.

Now we can add the feed data function. Under the new function we made, add this:

memo.feedData(factorial, num, f(num));

There! Now, lastly we return the factorial calculation function:

return f(num);

The final function should look like this:

function factorial(num) {
  if(memo.inMemory(factorial, num)){
    return memo.fetchEntry(factorial, num);
  }
  const f = num => (num === 0) ? 1 : num * f(num - 1);
  memo.feedData(factorial, num, f(num));
  return f(num);
}

Now this whole thing should work!

You may be a bit skeptical on why we don't have an else statement (the JavaScript pros probably already know why, though).

An else statement isn't needed. If it is found in the memory it simply returns it and the function stops executing.
Any "new" number will be added to the memory and will be stored.

You might also be skeptical on why we are calling f twice. This seems like a lot of processing power, right? Well... yes. The caching should be efficient, but you should also make sure that storing the data doesn't take up too much processing power either, especially if you are working with a lot of arbitrary data. This actually brings us to the next section: the pros and cons, as well as things to improve.


Ok then, so we have our function and all, but is it really that efficient? As we see, we are calling the factorial calculation function twice making twice as much "startup" lag.

The truth is: it depends.

Whether or not it is good for you depends on what you are using the function for. If you are calculating the same factorial a lot, then sure. Then again, you can just store it as a variable and use it throughout your program. However, the one way it could be useful is if you are repetitively calculating the same data over and over. Again, you can store it as a variable (probably an array), but it would actually be a lot more readable to just simply do factorial(100) and factorial(300) versus

const index = factorialCache.indexOf(100);
factOf100 = factorialCache(index);

and

const index = factorialCache.indexOf(300);
factOf300 = factorialCache(index);

Of course, at that point it's all up to you. Remember that this argument is applying to a factorial function, and primarily just the way I did chose to feed the result. In fact, now that I think of it it would be a lot extremely better to do this:

const fact = f(num);
memo.feedData(factorial, num, fact);
return fact;

It goes to show you how a couple of lines of code can improve your performance.

Now, we kind of have the pros and cons scattered here and there from our little talk about whether or not this is useful, but the main pros are:

  • Better code readability
  • Data for function is stored in one place
  • Allows you to keep track of multiple function's data

But the cons are there, too:

  • Can lead to lower performance/no change
  • Not the best for arbitrary data
  • Can actually decrease performance if function is only used once
  • Storing the data initially can cause initial lag on using the function

Whether or not you should use it is dependent on your situation. Think it through and determine whether or not you should use it. There are going to be times where you shouldn't use this, like for simple "addition" or "subtraction" functions. A more usefule example would be using it for Perlin noise where you have the same input occur multiple times (Perlin is very sensitive to input though, one decimal off and it might give you a completely different output).

Also, there are some barriers with our Memoizer class.

It doesn't work on nested data. Therefore, passing in arrays as input won't work! A simple fix for shallow nested arrays is to use the spread operator to spread the nested values into a single array, but for heavier nested objects or objects that have arbitrary nesting deepness you might want to consider using a recursive iterator to loop through your array and create a new shallow array with all of the input in order. The problem with this is that it can become more and more process intensive until it actually slows down your function more than it speeds it up! A clever trick someone suggested is using JSON.stringify() to create a "key" of the input and compare it. However, there is a problem with that.

Do you remember when I told you about how I extended my input with the least-common-multiples function? Well, if I were to add even one number, the whole string would be inequal to the other. In other words, an array of [1, 2, 3] and [1, 2, 3, 4] wouldn't match because even though [1, 2, 3][0] equals [1, 2, 3, 4][0], [1, 2, 3][1] equals [1, 2, 3, 4][1] and [1, 2, 3][2] equals [1, 2, 3, 4][2], "[1, 2, 3]" does not equal "[1, 2, 3, 4]" (note that it the two "arrays" are wrapped in quotes, indicating that they are strings). So, it really depends on what you want to do with the memoizer function. In fact, you could have different methods of storing the data and choose 1 out of multiple retrieval/matching methods that suit your needs for each function. Hey, the program is yours not mine!


So as we come to the end of this tutorial and it is 11:47PM here at my house (my Mom is not happy. Seriously though, how am I not deranged from staying up this late), I hope that you guys learned a lot from this tutorial. I tried to be clear enough so that you don't need all of the code put together as a "finished code" block.

If you guys liked this tutorial then please upvote. If you didn't, then please (emphasis on the please) tell me what I can do better. Please read the whole tutorial before criticizing though, and refrain from destructive criticism. I hope you found this useful and plan on using this in your future projects.

Bacon boi out!

Comments
hotnewtop
Baconman321 (1051)

@tussiez do you think we could use this for perlin noise?

Do we have recurring inputs for our perlin noise implementation, anyways?

tussiez (1435)

@Baconman321 Busy, I’ll finish reading this later :/