Implementing Undo in Youtube Playback

Intro

Two months ago, I was listening to a podcast on youtube, everything was going fine until I accidentally pressed a numpad key, sending me to a random timestamp on the video. It wasn’t the first time that this happened, the shortcut is pretty easy to hit by mistake.

As usual, I got a bit annoyed and wished I could just undo it to get back to where I was, but that night, I didn’t really have much to do, so instead of letting that thought fade away, I decided to get to scripting (thus not finishing the podcast).

In this article, I’ll describe the process of making this script with Tampermonkey, as well as the problems I encountered along the way, and how the implementation slowly evolved.

Install steps

Install the TamperMonkey extension;
Go on the gist of the script and click on the “raw” button.

You should then be able to undo and redo when watching a video on Youtube using ctrl+z and ctrl+Z.

Intro
Install steps
Table of contents
Tampermonkey
Initial attempt
Polishing the script
The final iteration
- What it means to undo
- Back to Stacks: simplifying the undo history
Conclusion

Tampermonkey

Tampermonkey is a web extension that allows you to write your own JavaScript scripts, and bind them to whichever site you want. Put another way, it basically lets you write your own small web extensions.

The development experience is barebones at most. As a developer who’s mostly using node Backends and heavier JS Frameworks, I occasionally find it refreshing to work with raw Javascript, its DOM methods, and pretty much no external package.

I’ve learnt about this extension a while back with Mmaker’s script to disable Youtube’s volume normalization (it’s not on github anymore, but other people have also made similar scripts).

I made my first script with it after another friend complained about NicoNico’s language always reseting back to english, it was a very simple script that looked for the language dropdown on the page, and switched it back to japanese if needed.

Initial attempt

I first thought of implementing it as a very basic stack (Last In First Out => LIFO), it would store the previous timestamp each time I moved the play-head to a different position. However, as the end user of my own solution, I knew that I wasn’t safe from an accidental undo either, so this solution wound up being just a bit too simple to work.

I had to implement redo functionality.

Instead of a stack, I started treating it more like an array, which means I would have to store the undo elements as well as a pointer knowing where I was in the stack.

When adding a new element to the undo history, it would either append if the cursor is at the very last position, or clear all the following elements if the cursor wasn’t at the last element, this is the usual redo-invalidation logic you may find in most software.

const seekHistory = [];
const videoElem = document.getElementsByTagName("video")[0];
let seekIndex = -1; // <- Pointer we use to know at which position we are

videoElem.addEventListener(
"seeking",
function (ev) {
    addToHistory(ev.target.currentTime);
});

function addToHistory(timestamp) {

    seekIndex++;
    seekHistory[seekIndex] = timestamp;
    seekHistory.length = seekIndex + 1;
}

The undo/redo logic was all about navigating to the desired position in the history, and properly incrementing/decrementing the counter.

function undo() {
    if (seekIndex <= 0) {
        return;
    }
    seekIndex--;
    videoElem.currentTime = seekHistory[seekIndex];
}

function redo() {
    if (seekIndex >= seekHistory.length - 1) {
        return;
    }
    seekIndex++;
    videoElem.currentTime = seekHistory[seekIndex];
}

The very first issue I encountered with this was simple enough: changing the currentTime programmatically still triggers a seek event. All I needed to do was make an exception for it.

const seekHistory = [];
const videoElem = document.getElementsByTagName("video")[0];
let seekIndex = -1;
let isProgrammaticSeek = false; // <- This here

videoElem.addEventListener(
    "seeking",
    function (ev) {
        if(isProgrammaticSeek) {
            isProgrammaticSeek = false;
            return;
        }
        addToHistory(ev.target.currentTime);
    });


function addToHistory(timestamp) {

    seekIndex++;
    seekHistory[seekIndex] = timestamp;
    seekHistory.length = seekIndex + 1;
}

function undo() {
    if (seekIndex <= 0) {
        return;
    }
    isProgrammaticSeek = true;
    seekIndex--;
    videoElem.currentTime = seekHistory[seekIndex];
}

function redo() {
    if (seekIndex >= seekHistory.length - 1) {
        return;
    }
    isProgrammaticSeek = true;
    seekIndex++;
    videoElem.currentTime = seekHistory[seekIndex];
}

All this isProgrammaticSeek variable does is inform the event handler that we do not want to save to the undo history when we are seeking because of an undo/redo event.

Polishing the script

Tampermonkey integration

I uploaded this script to gist in order to benefit from Tampermonkey’s quick install features in case I wanted to share it to other people. After a bit of searching, I found the way to make it work was with the .user.js filename extension which allowed “quick-installing” scripts via clicking the “raw” button on github or a gist.

Key event conflicts

Another thing I wanted to avoid was triggering the undo events when pressing ctrl+Z while writing a comment, but as the comment box element does not load immediately, querying it in the body of my script caused it to error out. Further research led me to use Observers as a way to await the comment component load.

let commentInputElem = null;

waitForElm('#contenteditable-root').then((elm) => {
    logDebug("Comments have been loaded");
    commentInputElem = elm;
});

function waitForElm(selector) {
    return new Promise(resolve => {
        if (document.querySelector(selector)) {
            return resolve(document.querySelector(selector));
        }

        const observer = new MutationObserver(mutations => {
            if (document.querySelector(selector)) {
                observer.disconnect();
                resolve(document.querySelector(selector));
            }
        });

        observer.observe(document.body, {
            childList: true,
            subtree: true
        });
    });
}


function handleKey(e) {
    if (document.activeElement.contains(commentInputElem)) {
        logDebug("Comment element is focused, ignoring undo/redo inputs");
        return;
    }
    if (e.key === 'z' && e.ctrlKey && !e.shiftKey) {
        undo();
    }
    else if (e.key.toLowerCase() === 'z' && e.ctrlKey && e.shiftKey) {
        redo();
    }
}

document.addEventListener('keydown', handleKey);

Purging the undo history

My script was functional, but as I tested it some more, I realized that navigating to another video did not cause a page reload, and thus didn’t reload the state. The history was being preserved which didn’t make much sense. To circumvent this SPA behavior. I had to bind another observer to the video element itself, and listen to changes in its src attribute.

const observer = new MutationObserver(function(mutations) {
    mutations.forEach(function(mutation) {
        logDebug(mutation);

        if (mutation.type === "attributes" && mutation.attributeName === "src") { 
          // This is where the state is reset when changing videos
            seekHistory = [];
            seekIndex = -1;
        }
    });
});

waitForElm('#movie_player video').then((elm) => {
    videoElem = elm;

    logDebug('found video element', elm);

    videoElem.addEventListener(
        "seeking",
        function (ev) {
            if(isProgrammaticSeek) {
                isProgrammaticSeek = false;
                return;
            }
            addToHistory(ev.target.currentTime);
            logDebug(seekHistory, seekIndex);
        });

    document.addEventListener('keydown', handleKey);

    observer.observe(elm, {
        attributes: true //configure it to listen to attribute changes
    });

})

The first seek event

With further testing, I came across an edge case I failed to consider.

Because my script only appended the new timestamps to the undo history, and not the position before seeking, it would mean that we are not able to undo the very first seek event.

This was a very bad oversight, while I was having fun writing the script, I was also the intended user for it, and in a regular user scenario, the script is intended for accidents, and I wouldn’t think about it the rest of the time, meaning it would often be my first seek event.

To make a long story short: the only relevant scenario in which I would need my script also happened to be the very edge case I failed to consider.

Fortunately, the fix was simple enough. In order to handle it, all that was needed was a cache of my previous position, and an exception to insert said cached position when the history was empty.

let seekHistory = [];
let seekIndex = -1;
let cache = 0.0;
let cache_prev = null;

videoElem.addEventListener("timeupdate", () => {
    cache_prev = cache; // save last known value
    cache = videoElem.currentTime; // before updating to new currentTime
});

// ... //

videoElem.addEventListener(
    "seeking",
    function (ev) {
        if(isProgrammaticSeek) {
            isProgrammaticSeek = false;
            return;
        }
        if (seekHistory.length === 0 && cache_prev !== null) {
          addToHistory(cache_prev);
        }
        addToHistory(ev.target.currentTime);
        logDebug(seekHistory, seekIndex);
});

I left my script at that, as it was working well enough for now.

The final iteration

A few weeks later, I was watching something else, and decided to mess around a bit to see if my script was still working.

That’s when I found another edge case, once my undo history was not empty anymore, it would specifically require a user-triggered seek event to add to it.

alt text

I wanted to solve this issue, but the more I thought about it, the less I was satisfied with my solutions.

What it means to undo

When you undo on a text or editing software, the operation is fairly straightforward, each new action you take will be appended to the undo history. While some programs can get pretty smart with it (like text-editors undoing per word instead of per keystroke), one thing stays true: when you’re not touching anything, nothing is happening in your undo history.

But how should this be handled when it comes to video playback? Even if you don’t touch anything, the video will keep playing.

I would like to be able to redo something after undoing, but with video, this begs a few questions:

If undoing adds the pre-seek position to the history, should it clear all elements after it even though this is not a user-triggered seek operation? Or should it insert the undo position while preserving everything after it?
- If I do not clear the next elements when inserting in history, should I run a proximity check with the next redo element and avoid inserting if it is close enough with the inserted timestamp?
With the history [5s, 12s, 30s], if I undo and go to the timestamp 30s, and then wait a few seconds (five or ten), should the next undo operation take me back to 30s (the closest undo element)? Or should it take me back to 12s (the next undo element)?
- Assuming it is the closest undo element, how close to 30s should I be in order for my next undo to take me back to 12s?

The smarter I wanted the script to be, the more implicit behavior I realized this would introduce. At this point, any decision I took would be a tradeoff between accuracy and user expectations. And as a user, I don’t expect an undo script to do this kind of internal logic and proximity checks, but with that said, I still needed to solve my issue with inserting redo elements on a non-empty history.

Since every addition seemed to add these implicit checks and behaviors, I took it to the design itself, maybe there was something to change there.

Back to Stacks: simplifying the undo history

When I started, I made a very simple undo stack, then changed it to an undo history after considering the need for redo operations. But there was another path I could have taken, one which I was now going back to: have a separate undo and redo stack.

Instead of navigating a history with a cursor, I would use an undo and redo stack, both of which are only ever manipulated with LIFO operations.

alt text

In the new design undoing and redoing will immediately pop from the respective stack, and push the pre-seek position to the opposing stack. This solves the previously mentioned issue of losing our previous position when undoing with a non-empty history.

However, undoing and then redoing while the video is still playing introduces a little drift in the timestamp (notice the last element of the undo stack in step 2 being 1min30s, and then being 1min35s in the final step). Doing this repeatedly increases the drift. While this behavior doesn’t strictly align with the idempotency of undo operations in editing software. I believe that behavior to be much more preferable to all of the implicit and unintuitive behaviors that come from using a history instead.

Conclusion

What started as a small project slowly turned into a… small project but with a mild challenging element to it. All around, this only took three work sessions to get this done, totalling less than half a day.

Nevertheless, this was a fun experience, from which I took a few lessons:

When I started writing this script, I didn’t realize how the fact that it was a playing video would affect the undo logic that much, maybe diagramming the flow from the start could have avoided me many of these troubles;
Youtube’s initial load takes a while! Seriously, each time I updated the script, I would have to refresh the page, and wait between 10 to 15 seconds just for the video component to be fully operational;
Tampermonkey is cool, now when I browse the internet, I’ll sometimes have moment where I think to myself “huh I wonder if I could script something to make X or Y more pleasant to use” (not like I plan to act on such thoughts every time though);
I’m much more receptive to edge cases and usability when I design something for my own use.

With that said, thanks for reading. See you on the next article!