kageru.moe

Introduction

Now, if you’re anything like me, you’ll probably react to that title with “wtf, no, we have way too many of them”. But hear me out. While it is true that most Vapoursynth “funcs” have dozens of dependencies that are sometimes poorly (or not at all) documented, we might be cutting corners where it matters most. Most as measured by “where it affects most users”.

Fundamentally, there are two groups of developers in the Vapoursynth “community” (and I use that term very loosely):

People who write plugins
People who write Python functions

Myrsloik, the head developer of Vapoursynth, has set everything up in a way that facilitates a structured plugin ecosystem. Every plugin can reserve a namespace, and all of its functions have to live there. You can’t have a denoiser and a color shift in the same namespace without making things really confusing, both for you as a dev and for the users.
This is good. It should be like this. But here’s an issue:

Functions are a separate ecosystem

Funcs (and I’ll use that term to collections of functions such as havsfunc or mvsfunc) are fundamentally different from plugins. Most importantly for the user, they need to be imported manually. The namespacing is handled by Python. But even Python can’t save you if you don’t let it.

Probably the most popular func out there is havsfunc. At the time of writing, it consists of 32 main functions and 18 utility functions. The other big funcs paint a similar picture. For some reason, the convention has become to dump everything you write into a single Python file and call it a day. When I started using Vapoursynth, this trend was already established, so I didn’t really think about it and created my own 500-line monstrosity with no internal cohesion whatsoever. We don’t care if our func depends on 20 plugins, but God forbid a single encoder release two Python modules that have to be imported separately. This is what I mean by “we’re afraid of dependencies”. We want all of our code in one place with a single import.
It is worth pointing out that not everyone is doing this. People like dubhater or IFeelBloated exist, but the general consensus in the community (if such a thing even exists) seems to be strongly in favor of monolithic, basically unnamed script collections. This creates multiple problems:

The Barrier of Entry

I don’t think anyone can deny that encoding is a very niche hobby with a high barrier of entry. You won’t find many people who know enough about it to teach you properly, and it’s easy to be overwhelmed by its complexity.
To make matters worse, if you’re just starting out, you won’t even know where to look for things. Let’s say you’re a new encoder who has a source with a few halos and some aliasing, so you’re looking for a script that can help with that. Looking at the documentation, your only options are Vine for the halos and vsTAAmbk for the aliasing. There is no easy way for you to know that there are dehalo and/or AA functions in havsfunc, fvsfunc, muvsfunc, … you get the point. We have amazing namespacing and order for plugins, but our scripts are a mess that is at best documented by a D9 post or the docstrings. This is how you lose new users, who might have otherwise become contributors themselves.
But I have a second issue with the current state of affairs:

Code Duplication

As mentioned previously, each of these gigantic functions comes with its own collection of helpers and utilities. But is it really necessary that everyone writes their own oneliner for splitting planes, inserting clips, function iteration, and most simple mask operations? The current state, hyperbolically speaking, is a dozen “open source” repositories with one contributor each and next to no communication between them. The point of open source isn’t just that you can read other people’s code. It’s that you can actively contribute to it.

The Proposal

So what do I want? Do I propose a new system wherein each function gets its own Python module? No. God, please no. I accept that we won’t be able to clean up the mess that has been created. That, at least in part, I have created. But maybe we can start fixing it a least a little bit to make it easier for future encoders. Actually utilizing open source seems like it would benefit everyone. The script authors as well as the users. Maybe we could start with a general vsutil that contains all the commonly-used helpers and utilities. That way, if something in Vapoursynth is changed, we only have to change one script instead of 20. This should particularly help then-unmaintained scripts which won’t break quite as frequently. The next step would be an attempt to combine functions of a specific type into modules, although this might get messy as well if not done properly. Generally, splitting by content rather than splitting by author seems to be the way.

I realize that I am in no real position to do this, but I at least want to try this for my own kagefunc, and I know at least a few western encoders who would be willing to join. We’ve been using a GitHub organization for this for a while, and i think this is the way to go forward. It would also allow some sort of quality control and code review, something that has been missing for a long time now.

I’ll probably stream my future work on Vapoursynth-related scripts (and maybe also some development in general) on my Twitch channel. Feel free to follow if you’re interested in that or in getting to know the person behind these texts. I’ll also stream games there (probably more games than coding, if I’m being honest), so keep that in mind.

Edit: It has been pointed out to me that vsdb exists to compensate for some of the issues described here. I think that while this project is very helpful for newcomers, it doesn’t address the underlying issues and just alleviates the pain for now.

I just had to plug my Twitch there, didn’t I?