Weeknote: 2024-W16

22 April 2024

Last week in Manyfold development saw a bit more rabbithole spelunking again, and once again not quite the focus I was after.

Monday was a good start, with actual 3MF conversion support happening pretty quickly, as expected. Then, however it fell into the usual “oh but what about” type issues before pushing that out as a release.

OBJ debugging

First issue came up straight away when I tried to convert some OBJ files - they didn’t load properly, and no geometry came out in the 3MF. Not ideal.

OBJ is a somewhat strange format - it’s pretty flexible with what has to go where, meaning that real-world files can easily have a structure that wasn’t in your test examples, and this is exactly what was happening in this case. So, I went off to read the spec to see what was going on.

Turns out, OBJ has an o directive which is used to name an object in the file. It doesn’t actually have any effect on geometry, it’s just used for naming. Now, you’d think that would go at the start of the file, but no, it can go anywhere, and frequently does. Mittsu’s OBJ loader was written based on test data that had that directive at the start of the file, whereas a bunch of my OBJ models had it after the vertex list, but before the faces. Mittsu’s handling meant that in that situation, it was throwing away all the vertices and starting again. So, that’s where my geometry went.

Fortunately, it was a fairly quick fix to rename the existing object if an unnamed one was in progress of being loaded. Then on to the next thing!

Job feedback

So, the 3MF conversion was working, but there were a couple of problems. First, it’s slow, so you don’t know whether it’s still in progress, and secondly there are a bunch of things that can go wrong during the process. So we need to be able to get some feedback.

The conversion (as with a bunch of scanning and file analysis work) is done in a background job, so that tasks that don’t compelte quickly don’t block up the frontend interface. So far for that I’ve used Delayed Job, via Rails’ generic job interface, ActiveJob.

Now, ActiveJob is a brilliant system in that it makes it super easy to use any job backend through a standard interface. However, it lacks any sort of reporting function - you can’t query which jobs are running, what they’re doing, how big the queue is, that sort of thing. So far it’s not really mattered, as scans don’t tend to need to send much feedback, and don’t run for long. And if something broke, I set it up so that old jobs were cleared on restart, and added the Tasks panel to the advanced admin UI to list the current jobs in a very simple way so at least you could see what’s going on.

That wasn’t going to be enough here though; I wanted to be able to get feedback from the jobs on what they were doing, track failures, and so on. So I started wondering about patterns for building a proper activity log. This was honestly pretty confusing to decide on - just working out what to call it was hard enough, then knowing what to search for to see what existing things existed… that took a bunch of time all by itself.

Changing to Sidekiq

While doing this, I was obviously doing a lot of background job testing, and a dev problem that I’d ignored for ages reared its head. When running a scan locally, once the jobs completed, the app would lock up and stop responding to the browser. I suspected an SQLite lock or something, but it was completely unclear. Anyway, it finally got too much for me and I had to do something.

I wasn’t entirely happy with Delayed::Job; it didn’t give me any feedback, and my cobbled-together task monitor was pretty poor. But there’s a much nicer background system out there called Sidekiq, and I wondered if I could prove how good ActiveJob’s interface was by changing over to that.

Turns out, ActiveJob’s interface is brilliant. I changed to Sidekiq and had jobs running in it in about 15 minutes, with basically no code changes. AND, because of the way I’d packaged and designed Manyfold so far, I could even change over the runner without impacting any users, because it’s all wrapped up inside the docker container. And, I had already required a Redis server for the app, so even though Sidekiq needs that, it was already there ready to be used. (The dirty secret part is that Redis wasn’t actually being used yet - it was originally intended for ActionCable stuff, but I haven’t done that yet).

So because of a combination of luck and a little good design, I can swap out the entire backend system without anyone noticing. I love things like that. As a bonus, Sidekiq instantly resolved the lockup problem, so that was even better. And it’s significantly faster, at least in development.

Status updates

Now, it would be really good if Sidekiq could give me some better state feedback… ah. It doesn’t really, at least not what I need. Oh well. We’re still going to need something else.

After more searching around, I stumbled across activejob-status, a nice lightweight system for communicating status updates from background jobs. It can use Redis as a backend, same as Sidekiq, and it adds a status object for each background job which automatically gets updated with the job state, can include messages, errors, things like that. Very flexible. And because it’s in Redis, it automatically expires after a certain time, so we don’t end up with an infinitely-growing activity log.

There still wasn’t a standard way to list all the jobs, but I worked out a way to do it by searching for matching Redis keys; you wouldn’t want to do this with a long expiry period, but as I was only keeping jobs for 24 hours, this is fine.

Once we had all that in place, creating the activity log UI itself was pretty straightforward, helped by Kaminari’s ability to do pagination for plain Ruby arrays, not just ActiveRecord objects.

Slouching towards release

All that architectural change took quite a lot of mental capacity, and my productivity wasn’t helped by a few bugs that cropped up along the way. I broke a few things when I moved bulk file editing to a different URL (to make way for the conversion endpoints), and there was one potentially horrifying bug that had popped up early in the week and which I kept avoiding, but was making me paranoid about releasing.

I’d wiped my dev database to run the app from afresh with all the new background changes, and all the paths came out wrong. I’d set up the library as normal, but all the model paths somehow included a full absolute path, not the path within the library folder. I spent a couple of days with that in the back of my head shouting “OH GOD WHAT DID YOU BREAK?!”.

Fortunately, the back burner did its job, and one night just as I was dropping off to sleep, my subconscious let me know that it was because the case of the paths was wrong.

I develop Manyfold on macOS, and the default filesystem is case-insensitive, but case-preserving. That means that it doesn’t matter if I use the path /users/james/3d or /Users/James/3D, I’ll get the same result. However, one is correct, and the other is not. When I list all the files in that folder using Ruby’s Dir.glob, I’ll get a list of absolute paths using the correct case, not the case of the search string I passed in.

In this case, I’d got the library path case wrong, and so it hadn’t matched all the detected paths later on in the code, and everything broke. Once I’d realised that, it was a fairly simple matter to check the library path and fix it before storing it in the database. Now it’ll never happen again.

In summary: case-preserving-and-insensitive filesystems can get in the sea.

Anyway, with that heart-stopping error resolved, and a few other little bugs ironed out, I could finally push out the v0.63.0 release, featuring 3MF conversion. The 80-20 rule really held true for this one. That last 20% of the work really did take most of the time this week.

What’s next?

This week I’m going to focus on security, accessibility and performance. NLNet make some great services available as part of their funding offer; I’ve got an accessibility audit lined up for Friday, and a security audit happening sometime soon as well. So, I’ll be spending the week fixing everything I know about before those happen, and installing some automated tools to pick up more before the experts start finding more obscure things.

Bonus Feature

This week you get a treat from Coach Party, who along with Lauran Hibberd and Wet Leg are part of an explosion of fantastic music coming from the Isle of Wight recently, and which are all doing heavy rotation while I code.