From audiobook to audio novel with SubPlz

Visual Novels, in the past few years, have increasingly become a very popular way to immerse in Japanese. It is arguably an excellent medium as it combines audio and text, in a much higher density than anime and with often “richer” vocabulary (口語 vs 文語) by virtue of being a literary medium, which asks for more description than anime. All the while featuring supporting artwork helping the reader imagine the scenes (a big plus for those of us with aphantasia).

However, people can also feel alienated by the medium and refuse to use it, be it because of prejudges, because of the often sexual nature of them, not finding a title that interests them or many other reasons.

The compromise I offer you today, in no small part thanks to the excellent work of KanjiEater with SubPlz will bring you a lot of the benefits of visual novels without many of the drawbacks and qualms people might have with them. The big plus side is that nowadays, a lot of popular anime draw their source material from Light Novels, and often, you’ll find that the narrator of the audiobook is one of the voice actors from the anime (The highest budget ones sometimes even have an ensemble cast, with different actors voicing multiple chracters, or use of SFX)! And of course, it goes without saying, but this method works for more traditional novels, your コンビニ人間s, your こころs or your 同志少女よ、敵を撃てs, there’s even some translated english books that got Japanese audiobooks counterparts. It works for anything that the market decided should have an audiobook (which, sometimes might surprise you in one way or another, it’s possible that that one specific very niche anime you like has an audiobook for its novels, while something like Re:Zero, being one of the biggest sellers of a generation, does not).

Concretely, SubPlz automagically creates very accurate subtitles of a novel to its audiobook. It creates a .srt file that you can use with the audio file in a video player, much like subbed anime/movies. As it stands, most people use mpv and keyboard shortcuts (to go from one sub to the other) as well as extensions to use a clipboard page with it for lookups. For all intents and purposes it’s probably good enough to read as is.

Unfortunately, I’m an idiot, I’m a perfectionist, my eureka moments bring out curses to this world. And as you guessed from the title of the article, and the foreshadowing in the first paragraph, we’re going to use the battle-tested Ren’py engine instead. This gives us a few nice features easily:

An user interface that looks much more like something along the lines of a normal epub reader like ッツ or Readwok
Furigana rendering
Personalization (font, colors, background, etc)
(TODO Test this more) vertical text
Save/Load features
Easy back and forth/skip features without tweaking around config files (just use the mouse wheel)
Out of the box websocket/clipboard copying for texthooker pages
It’s not as thoroughly tested so far, but the whole setup should generally work with languages other than japanese as well.

So, what does it look like in practice?

Cool huh? It’s become my prefered way to read books, the line pauses at the end, so it leaves you time to look stuff up, forces a slower pace on you (by following the speech) if you’re prone to whitenoising, saves you time second guessing yourself and yomitan-ing a word just for its reading, gives you non-verbal cues about what’s going on (emotion in the voice, knowing which character is speaking) and it breaks the monotony of scrolling while reading a book or just listening to the audiobook on autoplay. It can be considered a crutch in some effects but it’s not really detrimental to learning in the long run. That said, because there’s no visuals here, I’ve come to call them Audio Novels (ANs) instead. Please help me coin the term if you like this setup !

From Audiobook to Subtitle

[section needs redoing, as some things have changed ]

From Subtitle to Audio Novel

First, let’s download everything we’ll need:

For windows
On linux: TODO
On mac: TODO

The software itself is intuitive to use, but there’s a few preparatory steps (that you’ll only go through once) to take care of beforehand.

Customizing our ren’py

The template folder next to the executable will be copied every time you make a new AN, so, let’s start by making it look good from the get-go. For that purpose, the template is a very basic exerpt from 吾輩は猫である. That way we have an example to test our customizations on.

Feel free to skip any step here, the template is based on my configuration so it’s perfectly usable as is.

Let’s head to the game folder, as it is where everything will take place.

TODO Vertical writing ?

Copy to Clipboard / Websockets

A classic for VN readers, both are enabled by default (websocket on port 6677 by default), but of course you might have reasons to disable either.

To do that, simply delete texthooking.rpy and texthooking.rpyc to disable the clipboard, or websocket_server.rpy to disable the websockets.

If you want to change the port, open websocket_server.rpy and change the 6677 in server = WebSocketServer('', 6677, SimpleEcho) to the number of your choice.

Font

Noto Sans CJK by default, if you want to change it, find a .ttf file of your font, and replace font.ttf with it. Make sure it is named that way and beware of uppercase letters.

Background

The background is in gui/nvl.png, feel free to decorate it as you please, using any color, patterns or anything you please.

`gui.rpy` changes

Everything has a description, but here’s a few recommendations of things to change:

define gui.text_size = 33 Self explanatory, I like my font pretty big, but you might like it smaller.
define gui.nvl_list_length = 2 This controls the number of lines displayed at once, be wary of setting it too high or it’ll overflow and some text might become hidden (this is my biggest gripe that I might look into improving in the future).

`top.txt` changes

This one isn’t in game but back in the executable folder, but I recommend trying your changes out on script.rpy first.

replace #1d1f20, with the text color (first for normal text, then for ruby/furigana, you can use different colors if you’d like)
window_background is the color of the rectangles behind the text, I recommend setting it close or equal to the background color to make it transparent, unless you have a more complex background.

Don’t forget to mimic your changes to top.txt once you’re done experimenting!

TODO list good color combinations from ttsu

It’s audiobookin’ time

Phew, that was a doozy, but that’s all the hard stuff out of the way now. You can finally run audiobooktorenpy and you’ll be greeted by a pretty clear UI.

Name: Will be the name of the folder, just use a new one every time, nothing crazy here.
Path to the epub will be used to calculate the furigana and put them back in the text, it’s not mandatory and a bit shaky but pretty neat.
Offsets: this’ll offset the audio splitting a little, because sometimes the subtitle starts a bit too late which causes a weird half-sylable at the start of the line. Typically anywhere from -120 to -200 sounds good to capture a bit of silence beforehand, no magic here, it’s on an audiobook to audiobook basis, but the process takes a couple minutes so trial and error is the key.

After filling in the fields, we’re done, just have to press the button and wait a few minutes, all that’s left is to enjoy and read!

From Subtitle to SRS

We can also transform our subtitle into an anki deck. We already could have done that with subs2srs this has a couple drawbacks. Firstly, like for ren’py, I modify the subtitles slightly for them to be generally more accurate, secondly, this way is several times (approximately 10 to 50 times) faster than subs2srs, thanks to optimizations from not needing the video to take a screenshot of, as audiobooks obviously don’t have video.

Running it is very simple, download it here:

Windows
TODO Linux
TODO mac

Then extract and run audiobook2srs, give a unique name (otherwise your media collection might start having duplicates and trouble importing, I didn’t check this directly but better safe than sorry), select the .srt and .m4b, and set the offset, read up the renpy part for clearer explanations.

After a couple of minutes, you’ll get an .apkg file, which you can import in anki, tada!

TODO Addentum: Sentence Banking and automatically bringing the audio to your mined cards

To be expanded in a future update!