Japanese clipboard
translation
This article shows a Linux (Wayland) workflow for learning Japanese where you select Japanese text (image or plain text) on the screen via mouse and then you get a popup with the translations.
There are quite many tools for this already which come with their own instructions. The focus here is on a modular and hackable solution for power users.
Clipboard setup
This solution works by getting the Japanese text to the system clipboard and then reading it automatically from there. The first gotcha here is that you may need to install programs like wl-clipboard (for Wayland) or xclip (for X) for programmatic clipboard access.
To copy normal text to the clipboard you simply use Ctrl-C. To read Japanese in image format displayed on your screen (eg when playing games or reading digital manga), you need to capture the screen and parse the image into Japanese characters:
Partial screen capture
To select a rectangular area from screen using mouse, on Wayland I suggest installing grim and slurp and binding the following to some global hotkey (eg Meta+s):
grim -g "$(slurp)" - | wl-copy --type image/pnggrimis basic program to capture the screen, the-gargument specifies the region inx,y <width>x<height>format.- We use
slurpto select that region with the mouse. It dims the screen slightly and shows a crosshair to select a rectangular area. - Finally
wl-copy(from thewl-clipboardpackage) puts the image in the clipboard in PNG format - Test this step works by using the Paste (as new image) feature of your favorite image editor.
On X you can find similar command line tools or use a program such as flameshot.
OCR
While more traditional approaches for Optical character recognition (OCR) also exist, this is a great use case for modern AI. I suggest using kha-white/manga-ocr. Install it with:
python -m venv manga_ocr
source manga_ocr/bin/activate
cd manga_ocr
pip install .While the virtual environment is still active, run the program on the command line with manga_ocr --force_cpu=True. Leave the terminal window open. I suggest adding the --force_cpu=True option when you are playing games or doing other visual work because the program may consume a lot of VRAM in the default GPU mode and crash your programs if the VRAM runs out. It works quite fast also on CPU.
Test this step works using the image capture from the previous step. You should see text, including the detected text, appear in the terminal.
The manga_ocr program is great because it also copies the detected text as plain text to the clipboard. Test this works with Ctrl-V. Now we are in the same stage as with pure text Ctrl-C, and need to process the Japanese text in the clipboard.
Translating the text
Here you could use a dictionary program that supports reading from the clipboard, such as Tagaini Jisho.
I instead made my own tool so I could better control the presentation: anonee-od/japanese-clipboard-translation. The instructions are on the GitHub page, but basically it uses the very common JMdict data via a very simple Python script you can easily tweak.
One feature I implemented is the ability to post the results with notify-send which means they can be shown with any notification implementation.
In case you have trouble download the JMdict dictionary XML file from the official sources (common issue), here is one copy: https://files.catbox.moe/svua09.7z