User blog:JoshuaJSlone/Updates on working in multiple languages at once

Though I hadn't done much with it in the last few years, working on ways to fill in the non-English wikis with information with the least repeated work necessary has been a project of mine for a long time. I think this blog entry from late 2014 is the first time I talked about it in a big way. Anyway, I've lately been helping out with [http://helloproject.fandom.com/da Danish Hello! Project Wiki] and it's got me back to making improvements on the tools and methods I'm using, so I thought I'd share in case anyone is interested.

The biggest part of working with multiple languages has been to make pages full of code phrases like #^Single^# and #^Tracklist^# which get replaced with appropriate words in each language, resulting in a finished but simplistic page. It doesn't have all the detail of one of our English pages, but covers the facts and would make a good starting point for future editors who want to go fancier. Anyway, there are always certain things the code phrases didn't cover or that were different by language, so I've kept a file of little instructions for when actually putting them to use. Like, how to construct an introduction sentence or how dates/numbers/whatever are displayed in that language. Czech: Days in dates have an ordinal dot: 30. Duben 2014 Numbers don't use a comma for separating thousands, but a space. Intro sentence: SINGLENAME je X. singl skupiny GROUP.
 * 1. Further improvements to Genericode and processing of it

German: Days in dates have an ordinal dot. Numbers use a dot for thousands separation. Intro sentence: SINGLENAME ist die 1. Indie-Single von GROUP. ALBUMNAME ist das 1. Album von GROUP. Anyway, I finally decided to automate some of these things. Added in rules for certain languages so it tries to detect where dates are written to add in those dots. Created some new genericodes like #^introIsTheAlbum^# and #^introOf^# that can be used in intro sentences so those mostly don't need to be touched, either. Numbers written on the pages use #^ThousandsSeparator^# to get the appropriate dot or space or comma or whatever, and the Oricon templates take care of numbers presented in those tables. Of course nothing is perfect or covers every situation, but things are more hands-off than ever before.

Years back I made a list of templates that were needed to copy to new wikis that wanted to have the same kind of functionality we've got here. However, since then ours have just kept being added to, so it's now a pretty hefty list. It's too late to take much advantage of it for the Danish wiki, but if it's needed again I think I've seen a better way.
 * 2. Copying of necessary templates, etc.

I use Pywikibot to help make interwiki links between the different languages of H!P Wiki. These are the links you see at the bottom of a page ("Languages: Česky Deutsch Français Italiano 日本語") that point you to the equivalent pages on the other languages. Since it's been a few years, the old version of Pywikibot I was familiar with was no longer working, and while having trouble getting newer versions to work I learned about more functions that software allows, including copying pages from one wiki to another. It can do a given page, a category, or a list given another way. Next time a place needs 50 templates from here, it will be much easier to do.

It will also help with keeping some things up to date. For instance, the MB templates we use to make the member boxes. It's easy to copy that over to another wiki, but they're being changed over here all the time. New people added. New profile photos replacing old ones. And so unless you keep keep keep copying it over, the external versions fall behind. I haven't done this yet, but it should be a fairly simple thing to set up some commands to automatically copy pages like this. Run it every few weeks or whatever, everything stays nice and up to date.

This is just in the idea phase right now, but if 1) there are Genericode pages that can be processed to a point where they need almost no fixing up and 2) there are ways to add pages by way of bot that are within my understanding then... 3) Might I be able to tell it to automatically take a bunch of Genericode pages and add them to a new wiki, rather than doing every single one by hand? I don't see why not. I'm not very familiar with working with Python directly, but the core of the Genericode processing is just a bunch of search/replace, so it shouldn't take a lot of deep understanding to move the same functions over. Alternate 3) Instead of a bunch of pages added to one wiki at once, it would also mean I could make a new Genericode page and tell it to add itself to Czech, Danish, German, French, Spanish, and Italian without having to do each one separately.
 * 3. 1 + 2 = 3