Re: (Finally Complete) Telefang English Translation

kmeisthax · Post by **kmeisthax** » Mon Aug 22, 2016 6:03 am

So I've been making some work on integrating a text extractor and inserter into the disassembly! It -mostly- works; however, there are two big problems, both of which are necessary for being able to have byte-for-byte correctness on the Japanese branch of the disassembly:

1. Text tables with multiple pointers aliased to the same string. I can represent this by adding an «ALIAS ROW 0xwhatever» code in place of a string to instruct the inserter to alias the pointer again. In fact, the extractor portion of the same script automatically does this; but the wikitext doesn't have these references. And they aren't easy to insert.
2. Text tables that are missing strings. For some reason, a few strings sit between the terminated end of one string and the next pointer's target; so I have to manually insert these into the wikitext that the extractor generates for now. There's a lot more unindexed text than I expected though, so I -may- adjust the extractor to detect this particular case and generate the appropriate wikitext. The inserter recognizes this case by seeing a row with (No pointer) in the Pointer column, which appears to be what the wiki does.

So some more headaches before we have a working Japanese text inserter, and then I have to do it again in the patch branch. No, I can't just pull the dumps on the wiki; I specifically want the assembled patch to, again, byte-for-byte match what our old IPSes have in them. Rationale is that my goal is to have a complete disassembly of the patch's changes. Once we have a complete disassembly of such, then we can start importing the new translations from the wiki, add new features to the patch, etc.

Also, someone on IRC has been expressing interest in translating to German using our patch; so I will need to eventually add German columns for all the text dumps on the Wiki. However, this particular translator wants to use Excel to manage the text dumps. This sounds stupid, but after trying to get the SMS text block to have the proper aliasing instructions and realizing it would be an hour and a half of copy-paste, I'm beginning to see the advantage of, say, converting all our textblocks to CSV files. Wikitext isn't exactly copypaste friendly.

This would mean abandoning the Wiki entirely (unless you want to stick CSV source right onto Mediawiki) and sticking the CSV files straight into the Git repository. I don't know if that's any more translator friendly - Git is powerful software for developers, but the UI is absolutely atrocious. I don't know if I necessarily want to have to teach every translator how to resolve merge conflicts in CSV files when doing so is only practically possible from within a text editor. Translators would have to submit their CSVs to developers to get compiled and injected. But I can't see any better options.

DaVince · Post by **DaVince** » Mon Aug 22, 2016 6:18 pm

wrote:This sounds stupid, but after trying to get the SMS text block to have the proper aliasing instructions and realizing it would be an hour and a half of copy-paste, I'm beginning to see the advantage of, say, converting all our textblocks to CSV files. Wikitext isn't exactly copypaste friendly.

CSV specifically seems kinda messy to use for this, as the translations will certainly be using special characters like quotes, commas, semicolons and newlines a lot. Not just that, but it becomes a mess to look at in text form. I'd prefer something that looks clean so it more easily serves the people translating it.

Maybe this?

Code: Select all

#pointer
Original Japanese (can be multiple lines)
 <empty line to separate the Japanese from the translated text>
Translated text (can be multiple lines)

That would end up looking like this:

Code: Select all

#0x1200d2
この おおきな きから
デンジュウカイヘ いけるという うわさは ほんとなんでしょうか デンジュウカイ... ... たのしいことが いっぱい あるんでしょうね いってみたいなぁ....

I've heard that you can get to the Denjuu World through this giant tree...
I wonder if it's true? The Denjuu World... I bet there's a lot of fun things to do there... I wish I could go see it...

#0x1200d4
あ...あぶなかったぁ...
ここまで ボ-ルを とばせる ようなひとは ただひとリ..

Th-That was close!
There's only one guy I know who could hit a ball this far...

#0x1200d6
...

It's readable in any text editor by anyone wanting to translate, and it seems easy enough to parse, right? It's even Markdown compatible. And untranslated lines could just have a little dash inserted or something.

wrote:This would mean abandoning the Wiki entirely (unless you want to stick CSV source right onto Mediawiki) and sticking the CSV files straight into the Git repository. I don't know if that's any more translator friendly - Git is powerful software for developers, but the UI is absolutely atrocious. I don't know if I necessarily want to have to teach every translator how to resolve merge conflicts in CSV files when doing so is only practically possible from within a text editor. Translators would have to submit their CSVs to developers to get compiled and injected. But I can't see any better options.

I had to think about it for a while, but I think it's a good idea to keep using the wiki. It's easy to edit, which is quite a useful strength, especially if that's all you want to focus on. I know I don't want to unnecessarily mess with Git; I just want to put the new translations on there at some point!

How about the following?
- The wiki pages now use the new format.
- The Git repo already has the text files with the Japanese and English lines in them.
- We also have a script that fetches the wiki pages and puts them into those text files. It's completely optional to run this file, and it would basically 'update' whatever is in the text files based on whatever's been changed/added on the wiki. If any local text files have been changed, it would warn you that it's going to overwrite them so you don't lose whatever you replaced the text with.
- The ROM can be reassembled regardless of whether you decided to run this script. The script is just there to fetch the latest and greatest English text.

Now, these are all just thoughts and ideas. I've taken way too long writing this post now. Lemme know what you think, be it in here or on IRC.

kmeisthax · Post by **kmeisthax** » Mon Aug 22, 2016 11:49 pm

The point of using CSV is so that you can import the translation tables into a spreadsheet program, which is easier to work with than wikitext is. For example, if I want to add a new language column with placeholders, I'd have to manually add a column marker for every row in a 255-string text bank, across about 20 blocks in the game. Whereas in Excel/LibreOffice it's a matter of adding the new heading, adding one row with an address, then click-dragging to extend that pattern through every row in the table. The spreadsheet has the advantage of being more WYSIWYG, whereas wikitext isn't. In fact, the rendering of wikitext after it's been saved doesn't even match what the string is supposed to be, given that manual newlines don't always show up when the wikitext is rendered on the site.

We would not want people to be hand-editing CSVs, that would be worse than the problems I have with the wikitext. But the underlying format supports special characters, UTF-8, punctuation just fine; and in fact you wouldn't believe how many large businesses use CSVs internally to transfer data. And there's plenty of well-tested libraries for parsing them, escaping special characters, and so on. Not to mention that special characters are actually a worse problem in wikitext given that there's more of them and the escape solution is the not-so-elegant <nowiki> tag. (Which my parser/injector does not handle at all at the moment.)

Someone on IRC mentioned Google Drive's spreadsheets as another possible solution; I don't know how I feel about that yet. We'd have a central hub for our translations still, but one of my plans was to be able to pull data from the wiki into the Git project. Back about 5 years ago I wrote a data visualizer that could pull from a publicly shared Google Doc; but I need to investigate if this can still be done (a lot of companies have gotten on the "every fscking API call needs to be authenticated" craze...) and if it can be done with a spreadsheet. If that's the case, then we could stick our translations on Google Drive, the translators would stick their strings there, and the CSVs would just be an intermediate format between that and the compiled string tables.

Sticking newlines in between rows is going to have issues, because script editors are free to (and, in fact, need to) inject manual newlines as appropriate. If we decide that two newlines are the row divider, and one newline is just an E2, then we can't properly represent, say, a string with an empty line in it. IDK if this exists in either script yet, but I don't like file formats that lack the ability to escape special characters. Also, we still have the same problem I mentioned above of sweeping changes to the text table being difficult to enact.

kmeisthax · Post by **kmeisthax** » Wed Aug 24, 2016 5:56 am

Short status update: I am still attempting to get a perfect disassembly of the Japanese script text that will compile back into an identical ROM. I have learned that unethical pointer torture was standard practice at SmileSoft, and the result has been gutwrenching. As a result, the ALIAS ROW command gained an extra operand today, an optional INTO parameter which lets you add an arbitary byte count to the aliased pointer. This is because of two strings that point to the middle of other strings in the table. Additionally, I also had to add more robust unindexed text detection, as some text was dummied out by moving the pointer forward. The translators may find the removed line "きをつけていくんだぞ", originally placed before the string pointed to by 0x12401e, to be particularly interesting.

Despite this, a clean extraction still eludes me. I hesitate to wonder what other kinds of horrors lurk below, in the true laboratory of Sanaeba Research Center.

EDIT: Or maybe I'll fix one bug and the rest of the disassembly will work perfectly. Onto patchlands...

tymime(imported) · Post by **tymime(imported)** » Mon Sep 12, 2016 10:26 pm

kmeisthax wrote:Someone on IRC asked about replacing the voice sample, so I wrote some machinery in our disassembly to handle PCM samples. From what we discussed, I can guess that the voice sample is roughly 4-bit 16kHz mono; it's stored in gfx/title/voice_sample.wav and you can replace this with anything. I highly recommend saving your file as 8-bit 16kHz WAV for now.

Note - I'm not 100% on 16kHz, but it generates something that sounds similar to the actual game. The sample rate the game uses is based on the CPU timings of this particular function (Sound_PlaySampleFragment), so I would have to actually count samples and then divide against the base clock to figure it out. (Also, holy crap how does this even work correctly on emulators?)

You can either inject the file into the unpatched ROM or the patched version, depending on if you use the master or patch branch. Replace the WAV file and run make to generate the new PCM data. Note that for really, really stupid reasons (I'm too lazy to move metatable generation into a script and make the Makefile call it) we can't determine the actual sample count of your file yet, so you also have to adjust gfx/samples.asm. Look for the "dsample" macro's second parameter ($789F) and replace that with the number of samples in your WAV file, in hex. Note that isn't the number of bytes in the PCM file, as it's packed 4-bit (which is also why we thought it was 8kHz at one point).

My copy of the disassembly which has the voice sample exists here: https://github.com/kmeisthax/telefang It should show up in Sanqui's branch eventually when I can be arsed to write a pull request.

P.S. The way the sample system works is that it basically sets up channels 1 and 2 to output square waves, then changes their volume to match the 4-bit sample data, wastes CPU cycles for timing purposes, and repeatedly does this until it runs out of samples. Naturally this is a blocking operation, which is why the whole game pauses when that sample plays. But you'd be surprised to note that this game actually has support for more samples as well as multi-bank samples... none of which can be taken advantage of, as there's no space for a larger sample table. For more info check the source at components/sound/samples.asm

What does the voice in the intro translate to? I'm actually an amateur voice actor, and I could easily record something and export it to whatever format is needed.
Of course, I'm a guy, but I can make my voice pretty high if you guys want something like the original.

DaVince · Post by **DaVince** » Mon Sep 12, 2016 10:34 pm

It's the title of the game, so according to our translation that would be "Mobile Phone Beast Telefang!".

The discussion topic on the chat came up because of a German guy translating the game into German, and the voice sample kinda came up. Would definitely be neat to eventually have one for the English version too, but I dunno, the English translation project tries to not localize where it's not entirely necessary (for a more authentic Japanese feel, I suppose). So uhhh. I dunno? I wonder what the rest of the team thinks of this.

You can always stop by and discuss it on the chat, too. It tends to be a little more active!

tymime(imported) · Post by **tymime(imported)** » Tue Sep 13, 2016 1:07 am

Here's my recording- mono 8-bit 16kHz WAV. I figured it was best to post it here so it'd be more accessible.
http://www.mediafire.com/download/xltk63ab23z9qtd/voice_sample.wav

It's a little silly, but I was trying to recreate the excited Japanese feel of the original. It doesn't sound half bad considering how compressed it is.
Let me know what you guys think!

DaVince · Post by **DaVince** » Wed Sep 14, 2016 3:09 pm

Not gonna lie - I love it, but I love it because it sounds so incredibly much like Luigi is saying it. xD

tymime(imported) · Post by **tymime(imported)** » Wed Sep 14, 2016 9:29 pm

DaVince wrote:Not gonna lie - I love it, but I love it because it sounds so incredibly much like Luigi is saying it.

I thought it sounded a little like Mario. It's probably because Italian and Japanese accents share a few vowel sounds. Very few, but still. Plus the pitch was about the same.
I'm glad you like it though.

Imaynotbehere4long · Post by **Imaynotbehere4long** » Tue Sep 27, 2016 12:46 am

tymime wrote:Here's my recording- mono 8-bit 16kHz WAV. I figured it was best to post it here so it'd be more accessible.
http://www.mediafire.com/download/xltk63ab23z9qtd/voice_sample.wav

It's a little silly, but I was trying to recreate the excited Japanese feel of the original. It doesn't sound half bad considering how compressed it is.
Let me know what you guys think!

I don't really think it works. It's said much slower than the original, so it comes off as more of a sarcastic excitement than anything. The accent doesn't help matters much, either (though I know you can't do much about that, so sorry if that came across as rude).

EDIT: In retrospect, I realize you were going for a "game show host" vibe, but like I said, I don't think it really works. Sure, the original is "excited," but there's also a sense of, for lack of a better word, seriousness in the voice.

Tulunk Village

Re: (Finally Complete) Telefang English Translation

Re: (Incomplete) Telefang English Translation

Re: (Incomplete) Telefang English Translation

Re: (Incomplete) Telefang English Translation

Re: (Incomplete) Telefang English Translation

Re: (Incomplete) Telefang English Translation

Re: (Incomplete) Telefang English Translation

Re: (Incomplete) Telefang English Translation

Re: (Incomplete) Telefang English Translation

Re: (Incomplete) Telefang English Translation

Re: (Incomplete) Telefang English Translation