« September 2024 »
S M T W T F S
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30
You are not logged in. Log in
Entries by Topic
All topics
The Do-It-Yourself Corner  «
Watercrown News
Blog Tools
Edit your Blog
Build a Blog
View Profile
Watercrown Productions DevBlog
Friday, 7 April 2006
The Do-It-Yourself Corner, Chapter 2: Are We Having Fun Yet?
Topic: The Do-It-Yourself Corner

So few people seem to be reading this...it's got me a little bit worried. ^_^;

DON'T PANIC!: The translation project is still proceeding. I haven't forgotten any of you. The final (to my knowledge) script block has been dumped and I'm formatting it as we speak. This particular one is full of all the little messages that you expect to see in RPGs when you pick up items or perform tasks.

I also discovered that my VWF has one last kink to work out: the VWF will be more or less perfected after I hone this final element. I think.

So without further ado, Chapter 2: Making the Table. Or whatever.

Like I said before: everything in a ROM is based around numbers. Those numbers mean different things depending on context: it's like having a language that only has 256 words, but their meaning and required grammar are completely different depending on whether our little 1337-speaker is, say, sitting in the bathroom, driving his car or ordering a cappuccino.

Unlike graphics, which are due to hardware conveniences usually stored in a format that someone somewhere has fully documented and fully implemented into a tile editor, there are as many possible ways to store text in a game as there are books in Borges' Library of Babel. Sometimes you get lucky, and a game uses an established encoding format: sitting down at Notepad, typing up a document, you are using in most cases the standard ASCII encoding system (interestingly, Sylvanian Families uses part of the ASCII set). Japanese also has many standards, such as EUC, JIS and its variants, and the variants and subsets of the Unicode system.

If you've ever cracked an alphanumeric 3-18-25-16-20-15-7-18-1-13, you're familiar with the concept behind a table. There are, however, three differences:

1. The encoded text will very likely not be signposted in any helpful way. You'll have to hunt for it in a sea of non-text gibberish data.

2. The cipher you're trying to crack isn't written in English: it's written in Japanese. The classic "Etaoin Shdrlu" will not save you.

3. You won't just be working out letters/symbols. Some numbers do not represent text, but are nonetheless significant. Line breaks, page breaks, text effects...all of these are represented with their own numbers, and are collectively referred to as "control codes." Ignore them only at your peril. (Sylvanian Families uses so many of these that its script can be considered a compiled programming language.)

Right. I won't insult your intelligence any longer with vague references to "numbers." If you know anything about computers, you know they don't count "one, two, three". These numbers I've made so much of come down to rows upon rows of tiny on/off switches. Natively, a computer "thinks" in binary, or "base 2": it uses the numbers 0 and 1 and a place value system to represent every number from 0 to whatever. (An old joke says that there are 10 kinds of people in the world: those who understand binary and those who don't.)

Of course, most of your work will probably never delve directly into the magical world of binary. Binary digits, or "bits", are grouped in sets of eight, called "bytes", which can individually represent every value from 0 to 255, and this arrangement paves the way for a stopgap between the base 2 world of the computer's brain and the base 10 world of ours: the hexadecimal (base 16) number system. It uses the numbers 0-9 as well as the letters A-F to represent the numbers 0 through 15, and place value to accomplish the rest. The "tens" column represents "sixteens": 10 hex is 16, 20h is 32, 30h is 48, all the way up to FF, which is (15 * 16) + 15, or 255. So FF, the highest two-digit hexadecimal number, also conveniently represents the maximum number a byte can represent. (Also conveniently, 100h comes out to a clean 256.)

This is the "hex" in "hex editor": you will be looking at the ROM as byte data, represented in the form of hexadecimal numbers. Exciting, isn't it?

Now, down to business. Open up JWPce (or whatever) and load your ROM in Tile Molester (again, or whatever). Search the ROM for the font. Tile Molester should choose the appropriate format on its own, but on occasion you'll need to tweak it. SNES/SFC ROMs store graphics in a 4bpp format (you don't need to understand what that means right now), but the font is usually stored in a 2bpp format (the exact same one used by Game Boy, in fact). It might be stored in an unusual format, so if you don't find it using the default, try a different one.

If you can't find the font and you've tried all the different formats, then choose a different project. Its time will come. You are but a learner now, but when next you meet, you'll be the master...and you'll have a neat helmet, a red lightsaber and James Earl Jones doing your voice. Or something like that.

Games often have multiple fonts: if you find one that's not the main font, note its location anyhow. Sylvanian Families actually had me stymied for a while because even though I found the actual font the game printed, it was not in the same order as the game's table: another font elsewhere, however, was. Once we've found the font, which we will assume is in the same order as the table, we go on to the next step, which actually involves playing the game. ^_^

Even if you can't actually understand a word of it, pay attention for any patterns of characters in the game's text that are close together in the font. We'll need to know such chinks in the game's armor for our final step, which involves our hex editor's special power tool for table making: the Relative Search function. Relative Search takes a pattern of characters and searches for it in the ROM: if you searched for "king", it would search the ROM for every set of four bytes that have the same relation between them as the letters in the word (which, in theory, will help us determine what values represent what letters). For example, if all our relative search matches for "king" are identical, then it's safe to assume we've found the values for "k", "i", "n" and "g", and we can extrapolate the rest with the aid of the font.

The same thing applies here, only we'll be doing it with Japanese characters. We can't search for "king", but we can search for patterns. Let's assume our font has all the "plain" characters in order and leaves special characters such as dakuten and small kana outside the main run. For the sake of argument, let's say this is a "Ghost in the Shell: Stand Alone Complex" game and we've decided to use the name of those lovable, inescapable Tachikoma as our key. In Japanese, Tachikoma is written in katakana as "TA-CHI-KO-MA", which are close enough in the font for us to Relative Search. So let's take a look at the syllabary, and assign English letters to each one:

A-ko
B-sa
C-shi
D-su
E-se
F-so
G-ta
H-chi
I-tsu
J-te
K-to
L-na
M-ni
N-nu
O-ne
P-no
Q-ha
R-hi
S-fu
T-he
U-ho
V-ma

(Trust me, that "A-ko" isn't in there on purpose. I've never even watched it. >_>)

So the pattern we're looking for is going to be "GHAV". If we get results from the Relative Search that are all the same, we've found "TA", "CHI", "KO" and "MA" and can begin work on extrapolating the rest of the table. If not, we'll have to refine our search a bit, maybe choose a different pattern to look for...

But once we're sure we've found the correct pattern, we can start work on the table!

Table files are usually saved with the extension ".tbl", but they can really be any plain text format. WindHex seems to support Shift-JIS exclusively; Atlas and romjuice are fine with UTF-8. So we'll probably have to save it in two formats; no biggie for JWPce.

Let's first start with a blank table. Assuming there are 256 characters or fewer in the font, go into JWPce, switch to regular ASCII mode, then start with this:

00=
01=
02=
03=
04=
05=
06=
07=
08=
09=
0A=
0B=
0C=
0D=
0E=
0F=
10=
11=
...

And work your way through to...

...
EE=
EF=
F0=
F1=
F2=
F3=
F4=
F5=
F6=
F7=
F8=
F9=
FA=
FB=
FC=
FD=
FE=
FF=

And make sure there's a blank line at the end. Copy-and-paste helps a lot; just make sure you don't have any duplicates.

Switch back to Japanese mode and put down the symbol each byte value represents after its corresponding equals sign. Once you're done with that, work out the control codes: in the simplest case, you'll just have line break and page break codes to puzzle through. If every line ends in "FE" and a page of text ends in "FF", mark those values with "<line>" and "<end>" or whatever suits your fancy. If you're not so lucky, there will be more, but you can work them out by playing the game and taking note of what happens when text with those control codes appears on screen. Mark any unknown non-text values with the number in pointy brackets, like "<1F>", <CE>", "<D5>", "<42>", etc. There's a good reason for this, but it'll wait till another time.

There are probably people out there much better suited to writing a readme for romhacking beginners, and those same people are probably watching my silly, inefficient style and laughing. It's entirely possible I'm doing more harm than good by writing these, but I thought I'd fill in the space between major project updates by giving back to the online community. These are the techniques I've used, for the most part, anyhow, and they've served me in good stead.

Until next time, ladies, gentlemen, Demi-Fiends and you fuzzy things sitting on a shelf.


Powered by Qumana



Posted by Ryusui at 10:03 PM PDT
Thursday, 6 April 2006
The Do-It-Yourself Corner, Chapter 1: Getting Started
Topic: The Do-It-Yourself Corner

Before I begin, I'd like to say that working on Sylvanian Families for GBC has given me more experience with translation hacking than any project I've ever attempted. I dare say it's given me more experience with translation hacking than any one game could provide. It's like the game equivalent of the Chateau d'If, only warm and fuzzy: I came in Edmond Dantes, hopelessly naive sailor, and I came out as the Count of Monte Cristo, implacable agent of divine justice with a buried fortune at his disposal, deus-ex-machina powers that most shonen manga heroes would be envious of and characters like Zorro and Batman claiming their heritage from me. Well, not really, but you get the idea.

On that note, I'd really, really like to claim I'm an expert on the topic of translation hacking, but I can't. Sylvanian Families really is only one game, no matter how hard it is to get a translation inserted, and there are probably monsters out there lurking in some game ready and willing to tear my head off if I even think about translating them.

You know, if I keep beating around the bush like this, the DIY Corner is going to become the Lecture Circuit. So here it goes, as best I can: the real, genuine Chapter 1.

If you've ever had any serious interest in games, you've probably had a title you really, really wish you could play in your native tongue, but for some reason, the powers that be denied you your gaming goodness. You've found the ROM, downloaded it, played it, got frustrated at having absolutely no idea what was going on or having to constantly consult a dictionary, got more frustrated that no one but you seems to know your game exists and hence nobody's tried to translate it. Or maybe you just want to prove your translation skills and have a likely target in mind.

That said, translation hacking is often a multi-person job, but it can be done by one person working alone, assuming he/she is skilled at all the tasks required: the bare essentials are script dumping/insertion, graphics editing, and of course, the actual translation.

For the sake of argument, I will assume that you, the reader, have either 1. enough of a grasp of Japanese to translate with the aid of a dictionary or 2. a ready, willing and able translator on hand. If you have neither, I strongly suggest you visit the following link:

http://www.animelyrics.com/forum/topic_show.pl?tid=6109

Do not ask the users there to teach you. Use the resources provided in this link. The Japanese forum on AnimeLyrics.com is for language help and advice, not for dumping your translation woes upon.

At the absolute least, you should learn the kana, the syllabic writing system which represent the basic atoms of the Japanese language. There are around 100 total, split between two systems: the hiragana, cursive letters used to write native words, and the katakana, which more resemble print and are used like italics are in English: for emphasis and for writing words of foriegn origin. Some games, Sylvanian Families included, also throw in a smattering of simple kanji; you should prepare yourself with the appropriate resources.

Trust me. Even if you have a translator handy, it pays to know these.

Right. On to the more tangible requirements. Apart from the game and a good emulator (one with tile and map viewing is preferable; for the trickier parts, make sure you have memory viewing and some debugging functions), you will require:

1. A graphics editor. Your ordinary Paintbrush won't help you with this: in fact, not even Photoshop will help you. Directly, anyway. Games store their graphics in a myriad of strange (but usually efficient) formats that can't be read by ordinary graphics programs. I used to swear by Tile Layer Pro, but the Java-based Tile Molester (pardon the name; it's not mine) beats it in practically every concievable way. Most graphics editors, Tile Molester included, also allow you to export graphics in BMP format for editing in your favorite paint program: this is handy when it finally comes time to do the title screen. If you run into something tricky, FEIDIAN is a tool written in PHP for extracting graphics from a ROM directly into BMP format, but if the game you've chosen has anything Tile Molester can't view on its own, you should probably see if you can find an easier project. ^_^;

2. A hex editor. This has absolutely nothing to do with spells or curses: a processor, be it in a game console or sitting behind a prominent "Pentium 4 HT" label, is ultimately a sophisticated abacus for shuttling numbers, and as such a game, from the emulator's viewpoint, is merely a long list of numbers to shuttle around. Even the game's resources. The graphics editor, as with any paint program, takes those numbers and assembles them into the graphics they're meant to represent: the hex editor takes those numbers and presents them to you at face value. You see, even the all-important text is stored as numbers, and by using the two aforementioned editors in tandem, you will assemble a table that allows you to see what numbers represent the game's text and where and how it's stored. I strongly recommend Bongo's WindHex for this.

3. A script dumper. Once you've taken care of the above, you can use this program to extract the game's text, or "script", according to the table you compile. I've been using romjuice, but it comes compiled for Linux with C source code: you could get a C compiler and compile it for your own machine, or I could get off my butt and provide my Windows-compiled version if you need it.

4. Klarth's Atlas. Generically speaking, you want a script inserter, which will take your translated script and transform it into the digits that the game understands using your table as its guide, but Atlas is the best I've found, and if you have some skill with C, you can tailor Atlas to your precise needs (although it should have everything you need already if you've chosen a simple project).

5. A word processor with Japanese support. As noted above, actual Japanese knowledge is purely optional, but knowledge of the basic kana and possibly a few simple kanji is a must. JWPce is a great freeware program which supports multiple formats: WindHex, to my experience, requires its tables be in Shift-JIS format, and if you wish to reinsert Japanese text for any reason, Atlas supports UTF-8 (although I'm uncertain what other formats it supports). For some strange reason, I can't remember what format, or formats, romjuice supports. Of course, you could always construct a table using the romaji (literally "Roman letters", i.e. the English alphabet) equivalents of the Japanese symbols, but I can't recommend this.

Some games will require tools other than the ones outlined above, but for a simple, tutorial-worthy project, this is all we require.

This chapter is called "Getting Started", and that's precisely what we've done: gotten started. I seem to have a nasty habit of either having too little to say or embarking on page-long rants where they're not needed. Tune in next time for our next step: making the table!


Powered by Qumana



Posted by Ryusui at 9:27 PM PDT
Updated: Thursday, 6 April 2006 10:40 PM PDT
Wednesday, 5 April 2006
Welcome to the Do-It-Yourself Corner!
Topic: The Do-It-Yourself Corner

Translation is not an easy business.

I've seen people on the Internet who seem convinced that there is a magic bullet, an almighty program capable of taking a game and rendering it into perfectly legible, natural English. I've seen people who seem to think that there is an infallible, one-for-one conversion between every word in Japanese and every word in English, and think that the humble "kuso"-which literally means "fecal matter" but can be considered a catch-all for any four-letter word from "darn" on up-must always, always be considered to mean the s-word. I've seen people who swear by translations that are full of untranslated words that are pure gibberish to English speakers and insist that any work that leaves out these precious honorifics and distinctly Japanese quirks of speech is inferior. And perhaps most maddening of all, I've encountered people who seem to regard translation from Japanese to English as a simple cryptogram game, that the entirety of the language can be rendered into perfectly comprehensible English by simple letter substitution. (All the previous ones can be attributed to pure ignorance, but that last one is bred of an annoying, arrogant assumption that all languages are ultimately related to or derived from English, or that English is the only "real" language and that all others are merely games derived by foriegn people to frustrate Westerners.)

Thank God none of the people following my work are like that. ^_^

So. Since you all seem to be sane, sensible people, or perhaps I'm just viewing my blog through rose-colored glasses (is that why the background looks purple to me? Just kidding there), I thought I'd share some pointers in case one of my readers has some obscure Japanese or otherwise foriegn-language title that nobody thought to release in their native tongue. I have only one request of my readers:

If that game is ever released in English or whatever language you speak, make sure you buy the legit version. The retrogaming features on the Nintendo Revolution have me excited: with games from every Nintendo platform as well as Sega and TurboGrafx-16/PC Engine titles slated to be represented, a whole lot of great, overlooked titles may very well be released on our shores. Like Mother 1 (a.k.a. Earthbound Zero), Castlevania: Rondo of Blood or Starfox 2, which was never officially released anywhere (find the full beta ROM and AGTP's translation patch; you'll forget Adventures and Assault ever happened!).

...Crimony. I've spent so much time on the setup that if I keep going, I won't have anything left to post about. Consider this a "to be continued" on the topic of Do-It-Yourself fan translations: I'll start off with the basics tomorrow!


Powered by Qumana



Posted by Ryusui at 10:48 PM PDT
Updated: Wednesday, 5 April 2006 11:15 PM PDT

Newer | Latest | Older