| Index of Articles by K. Ö. | Koral Özgül's Website | SDL Trados Home |
|
Step
by Step Instructions for
BUILDING A TM FROM AN ORIGINAL DOCUMENT AND ITS TRANSLATION by Koral Özgül No liability accepted. Use at your own risk.
CONTENTS
A.
PREPARATION OF THE SOURCE AND TARGET STRINGS
B.
PREPARING THE INTERMEDIATE EXCEL TEMPLATE (one time task)
C.
TRANSFERRING THE STRINGS TO EXCEL TEMPLATE
E.
IMPORTING THE FILE TO TRADOS
A. PREPARATION OF THE SOURCE AND TARGET
STRINGS As a starting point, we have two Word documents; an English file (Figure 1) and a Turkish translation of it (Figure 2). These are variably formatted. We need to change their format.
Figure 1 - Original English Word document
Figure 2 - Translated Turkish Word document
A.1. The original document has a header. Since the header will be lost during this process otherwise, we copy it and paste to the top of the body text. A.2. Some parts of text are separated with TAB characters. We remove these by searching and replacing them with Paragraph Marks (Figure 3): Find what: ^t Replace with: ^p Replace All
Figure 3 - Replace TABs with Paragraph Marks A.3. Now, we have to convert the table(s) in the document to normal text. Place your cursor into the table. Use the menu command Table > Select > Table to select the whole table. A.4. With the table selected, use the menu command Table > Convert > Table to Text. Under Separate text with in the Convert Table To Text dialog box, select Paragraph marks and click OK (Figure 4). Repeat this with every table in the file.
Figure 4 - Convert table to text separated with Paragraph marks A.5. Now we are almost done for the first stage, but the resulting text has a variable number of paragraph marks between the strings (Figure 5). This is partially because of the initial formatting. But another reason may be the changes done in the Turkish document deviating from the original. For example, there was only one tab in the English version between the entries at the beginning, while the second line in the TR file was aligned with two tabs instead, because the Turkish phrase was shorter.
Figure 5 - The result, with variable number of paragraph marks We'll have to straighten up these irregularities and remove the gaps by repeatedly replacing every double paragraph mark with a single paragraph mark, until there are no double paragraph marks remaining in the document: Find what: ^p^p Replace with: ^p Replace All
Figure 6 - Replace successive paragraph marks with a single paragraph mark Observe the message that tells how many instances have been replaced. 19 instances have been found and replaced in this case. Click OK. Click Replace All again and watch the number of replacements (Figure 7).
Figure 7 - The number of replacements decreased from 19 to 5 Repeat the replacement procedure until the number doesn't decrease any more. The ideal case would be zero. But due to unknown reasons to me, many documents stop at 1 remaining. This one doesn't fall below 2 (Figure 8).
Figure 8 - The number of replacements doesn't decrease any more Be sure that the number of replacements doesn't decrease any more and continue with the next step. Now we have only one paragraph mark each, after every string (Figure 9) except at the very beginning and end, but that's not important.
Figure 9 - A single paragraph mark after each string A.6. Our next task is to convert the whole text to a table with a single column. To do this, place the pointer at the very beginning of the text (Figure 10).
Figure 10 - Marking text (only), starting point Leave the pointer there and scroll to the very end of the file. Press and hold Shift key and click just behind the last character of all text in the document (Figure 11). Release the Shift key.
Figure 11 - Marking text (only), ending point Alternatively, after placing the pointer at the beginning of text, you can press and hold Ctrl+Shift and press End key, then without releasing the Shift key (release Ctrl), navigate to behind the last character with PageUp/PageDown and/or arrow keys, and release Shift key when all the text is selected. Don't include empty lines at the end. A.7. Use the menu command Table > Convert > Text to Table to convert the selected text into a table. Number of columns value should be 1 (Figure 12).
Figure 12 - Convert text to table with one column Now we have an orderly table containing the source English strings in each cell (Figure 13).
Figure 13 - Table containing the English strings A.8. Apply all of the above steps to the Turkish file. At the end, we'll have two tables (in separate files), one with English strings and the other with their Turkish translations.
Figure 14 - Table containing the translated Turkish strings A.9. Save both
files with different names in a temporary place.
B. PREPARING THE INTERMEDIATE EXCEL
TEMPLATE (This needs to be done only once for each language pair) B.1. Prepare an Excel sheet with the following column entries in a single (first) row. Enter only the texts marked with yellow; blue texts here are for clarification.
The Excel sheet should look like this:
Figure 15 - TM import template example B.2. Save the Excel file with a descriptive name (for example "TMImportTemplate_EN-TR.xls") in a suitable folder. You can use it for other conversions for the same language pair later. C. TRANSFERRING THE STRINGS TO EXCEL TEMPLATE C.1. Go to the Word document that contains the English string table. Place your mouse pointer somewhere inside the table. Select the whole table using the menu command Table > Select > Table. Press Ctrl+Insert keys to copy the whole table to the Windows clipboard.
C.2. Go back to the Excel template. Select the cell E1 (the cell between <Seg L=EN_US> and <Seg L=TR_01>). Press Shift+Insert keys to paste the Source strings into that column (column E), starting from the topmost cell (Figure 16).
Figure 16 - Source strings pasted into the import template C.3. Go to the Word document that contains the Turkish string table. Place your mouse pointer somewhere inside the table. Select the whole table using the menu command Table > Select > Table. Press Ctrl+Insert keys (Copy command) to copy the whole table to the Windows clipboard. C.4. Go back to the Excel template. Select the cell G1 (the cell between <Seg L=TR_01> and </TrU>). Press Shift+Insert (Paste command) keys to paste the Target strings into that column (column G), starting from the topmost cell (Figure 17).
Figure 17 - Target strings pasted into the import template
C.5. Select the cells A1 through D1 (from <TrU> to <Seg L=EN_US>) (Figure 18).
Figure 18 - Select the cells A1-D1 Press Ctrl+Insert keys to copy the contents to the Windows clipboard. C.6. Select the cell A2 (just below <TrU>). Scroll down to the last entry in the source and target columns. Press and hold the Shift key, click the row from the last entry in the column D to select all cells from A2 to Dn (where n is the last row containing string entries) (Figure 19). It's the 33rd row in our example:
Figure 19 - Select the cells A2-D33 Press Shift+Insert keys to fill in the selected cells with the copied contents (Figure 20).
Figure 20 - Paste the copied contents into the empty cells A2-D33 C.7. Select the cell F1 (<Seg L=TR_01>). Press Ctrl+Insert keys to copy the contents to the Windows clipboard. C.8. Select the cell F2 (just below <Seg L=TR_01>). Scroll down to the last entry in the source and target columns. Press and hold the Shift key, click the row from the last entry in the column F to select all cells from F2 to Fn (where n is the last row containing string entries). Press Shift+Insert keys to fill in the selected cells with the copied contents. C.9. Finally, Select the cell H1 (</TrU>). Press Ctrl+Insert keys to copy the contents to the Windows clipboard. C.10. Select the cell H2 (just below </TrU>). Scroll down to the last entry in the source and target columns. Press and hold the Shift key, click the row from the last entry in the column H to select all cells from H2 to Hn (where n is the last row containing string entries). Press Shift+Insert keys to fill in the selected cells with the copied contents. You're done with this stage (Figure 21 and 22).
Figure 21 - The top...
Figure 22 - ... and bottom of the filled in template
D.1. Select the top left cell (A1). Scroll down to the bottom right cell (Hn). Press and hold the Shift key and click the bottom right cell (Hn) to select all the populated cells (Figure 23).
Figure 23 - Select all the non-empty cells Press Ctrl+Insert keys to copy it into the clipboard.
D.2. Open an empty new Word document in MS Word.
Figure 24 - New Word document for TM import file preparation D.3. Press Shift+Insert to paste the whole table into the Word document (Figure 25).
Figure 25 - Excel data pasted into the new Word document It's not a problem if the table is not fully visible at the right side of the page - we won't do any manual editing in this file.
D.4. Place the mouse pointer anywhere within the table. Use the menu command Table > Select > Table to select the whole table (Figure 26).
Figure 26 - Select the table D.5. Use the menu command Table > Convert > Table to Text to convert the table to text. Under Separate text with in the Convert Table To Text dialog box, select Paragraph marks and click OK. You'll get something like this:
Figure 27 - Table converted to text (paragraph marks shown) D.6. Now, open the Replace dialog box (Edit > Replace). Apply the following replacement: Find what: <Seg L=EN_US>^p Replace with: <Seg L=EN_US> Replace All This will remove the paragraph marks at the end of "<Seg L=EN_US>" codes and append the English stings immediately to the code. That's the proper format for Trados export/import files. D.7. Do the same formatting change for the target strings: Find what: <Seg L=TR_01>^p Replace with: <Seg L=TR_01> Replace All The paragraph marks at the end of "<Seg L=TR_01>" codes will be removed and the Turkish strings will be appended immediately to the code. The result will look like this:
Figure 28 - The final result in Word format D.8. The only remaining step is to convert the file to plain text format:
The File Conversion dialog box will appear. Check whether the special characters (if any) for that target language appear correctly in the Preview pane. If not, try to change the encoding from default to Unicode-UTF8 or another suitable codepage.
Figure 29 - File Conversion dialog box, showing correct
TR special characters Click OK button. You're done. The resulting file is a plain text file in the correct format that Trados accepts for imports:
Figure 30 - Final import file ready Now proceed with importing.
E. IMPORTING THE FILE TO TRADOS Either you would want to use the entries as complementary to an existing TM (jump to Step E.2) or you might want to create a new TM with these entries. In the latter case: E.1. Open Trados Translator's Workbench and create a new translation memory with the exactly same source and target languages as you used in the compiled import file. (For detailed information about how to create a new TM, refer to Trados Workbench Help.) E.2. Use the menu command File > Import to open the Import dialog box (Figure 31).
Figure 31 - Trados Translator's Workbench Import dialog Click OK button. Open Import File dialog box opens. Browse to the text file you have prepared and select it. Click Open button. Trados will import the entries into the current TM. Mission accomplished.
Koral Özgül, Istanbul, September 2008
|
|
Member of Babylon's Outreach Program for Translators
|