Pages in topic: < [1 2 3] | How to convert TMX to tab-delimited? Thread poster: Hans Lenting
| Hans Lenting Netherlands Member (2006) German to Dutch TOPIC STARTER
Samuel Murray wrote: Yes, so you have to first replace all whitespace characters (except spaces, duh) with replacement characters. ... = horizontal tab Thank you for reminding me of that one! I'll add a rule to the TextFactory. Or just replace \n with ① and replace \t with ② throughout the file -- no need to restrict it to segments, for since you're not going to use the TMX file after this
I'll use the tab-delimited file for several purposes. One of them is ... creating a cleaned and smaller TMX. That TextFactory will be pretty straightforward. BTW: BBEdit introduces the Text Factory, which allows you to assemble a list of text transformations that will be applied in order to either the current document or selection (when invoked as a filter), or to a specified list of files and folders (when invoked via the Scripts menu). | | | Hans Lenting Netherlands Member (2006) German to Dutch TOPIC STARTER | Why not use CafeTran Espresso? | Oct 27, 2022 |
Could you simply use CafeTran Espresso for that conversion? 1. Create or open a project with the required language pair. 2. Open or Import the TMX file (or an SDLTB/TBX, which will be automatically converted to TMX), possibly not as read-only and with fragments enabled 3. Select the tab of the glossary you wish to import into (an empty Project Terms page will do, or you create a new glossary and select its tab) 4. Memory menu > Export > Export segments to glossary.... See more Could you simply use CafeTran Espresso for that conversion? 1. Create or open a project with the required language pair. 2. Open or Import the TMX file (or an SDLTB/TBX, which will be automatically converted to TMX), possibly not as read-only and with fragments enabled 3. Select the tab of the glossary you wish to import into (an empty Project Terms page will do, or you create a new glossary and select its tab) 4. Memory menu > Export > Export segments to glossary. A dialog will ask you to select which memory to import segments from. And if the currently selected/opened tab is not a glossary, it will first ask you to select one. That's it. CafeTran also includes some TM Filter options, including one called "Clean and replace foreign codes": Some TMX files from third-party tools have unusual codes in the segments such as codes inside the curly brackets or emdash, endash, tab code. CafeTran clears or replaces them with equivalent unicode characters. https://github.com/idimitriadis0/TheCafeTranFiles/wiki/3-TM-options#tm-filter-options If needed, prior TMX editing (including search and replace, with or without regular expressions) can also be done from within CafeTran.
[Edited at 2022-10-27 05:50 GMT] ▲ Collapse | | | Hans Lenting Netherlands Member (2006) German to Dutch TOPIC STARTER
Jean Dimitriadis wrote: Could you simply use CafeTran Espresso for that conversion? 1. Create or open a project with the required language pair. 2. Open or Import the TMX file (or an SDLTB/TBX, which will be automatically converted to TMX), possibly not as read-only and with fragments enabled 3. Select the tab of the glossary you wish to import into (an empty Project Terms page will do, or you create a new glossary and select its tab) 4. Memory menu > Export > Export segments to glossary. A dialog will ask you to select which memory to import segments from. And if the currently selected/opened tab is not a glossary, it will first ask you to select one. That's it. CafeTran also includes some TM Filter options, including one called "Clean and replace foreign codes": Some TMX files from third-party tools have unusual codes in the segments such as codes inside the curly brackets or emdash, endash, tab code. CafeTran clears or replaces them with equivalent unicode characters. https://github.com/idimitriadis0/TheCafeTranFiles/wiki/3-TM-options#tm-filter-options If needed, prior TMX editing (including search and replace, with or without regular expressions) can also be done from within CafeTran. [Edited at 2022-10-27 05:50 GMT] I am familiar with this procedure. However, it is extremely slow. This takes ages. Besides that, I like to have an alternative solution that I can use as a framework and possibly integrate in my workflows.
[Edited at 2022-10-27 07:18 GMT] | |
|
|
Dan Lucas United Kingdom Local time: 17:50 Member (2014) Japanese to English
Stepan Konev wrote: If that MacOS text editor can mark the match, you can use the following regex: to mark and then copy all segments to clipboard Although I've only tried it on one file, this typically clever solution from Stepan seems to work well in Notepad++ here - much appreciated. Given that we already have the regex, it looks like an obvious choice for a tiny script in the programming language of one's choice (probably just from the command line in Perl!). I have never actually needed to convert TMX to tab-delimited, but it's nice to know that it's possible. Thanks to Hans and other contributors for the topic. Dan | | | Pages in topic: < [1 2 3] | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » How to convert TMX to tab-delimited? TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
| Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |