Tuesday, June 30, 2015

Studio 2015: AutoSuggest Gets Even Better

Studio 2015 comes with significant updates to the AutoSuggest feature, which will be a welcome productivity enhancement for users.

Translation Memory and Automated Translation Suggestions in Studio 2015

In addition to Termbases, AutoSuggest Dictionaries and AutoText, Studio 2015 includes a new AutoSuggest provider: Translation Memory and Automated Translation.*

* For Automated Translation suggestions to be provided, an automated translation provider, such as Google Translate, must be added to the list of Translation Memories.

Depending on the settings we choose, we can get suggestions that are 100% matches, fuzzy matches and concordance matches that come from our very own TMs.

To access the new settings, we will find a new node under Options - AutoSuggest, where we can set our preferences.

Above, I have chosen a value of 1 under "Minimum number of characters typed before displaying suggestions" to get the most matches. A setting of 1 under "Minimum suggestion length in words" will result in many more matches than a setting of 3, for example, as it's a less restrictive criterion.

The best settings for these thresholds will depend on personal preferences and the extent of our resources. With a well-populated termbase and TM, for example, low thresholds in these settings (1 character typed, 1-word suggestions) may result in many duplicates in our suggestion list, as, for example, one-word terms that already exist in the termbase and in the AutoSuggest dictionary will also be offered as Translation Memory autosuggestions, as shown in the (extreme) example below. As a disclaimer, please note that the AutoText entry would only appear after the first four letters have been typed, so, in fact, after typing the first letter, only the first three suggestions would be offered.

With this new AutoSuggest provider, even a user who has no termbases, no AutoSuggest dictionaries and no Auto Text entries will benefit from AutoSuggest, as suggestions are offered from existing and newly added translation units, and are updated on-the-fly, as translation units are confirmed.

For the examples below, I have disabled Termbases, AutoSuggest Dictionaries and AutoText from the list of AutoSuggest providers to simulate a scenario where the only resource available to the user is the TM.

For segment 1 there are fuzzy matches in the TM, so AutoSuggest offers a fuzzy match in the list of suggestions when I type the letter "v".

Since segment 5 has no exact or fuzzy matches in the TM, AutoSuggest offers concordance matches* and automated translation suggestions as soon as I type the first letter for the segment.

*The setting that determines this behavior can be found under Options - Editor - Concordance Search Window. If this option is not checked, auto concordance is disabled and therefore AutoSuggest will not offer any concordance suggestions.

AutoSuggest in Action

The 'magic' of AutoSuggest is best appreciated in action, so below you'll find a video showing how it works.

This is definitely one of my favorite features in Studio 2015!

The New Bilingual Excel File Type in Studio 2015

With the new Bilingual Excel file type in Studio 2015, we can extract text from a specific column for translation and have Studio insert the translated text into a different column in the target file. No more cutting and pasting!

Note: File type and filter are sometimes used interchangeably, so Bilingual Excel file type and Bilingual Excel filter refer to the same thing in this post.

What It Does

Here's a typical use scenario.

The regular Excel file type produces the following Studio file, with all the content extracted for translation.

The resulting target file of the above would be an exact copy of the source file, but with all the text replaced by the translations.

With the new Bilingual Excel file type, however, only the text I need will be extracted for translation.

And the resulting target file will have the translations inserted in Column F, while the rest of the text remains intact, as shown below.

How to Use It

First things first: For any projects created with a pre-2015 version of Studio, the Bilingual Excel file type will need to be added manually by clicking on the "Additional installed File Types exist" link, and checking the new file type checkbox, as shown here.

Are the files in the correct order?
A key thing to keep in mind to make this work is that Studio will process your Excel file with whatever filter it finds first, so if the Bilingual Excel file type comes before the regular Excel file type in the list, the Bilingual Excel file type will be used to process your file, but if it comes after the regular Excel file type, then the regular Excel filter will be used. So, to make sure Studio uses the Bilingual Excel filter, you can either make sure it comes before the regular Excel filter in the list or you can disable the regular Excel file type.

Go to File - Options - File Types

Important note: There are two places where you can make this change.

1. If you make it by going to Options - File Types (as shown in the screenshots above), this change will affect future projects and single-document flows, but not existing projects.

2. If you make it by going to Project Settings - File Types, the change will only affect your active project.

This gives you the ability to enable the Bilingual Excel file type only for a specific project without changing how Studio handles Excel files in general.

Once the file type is enabled, it's time to look at the settings. For my example above, I set the source column to E and the translation column to F and chose to confirm any existing translations that may already be in column F.

After clicking OK, you would process your file as usual. Keep in mind that any changes made to Project Settings or Options will be applied to files that are processed after the changes, not to existing files, so if you've made this change after a file had been added to a project and prepared, you would need to remove the file, add it and prepare it again for the new settings to go into effect.

Note: This section has been edited to clarify how cell and text-level formatting are handled.
A limitation of this first version of the Bilingual Excel filter is that text-level formatting (e.g. bold, italics, font, color applied to individual words or phrases) is not recognized. Cell formatting, however, can be preserved by unchecking the "Preserve Target Style" checkbox in the file type settings, as shown below. Note that in this example, column C has been marked as containing comments.

This is the source file to be used in this example. Rows 2 and 5 in Column A have cell formatting applied to them. For the rest of the cells, formatting was applied at text level.

Here's the above file, processed with the bilingual file type and with the settings shown earlier. Note that there are no formatting tags in the source column.

Now let's have a look at the resulting target file.

So, as we can see, formatting that applies to the entire cell is transferred over to the target column, but since we have no tags to help us indicate the formatting of individual words or word groups, text-level formatting cannot be preserved.

Let's compare now how the same text looks when processed through the regular Excel filter.

Here we get full control over text formatting tags, so it's easy to duplicate source formatting.

Another limitation of the bilingual file type is that there are no embedded content settings for this file type. For most regular jobs, however, this may not even be an issue.

In conclusion, choosing the right Excel file type to use will depend on the specific use case at hand. While heavily-formatted files may still call for the regular Excel filter, there will be other files with little or no formatting that will be perfect for the new Bilingual Excel filter. 

Wednesday, June 17, 2015

How to easily spot unintentional spaces in Studio segments

Studio's QA Checker offers a section under Punctuation to check for unintentional spaces before punctuation marks, a very useful feature to spot those extra spaces that might be difficult to see, especially in the vicinity of tags.

I find that a simple enhancement that makes this feature more useful for me is adding a closing parenthesis and a comma to the selection of marks, which already includes a colon, an exclamation point, a question mark and a semicolon.

To apply this change to all projects created in the future and to future documents processed with the single document translation workflow, go to File - Options - Verification - QA Checker 3.0  - Punctuation and add the necessary punctuation marks as shown below:

Please note that making this change at the global level will not affect any existing projects or open documents.

To apply the change to current projects in your system created before the change or to packages received from a client, go to Project Settings - Verification - QA Checker 3.0  - Punctuation and do the same.

Once this has been set, Studio will show a warning* when unintentional spaces are found in a target segment, as shown here:

*Provided that the QA Checker 3.0 checkbox (under Verification) and the Check for intentional spaces before: checkbox (under Punctuation) are checked.

And that's it, a simple customization that can make Studio a little more helpful!