Wednesday, January 15, 2014

Adding a Soft Return Segmentation Rule to SDL Trados Studio 2014

In a previous post, I discussed how to add a Tab segmentation rule to Studio.

That is pretty straightforward, as the Tab is one of the segmentation options offered in the list of Break characters.

But what if we want to add a new segmentation rule for soft returns? No such option in the dropdown menu, so we need to use a Regex expression. The steps are detailed below.

In order to get to the Segmentation Rules window, we first need to do the following:

Go into Project Settings, All Language Pairs, then Translation Memory Settings:


In the window that opens, select Language Resources on the left, and then Segmentation Rules on the right, then click on Edit:


This brings up one more box. For this example, since I want Studio to create a new segment every time it finds a soft return, I need to choose Add:


To create a segmentation rule for soft returns, add a name in the description field, choose "Anything" in the "Before break" dropdown menu and "Anything" in the "After break" dropdown menu. Since a soft return is not one of the options in the "Break characters" menu, we need to go to the Advanced View by clicking the button to the right of the Description.


 After clicking on Advanced View, we see this:


This is where we add the Regex expression for a soft return, which should look exactly like this (feel free to copy from below and paste into Studio):

.[\n]+


Disclaimer: My knowledge of Regex is extremely limited; I got this expression from one of Paul Filkin's posts in a forum and simply typed it in. Thank you, Paul!

After this, click OK several times to close all the open dialog boxes, and that's it, from now on, in files processed with this TM, a new segment will be created whenever Studio encounters a soft return.






3 comments:

  1. Here are just a few of the more unusual benefits which are associated with using outsourced typing services. See more manuscript typing

    ReplyDelete
  2. Replies
    1. Hi, Luca,

      Try to remove the dot from
      ".[\n]+"
      to get
      "[\n]+"

      Works for me.
      Cheers

      Delete