What is 'Learning Terms' all about?
Glossary management has always been a crucial part of localization services, especially those that rely on agility, scalability, and integrity of their translations.
In a perfect world, you could have every localization project go through a step where terminology is extracted, translated, and validated BEFORE even beginning a translation project.
However, in reality, some projects start already in time debt, and pretty soon different people in different parts of the world are working on the same files, making all kinds of terminological decisions across different languages.
Additionally, manually adding terms to glossaries is a tedious, time-consuming task and requires extra discipline from the translation team.
Additionally, glossaries frequently encounter an asymmetrical data problem, wherein certain languages are overloaded with irrelevant terms or distinct meanings for the same word, while other languages barely have registers in the database. A lot of noise is added to the translation, and some key concepts may be left behind.
We at Bureau Works took all of this into account when designing our Learning Terms feature, which "magically" learns from your source content, seamlessly adding terms to your glossary.
In the moment of parsing your file, Learning Terms will identify key terms for previous catalogation in the glossary, and in the moment of translation, Learning Terms will add data to the glossary based on Context Sensitive translations.
How to Use Learning Terms
Learning Terms can help you in two different steps of a project: during the parsing of the file, when it will extract key terms, and during the translation process.
To use Learning Terms during the parsing of the file, all you need to do is enable the feature in the parameter's screen.
This is highly customizable, and you can set your preferences accordingly.
Extract Terms from Source Content: Enables the Leaning Terms feature, which will get key terms from the text while it's being parsed;
Keep Case: Keeps case sensitivity to differentiate terms;
Max Words Per Term: Enter the maximum number of words that a sequence of words must have to be retained as a term. We recommend you keep this value to a reasonable maximum (6 or 7, or less).
Minimum Occurrences: Minimum amount of occurrences of the term in the text in order to be extracted;
Minimum Words Per Term: Indicate how many words a term must contain in order to be retained.
Remove Sub Terms: When a term is formed by sub terms, if this option is enabled, it will disconsider the sub term and only register the whole term;
Sort By Occurrence: Sort the list of extracted terms by its occurrence, where the most frequent terms stays at the top of the list;
Top Terms Limit: The total amount of key terms that are extracted from the text's parsing process.
Once the document is parsed, it will estimate the words that are more adequate to be automactically learned based on their occurrence and context.
You can access this panel by selecting the stacked elipsis on the right side of the desired work unit and then selecting "Extracted Terms":
Now, to use Learning Terms during the translation process, remember to have LLM active in your account and in your Organizational Unit. If you need help verifying the LLM status in your Account/Org. Unit, please check out this article.
Once you're all set to start translating, Learning Terms will do it's magic! During the translation process, the terms that were extracted and had their relevance estimated will be automactically added to your glossary! To ensure the register's origin, learned terms will be marked with a unique icon:
If you already have a manually registered term in your glossary, Learning Terms will not overwrite anything that you previously added. However, it will update other learned terms. This way you can leverage smart technologies to improve your work while keeping the data integrity of your glossaries safe!