Cloud Translation - Advanced API lets you customize the Google Neural Machine Translation (NMT) model without writing code. This means you can tailor a custom model to your domain-specific content and produce more accurate translations than the default Google NMT model would.
The NMT model covers a large number of language pairs and does well with general-purpose text. Where a custom model excels is in handling specific, niche vocabularies. Customizing the NMT model lets you get the right translation of domain-specific terminology that matters to you.
If you run a specialized reporting service that has the opportunity to expand into new countries. Those markets require that your time-sensitive content be translated correctly in real time, including specialized terminology. Instead of having to hire bilingual staff or contract with specialist translators, both of which come at a high price, you can create and refine a custom model to do the job in real time at a much lower cost.
In order to train a custom model, you supply matching pairs of segments in the source and target languages. These are pairs of words or phrases that mean the same thing in the language you want to translate from and the language you want to translate to. The closer in meaning your segment pairs are, the better your model can work.
While putting together the dataset of matching segment pairs, start with the use case:
What outcome are you trying to achieve?
What kinds of segments are important to translate correctly in order to achieve that outcome? Can the NMT model achieve that out of the box?
Is it possible for humans to translate these key segments in a way that satisfies you? If the translation task is ambiguous enough that a person fluent in both languages would have a hard time doing a satisfactory job, the NMT model might perform as well as a custom model.
What kinds of examples best reflect the type and range of segment pairs that your system will need to translate?
You're training a custom translation model because you need a model that fits a particular linguistic domain. Make sure your segment pairs do the best possible job of covering the vocabulary, usage, and grammatical quirks of your industry or area of focus. Find documents that contain typical usages you'd find in the translation tasks you want accomplished, and make sure your parallel phrases match as closely in meaning as you can arrange. Of course, sometimes languages don't map perfectly in vocabulary or syntax, but try to capture the full diversity of semantics you expect to encounter in use if that's possible. You're building on top of a model that already does a pretty good job with general-purpose translation - your examples are the special last step that makes custom models work for your use case in particular, so make sure they're relevant and representative of usage you expect to see.
It's tempting to assume that the way people write about a specific domain is uniform enough that a small number of text samples translated by a small number of translators should be sufficient to train a model that works well for anyone else writing about that domain. But we're all individuals, and we each bring our own personality to the words we write. A training dataset with segment pairs from a broad selection of authors and translators is more likely to give you a model that's useful for translating writing from a diverse organization. In addition, consider the variety of segment lengths and structures; a dataset where all the segments are the same size or share a similar grammatical structure won't produce build a good custom model that captures all the possibilities.
After you've established what data you need, you need to find a way to source it. You can begin by taking into account all the data your organization collects. You might find that you're already collecting the data you would need to train a translation model. In case you don't have the data you need, you can obtain it manually or outsource it to a third-party provider.
If it's at all feasible, make sure a person who understands both languages well has validated that the segment pairs match up correctly and represent understandable, accurate translations. A common mistake like misaligning the rows of your training data spreadsheet can yield translations that sound like nonsense. High-quality data is the most important thing you can provide to Cloud Translation - Advanced API to get a model that's usable for your business.
A core principle underpinning Google's ML products is human-centered machine learning, an approach that promotes responsible AI practices, including fairness. The goal of fairness in ML is to understand and prevent unjust or prejudicial treatment of people related to race, income, sexual orientation, religion, gender, and other characteristics historically associated with discrimination and marginalization, when and where they manifest in algorithmic systems or algorithmically aided decision-making. You can read more in our guide and in these fair-aware notes:
Review regulations in both your region and the locations your application will serve, as well as existing research or product information in your domain to learn about legal guidelines and possible cultural issues. Read more...
Consider whether your use case or product might negatively impact individuals' economic or other important life opportunities, and if so read more...
If you have scarce data about a particular subgroup, make sure that data is spread representatively between your training and test sets by performing the train-test split yourself.
When testing, think carefully about your problem domain and its potential for unfairness and bias. Come up with cases that would adversely impact your users if they were found in production, and test those first. Read more...
If you have a use case that warrants fairness considerations, read more about how to use your model in a manner that mitigates biases or adverse outcomes. Read more...
You may make mistakes when preprocessing data, and some mistakes can really confuse a custom model. In particular, look for the following data issues that you can fix:
Cloud Translation - Advanced API stops parsing your data input file when:
Cloud Translation - Advanced API ignores errors for problems it cannot detect, such as:
<tu> element in a TMX file that doesn't specify the source or target
language.For automatic data splitting, Cloud Translation - Advanced API performs additional processing (see Dataset division):
After the dataset is uploaded, it removes segment pairs with identical source segments.
It randomly splits your data into three sets with a ratio of 8:1:1 (train:validation:test) before training.
Your dataset of segment pairs is divided into three subsets, for training, validation and testing:
Training Set
Most of your data should be in the training set. This is the data
your model "sees" during training: it's used to learn the parameters of the
model and calculate the weights of the connections between nodes of the
neural network.
Validation Set
The validation set, sometimes called the dev set, is also used during the
training process. During model learning, the framework uses the training set to
train a suite of candidate models, and then uses the model's performance on the
validation set to choose the best model generated. It uses the model's
performance on the validation set to tune the model's hyperparameters, which are
variables that specify the model's structure. If you used the training set to
tune the hyperparameters, the model would end up overly focused on your training
data. Using a somewhat novel dataset to fine-tune model structure means your
model will generalize better.
Test Set
The test set is not involved in the training process at all. After the
model has completed its training entirely, the test set is used as an
entirely new challenge for your model. The performance of your model on
the test set is intended to give you a pretty good idea of how your model
will perform on real-world data.
If you don't manually specify how your dataset is split between these functions as described in Preparing your training data, and if your dataset contains fewer than 100,000 segment pairs, then Cloud Translation - Advanced API automatically uses 80% of your content documents for training, 10% for validating, and 10% for testing. If your data is larger than that, you must explicitly specify how it is split. Manual splitting gives you more control over the process, not only letting you determine the split percentages, but also letting you specify particular sets in which to include particular segment pairs.
After you've decided whether a manual or automatic split of your data is right for you, there are two ways to add data:
You can import data as a tab-separated values (TSV) file containing source and target segments, one segment pair per line.
<tmx> element is missing, Cloud Translation - Advanced API ends processing and
returns an error if it skips more than 1024 invalid `' elements.
After your model is trained, you receive a summary of your model performance. Click the Train tab to view a detailed analysis. The BLEU score of your custom model and of the standard Google NMT model are shown in the Train tab, along with the BLEU score performance gain from using the custom mode.
The higher the BLEU score, the better translations your model can give you for segments that are similar to your training data. Scores in the range 30-40 are considered good. For a detailed explanation of BLEU scores, see The BLEU translation quality metric BLEU.
There are other evaluation metrics that are often more reliable than the BLEU score. For information about those evaluation options, see Evaluate translation models.
Debugging a custom model is more about debugging the data than the model itself. If your model is not translating the way you intend, check your data to see where it can be improved.
Even if your BLEU score looks okay, it's a good practice to check the model yourself to make sure its performance matches your expectations. If your training and test data are drawn from the same incorrect set of samples, the scores might be excellent even if the translation is nonsense. Add some examples as input on the Predict tab and compare the results from the custom model with the Google NMT base model. You might notice that your model comes up with the same predictions as the base model, especially on short segments or if you have a smaller training set, since the base model is already pretty good for a wide variety of use cases. In that case, try longer or more complex segments. However, if all of your segments come back identical to the predictions from the base model, it might indicate a data problem.
If there's a mistake that you're particularly worried about your model making (like a translation mistake that might be costly in money or reputation) make sure your test set or procedure covers that case adequately for you to feel safe using your model in everyday tasks.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-06-09 UTC.