diff --git a/content/blog/when-and-how-to-train-a-language-model/index.md b/content/blog/when-and-how-to-train-a-language-model/index.md index 763375c1..43595430 100644 --- a/content/blog/when-and-how-to-train-a-language-model/index.md +++ b/content/blog/when-and-how-to-train-a-language-model/index.md @@ -84,7 +84,7 @@ Many people underestimate the role data labeling can play in machine learning. I What _really_ makes for good models is annotated data, specifically “difficult” annotations, as they can teach your model to deal with cases that even humans find hard to handle. -While we’ll admit that annotation might not be the most fun work, there are tools to make the process easier for everyone. For example, the [Haystack annotation tool](https://www.deepset.ai/annotation-tool-for-labeling-datasets) provides the framework for a more streamlined process. Clear guidelines go a long way toward a well annotated and consistent dataset. It’s also valuable to engage with your own data intimately, as it will increase your understanding of the use case and why certain predictions may be hard for your model. +While we’ll admit that annotation might not be the most fun work, there are tools to make the process easier for everyone. For example, the [Haystack annotation tool](https://docs.haystack.deepset.ai/docs/annotation) provides the framework for a more streamlined process. Clear guidelines go a long way toward a well annotated and consistent dataset. It’s also valuable to engage with your own data intimately, as it will increase your understanding of the use case and why certain predictions may be hard for your model. So to really drive the point home: we recommend investing in _data annotation_ rather than model creation. Machine learning researchers have worked hard to come up with model architectures that emulate linguistic intuition faithfully, and new techniques are constantly emerging to make existing models smaller and faster. But you and your team’s expertise lies in your own data — and that is precisely the area where you can have the biggest impact on your models’ performance.