Q16 – How about data cleaning? What is your approach?

Companies cannot underestimate (and often only begin to understand) the effort required in data cleaning when they begin to export bilingual (parallel) data for machine learning. Due to CAT limitations and features, noise can enter in a sentence in the shape of unwanted code, but the concept of data cleaning goes beyond removing in-lines, as … Continue reading Q16 – How about data cleaning? What is your approach?