Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's going to be diminishing returns in splitting the languages where you get less information related to the region / concept just because you're avoiding mixing languages. The language was not the only aspect: "cultural background, and in-depth regional knowledge". There's going to be lots of information shared in south/North languages just because of the geographically close (relatively anyway) distance.

I mean you wouldn't want to split a model into 3 separate ones, where one contains Austrian, another Slovakian, and another Hungarian, since there's going to be lots of cultural overlap.



I agree that it makes sense to group the Indic languages together due to cultural proximity but why would you group the Indic languages with Middle Eastern ones? Might as well group it with European or African or Sinitic languages at that point.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: