Unfortunately that popularity directly translates to the AIs ability to digest and paraphrase a book. LLMs have been trained on what is available in computer text format, which means mostly internet sources. English has an outsized presence on the internet compared the to actual number of native speakers, so there’s magnitudes more training data for it than any other language. The models of other languages will be severely limited, if AI companies have spent the resources to train them at all.
There are many AI companies, including those that are based in countries where people communicate in other languages. What you are saying is not an insurmountable problem.
Unfortunately that popularity directly translates to the AIs ability to digest and paraphrase a book. LLMs have been trained on what is available in computer text format, which means mostly internet sources. English has an outsized presence on the internet compared the to actual number of native speakers, so there’s magnitudes more training data for it than any other language. The models of other languages will be severely limited, if AI companies have spent the resources to train them at all.
There are many AI companies, including those that are based in countries where people communicate in other languages. What you are saying is not an insurmountable problem.
Yes it is insurmountable. There is not enough non-english text in the world to be able to train an LLM.