I think many people including me want to have the Japanese translation option for websites. I need to use now google translate for translating Japanese websites. And I don't like it to use every time google translate for that.
A question for @nordzilla, you mentioned training a model... Is the translation done by machine learning?
Usually these large language models are served as an online service since they require a lot of processing power, will this translator model somehow manage to run locally to ensure privacy and security are mantained, or does the translation tool need to send the source text over the network to be processed by a server running the LLM at Mozilla?
Hey @UnseenLurker, I appreciate you taking the time to ask!
Usually these large language models are served as an online service since they require a lot of processing power, will this translator model somehow manage to run locally to ensure privacy and security are mantained, or does the translation tool need to send the source text over the network to be processed by a server running the LLM at Mozilla?
Each of our models is trained for a specific language pair (e.g. Japanese to English). They distilled to a size of roughly 20 to 25 megabytes.
You can view the model sizes and release statuses on this dashboard.
When a translation is requested, the relevant models for the relevant language pair are then downloaded to the user's device. Our inference engine has been optimized to run the models locally on the device's CPU.
You can view the open-source models themselves in this repository.
The only information transferred over the wire is the download of the language-specific models to your device. The content of the source text and the translated output are not sent anywhere during translation. Once the models have been downloaded, an Internet connection is not required to translate.
You mention Japanese-English interconversion as an example, is this because this is a thread referring to Japanese? There is no mention of Japanese, Chinese, or Korean so-called CJK support on the official page or repo.
Note: That I'm also participating in Chrome Builtin-AI test, but I've not seen such a drastic change in direction. We are very excited about this endeavor and would like to give you keen feedback💪
Note2: After posting this, I should note that a similar case has started in early Chrome testing. The initial language pairs are 「en ←→es」 and 「en←→ja」.
The documentation is private, so I can't post it, but if you are participating, please check it out.
Glad to see this feature is finally taking shape. I am curious, though. How long does it typically take to train these models on a single language pair? Japanese/English for instance. I am interested to know if it takes long enough that this feature would release by the end of the year or if it would get pushed to 2025, as I am eagerly awaiting the ability to translate Japanese natively. Thanks!
Comments were not reflected and mine disappeared when I edited them and waited for them to be published. I'm sorry to be an unsympathetic opinion, but if there is a way to restore it, please do so.
@daifyi there was a firefox-translations repo which was shut down and directed discussion to this forum
Just adding my 30 yen - there's so many great Japanese blogs and portfolio sites begging to be read. Need some context for all the beautiful photos and craftsmanship! Also...navigating amazon.jp can be quite the adventure.
edit: wow actually looks like Amazon went ahead and added a toggle for both English and Chinese. Very nice. Seems to cover everything except for reviews (though you can translate them one by one just as on amazon.com)
I agree. Japanese translation is a very important feature, and many people are looking forward to it. Also, the current translation feature is easier to use and more convenient than the extension. Please do so.