Make translation feature work with NPC dialogues

Question

Make translation feature work with NPC dialogues

kristofbolyai opened this issue a year ago · 13 comments

kristofbolyai commented a year ago

Note to future self: Make sure to test all behavior with Chat Tabs Feature disabled!

The Issue

This issue is used to track the progress on making the mod handle translating NPC dialogues, and most importantly, displaying the translated version, either on the overlay, or in chat.

Most of the work needs to be done in ChatHandler itself, not TranslationService, most likely in the following order:

Fix some minor bugs in dialogue parsing (in ChatHandler)
Create some kind of NpcDialogueModel, and move the responsibility of tracking what dialogue is currently displayed there, from NpcOverlay.
- The model needs to have knowledge of "NPC dialogue extraction dependents", signal it to ChatHandler.
- The model has the responsibility of distributing the dialogues between consumers (if the overlay is disabled, the dialogue should fall back to the chat).
- Add the ability to use NPC dialogue extraction with a chat-based display mode
Make TranslationFeature work with NpcDialogueModel to easily distribute translated data
Optionally, explore integrating Wynnic/Gavellian transcription into the model
Fix the translation cache getting corrupted quite frequently

magicus · Answer 1 · 2024-03-04T11:18:44.000Z

A semi-related thought -- I wonder if there exists any new flashy LLM-based translation services that can be used as an alternative backend.

kristofbolyai · Answer 2 · 2024-03-04T11:20:53.000Z

A semi-related thought -- I wonder if there exists any new flashy LLM-based translation services that can be used as an alternative backend.

DeepL has been suggested before, I am thinking of implementing it.

magicus · Answer 3 · 2024-03-04T11:19:31.000Z

And a second semi-related thought -- this would pave the way for allowing an easy (or at least easier) integration of Voices of Wynn into Wynntils.

kristofbolyai · Answer 4 · 2024-03-04T11:21:47.000Z

And a second semi-related thought -- this would pave the way for allowing an easy (or at least easier) integration of Voices of Wynn into Wynntils.

Yes, we will now have knowledge of the current dialogues, so it's definitely a step.

magicus · Answer 5 · 2024-03-11T13:36:37.000Z

An additional bullet or two that I've thought about for a long time:

Have a crowd-sourcing service that stores translations. Basically, you'd make a request "lang = sv, tracked quest = Find the Gnuff, npc = Gnuffdorf Citizen, text = Beware, adventurer!" and if there is a previous Swedish translation for that text from the NPC, you'd get it back. Otherwise, the mod goes and ask the translation service, and instead provide that back to the server, which stores it for future use.
Allow users to edit the stored translation to correct machine translation errors. Will need some kind of moderation system etc. Maybe we can use CrowdIn for this better than for the literal strings of the mod.

In time, given enough players, this will build up a base of all dialogues, translated into all languages actually played by Wynntils users, with the convenience of machine translation, combined with the possibility of human curation.

kristofbolyai · Answer 6 · 2024-03-11T13:39:03.000Z

@magicus Even better solution, use ChatGPT :)

(I am actually thinking of implementing it in some way, maybe by pre-translating the crowd-sourced dialogues)

magicus · Answer 7 · 2024-03-12T12:38:05.000Z

I'm not sure ChatGPT does this better. Google Translate has actually been doing a great job, and they have spent years honing it on handling things like translating text with formatting. ChatGPT is cool and all, but it is a generic tool, and due to the nature of LLM it might very well decide that it do not need to keep formatting, or that it can shift it around etc.

But for both services, I think more context is better. So if we could collect all dialogue lines from every NPC, and send them in for translation at a single time, the result will definitely be better consistency. For instance, names or terms might otherwise be translated differently. Some NPCs have characteristics in their manner of speak (professional and urban, sloppy and lowly, etc) that might not show clearly in just one line but will show when seen in context, etc.

kristofbolyai · Answer 8 · 2024-03-12T12:39:51.000Z

I've been testing GPT, and while 3.5-turbo is pretty bad, 4.0-Turbo seems to do a very good job. I've sent some of my findings in Discord if you have time, and are interested.

magicus · Answer 9 · 2024-03-12T12:41:02.000Z

But if we just start by adding a crowd-source functionality in Athena, and send the lines as we encounter them, we can have the basis for doing much future fun. :-)

I think it is important to send not just the actual text content, but also send the active quest. Not all dialogues are associated with a quest, but most are, and we'd have to find a way to filter out the former from the latter. (Probably rather easy, if it is the same speaker and the same content, but with no quest active, or with different quests active, consider them general).

magicus · Answer 10 · 2024-03-12T12:41:53.000Z

I've noticed the discussion in Discord but have not had time yet to read it in detail. It might end up working better, sure, and alternatives are likely good. But I think the first step is collecting the data.

kristofbolyai · Answer 11 · 2024-03-12T12:42:40.000Z

Only if we can get some server infrastructure going, which is never easy...

magicus · Answer 12 · 2024-03-12T12:56:44.000Z

I started a thread about this in Discord.

kristofbolyai · Answer 13 · 2024-03-12T17:52:42.000Z

This issue is done. Any other improvements for the translation feature itself should be collected in a separate issue. As for Crowd Sourced NPC Dialogues, we are tracking that internally.

Share to

Note to future self: Make sure to test all behavior with Chat Tabs Feature disabled!

The Issue