Wynntils

Wynntils

611k Downloads

Context-Based Translation Solution with Cloudflare Caching for NPC Dialogue

layue13 opened this issue · 7 comments

commented

Description

This issue proposes a context-aware NPC dialogue translation solution that uses Cloudflare’s CDN and Worker services to provide high-quality translations at minimal cost. By leveraging Cloudflare’s edge caching, the system can store translated dialogues at CDN nodes. Over time, this will reduce the need to use Workers and GPT for repeat translations, creating a highly cost-effective, even near-free, solution.


Background

In our NPC dialogue system, players interact with NPCs through sequential messages that often require translation. Understanding the full context of each message improves translation quality. By building a context chain for each dialogue (using the previous message's id), we can accurately translate each message in sequence. Over time, Cloudflare’s edge caching can store these translations, making them available instantly without invoking Workers.


Proposed Workflow

  1. Mod Layer: Capture and Send Individual Dialogues

    • Each time a player interacts with an NPC:
      • Capture the dialogue message and generate a unique id (text_hash) based on the content.
      • Include parent_id, which is the id of the previous dialogue, to maintain the context chain.
      • Send the dialogue data (including id, parent_id, and text) to Cloudflare Worker.
  2. Cloudflare Worker: Context Assembly and Translation

    • Step 1: Check Cache
      • Cloudflare CDN acts as the first layer of caching. If the translation is already cached at the CDN edge, it is served instantly.
    • Step 2: Worker Cache Check
      • If no CDN cache is available, the Worker checks its internal key-value cache for the current id. If found, it returns the cached translation.
    • Step 3: Build Context and Translate
      • If no cache is available, the Worker constructs the dialogue context chain from parent_id and sends it to GPT for translation.
    • Step 4: Cache Results at All Levels
      • The translated message is cached in the key-value store and returned through Cloudflare, where it is cached at the CDN edge for future requests.
  3. Data Storage Format

    • Each entry in the key-value database uses the id (text_hash) as the key, with the following JSON format:

      {
        "id": "text_hash",
        "parent_id": "previous_text_hash",
        "text": "Hello, adventurer! How can I assist you today?",
        "translations": {
          "zh_cn": {
            "text": "你好,冒险者!今天我能为你做些什么?"
          },
          "es_es": {
            "text": "¡Hola, aventurero! ¿Cómo puedo ayudarte hoy?"
          }
        }
      }
    • This format allows us to retrieve context and translations for multiple languages, with each language stored under translations by locale (e.g., zh_cn, es_es).

  4. Long-Term Efficiency via CDN Caching

    • As translations are repeatedly requested, they will be stored at Cloudflare’s CDN edge, allowing future requests to bypass Workers entirely.
    • This effectively creates a free translation system over time, as dialogues are increasingly served directly from CDN cache.

Example Workflow

  1. Initial Interaction

    • Player starts a dialogue with an NPC.
    • The first message, "Hello, adventurer!" is captured, assigned an id, and sent to Cloudflare Worker.
    • Worker finds no parent_id (indicating a new conversation), sends the message to GPT for translation, and caches the result at both the Worker level and the CDN edge.
  2. Subsequent Messages

    • Each new message in the conversation includes the previous message’s id as parent_id.
    • The Worker retrieves context by following parent_id values, sends the full context to GPT for translation, and caches the result.
    • Over time, frequently requested translations will be served directly from the CDN edge cache.

Benefits

  • Contextual Accuracy: By building the dialogue chain dynamically, we ensure contextually accurate translations.
  • Cost-Efficient: Caching at multiple levels (CDN, Worker) minimizes GPT calls, reducing operational costs significantly.
  • Near-Free Solution Over Time: As CDN caching fills up, Workers and GPT usage decrease, potentially creating a nearly free translation system.

Requirements

  • Mod Layer: Capture each NPC dialogue, generate id and parent_id, and send the data to Cloudflare Worker.
  • Cloudflare Worker: Check CDN and Worker cache, assemble context, translate via GPT, and cache the result.
  • Key-Value Database: Store dialogues as JSON, with id as the key, and fields for text, parent_id, and translations.

Additional Notes

This approach combines context-aware translation with multi-level caching, allowing us to provide accurate NPC dialogue translations with minimal ongoing costs. Edge caching at Cloudflare CDN will further optimize system efficiency over time, creating a sustainable and scalable solution.


Tasks

  1. Implement dialogue capture and transmission in the Mod layer.
  2. Set up Cloudflare Worker for handling translation requests, context building, and caching.
  3. Configure the key-value database for storing translations in JSON format.
  4. Enable CDN caching for Cloudflare Worker responses to maximize efficiency.

@magicus @kristofbolyai i guess we can talk here

commented

image

i guess it will help us to build context, but i can't find it in code.

commented

Even though I have a feeling the issue writeup is ChatGPT generated, let's ignore that, I understand the use case.

What my problem here is that this is took a 90 degree turn to the original "proposal" by magicus. We want to avoid on the fly translation by cloudflare. It creates many problems.

  1. People can pass garbage data and use our up our credit (and or prompt inject)
  2. The translations become un-reviewable as they happen on the fly
  3. The translation database becomes really tied to cloudflare, without us having a direct way to modify it (e.g. in the original idea, translations are stored on Github and only cached by some service from there)
  4. Even though you "build context", the approach will likely result in the first x dialogues always having slightly worse translations, then the last ones, as the AI hasn't had enough context. Usually you can pass a whole dialogue chain and it'll happily fit in the context window, and improve the translations.

I feel that my expression is not as good as GPT's, so I relay it to GPT and let GPT generate some explanations.

I have an idea: we should gradually collect the data that needs translation and use Google Translation to address immediate user needs. On the backend, we can continuously check for new submissions and implement filters to mitigate spam and malicious content. Once we've collected enough data, we can perform a unified, thorough translation. When users request translations, they will initially receive results from Google Translation unless we have better, reviewed translations available. This approach ensures accuracy and security while maintaining high-quality translations.

commented

Even though I have a feeling the issue writeup is ChatGPT generated, let's ignore that, I understand the use case.

What my problem here is that this is took a 90 degree turn to the original "proposal" by magicus. We want to avoid on the fly translation by cloudflare. It creates many problems.

  1. People can pass garbage data and use our up our credit (and or prompt inject)
  2. The translations become un-reviewable as they happen on the fly
  3. The translation database becomes really tied to cloudflare, without us having a direct way to modify it (e.g. in the original idea, translations are stored on Github and only cached by some service from there)
  4. Even though you "build context", the approach will likely result in the first x dialogues always having slightly worse translations, then the last ones, as the AI hasn't had enough context. Usually you can pass a whole dialogue chain and it'll happily fit in the context window, and improve the translations.
commented

And we have arrived to why we don't do this. We basically have the code to crowd-source dialogue lines, but have no backend to hold onto it, filter it, sanitize bad requests, etc. It's actually quite a lot of work if you think about it. The whole Google API translation part can be dropped, we really should just care about AI/ML translations here (as clients can do Google API translations already).

commented

@kristofbolyai I know we have discussed this before, but is there some way we can use Crowdin for this? At least they have some kind of system for reviewing and moderating changes; maybe we should study them a bit more.

I mostly worry about the correctness/manual verification part of this. Maybe it's a sign that I'm old, cynical and hardened, that I believe any system which allows unknown and anonymous persons to modify texts that are pushed to other users can and will be mis-used. In any case, without a believable story on how to handle that, this effort is dead in the water.

The rest is just (or "just") technical problems.

commented

You can use Crowdin, or review on Github, it'll probably work well enough.. We've had some sync issues with Crowdin if we skipped their UI and pushed to GH, but I've not seen their dashboard, and maybe it was just a missconfiguration.