RatingBuster Classic

RatingBuster Classic

51.6k Downloads

Autogenerate StatLogic locales

raethkcj opened this issue ยท 1 comments

commented

Motivation

StatLogic's locales (note: not RatingBuster's locales) are a major pain point in keeping stat summaries accurate, even in English. They're not a typical locale file because they're not just enUS -> xxYY translations, they're a reverse mapping of strings Blizzard uses to stats defined in the addon. For this reason, it's very difficult for localizers to know what to translate. They can't just translate the English strings, they have to know (or guess) which items have the relevant stat string, as well as which garbled form will actually be passed to StatIDLookup after it's been broken down by DeepScan. /sldebug helps with this some, but it still requires localizers to have technical knowledge of both the game, and the internals of StatLogic. That's not a sustainable expectation.

Goal

I'd like to remove most of StatLogic's "scanners" (SinglePlus, SingleEquip, PreScan (non-exclude part), DeepScan and all its children), and leave only WholeText and Substitution. PrefixExclude, ColorExclude, and the exclude part of PreScan might be removed eventually, but at the very least they are helpful for debugging Substitution while it's still a work-in-progress.

Removing these scanners would allow all of their associated tables in the locale files to be deleted as well: PrefixExclude, PreScanPatterns, DeepScanSeparators, DeepScanWordSeparators, DualStatPatterns, and DeepScanPatterns, as well as the miscellaneous translations of tonumber, SinglePlusStatCheck, SingleEquipStatCheck, etc.

Then, I'd like to use a combination of GlobalStrings, and lists of strings generated directly from CSV dumps of DB2s to populate the remaining tables, WholeTextLookup and StatIDLookup. There would be no exclusions; anything not matching WholeText or Substitution could immediately be discarded. The handful of exclusions remaining in GlobalPatterns could also be removed. There would be no more need for hand-translating these files, and new stats could instead be added as inputs to the (yet unwritten) locale-independent script that generates the strings for all locales.

Implementation

  • Write the Substitution scanner, which strips a tooltip line of several prefixes and suffixes, lowercases it, and replaces numbers with %s, such that the processed string closely matches the format they appear in GlobalStrings, spell descriptions, and enchant descriptions.
  • Add as many GlobalStrings as possible to GlobalPatterns, and process them into the same format as the Substitution scanner, so they can be directly looked up in StatIDLookup.
  • For existing entries in enUS StatIDLookup, identify which are now redundant due to GlobalPatterns, and which are real unique stat strings. Delete redundancies, convert remaining to Substitution format, and identify which DB2, ID, and WoW flavor their text & Stat are relevant to.
  • Write a script that takes as input the identified table of DB2s, IDs, WoW flavors, and Stats, and parses dumped CSVs to obtain all the variants of that string in each flavor, and write them to a single file for each locale (including locales we don't yet support!)
commented

All pre-existing enUS stats have been identified by their Enchant or Spell ID and expansion!

Some minor notes:

  • Strings used on gems are too irregular to just generate them from _SHORT GlobalStrings or similar. Therefore I'll just parse all SpellItemEnchantment entries that are gems. In Wrath this is directly in the table; for TBC I'll additionally need GemProperties.
  • A handful of those gem strings might make entries in the hardcoded list redundant. Won't cause any problems, those hardcoded entries could just be deleted.
  • Patterns unique to Atiesh variants should probably just be moved to WholeTextLookup, since they often contain 30 yards which would be tricky to parse around. This turned out to be wrong, Atiesh aura ranges use $a1 which is easily parseable!