Staredit Network > Forums > SC1 Mapping Tools > Topic: Extended string table format discussion
Extended string table format discussion
May 15 2019, 8:19 pm
By: Suicidal Insanity  

May 15 2019, 8:19 pm Suicidal Insanity Post #1

I see you !

This is a suggestion for an extended string table format, in order to get around the 65k limit. Blizzard may implement a common suggestion in the future.

Goals are simplicity of changes and backwards compatibility.

To that aim I propose the following format

- Table chunk name: STRx, instead of STR. If the STRx chunk is present it has priority over the STR chunk.
- Requires the 1.22 map format IDs.
- All strings are encoded using UTF-8 (For consistency with SC:R).
- String offsets changed to uint32 from uint16.
- String count changed to uint32 (necessitates 32bit string IDs within SC - this I am not sure about. If 32bit string IDs are not possible, we limit the valid IDs to 16bit)

Other side effects of 32bit string IDs: Strings that are referenced within sections which store 16bit string indices must be stored in the first 65k strings. (Pointed out by jjf28). We can deal with this by reserving a certain string range for these strings in case we are in danger of overflowing into 32bit indices. (E.g. the last 2K indices are skipped when adding strings unless it is a unit name string). String recycling also needs to keep this in mind.

EUD triggers that are added via EUD compilers should be added to the end of the STRx section instead of the STR section - the rest should function as normal.

Before support is added to SC editors can write both STR and STRx, and keep editor only strings in STRx while leaving the corresponding indices empty in STR to save space.

I want to keep this simpler than the KSTR section just to minimize changes required when updating existing tools and especially the game, in order to hopefully get this earlier. If string parameters are required they can be stored in a second section, or between the end of string NULL byte and the next string. (Although the latter may lead to encoding/decoding errors)

Thoughts?

Post has been edited 1 time(s), last time on May 22 2019, 9:54 am by Suicidal Insanity.




May 15 2019, 9:47 pm jjf28 Post #2

Cartography Artisan

1.) This should have a data spec:
"STR " (current official STR section)


"STRx" (32bit stringIds, 32bit stringOffsets)


"STRx" (16bit stringIds, 32bit stringOffsets)


2.) For quick reference, here's every CHK section and their usage of strings...
CHK STR Usage


Of particular interest, while many of these would easily support 32-bit stringIds, MRGN, SPRP, FORC, UNIS, and UNIx are all 16-bit stringIds.


3.) The main design gap I feel we need to address is whether we use 16-bit or 32-bit stringIds...

Quote
We can deal with this by reserving a certain string range for these strings in case we are in danger of overflowing into 32bit ids. (E.g. the last 2K ids are skipped when adding strings unless it is a unit name string). String recycling also needs to keep this in mind
MRGN: Up to 255 strings
SPRP: Up to 2 strings
FORC: Up to 4 strings
UNIS: Up to 228 strings
UNIx: Up to 228 strings
Total: 717 strings

I would prefer we make 32-bit stringIds usuable, and I like your suggestion, but I want to make it very specific and say stringId 63488-65535 (inclusive, 2048 total) are always reserved for strings from the 16-bit stringId sections, and disallow any string that can use a 32bit id from being used here.

4.) The KSTR Section was designed on the assumption that no changes to StarCraft were being made and that we just wanted to enhance developers ability to document things in the map, not change the amount of loadable strings in the game; if something like this STRx section gets built into StarCraft I would deprecate the KSTR section and put out a one-time migration tool (maybe separate from Chkdraft).

5.)
Quote
- String offsets changed to uint32 from uint16.
For clarity you should write this "- String offsets changed from uint16 to uint32."

Post has been edited 2 time(s), last time on May 15 2019, 10:13 pm by jjf28.



Rs_yes-im4real - Clan Aura - jjf28.net84.net

Reached the top of StarCraft theory crafting 2:12 AM CST, August 2nd, 2014.

May 15 2019, 10:25 pm Suicidal Insanity Post #3

I see you !

Thanks for the cleaner text.

Quote from jjf28
3.) The main design gap I feel we need to address is whether we use 16-bit or 32-bit stringIds...
I agree. If blizzard uses 16bit IDs internally in the function calls, and they don't want to risk changed, we will be limited to 16 bit. Preferably I would go with 32bit just for future proofing.

Quote from jjf28
Quote
We can deal with this by reserving a certain string range for these strings in case we are in danger of overflowing into 32bit ids. (E.g. the last 2K ids are skipped when adding strings unless it is a unit name string). String recycling also needs to keep this in mind
MRGN: Up to 255 strings
SPRP: Up to 2 strings
FORC: Up to 4 strings
UNIS: Up to 228 strings
UNIx: Up to 228 strings
Total: 717 strings

I would prefer we make 32-bit stringIds usuable, and I like your suggestion, but I want to make it very specific and say stringId 63488-65535 (inclusive, 2048 total) are always reserved for strings from the 16-bit stringId sections, and disallow any string that can use a 32bit id from being used here.
I meant it as hard reservation - but I feel we can limit it to 1024 strings, and don't need the full 2k. Thats an implementation detail on our editor side though.

Quote from jjf28
4.) The KSTR Section was designed on the assumption that no changes to StarCraft were being made and that we just wanted to enhance developers ability to document things in the map, not change the amount of loadable strings in the game; if something like this STRx section gets built into StarCraft I would deprecate the KSTR section and put out a one-time migration tool (maybe separate from Chkdraft).

What I was trying to get at is we could store string metadata in a table before or after the string data, and blizzard doesn't need to know about the format. Which is actually what KSTR does, if the offsets are also relative to the start of the chunk.

EG:

Code
u32 numEntries; // Number of strings in the section (Default: 1024)
u32[numEntries] stringOffsets; // 1 integer for each string specifying the offset (the spot where the string starts in the section from the start of it).
u8[arbitrary data] // Store metadata here
void * stringData; // All strings in the map, one after another, each NUL terminated, by default starts with one NUL character which all unused stringOffsets point to


OR:

Code
u32 numEntries; // Number of strings in the section (Default: 1024)
u32[numEntries] stringOffsets; // 1 integer for each string specifying the offset (the spot where the string starts in the section from the start of it).
void * stringData; // All strings in the map, one after another, each NUL terminated, by default starts with one NUL character which all unused stringOffsets point to
u8[arbitrary data] // Store metadata here


OR:

Code
u32 numEntries; // Number of strings in the section (Default: 1024)
u32[numEntries] stringOffsets; // 1 integer for each string specifying the offset (the spot where the string starts in the section from the start of it).
void *stringData[0]
u8[arbitrary data] // Store metadata here
void *stringData[1]
u8[arbitrary data] // Store metadata here
etc





May 22 2019, 8:55 am Suicidal Insanity Post #4

I see you !

I think we should go with 32bit string count, but limit it to 16bit indices until we are sure that larger indices work - that way we have locked in the binary format but can change the meaning later on.




Jul 1 2019, 10:23 am T-warp Post #5



We could pass 65k limit by extending offsets to u32 and leaving indexes u16. Having "only" 65k strings is not the issue (who would use more than 65k strings anyway). Addressing them is the issue.

"STRx" (16bit stringIds, 32bit stringOffsets)


This way it wouldn't require any changes outside the STR(x) section as the string IDs would remain u16, which I think is more reasonable.




Jul 1 2019, 10:38 am Suicidal Insanity Post #6

I see you !

So my second option - that is the one I think will be most likely to be acceptable to blizzard in terms of changes required.




Jul 1 2019, 11:19 am T-warp Post #7



So my second option - that is the one I think will be most likely to be acceptable to blizzard in terms of changes required.
You still need to keep the original STR section in memory for some EUD maps to work. If you want metadata, you should use unreferenced string at the end of the STR section (which would most likely be removed by map compressing/protection tools). Altough if starcraft does use the STR section as source for all outputs, there will be a tool for real time translating that section by modifying offsets with more reason to utilize that unreferenced space than the editor. Make yourself another STR section for your metadata (it should be ignored by starcraft).




Jul 1 2019, 1:05 pm Suicidal Insanity Post #8

I see you !

I don't think staredit still needs to load the STR section. Either the external EUD compiler does not know about STRx - then it wouldn't work with maps that store data in STRx anyways. Or it does know about STRx and uses that to payload it's hidden triggers.

So - STRx is an optional section in SC:R format maps, but if it is present it overwrites the STR section in game memory.




Jul 2 2019, 5:47 pm T-warp Post #9



If it's in your power to propose new features, propose integer arithmetics. Like sets of 2 addresses in memory accessible through EUD for each operation. Writing to those addresses would perform those actions. That could speed up useful things in maps of all kinds (workarounds exist but are way too complicated).




Jul 2 2019, 6:52 pm Suicidal Insanity Post #10

I see you !

That's third on my list of suggestions - but obviously no promises. Usefulness wise it should be second but its got the lowest chance of implementation.

Post has been edited 1 time(s), last time on Jul 2 2019, 7:29 pm by Suicidal Insanity.




Jul 2 2019, 7:32 pm T-warp Post #11



That's third on my list of suggestions - but obviously no promises. Usefulness wise it should be second but its got the lowest chance of implementation.
Lowest chances and yet the easiest to implement. Just out of curiosity, what's the first?




Options
  Back to forum
Please log in to reply to this topic or to report it.
Members in this topic: None.
[08:29 am]
Wing Zero -- Rip iNcontroL. Im going to miss your cheeky casting.
[01:45 pm]
Pr0nogo -- hi
[01:41 pm]
neomirav -- Helloo...
[2019-7-21. : 11:21 am]
UEDCommander -- That worker hardcode shit is annoying
[2019-7-21. : 11:15 am]
UEDCommander -- I would like to point out that "successfully modded" and "done everything i want" are two big differences
[2019-7-21. : 5:46 am]
O)FaRTy1billion[MM] -- 🥝
[2019-7-21. : 4:47 am]
Pr0nogo -- kiwi
[2019-7-21. : 4:46 am]
A_of-s_t -- I think Corbo made it a mango at one point
[2019-7-21. : 4:46 am]
jjf28 -- what was it.. like... a peach originally?
[2019-7-21. : 4:46 am]
A_of-s_t -- lol :aofst: forgot about that one
Please log in to shout.


Members Online: Suicidal Insanity, Roy, Excalibur, Wing Zero, Moose