Staredit Network > Forums > SC1 Mapping Tools > Topic: Extended string table format discussion
Extended string table format discussion
May 15 2019, 8:19 pm
By: Suicidal Insanity  

May 15 2019, 8:19 pm Suicidal Insanity Post #1

I see you !

This is a suggestion for an extended string table format, in order to get around the 65k limit. Blizzard may implement a common suggestion in the future.

Goals are simplicity of changes and backwards compatibility.

To that aim I propose the following format

- Table chunk name: STRx, instead of STR. If the STRx chunk is present it has priority over the STR chunk.
- Requires the 1.22 map format IDs.
- All strings are encoded using UTF-8 (For consistency with SC:R).
- String offsets changed to uint32 from uint16.
- String count changed to uint32 (necessitates 32bit string IDs within SC - this I am not sure about. If 32bit string IDs are not possible, we limit the valid IDs to 16bit)

Other side effects of 32bit string IDs: Strings that are referenced within sections which store 16bit string indices must be stored in the first 65k strings. (Pointed out by jjf28). We can deal with this by reserving a certain string range for these strings in case we are in danger of overflowing into 32bit indices. (E.g. the last 2K indices are skipped when adding strings unless it is a unit name string). String recycling also needs to keep this in mind.

EUD triggers that are added via EUD compilers should be added to the end of the STRx section instead of the STR section - the rest should function as normal.

Before support is added to SC editors can write both STR and STRx, and keep editor only strings in STRx while leaving the corresponding indices empty in STR to save space.

I want to keep this simpler than the KSTR section just to minimize changes required when updating existing tools and especially the game, in order to hopefully get this earlier. If string parameters are required they can be stored in a second section, or between the end of string NULL byte and the next string. (Although the latter may lead to encoding/decoding errors)

Thoughts?

Post has been edited 1 time(s), last time on May 22 2019, 9:54 am by Suicidal Insanity.




May 15 2019, 9:47 pm jjf28 Post #2

Cartography Artisan

1.) This should have a data spec:
"STR " (current official STR section)


"STRx" (32bit stringIds, 32bit stringOffsets)


"STRx" (16bit stringIds, 32bit stringOffsets)


2.) For quick reference, here's every CHK section and their usage of strings...
CHK STR Usage


Of particular interest, while many of these would easily support 32-bit stringIds, MRGN, SPRP, FORC, UNIS, and UNIx are all 16-bit stringIds.


3.) The main design gap I feel we need to address is whether we use 16-bit or 32-bit stringIds...

Quote
We can deal with this by reserving a certain string range for these strings in case we are in danger of overflowing into 32bit ids. (E.g. the last 2K ids are skipped when adding strings unless it is a unit name string). String recycling also needs to keep this in mind
MRGN: Up to 255 strings
SPRP: Up to 2 strings
FORC: Up to 4 strings
UNIS: Up to 228 strings
UNIx: Up to 228 strings
Total: 717 strings

I would prefer we make 32-bit stringIds usuable, and I like your suggestion, but I want to make it very specific and say stringId 63488-65535 (inclusive, 2048 total) are always reserved for strings from the 16-bit stringId sections, and disallow any string that can use a 32bit id from being used here.

4.) The KSTR Section was designed on the assumption that no changes to StarCraft were being made and that we just wanted to enhance developers ability to document things in the map, not change the amount of loadable strings in the game; if something like this STRx section gets built into StarCraft I would deprecate the KSTR section and put out a one-time migration tool (maybe separate from Chkdraft).

5.)
Quote
- String offsets changed to uint32 from uint16.
For clarity you should write this "- String offsets changed from uint16 to uint32."

Post has been edited 2 time(s), last time on May 15 2019, 10:13 pm by jjf28.



Rs_yes-im4real - Clan Aura - jjf28.net84.net

Reached the top of StarCraft theory crafting 2:12 AM CST, August 2nd, 2014.

May 15 2019, 10:25 pm Suicidal Insanity Post #3

I see you !

Thanks for the cleaner text.

Quote from jjf28
3.) The main design gap I feel we need to address is whether we use 16-bit or 32-bit stringIds...
I agree. If blizzard uses 16bit IDs internally in the function calls, and they don't want to risk changed, we will be limited to 16 bit. Preferably I would go with 32bit just for future proofing.

Quote from jjf28
Quote
We can deal with this by reserving a certain string range for these strings in case we are in danger of overflowing into 32bit ids. (E.g. the last 2K ids are skipped when adding strings unless it is a unit name string). String recycling also needs to keep this in mind
MRGN: Up to 255 strings
SPRP: Up to 2 strings
FORC: Up to 4 strings
UNIS: Up to 228 strings
UNIx: Up to 228 strings
Total: 717 strings

I would prefer we make 32-bit stringIds usuable, and I like your suggestion, but I want to make it very specific and say stringId 63488-65535 (inclusive, 2048 total) are always reserved for strings from the 16-bit stringId sections, and disallow any string that can use a 32bit id from being used here.
I meant it as hard reservation - but I feel we can limit it to 1024 strings, and don't need the full 2k. Thats an implementation detail on our editor side though.

Quote from jjf28
4.) The KSTR Section was designed on the assumption that no changes to StarCraft were being made and that we just wanted to enhance developers ability to document things in the map, not change the amount of loadable strings in the game; if something like this STRx section gets built into StarCraft I would deprecate the KSTR section and put out a one-time migration tool (maybe separate from Chkdraft).

What I was trying to get at is we could store string metadata in a table before or after the string data, and blizzard doesn't need to know about the format. Which is actually what KSTR does, if the offsets are also relative to the start of the chunk.

EG:

Code
u32 numEntries; // Number of strings in the section (Default: 1024)
u32[numEntries] stringOffsets; // 1 integer for each string specifying the offset (the spot where the string starts in the section from the start of it).
u8[arbitrary data] // Store metadata here
void * stringData; // All strings in the map, one after another, each NUL terminated, by default starts with one NUL character which all unused stringOffsets point to


OR:

Code
u32 numEntries; // Number of strings in the section (Default: 1024)
u32[numEntries] stringOffsets; // 1 integer for each string specifying the offset (the spot where the string starts in the section from the start of it).
void * stringData; // All strings in the map, one after another, each NUL terminated, by default starts with one NUL character which all unused stringOffsets point to
u8[arbitrary data] // Store metadata here


OR:

Code
u32 numEntries; // Number of strings in the section (Default: 1024)
u32[numEntries] stringOffsets; // 1 integer for each string specifying the offset (the spot where the string starts in the section from the start of it).
void *stringData[0]
u8[arbitrary data] // Store metadata here
void *stringData[1]
u8[arbitrary data] // Store metadata here
etc





May 22 2019, 8:55 am Suicidal Insanity Post #4

I see you !

I think we should go with 32bit string count, but limit it to 16bit indices until we are sure that larger indices work - that way we have locked in the binary format but can change the meaning later on.




Options
  Back to forum
Please log in to reply to this topic or to report it.
Members in this topic: None.
[05:48 am]
O)FaRTy1billion[MM] -- SI when do we get more EUD condition/actions? It'd be nice to have .dat ones XD
[05:22 am]
O)FaRTy1billion[MM] -- I just heard the song
[05:17 am]
NudeRaider -- and I even heard it in the overmind voice
[05:16 am]
NudeRaider -- I am proud that I recognized it after the first 3 words :)
[04:23 am]
KrayZee -- Behold that I shall set you amongst the greatest of my Cerebrates, that you might benefit from their wisdom and experience. Yet your purpose is unique. While they carry forth my will to the innumerable Broods, you have but one charge entrusted to your care.
[04:23 am]
KrayZee -- Awaken my child, and embrace the glory that is your birthright. Know that I am the Overmind; the eternal will of the Swarm, and that you have been created to serve me.
[2019-5-23. : 7:33 am]
KrayZee -- NO
[2019-5-23. : 5:00 am]
Pr0nogo -- U
[2019-5-23. : 2:39 am]
RdeRenato -- xd
[2019-5-22. : 7:40 pm]
Suicidal Insanity -- I can just look at PyMS or something
Please log in to shout.


Members Online: Roy, Voyager7456, custoskanc, DarkenedFantasies, finanseitprik4