Since we're talking about string recycling I want to remind everyone about an optimization opportunity that probably raised several times along these years.
Some maps have text-based inventory/menu systems that display items in a list like this (preview map
here):
== Menu ==
> Item 1
Item 2
Item 3
This is done by issuing a separate display text message for each line. For each item there are 2 strings stored: "> Item 1" and "Item 1", where the first one is used to display the current selection. The second such string is always a suffix of the first one. There is no need to store any data for the second string.
The simplified STR section for such a map could look something like:
[6][7][9][16][18][25][27]> Item 1[0]> Item 2[0]> Item 3[0]
Numbers are given in brackets, characters given without parentheses, each number and character is 16bit unsigned integer. The first number N (in the example N = 6) is total number of stored strings, then there are N numbers that specify string offset (zero-based) from the start of the section counted in double bytes (words as programmers call them). We see that strings 1 and 2, 3 and 4, 5 and 6 reuse common data chunks.
I really believe such menu systems use a really big portion of available string memory. Getting this optimization to work would almost halve that requirement. Assure you, this is enough to make at least RPG makers happy - they can add twice more content for their inventories!
Even more. For instance, creative mind can display quantities of specific inventory items with the number of characters '>' specified before the item text which is tabulation-padded. The current selection is color-encoded, the color comes as the first character:
== Menu ==
>>> Item 1
>> Item 2
>>>> Item 3
Some major savings there since all is needed to store a single string data chunk per item: "[color code]>>>> Item 1". Making the use of grey color is another opportunity (which is unfortunately broken as of the most recent patch, but I hope this is going to be fixed).
The following discusses the simplest implementation:Nothing fancy is required to make this work. From my perspective 3 updates are needed:
- When a new string for an entity is getting added the recycling mechanism should account not only for it's equality to the other string, but also for the case when it is the other string's suffix.
- When a string reference is getting deleted. If the deleted string was pointing to the start of a data chunk and there is no other string pointing there then truncate the data chunk from the beginning to the longest reused suffix or delete data chunk if it was the only usage at all. Otherwise if the deleted string was just a suffix of a reused string data chunk then do nothing except reduce chunk usage count.
- When a string is edited from the string editor and the resulting string becomes so short that it's suffix-substring would point to the string data chunk that comes next. This one is something we can't ignore. Probably the sanest solution is to unrecycle those suffix-substrings that "got out of the data chunk" with their previous contents.
The following is the example of the common workflow:Let's consider my example above. For instance I decide to change some default unit name to "m 1". The new string is there but no new data chunk is added. The existing suffix is reused for unit's name. STR section will look as follows:
[7][8][10][17][19][26][28][13]> Item 1[0]> Item 2[0]> Item 3[0]
Then I am changing the first string chunk in a string editor to "> Itexm 1". This isn't resulting in any unrecycling. Text in my triggers is changed to "> Itexm 1" and "Itexm 1" respectively, unit's name string still points to the 6'th character of the string chunk and looks like "xm 1" now. STR after this operation:
[7][8][10][18][20][27][29][13]> Itexm 1[0]> Item 2[0]> Item 3[0]
Let's say I'll do something weird to the first string chunk in a string editor. Like change it to something totally different: "hello world". As long as the resulting string is at least 6 characters we don't care. Text in my triggers now uses strings "hello world" and "llo world", unit's name is " world" (with leading space). STR after this operation:
[7][8][10][29][31][38][40][13]hello world[0]> Item 2[0]> Item 3[0]
The only case we do care for is such a change in the string editor that makes the first string chunk shorter than 6 characters. Let's say we change it to just "hell". It's clear that the unit's name string can't point to the 6'th character in the string now because the string is just 4 characters long. We just then unrecycle the string for the unit name with it's previous contents to a new string chunk:
[7][8][10][13][15][22][24][31]hell[0]> Item 2[0]> Item 3[0] world[0]
After the last operation we had 2 strings in our triggers text: "hell" and "ll". If I change the first one in text trigger editor to "enough" it is just getting unrecycled to the new string data chunk. Since there is no string pointing to the start of the first string chunk that one is getting truncated to the longest suffix under use which is "ll":
[7][38][8][11][13][20][22][29]ll[0]> Item 2[0]> Item 3[0] world[0]enough[0]
EDIT:The proposed solution isn't user-friendly, but SI-friendly instead
- it's the least changes required for this thing to work. Usually it means to the user: don't mess up with suffix-reused strings in a string editor. Adding a single color code or character closer to the start of such a string chunk in a string editor will shift all the following substrings data.
We could try to be more smart and detect the text insertion point or the deleted text to fix substring pointers accordingly, but sometimes it's hard to tell if several text blocks were inserted somewhere and several were deleted or the string was completely rewritten. Accounting for such a logic isn't an easy thing to do at all.
Post has been edited 5 time(s), last time on Nov 4 2017, 6:06 pm by Wormer.
Some.