Difference between revisions of "Talk:CreateBoolArray - Utility"

m
(Created page with "== Bool array fill bug == A coding error in SKSE means that the contents of Bool arrays are effectively random. SKSE and SKSE64 both have the same defect, which arises from ho...")
 
Line 4: Line 4:
An SKSE-specific data structure called <code>VMResultArray&lt;T&gt;</code> ([https://github.com/NightQuest/SKSE/blob/f22a633d632dc254edfe745b9348088eb8efa962/src/skse/skse/PapyrusArgs.h#L40 SKSE] / [https://github.com/ianpatt/skse64/blob/c024c40d6f04b7ff0c5610f2b3d3f4ca15fbb75b/skse64/PapyrusArgs.h#L40 SKSE64]), where <code>T</code> can be substituted out for any data type, is used to more easily build arrays, and then this class is translated into a Skyrim-specific <code>VMArray</code>.
An SKSE-specific data structure called <code>VMResultArray&lt;T&gt;</code> ([https://github.com/NightQuest/SKSE/blob/f22a633d632dc254edfe745b9348088eb8efa962/src/skse/skse/PapyrusArgs.h#L40 SKSE] / [https://github.com/ianpatt/skse64/blob/c024c40d6f04b7ff0c5610f2b3d3f4ca15fbb75b/skse64/PapyrusArgs.h#L40 SKSE64]), where <code>T</code> can be substituted out for any data type, is used to more easily build arrays, and then this class is translated into a Skyrim-specific <code>VMArray</code>.


<code>VMResultArray&lt;T&gt;</code> is a customized version of <code>std::vector&lt;T&gt;</code>, a standard C++ data type for resizable lists. To "pack" the contents into a <code>VMArray</code>, SKSE loops over the contents using an [https://docs.microsoft.com/en-us/cpp/standard-library/iterators?view=msvc-170 iterator]: a standard C++ wrapper object (think of a candy wrapper) that offers a unified way to loop over the contents of a variety of different lists, no matter how those lists arrange their contents in memory. To access the list item that an iterator wraps around, you need to "dereference" it by writing <code>*myIterator</code>. SKSE, then, dereferences an iterator to get a list item, takes the memory address (pointer) of the result, and casts that to a pointer of the appropriate type (just for good measure, I guess) before passing that pointer into a function that "packs" a single value into a <code>VMArray</code>. For most array types, this works fine: dereferencing the iterator gives you a value of the appropriate type; taking its address gives you a pointer of the appropriate type; the cast is redundant; and the code to "pack" a single value receives exactly what it expects to receive.
<code>VMResultArray&lt;T&gt;</code> is a customized version of <code>std::vector&lt;T&gt;</code>, a standard C++ data type for resizable lists. To "pack" the contents into a <code>VMArray</code>, SKSE loops over the contents using an [https://docs.microsoft.com/en-us/cpp/standard-library/iterators?view=msvc-170 iterator]: a standard C++ wrapper object (think of a candy wrapper) that offers a unified way to loop over the contents of a variety of different lists, no matter how those lists arrange their contents in memory. To access the list item that an iterator wraps around, you need to "dereference" (unwrap) it by writing <code>*myIterator</code>. For each item in the array, then, SKSE dereferences an iterator to get the list item, takes the memory address (pointer) of the result, and casts that to a pointer of the appropriate type (just for good measure, I guess) before passing that pointer into a function that "packs" a single value into a <code>VMArray</code>. For most array types, this works fine: dereferencing the iterator gives you a value of the appropriate type; taking its address gives you a pointer of the appropriate type; the cast is probably redundant; and the code to "pack" a single value receives exactly what it expects to receive.


Unfortunately, the C++ standard contains a massive footgun. Usually, <code>vector</code>s store their elements sequentially in memory, one after the other, exactly as those elemnts are normally encoded when writing code. However, [https://en.cppreference.com/w/cpp/container/vector_bool std::vector&lt;bool&gt;] is not stored the same way. It is a "possibly space-efficient specialization" and the way it stores its values is up to individual compiler developers (so, Microsoft, in this case). Typically, a bool value is stored as an entire byte, even though only one bit is actually meaningful, because it's just easier for a CPU to work with it that way. A <code>std::vector&lt;bool&gt;</code> may instead pack up to eight bools into each byte, and do a little extra work to actually pull individual bools out and work with them. This means that <code>std::vector&lt;bool&gt;</code> has to use a unique iterator. This iterator doesn't point directly to a single bool; instead, it constructs a special [https://en.cppreference.com/w/cpp/container/vector_bool/reference "proxy object"] that wraps around the bool. The contents of the proxy object are also up to whoever developed the compiler, but I would expect it to consist of a pointer to a <code>vector</code> and the index of a bool in that <code>vector</code>.
Unfortunately, the C++ standard contains a massive footgun. Usually, <code>vector</code>s store their elements sequentially in memory, one after the other, exactly as those elemnts are normally encoded when writing code. However, [https://en.cppreference.com/w/cpp/container/vector_bool std::vector&lt;bool&gt;] is not stored the same way. It is a "possibly space-efficient specialization" and the way it stores its values is up to individual compiler developers (so, Microsoft, in this case). Typically, a bool value is stored as an entire byte, even though only one bit is actually meaningful, because it's just easier for a CPU to work with it that way. A <code>std::vector&lt;bool&gt;</code> may instead pack up to eight bools into each byte, and do a little extra work to actually pull individual bools out and work with them. This means that <code>std::vector&lt;bool&gt;</code> has to use a unique iterator. This iterator doesn't point directly to a single bool; instead, it constructs a special [https://en.cppreference.com/w/cpp/container/vector_bool/reference "proxy object"] that wraps around the bool. The contents of the proxy object are also up to whoever developed the compiler, but I would expect it to consist of a pointer to a <code>vector</code> and the index of a bool in that <code>vector</code>.


Here's what happens when SKSE tries to return a Bool array. It iterates over the <code>VMResultArray&lt;bool&gt;</code>. It dereferences the iterator and gets what it ''assumes'' is a bool; however, it's actually a proxy object that wraps around a bool. SKSE then takes a pointer to the proxy object, incorrectly casts that to a bool pointer, and passes that to the function which "packs" a single value into a <code>VMArray</code>. That code therefore ''misreads'' the data in the proxy object and treats that data like a bool. It basically ends up stuffing literal nonsense into the bool array that SKSE wants to return.
Here's what happens when SKSE tries to return a Bool array. It iterates over the <code>VMResultArray&lt;bool&gt;</code>. It dereferences the iterator and gets what it ''assumes'' is a bool; however, it's actually a proxy object that wraps around a bool. SKSE then takes a pointer to the proxy object, incorrectly casts that to a bool pointer, and passes that to the function which "packs" a single value into a <code>VMArray</code>. That code therefore ''misreads'' the data in the proxy object and treats that data like a bool. It basically ends up stuffing literal nonsense into the Bool array that SKSE wants to return.


This bug prevents the ''fill'' parameter in <code>Utility.CreateBoolArray</code> from working properly. However, it also has a much further-reaching effect: SKSE DLLs that use SKSE's codebase cannot create any Papyrus APIs that return <code>Bool[]</code>. Any such APIs will have their return values corrupted en route to the game, resulting in them returning arrays that have the correct length but are filled with nonsensical (usually <code>True</code>) values. [[User:DavidJCobb|DavidJCobb]] ([[User talk:DavidJCobb|talk]]) 20:06, 31 August 2022 (EDT)
This bug prevents the ''fill'' parameter in <code>Utility.CreateBoolArray</code> from working properly. However, it also has a much further-reaching effect: SKSE DLLs that use SKSE's codebase cannot create any Papyrus APIs that return <code>Bool[]</code>. Any such APIs will have their return values corrupted en route to the game, resulting in them returning arrays that have the correct length but are filled with nonsensical (usually <code>True</code>) values. [[User:DavidJCobb|DavidJCobb]] ([[User talk:DavidJCobb|talk]]) 20:06, 31 August 2022 (EDT)
53

edits