(to be determined)
Comparing the definitions with my retribution.h file, it seems that the HEK programs only write bytes that are used to store data. That means it's not obviously definition type sizes that determine data storage. For example, you can see in my retribution.h file that I specify byte flags instead of word flags when the data is written as byte flags, even though the definition is word flags.
Definitions line 2785 and 2795
word_flags scale flags#these flags determine which fields are scaled by the contrail density
Retribution.h lines 3062-3064
// 64, 1
uint8_t flags; //toff 64 8-bit 0x7F
uint16_t scale_flags; //toff 66 16-bit 0x03FF
Since the definition is for two bytes, and two bytes are allocated into the file yet only the second byte is written (big endian in the file), then either the program or the compiler decided to "allocate" 16 bits of space, but the program only wrote one byte because it only used one byte.
What we learn from this initially is that the HEK does not show in Guerilla all the values it uses; it might use 32-bits of flags but show only 2 flags in the interface...Edited by sparky on Apr 19, 2019 at 07:43 AM
The most obvious sign of predetermined allocation is the definition of blocks containing their sizes. That, combined with the logic of the matter that they knew that they wanted to store it all into memory and that they were using static, calculated virtual memory addresses, means that all this is not done by a compiler flag specifying memory address alignment, but that it is allocated space, the struct sizes based in part upon the definition type sizes, with only used bytes overwritten.
So how did they determine the struct sizes based upon definition type sizes, if the compiler wasn't used to allocate space? There is a lot of padding sometimes, which is what led me to consider that perhaps some non-data Guerilla interface text, which is specified in the interface definitions, was allocated space. But all the larger padding values are actually defined in the interface definitions, and at first glance, at least the explanation and custom types are not assigned bytes in the file.
Bitmap definition lines 1045-1069
Field Class: bshw
Dialog Data Offset: 0x4da1d0
Type controls bitmap 'geometry'. All dimensions must be a power of two except for SPRITES and INTERFACE BITMAPS:
* 2D TEXTURES: Ordinary, 2D textures will be generated.
* 3D TEXTURES: Volume textures will be generated from each sequence of 2D texture 'slices'.
* CUBE MAPS: Cube maps will be generated from each consecutive set of six 2D textures in each sequence, all faces of a cube map must be square and the same size.
* SPRITES: Sprite texture pages will be generated.
* INTERFACE BITMAPS: Similar to 2D TEXTURES, but without mipmaps and without the power of two restriction.
Retribution.h line 2476
So it's not worth investigating this with the other definition types as in extracthalotagdefs
datatype_string = 0x00,
datatype_char_integer = 0x01,
datatype_short_integer = 0x02,
datatype_long_integer = 0x03,
datatype_angle = 0x04,
datatype_tag = 0x05,
datatype_enum = 0x06,
datatype_long_flags = 0x07,
datatype_word_flags = 0x08,
datatype_byte_flags = 0x09,
datatype_point_2d = 0x0A,
datatype_rectangle_2d = 0x0B,
datatype_rgb_color = 0x0C,
datatype_argb_color = 0x0D,
datatype_real = 0x0E,
datatype_real_fraction = 0x0F,
datatype_real_point_2d = 0x10,
datatype_real_point_3d = 0x11,
datatype_real_vector_2d = 0x12,
datatype_real_vector_3d = 0x13,
datatype_real_quaternion = 0x14,
datatype_real_euler_angles_2d = 0x15,
datatype_real_euler_angles_3d = 0x16,
datatype_real_plane_2d = 0x17,
datatype_real_plane_3d = 0x18,
datatype_real_rgb_color = 0x19,
datatype_real_argb_color = 0x1A,
datatype_real_hsv_color = 0x1B,
datatype_real_ahsv_color = 0x1C,
datatype_short_integer_bounds = 0x1D,
datatype_angle_bounds = 0x1E,
datatype_real_bounds = 0x1F,
datatype_fraction_bounds = 0x20,
datatype_tag_reference = 0x21,
datatype_block = 0x22,
datatype_short_block_index = 0x23,
datatype_long_block_index = 0x24,
datatype_data = 0x25,
datatype_array_start = 0x26,
datatype_array_end = 0x27,
datatype_pad = 0x28,
datatype_skip = 0x29,
datatype_explanation = 0x2A,
datatype_custom = 0x2B,
datatype_terminator_X = 0x2C
Conclusion? The HEK wastes disk and memory space by specifying definition sizes larger than it needs. It's a programming flaw, not an artifact of the compilation process.
That's another credit towards using a database to store its values. But I knew that already, and you probably knew this already. Yet using a database to retrieve file metadata in part or whole would be slower than reading a single file, because you would have the overhead of the database software instead of the operating system kernel or whatever accesses the data on the filesystem for your program to store in memory. And I don't need constant analytics for file metadata; with the sparse files feature on recent filesystems, you don't have to worry so much about padded bytes, but as far as tag duplicates, that can all be checked by storing a checksum of each tag in the collection, and as far as overall data size, you can use quick compression types, and distribute archives in the read-only squashfs filesystem format.Edited by sparky on Apr 19, 2019 at 08:20 AMEdited by sparky on Apr 19, 2019 at 08:25 AMEdited by sparky on Apr 19, 2019 at 08:28 AM