TAnyValue, an attempt to make the best “variable” data container
And the saga continues
. If you don’t know what the hell I am talking about read the first three articles in the series on this subject:
- TValue and other “Variants” like implementation tests
- TValue and other “Variants” like implementation tests – revised
- TValue and other “Variants” like implementation tests – finale
The latest resulting TAnyValue implementation presented in the last article was good, fast and had resonable memory size (12 bytes). But two things bothered me to no end:
- I could not finalize the record when it goes out of scope, so I can’t properly clean up, resulting in non-optimal solutions for data fields (dynamic array, IInterface…)
- I had to use three fields for data. One dynamic record, one IInterface and one dynamic array. This was so because I needed to store strings and interfaces along with extended type and have all that clean on its own when record falls out of scope. Yes I could only use IInterface for all non trivial data types but the speed goes down the drain. Having three fields was fine on 32 bit, resulting in 12 bytes size of the record, but on 64x is goes up to 24 bytes. I was not happy at all.
I don’t give up so easily so I pondered what are my options. I really do not understand why in all these years Delphi development team did not make records with destructors and constructors available. It would just take them one simple call to a procedure for each. Maybe I am not seeing something, but it would be trivial for them to make it. Doing that would open vast possibilities for developers. So I had to find a way to fix that shortcoming. The only solution I could see working, was hooking some of the procedures in System.pas. The main targets would be:
- _FinalizeRecord
- _CopyRecord
- _AddRefRecord
The most important here is _FinalizeRecord which is called by the compiler when record goes out of scope and such a record contains an open array or IInterface or a string variable. Basically each variable that needs finalization triggers the call to _FinalizeRecord. The problem is that this is compiler magic. Compiler knows at compile time what the record holds and calls the procedure if needed, when the record goes out of scope and is freed. So I had to hook it and make sure that it gets called for TAnyValue. I don’t like hooks, I see them as last resort to solve a particular problem. Here this was a last resort and it was not a system wide hook (which should be used only in very, very special cases). Before I would use the hook, I needed to see if certain conditions were meet.
- The hook needs to be stable under 32 and 64 bit.
- There should be minimal to no overhead from the hook
- The hook should not interfere with other hooks and with original procedure
- You should avoid hooking procedures or functions with very high call frequency
To answer that:
- I used KOLDetours.pas which will become part of the Cromis library. The unit is very stable and work equally well under 32 and 64 bit. Also the licence is very liberal and I can easily include it in my library. Naturally I will retain the licence and made perfectly clear who is the author and who the credit goes to.
- The overhead is basically non existent. All I do is one simple pointer comparison and then I either call my finalize or the original finalize.
- Because this is not a simple patch but a detour I just call the original _FinalizeRecord if the record is not TAnyValue. Also this technique allows for multiple hooks to coexist.
- _FinalizeRecord and _CopyRecord have not such a high frequency as GetMem or FreeMem and similar. They are frequently called but because of no overhead that is not an issue. I did not hook _AddRefRecord because I believe that it is not even ever called for the solution I made.
Ok to the implementation then. How does it all work? First we hook the procedures and get the pointer to the TAnyValue TypeInfo.
initialization vTypeInfo := TypeInfo(TAnyValue); OldCopyRecord := InterceptCreate(GetCopyRecordAddress, @CustomCopyRecord); OldFinalizeRecord := InterceptCreate(GetFinalizeRecordAddress, @CustomFinalizeRecord); |
Here we need the addresses of functions to hook. We can’t get them from pascal, because they are protected, compiler won’t see them that way. But we can get them with assembler.
initialization function GetFinalizeRecordAddress: Pointer; asm {$IFDEF CPUX64} mov rcx, offset System.@FinalizeRecord; mov @Result, rcx; {$ELSE} mov @Result, offset System.@FinalizeRecord; {$ENDIF} end; |
You can get any address this way. Now that we have the addresses, we can write our improved and specialized _FinalizeRecord and _CopyRecord for TAnyValue
procedure FinalizeAnyValue(p : PAnyValue); begin if p.ValueType <> avtNone then begin case p.ValueType of avtString, avtAnsiString, avtWideString: string(PValueData(@p.ValueData).VPointer) := ''; avtInterface: IInterface(PValueData(@p.ValueData).VPointer) := nil; {$IFNDEF CPUX64} avtFloat: FreeMem(PValueData(@p.ValueData).VPointer); {$ENDIF} end; // set type to none and erase all data PValueData(@p.ValueData).VPointer := nil; p.ValueType := avtNone; end; end; procedure CopyAnyValue(dest, source : PAnyValue); var dstData: PValueData; srcData: PValueData; begin dstData := PValueData(@dest.ValueData); srcData := PValueData(@source.ValueData); if dest.ValueType <> source.ValueType then begin FinalizeAnyValue(dest); dest.ValueType := source.ValueType; {$IFNDEF CPUX64} case dest.ValueType of avtFloat: GetMem(dstData.VPointer, SizeOf(Extended)); end; {$ENDIF} end; case source.ValueType of {$IFNDEF CPUX64} avtFloat: PExtended(dstData.VPointer)^ := PExtended(srcData.VPointer)^; {$ENDIF} avtInterface: IInterface(dstData.VPointer) := IInterface(srcData.VPointer); avtString, avtAnsiString, avtWideString: string(dstData.VPointer) := string(srcData.VPointer); else dstData^ := srcData^; end; end; |
All that is left are the detour functions.
procedure CustomCopyRecord(Dest, Source, TypeInfo: Pointer); begin if vTypeInfo = typeInfo then CopyAnyValue(PAnyValue(Dest), PAnyValue(Source)) else OldCopyRecord(Dest, Source, typeInfo); end; procedure CustomFinalizeRecord(p: Pointer; typeInfo: Pointer); begin if vTypeInfo = typeInfo then FinalizeAnyValue(PAnyValue(p)) else OldFinalizeRecord(p, typeInfo); end; |
See how little overhead there is. Basically none, I just compare two pointers and that is all. Also for other records then TAnyValue I just call the original functions. Here I have to say that Eric Grange who jumped in to the discussion at the last article helped me a lot with hooks and with record structure. So a public thanks to him for all the ideas and solutions he helped me provide with.
Ok the dirty hooking details are behind us. Now lets see how we did with the record structure. Because we can now cleanup properly, we can be much more creative with how we structure our record. I made the following structure.
PAnyValue = ^TAnyValue; TAnyValue = packed record private ValueData: IInterface; {$HINTS OFF} {$IFNDEF CPUX64} Padding : array [0..3] of Byte; {$ENDIF} {$HINTS ON} ValueType: TValueType; function GetAsInt64: Int64; inline; ... |
You may now wonder why the IInterface variable in the record. Well it is there for two purposes. First we need to trigger the _FinalizeRecord by the compiler and IInterface does just that. On the other hand it provides us with 4 bytes of space we can use. Now you probably know, why the additional array [0..3] of Byte. I need additional 4 bytes on 32 bit to store Int64, Double etc… directly without calling GetMem. On 64 bit IInterface itself is 8 bytes so no padding is needed. Quite a neat solution. It only takes 8 bytes on 32 bit and 64 bit per single record (plus one byte for type enumeration). Lets just look how data is stored and read. For most types its just like that
procedure TAnyValue.SetAsInteger(const Value: Integer); begin if ValueType <> avtInteger then begin Self.Clear; ValueType := avtInteger; end; // assign the actual value PValueData(@ValueData).VInteger := Value; end; function TAnyValue.GetAsInteger: Integer; begin if ValueType = avtInteger then Result := PValueData(@ValueData).VInteger else Result := GetAsIntegerWithCast; end; function TAnyValue.GetAsIntegerWithCast: Integer; begin case ValueType of avtBoolean: Result := Integer(GetAsBoolean); avtString: Result := StrToInt(GetAsString); avtAnsiString: Result := StrToInt(string(GetAsAnsiString)); avtWideString: Result := StrToInt(string(GetAsWideString)); else raise Exception.Create('Value cannot be converted to Integer'); end; end; |
Separate getters are because of the inlining and exceptions. Strings on the other hand are store like this
procedure TAnyValue.SetAsString(const Value: string); begin Self.Clear; ValueType := avtString; string(PValueData(@ValueData).VPointer) := Value; end; function TAnyValue.GetAsString: string; begin if ValueType = avtString then Result := string(PValueData(@ValueData).VPointer) else Result := GetAsStringWithCast; end; function TAnyValue.GetAsStringWithCast: string; begin case ValueType of avtNone: Result := ''; avtBoolean: Result := BoolToStr(AsBoolean, True); avtCardinal: Result := IntToStr(AsCardinal); avtInteger: Result := IntToStr(AsInteger); avtInt64: Result := IntToStr(AsInt64); avtFloat: Result := FloatToStr(AsFloat); avtDateTime: Result := DateTimeToStr(AsDateTime); {$IFDEF UNICODE} avtAnsiString: Result := string(AsAnsiString); {$ENDIF} avtWideString: Result := AsWideString; else raise Exception.Create('Value cannot be converted to string'); end; end; |
This way we have reference counting which is very important. Same goes for interfaces. The only special one is Extended. It is like this.
procedure TAnyValue.SetAsFloat(const Value: Extended); begin if ValueType <> avtFloat then begin Self.Clear; ValueType := avtFloat; {$IFNDEF CPUX64} GetMem(PValueData(@ValueData).VPointer, SizeOf(Extended)); {$ENDIF} end; {$IFNDEF CPUX64} PExtended(PValueData(@ValueData).VPointer)^ := Value; {$ELSE} PValueData(@ValueData).VDouble := Value; {$ENDIF} end; function TAnyValue.GetAsFloat: Extended; begin if ValueType = avtFloat then begin {$IFNDEF CPUX64} Result := PExtended(PValueData(@ValueData).VPointer)^ {$ELSE} Result := PValueData(@ValueData).VDouble {$ENDIF} end else Result := GetAsFloatWithCast; end; |
Because extended is 10 bytes on 32 bit systems, I refused to use that 2 additional bytes. It messes with aligning and its just not worth the trouble. The speed difference is negligible anyway. You might wonder why not just use Double? As you know Extended is mapped to Double on 64 bit anyway. I do use double. I have special getter and setter for Double type. The reason for using extended is twofold:
- Some people may need extended sometimes. The reasons may wary but still, they may need it.
- The compiler resolves I / 5 to extended type. So direct assignment fails.
This won’t compile if you have only Double on the implicit operators
AnyValue := I / 5; |
You have to write it like this
AnyValue.AsDouble := I / 5; |
The final tests, taken again, only on 32 bit XE3 are
Delphi XE3 test (times are in ms for 10000000 operations):
| Type | Variants | TValue | TAnyValue | TOmniValue | TVariableRec |
| j := I | 149 | 266 | 87 | 270 | 62 |
| j := I/5 | 236 | 316 | 101 | 4151 | 222 |
| j := IntToStr(I) | 3291 | 4856 | 2185 | 5776 | 1648 |
| ALL | 3933 | 5681 | 2462 | 12689 | 2205 |
The results show how close to the best speed, that a full blown record has, we came. Very close to the limit. I thinks this is really a very good variable container for data holding and transfer, way better then variants. And it can be further developed almost without limitations now that we can finalize. I also think that hooking is so stable and has no overhead that this is a perfectly viable solution. I will test this further and when the code is clean and well tested it will become the official TAnyValue.
You can download the current code along with the test here. It has to be added that the test only tests assigning variables and reading them. It does not address memory allocations / releases. I have a new test for that, but that is already way beyond the scope of this article.
Cesar Romero wrote,
Iztok,
Just to satisfy my curiosity
Did you have done testes with Free Pascal?
Are there any code of TAnyValue that is specific from Dephi?
Link | February 14th, 2013 at 12:05 pm
Iztok Kacin wrote,
No I have not done it with Free Pascal. I would like to support Free Pascal (Lazarus) where I can, but the main and immediate showstopper for me is the lack of name spaces support for file names in Lazarus. As I know the compiler supports them but the IDE does not. I will not rename all my units just to go around that. Furthermore I love dots in file names to separate files by namespaces. The other problem would probably be the inclusion of detours unit. AnyValue is platform agnostic, but detours could give some problems if it uses to many windows APIs.
Link | February 14th, 2013 at 12:21 pm
Eric wrote,
Handling _AddRef is required, it happens when you pass a managed record by value, f.i.
function TriggerAddRef(av : TAnyValue) : TAnyValue;
begin
Result:=av;
end;
when not inlined will make a call to both _AddRef & _CopyRecord.
Link | February 14th, 2013 at 2:13 pm
Iztok Kacin wrote,
Will do some test to see when and how often it is called, thanks. But I think this has little to no impact on the performance. Maybe in some very special cases.
Link | February 14th, 2013 at 2:37 pm
TURBU Tech » Blog Archive » The next RTTI bottleneck wrote,
[...] and started working on building or optimizing their own meta-value types. Some people are even still working on that today. But one of the most interesting things was Robert Love’s response. He looked at the [...]
Link | March 1st, 2013 at 7:01 am
Patrick van Logchem wrote,
{$HINTS OFF} on Padding can be removed if you reference this array from RemoveWarnings.
Link | March 8th, 2013 at 2:25 pm
Iztok Kacin wrote,
Thanks for the hint
Link | March 8th, 2013 at 2:32 pm
Patrick van Logchem wrote,
LOL!
Also, TAnyValue.SetAsBoolean and TAnyValue.SetAsCardinal are missing the Self.Clear construct, which could lead to a memory leak.
Link | March 8th, 2013 at 3:44 pm
Iztok Kacin wrote,
Thanks, fixed it. I will upload new code on the weekend with the improved sliced array. That is why going public is good, others can fix my code
Joking aside more people see the code the greater the chance they will see something I missed.
Link | March 8th, 2013 at 3:53 pm