mirror of
https://github.com/python/cpython.git
synced 2024-12-01 03:01:36 +01:00
dea48ec581
The idea is Marc-Andre Lemburg's, the implementation is Tim's. Add a new ma_smalltable member to dictobjects, an embedded vector of MINSIZE (8) dictentry structs. Short course is that this lets us avoid additional malloc(s) for dicts with no more than 5 entries. The changes are widespread but mostly small. Long course: WRT speed, all scalar operations (getitem, setitem, delitem) on non-empty dicts benefit from no longer needing NULL-pointer checks (ma_table is never NULL anymore). Bulk operations (copy, update, resize, clearing slots during dealloc) benefit in some cases from now looping on the ma_fill count rather than on ma_size, but that was an unexpected benefit: the original reason to loop on ma_fill was to let bulk operations on empty dicts end quickly (since the NULL-pointer checks went away, empty dicts aren't special-cased any more). Special considerations: For dicts that remain empty, this change is a lose on two counts: the dict object contains 8 new dictentry slots now that weren't needed before, and dict object creation also spends time memset'ing these doomed-to-be-unsused slots to NULLs. For dicts with one or two entries that never get larger than 2, it's a mix: a malloc()/free() pair is no longer needed, and the 2-entry case gets to use 8 slots (instead of 4) thus decreasing the chance of collision. Against that, dict object creation spends time memset'ing 4 slots that aren't strictly needed in this case. For dicts with 3 through 5 entries that never get larger than 5, it's a pure win: the dict is created with all the space they need, and they never need to resize. Before they suffered two malloc()/free() calls, plus 1 dict resize, to get enough space. In addition, the 8-slot table they ended with consumed more memory overall, because of the hidden overhead due to the additional malloc. For dicts with 6 or more entries, the ma_smalltable member is wasted space, but then these are large(r) dicts so 8 slots more or less doesn't make much difference. They still benefit all the time from removing ubiquitous dynamic null-pointer checks, and get a small benefit (but relatively smaller the larger the dict) from not having to do two mallocs, two frees, and a resize on the way *to* getting their sixth entry. All in all it appears a small but definite general win, with larger benefits in specific cases. It's especially nice that it allowed to get rid of several branches, gotos and labels, and overall made the code smaller. |
||
---|---|---|
.. | ||
.cvsignore | ||
abstract.c | ||
bufferobject.c | ||
cellobject.c | ||
classobject.c | ||
cobject.c | ||
complexobject.c | ||
dictobject.c | ||
fileobject.c | ||
floatobject.c | ||
frameobject.c | ||
funcobject.c | ||
intobject.c | ||
iterobject.c | ||
listobject.c | ||
longobject.c | ||
methodobject.c | ||
moduleobject.c | ||
object.c | ||
obmalloc.c | ||
rangeobject.c | ||
sliceobject.c | ||
stringobject.c | ||
tupleobject.c | ||
typeobject.c | ||
unicodectype.c | ||
unicodeobject.c | ||
unicodetype_db.h | ||
xxobject.c |