0
0
mirror of https://github.com/python/cpython.git synced 2024-11-22 05:26:10 +01:00
Commit Graph

298 Commits

Author SHA1 Message Date
Serhiy Storchaka
061e50f196
gh-122943: Add the varpos parameter in _PyArg_UnpackKeywords (GH-126564)
Remove _PyArg_UnpackKeywordsWithVararg.
Add comments for integer arguments of _PyArg_UnpackKeywords.
2024-11-08 14:23:50 +02:00
Victor Stinner
b9a8ca0a6a
gh-115754: Use Py_GetConstant(Py_CONSTANT_EMPTY_STR) (#125194)
Replace PyUnicode_New(0, 0), PyUnicode_FromString("")
and PyUnicode_FromStringAndSize("", 0)
with Py_GetConstant(Py_CONSTANT_EMPTY_STR).
2024-10-09 17:15:23 +02:00
Victor Stinner
c203955f3b
gh-124502: Optimize unicode_eq() (#125105) 2024-10-08 16:25:24 +02:00
Victor Stinner
12af8ec864
gh-121040: Use __attribute__((fallthrough)) (#121044)
Fix warnings when using -Wimplicit-fallthrough compiler flag.

Annotate explicitly "fall through" switch cases with a new
_Py_FALLTHROUGH macro which uses __attribute__((fallthrough)) if
available. Replace "fall through" comments with _Py_FALLTHROUGH.

Add _Py__has_attribute() macro. No longer define __has_attribute()
macro if it's not defined. Move also _Py__has_builtin() at the top
of pyport.h.

Co-Authored-By: Nikita Sobolev <mail@sobolevn.me>
2024-06-27 09:58:44 +00:00
Ruben Vorderman
945a89b48f
gh-120196: Reuse find_max_char() for bytes objects (#120497) 2024-06-17 12:21:58 +02:00
Ruben Vorderman
2078eb45ca
gh-120397: Optimize str.count() for single characters (#120398) 2024-06-13 16:28:59 +02:00
d.grigonis
a8f1152b70
gh-119879: str.find(): Utilize last character gap for two-way periodic needles (#119880) 2024-06-04 03:44:49 -04:00
Victor Stinner
0518edc170
gh-119396: Optimize unicode_repr() (#119617)
Use stringlib to specialize unicode_repr() for each string kind
(UCS1, UCS2, UCS4).

Benchmark:

+-------------------------------------+---------+----------------------+
| Benchmark                           | ref     | change2              |
+=====================================+=========+======================+
| repr('abc')                         | 100 ns  | 103 ns: 1.02x slower |
+-------------------------------------+---------+----------------------+
| repr('a' * 100)                     | 369 ns  | 369 ns: 1.00x slower |
+-------------------------------------+---------+----------------------+
| repr(('a' + squote) * 100)          | 1.21 us | 946 ns: 1.27x faster |
+-------------------------------------+---------+----------------------+
| repr(('a' + nl) * 100)              | 1.23 us | 907 ns: 1.36x faster |
+-------------------------------------+---------+----------------------+
| repr(dquote + ('a' + squote) * 100) | 1.08 us | 858 ns: 1.25x faster |
+-------------------------------------+---------+----------------------+
| Geometric mean                      | (ref)   | 1.16x faster         |
+-------------------------------------+---------+----------------------+
2024-05-28 18:05:20 +02:00
Serhiy Storchaka
b313cc68d5
gh-117557: Improve error messages when a string, bytes or bytearray of length 1 are expected (GH-117631) 2024-05-28 12:01:37 +03:00
Erlend E. Aasland
deb921f851
gh-117431: Adapt bytes and bytearray .find() and friends to Argument Clinic (#117502)
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used. The following bytes and bytearray methods are adapted:

- count()
- find()
- index()
- rfind()
- rindex()

Co-authored-by: Inada Naoki <songofacandy@gmail.com>
2024-04-12 07:40:55 +00:00
Victor Stinner
be5e8a0103
gh-110964: Remove private _PyArg functions (#110966)
Move the following private functions and structures to
pycore_modsupport.h internal C API:

* _PyArg_BadArgument()
* _PyArg_CheckPositional()
* _PyArg_NoKeywords()
* _PyArg_NoPositional()
* _PyArg_ParseStack()
* _PyArg_ParseStackAndKeywords()
* _PyArg_Parser structure
* _PyArg_UnpackKeywords()
* _PyArg_UnpackKeywordsWithVararg()
* _PyArg_UnpackStack()
* _Py_ANY_VARARGS()

Changes:

* Python/getargs.h now includes pycore_modsupport.h to export
  functions.
* clinic.py now adds pycore_modsupport.h when one of these functions
  is used.
* Add pycore_modsupport.h includes when a C extension uses one of
  these functions.
* Define Py_BUILD_CORE_MODULE in C extensions which now include
  directly or indirectly (via code generated by Argument Clinic)
  pycore_modsupport.h:

  * _csv
  * _curses_panel
  * _dbm
  * _gdbm
  * _multiprocessing.posixshmem
  * _sqlite.row
  * _statistics
  * grp
  * resource
  * syslog

* _testcapi: bad_get() no longer uses METH_FASTCALL calling
  convention but METH_VARARGS. Replace _PyArg_UnpackStack() with
  PyArg_ParseTuple().
* _testcapi: add PYTESTCAPI_NEED_INTERNAL_API macro which is defined
  by _testcapi sub-modules which need the internal C API
  (pycore_modsupport.h): exceptions.c, float.c, vectorcall.c,
  watchers.c.
* Remove Include/cpython/modsupport.h header file.
  Include/modsupport.h no longer includes the removed header file.
* Fix mypy clinic.py
2023-10-17 14:30:31 +02:00
Victor Stinner
ad73674283
gh-107603: Argument Clinic: Only include pycore_gc.h if needed (#108726)
Argument Clinic now only includes pycore_gc.h if PyGC_Head is needed,
and only includes pycore_runtime.h if _Py_ID() is needed.

* Add 'condition' optional argument to Clinic.add_include().
* deprecate_keyword_use() includes pycore_runtime.h when using
  the _PyID() function.
* Fix rendering of includes: comments start at the column 35.
* Mark PC/clinic/_wmimodule.cpp.h and
  "Objects/stringlib/clinic/*.h.h" header files as generated in
  .gitattributes.

Effects:

* 42 header files generated by AC no longer include the internal C
  API, instead of 4 header files before. For example,
  Modules/clinic/_abc.c.h no longer includes the internal C API.
* Fix _testclinic_depr.c.h: it now always includes pycore_runtime.h
  to get _Py_ID().
2023-08-31 23:42:34 +02:00
Victor Stinner
8ba4714611
gh-106320: Remove private AC converter functions (#108505)
Move these private functions to the internal C API
(pycore_abstract.h):

* _Py_convert_optional_to_ssize_t()
* _PyNumber_Index()

Argument Clinic now emits #include "pycore_abstract.h" when these
functions are used.

The parser of the c-analyzer tool now uses a list of files which use
the limited C API, rather than a list of files using the internal C
API.
2023-08-26 04:05:17 +02:00
Victor Stinner
4e5a7284ee
gh-108444: Argument Clinic uses PyLong_AsInt() (#108458)
Argument Clinic now uses the new public PyLong_AsInt(), rather than
the old name _PyLong_AsInt().
2023-08-25 00:51:22 +02:00
Victor Stinner
c38c66687f
gh-106320: Add pycore_complexobject.h header file (#106339)
Add internal pycore_complexobject.h header file.

Move _Py_c_xxx() functions and _PyComplex_FormatAdvancedWriter()
function to this new header file.
2023-07-02 21:19:59 +00:00
Victor Stinner
ef300937c2
gh-92536: Remove PyUnicode_READY() calls (#105210)
Since Python 3.12, PyUnicode_READY() does nothing and always
returns 0.
2023-06-02 01:33:17 +02:00
Victor Stinner
7d07e5891d
gh-105156: Cleanup usage of old Py_UNICODE type (#105158)
* refcounts.dat:

  * Remove Py_UNICODE functions.
  * Replace Py_UNICODE argument type with wchar_t.

* _PyUnicode_ToLowercase(), _PyUnicode_ToUppercase(),
  _PyUnicode_ToTitlecase() are no longer deprecated in comments.
  It's no longer needed since they now use Py_UCS4 type, rather than
  the deprecated Py_UNICODE type.
* gdb: Remove unused char_width() method.
2023-06-01 07:18:09 +00:00
Victor Stinner
135ec7cefb
gh-99537: Use Py_SETREF() function in C code (#99657)
Fix potential race condition in code patterns:

* Replace "Py_DECREF(var); var = new;" with "Py_SETREF(var, new);"
* Replace "Py_XDECREF(var); var = new;" with "Py_XSETREF(var, new);"
* Replace "Py_CLEAR(var); var = new;" with "Py_XSETREF(var, new);"

Other changes:

* Replace "old = var; var = new; Py_DECREF(var)"
  with "Py_SETREF(var, new);"
* Replace "old = var; var = new; Py_XDECREF(var)"
  with "Py_XSETREF(var, new);"
* And remove the "old" variable.
2022-11-22 13:39:11 +01:00
Victor Stinner
3a1dde8f29
gh-99300: Use Py_NewRef() in Objects/ directory (#99354)
Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and
Py_XNewRef() in C files of the Objects/ directory.
2022-11-10 23:58:07 +01:00
Victor Stinner
df3a6d9beb
gh-97982: Remove asciilib_count() (#98164)
asciilib_count() is the same than ucs1lib_count(): the code is not
specialized for ASCII strings, so it's not worth it to have a
separated function. Remove asciilib_count() function.
2022-10-11 17:59:58 +02:00
Dennis Sweeney
69d9a08099
gh-94808: improve comments and coverage of fastsearch.h (GH-96760) 2022-09-13 14:25:10 -04:00
Erlend E. Aasland
f07adf82f3
gh-90928: Improve static initialization of keywords tuple in AC (#95907) 2022-08-13 12:09:40 +02:00
Eric Snow
6f6a4e6cc5
gh-90928: Statically Initialize the Keywords Tuple in Clinic-Generated Code (gh-95860)
We only statically initialize for core code and builtin modules.  Extension modules still create
the tuple at runtime.  We'll solve that part of interpreter isolation separately.

This change includes generated code. The non-generated changes are in:

* Tools/clinic/clinic.py
* Python/getargs.c
* Include/cpython/modsupport.h
* Makefile.pre.in (re-generate global strings after running clinic)
* very minor tweaks to Modules/_codecsmodule.c and Python/Python-tokenize.c

All other changes are generated code (clinic, global strings).
2022-08-11 15:25:49 -06:00
Christian Heimes
774ef28814
gh-84461: Silence some compiler warnings on WASM (GH-93978) 2022-06-20 13:34:40 +02:00
goldsteinn
7108bdf27c
gh-93033: Use wmemchr in stringlib (GH-93034)
Generally comparable perf for the "good" case where memchr doesn't
return any collisions (false matches on lower byte) but clearly faster
with collisions.
2022-05-24 10:45:31 +09:00
Victor Stinner
f62ad4f2c4
gh-89653: Use int type for Unicode kind (#92704)
Use the same type that PyUnicode_FromKindAndData() kind parameter
type (public C API): int.
2022-05-13 12:41:05 +02:00
Victor Stinner
db388df1d9
gh-89653: PEP 670: Convert PyUnicode_KIND() macro to function (#92705)
In the limited C API version 3.12, PyUnicode_KIND() is now
implemented as a static inline function. Keep the macro for the
regular C API and for the limited C API version 3.11 and older to
prevent introducing new compiler warnings.

Update _decimal.c and stringlib/eq.h for PyUnicode_KIND().
2022-05-13 11:49:56 +02:00
Inada Naoki
f9c9354a7a
gh-92536: PEP 623: Remove wstr and legacy APIs from Unicode (GH-92537) 2022-05-12 14:48:38 +09:00
Victor Stinner
b270b82f11
gh-91320: Argument Clinic uses _PyCFunction_CAST() (#32210)
Replace "(PyCFunction)(void(*)(void))func" cast with
_PyCFunction_CAST(func).
2022-05-03 20:25:41 +02:00
Serhiy Storchaka
18b07d773e
bpo-36819: Fix crashes in built-in encoders with weird error handlers (GH-28593)
If the error handler returns position less or equal than the starting
position of non-encodable characters, most of built-in encoders didn't
properly re-size the output buffer. This led to out-of-bounds writes,
and segfaults.
2022-05-02 12:37:48 +03:00
Victor Stinner
097f74a5a3
bpo-46670: Define all macros for stringlib (GH-31176)
bytesobject.c, bytearrayobject.c and unicodeobject.c now define all
macros used by stringlib, to avoid using undefined macros.
Fix "gcc -Wundef" warnings.
2022-02-07 01:26:58 +01:00
Victor Stinner
0a883a76cd
bpo-35134: Add Include/cpython/floatobject.h (GH-28957)
Split Include/floatobject.h into sub-files: add
Include/cpython/floatobject.h and
Include/internal/pycore_floatobject.h.
2021-10-14 23:41:06 +02:00
Dennis Sweeney
d01dceb88b
bpo-41972: Tweak fastsearch.h string search algorithms (GH-27091) 2021-07-19 12:58:32 +02:00
E-Paine
e9f66aedf4
Remove effbot urls (GH-26308) 2021-05-22 14:09:54 +02:00
Jessica Clarke
dec0757549
bpo-43179: Generalise alignment for optimised string routines (GH-24624)
* Remove m68k-specific hack from ascii_decode

On m68k, alignments of primitives is more relaxed, with 4-byte and
8-byte types only requiring 2-byte alignment, thus using sizeof(size_t)
does not work. Instead, use the portable alternative.

Note that this is a minimal fix that only relaxes the assertion and the
condition for when to use the optimised version remains overly strict.
Such issues will be fixed tree-wide in the next commit.

NB: In C11 we could use _Alignof(size_t) instead, but for compatibility
we use autoconf.

* Optimise string routines for architectures with non-natural alignment

C only requires that sizeof(x) is a multiple of alignof(x), not that the
two are equal. Thus anywhere where we optimise based on alignment we
should be using alignof(x) not sizeof(x).

This is more annoying than it would be in C11 where we could just use
_Alignof(x) (and alignof(x) in C++11), but since we still require only
C99 we must plumb the information all the way from autoconf through the
various typedefs and defines.
2021-03-31 12:12:39 +02:00
Dennis Sweeney
73a85c4e1d
bpo-41972: Use the two-way algorithm for string searching (GH-22904)
Implement an enhanced variant of Crochemore and Perrin's Two-Way string searching algorithm, which reduces worst-case time from quadratic (the product of the string and pattern lengths) to linear. This applies to forward searches (like``find``, ``index``, ``replace``); the algorithm for reverse searches (like ``rfind``) is not changed.

Co-authored-by: Tim Peters <tim.peters@gmail.com>
2021-02-28 12:20:50 -06:00
Victor Stinner
32bd68c839
bpo-42519: Replace PyObject_MALLOC() with PyObject_Malloc() (GH-23587)
No longer use deprecated aliases to functions:

* Replace PyObject_MALLOC() with PyObject_Malloc()
* Replace PyObject_REALLOC() with PyObject_Realloc()
* Replace PyObject_FREE() with PyObject_Free()
* Replace PyObject_Del() with PyObject_Free()
* Replace PyObject_DEL() with PyObject_Free()
2020-12-01 10:37:39 +01:00
Victor Stinner
00d7abd7ef
bpo-42519: Replace PyMem_MALLOC() with PyMem_Malloc() (GH-23586)
No longer use deprecated aliases to functions:

* Replace PyMem_MALLOC() with PyMem_Malloc()
* Replace PyMem_REALLOC() with PyMem_Realloc()
* Replace PyMem_FREE() with PyMem_Free()
* Replace PyMem_Del() with PyMem_Free()
* Replace PyMem_DEL() with PyMem_Free()

Modify also the PyMem_DEL() macro to use directly PyMem_Free().
2020-12-01 09:56:42 +01:00
Ma Lin
a0c603cb9d
bpo-38252: Use 8-byte step to detect ASCII sequence in 64bit Windows build (GH-16334) 2020-10-18 17:48:38 +03:00
Victor Stinner
f363d0a6e9
bpo-40521: Make empty Unicode string per interpreter (GH-21096)
Each interpreter now has its own empty Unicode string singleton.
2020-06-24 00:10:40 +02:00
Victor Stinner
c41eed1a87
bpo-40521: Make bytes singletons per interpreter (GH-21074)
Each interpreter now has its own empty bytes string and single byte
character singletons.

Replace STRINGLIB_EMPTY macro with STRINGLIB_GET_EMPTY() macro.
2020-06-23 15:54:35 +02:00
Victor Stinner
c6b292cdee
bpo-29882: Add _Py_popcount32() function (GH-20518)
* Rename pycore_byteswap.h to pycore_bitutils.h.
* Move popcount_digit() to pycore_bitutils.h as _Py_popcount32().
* _Py_popcount32() uses GCC and clang builtin function if available.
* Add unit tests to _Py_popcount32().
2020-06-08 16:30:33 +02:00
Serhiy Storchaka
5f4b229df7
bpo-40792: Make the result of PyNumber_Index() always having exact type int. (GH-20443)
Previously, the result could have been an instance of a subclass of int.

Also revert bpo-26202 and make attributes start, stop and step of the range
object having exact type int.

Add private function _PyNumber_Index() which preserves the old behavior
of PyNumber_Index() for performance to use it in the conversion functions
like PyLong_AsLong().
2020-05-28 10:33:45 +03:00
Serhiy Storchaka
578c3955e0
bpo-37999: No longer use __int__ in implicit integer conversions. (GH-15636)
Only __index__ should be used to make integer conversions lossless.
2020-05-26 18:43:38 +03:00
Victor Stinner
d7c657d4b1
bpo-40302: UTF-32 encoder SWAB4() macro use a|b rather than a+b (GH-19572) 2020-04-17 19:13:34 +02:00
Victor Stinner
1ae035b7e8
bpo-40302: Add pycore_byteswap.h header file (GH-19552)
Add a new internal pycore_byteswap.h header file with the following
functions:

* _Py_bswap16()
* _Py_bswap32()
* _Py_bswap64()

Use these functions in _ctypes, sha256 and sha512 modules,
and also use in the UTF-32 encoder.

sha256, sha512 and _ctypes modules are now built with the internal
C API.
2020-04-17 17:47:20 +02:00
Serhiy Storchaka
8f87eefe7f
bpo-39943: Add the const qualifier to pointers on non-mutable PyBytes data. (GH-19472) 2020-04-12 14:58:27 +03:00
Serhiy Storchaka
cd8295ff75
bpo-39943: Add the const qualifier to pointers on non-mutable PyUnicode data. (GH-19345) 2020-04-11 10:48:40 +03:00
Benjamin Peterson
51796e5d26
Update some www.unicode.org URLs to use HTTPS. (GH-18912) 2020-03-10 21:10:59 -07:00
Serhiy Storchaka
eebaa9bfc5
bpo-38249: Expand Py_UNREACHABLE() to __builtin_unreachable() in the release mode. (GH-16329)
Co-authored-by: Victor Stinner <vstinner@python.org>
2020-03-09 20:49:52 +02:00