cpython

mirror of https://github.com/python/cpython.git synced 2024-11-22 05:26:10 +01:00

Author	SHA1	Message	Date
Barney Gale	a6644d4464	GH-73991 : Rework `pathlib.Path.copytree()` into `copy()` (#122369) Rename `pathlib.Path.copy()` to `_copy_file()` (i.e. make it private.) Rename `pathlib.Path.copytree()` to `copy()`, and add support for copying non-directories. This simplifies the interface for users, and nicely complements the upcoming `move()` and `delete()` methods (which will also accept any type of file.) Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>	2024-08-11 22:43:18 +01:00
Barney Gale	98dba73010	GH-73991 : Rework `pathlib.Path.rmtree()` into `delete()` (#122368) Rename `pathlib.Path.rmtree()` to `delete()`, and add support for deleting non-directories. This simplifies the interface for users, and nicely complements the upcoming `move()` and `copy()` methods (which will also accept any type of file.)	2024-08-07 01:34:44 +01:00
Виталий Дмитриев	c4e8196940	Fix duplicated words 'begins with a' in pathlib docstring (#122732)	2024-08-06 18:38:33 +01:00
Barney Gale	c4c7097e64	GH-73991 : Support preserving metadata in `pathlib.Path.copytree()` (#121438) Add preserve_metadata keyword-only argument to `pathlib.Path.copytree()`, defaulting to false. When set to true, we copy timestamps, permissions, extended attributes and flags where available, like `shutil.copystat()`.	2024-07-20 23:32:52 +01:00
Barney Gale	094375b9b7	GH-73991 : Add `pathlib.Path.rmtree()` (#119060) Add a `Path.rmtree()` method that removes an entire directory tree, like `shutil.rmtree()`. The signature of the optional on_error argument matches the `Path.walk()` argument of the same name, but differs from the onexc and onerror arguments to `shutil.rmtree()`. Consistency within pathlib is probably more important. In the private pathlib ABCs, we add an implementation based on `walk()`. Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>	2024-07-20 20:14:13 +00:00
Barney Gale	88fc0655d4	GH-73991 : Support preserving metadata in `pathlib.Path.copy()` (#120806) Add preserve_metadata keyword-only argument to `pathlib.Path.copy()`, defaulting to false. When set to true, we copy timestamps, permissions, extended attributes and flags where available, like `shutil.copystat()`. The argument has no effect on Windows, where metadata is always copied. Internally (in the pathlib ABCs), path types gain `_readable_metadata` and `_writable_metadata` attributes. These sets of strings describe what kinds of metadata can be retrieved and stored. We take an intersection of `source._readable_metadata` and `target._writable_metadata` to minimise reads/writes. A new `_read_metadata()` method accepts a set of metadata keys and returns a dict with those keys, and a new `_write_metadata()` method accepts a dict of metadata. We might make these public in future, but it's hard to justify while the ABCs are still private.	2024-07-06 17:18:39 +01:00
Barney Gale	f09d184821	GH-73991 : Support copying directory symlinks on older Windows (#120807) Check for `ERROR_INVALID_PARAMETER` when calling `_winapi.CopyFile2()` and raise `UnsupportedOperation`. In `Path.copy()`, handle this exception and fall back to the `PathBase.copy()` implementation.	2024-07-03 04:30:29 +01:00
Barney Gale	35e998f560	GH-73991 : Add `pathlib.Path.copytree()` (#120718) Add `pathlib.Path.copytree()` method, which recursively copies one directory to another. This differs from `shutil.copytree()` in the following respects: 1. Our method has a follow_symlinks argument, whereas shutil's has a symlinks argument with an inverted meaning. 2. Our method lacks something like a copy_function argument. It always uses `Path.copy()` to copy files. 3. Our method lacks something like a ignore_dangling_symlinks argument. Instead, users can filter out danging symlinks with ignore, or ignore exceptions with on_error 4. Our ignore argument is a callable that accepts a single path object, whereas shutil's accepts a path and a list of child filenames. 5. We add an on_error argument, which is a callable that accepts an `OSError` instance. (`Path.walk()` also accepts such a callable). Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>	2024-06-23 22:01:12 +01:00
Barney Gale	20d5b84f57	GH-73991 : Add follow_symlinks argument to `pathlib.Path.copy()` (#120519) Add support for not following symlinks in `pathlib.Path.copy()`. On Windows we add the `COPY_FILE_COPY_SYMLINK` flag is following symlinks is disabled. If the source is symlink to a directory, this call will fail with `ERROR_ACCESS_DENIED`. In this case we add `COPY_FILE_DIRECTORY` to the flags and retry. This can fail on old Windowses, which we note in the docs. No news as `copy()` was only just added.	2024-06-19 00:59:54 +00:00
Barney Gale	7c38097add	GH-73991 : Add `pathlib.Path.copy()` (#119058) Add a `Path.copy()` method that copies the content of one file to another. This method is similar to `shutil.copyfile()` but differs in the following ways: - Uses `fcntl.FICLONE` where available (see GH-81338) - Uses `os.copy_file_range` where available (see GH-81340) - Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`. The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata. Incorporates code from GH-81338 and GH-93152. Co-authored-by: Eryk Sun <eryksun@gmail.com>	2024-06-14 17:15:49 +01:00
Barney Gale	242c7498e5	GH-116380 : Move pathlib-specific code from `glob` to `pathlib._abc`. (#120011) In `glob._Globber`, move pathlib-specific methods to `pathlib._abc.PathGlobber` and replace them with abstract methods. Rename `glob._Globber` to `glob._GlobberBase`. As a result, the `glob` module is no longer befouled by code that can only ever apply to pathlib. No change of behaviour.	2024-06-07 17:59:34 +01:00
Barney Gale	e83ce850f4	pathlib ABCs: remove duplicate `realpath()` implementation. (#119178) Add private `posixpath._realpath()` function, which is a generic version of `realpath()` that can be parameterised with string tokens (`sep`, `curdir`, `pardir`) and query functions (`getcwd`, `lstat`, `readlink`). Also add support for limiting the number of symlink traversals. In the private `pathlib._abc.PathBase` class, call `posixpath._realpath()` and remove our re-implementation of the same algorithm. No change to any public APIs, either in `posixpath` or `pathlib`. Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>	2024-06-05 18:54:50 +01:00
Barney Gale	7ff61f51b6	GH-119169 : Implement `pathlib.Path.walk()` using `os.walk()` (#119573) For silly reasons, pathlib's generic implementation of `walk()` currently resides in `glob._Globber`. This commit moves it into `pathlib._abc.PathBase.walk()` where it really belongs, and makes `pathlib.Path.walk()` call `os.walk()`.	2024-05-29 20:51:04 +00:00
Barney Gale	e418fc3a6e	GH-82805 : Fix handling of single-dot file extensions in pathlib (#118952) pathlib now treats "`.`" as a valid file extension (suffix). This brings it in line with `os.path.splitext()`. In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method that splits a path into a `(root, ext)` pair, like `os.path.splitext()`. This method is called by `PurePathBase.stem`, `suffix`, etc. In a future version of pathlib, we might make these base classes public, and so users will be able to define their own `splitext()` method to control file extension splitting. In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes` properties that don't use `splitext()`, which avoids computing the path base name twice.	2024-05-25 21:01:36 +01:00
Barney Gale	3c28510b98	GH-119113 : Raise `TypeError` from `pathlib.PurePath.with_suffix(None)` (#119124) Restore behaviour from 3.12 when `path.with_suffix(None)` is called.	2024-05-19 17:04:56 +01:00
Kirill Podoprigora	31a28cbae0	gh-119049 : Defer `import warnings` in `pathlib._local` (#119111)	2024-05-17 17:12:02 +01:00
Barney Gale	7d8725ac6f	GH-74033 : Drop deprecated `pathlib.Path` keyword arguments (#118793) Remove support for supplying keyword arguments to `pathlib.Path()`. This has been deprecated since Python 3.12.	2024-05-14 20:14:07 +00:00
Barney Gale	fbe6a0988f	GH-101357 : Suppress `OSError` from `pathlib.Path.exists()` and `is_()` (#118243) Suppress all `OSError` exceptions from `pathlib.Path.exists()` and `is_()` rather than a selection of more common errors as we do presently. Also adjust the implementations to call `os.path.exists()` etc, which are much faster on Windows thanks to GH-101196.	2024-05-14 17:53:15 +00:00
Barney Gale	f772d0d08a	GH-78707 : Drop deprecated `pathlib.PurePath.[is_]relative_to()` arguments (#118780) Remove support for supplying additional positional arguments to `PurePath.relative_to()` and `is_relative_to()`. This has been deprecated since Python 3.12.	2024-05-10 15:53:46 +00:00
Barney Gale	b4bdf83cc6	GH-116380 : Revert move of pathlib globbing code to `pathlib._glob` (#118678) The previous change made the `glob` module slower to import, because it imported `pathlib._glob` and hence the rest of `pathlib`. Reverts `a40f557d7b`.	2024-05-07 00:32:48 +00:00
Barney Gale	d8d94911e2	Move pathlib implementation out of `__init__.py` (#118582) Use the `__init__.py` file only for imports that define the API, following the example of asyncio.	2024-05-05 20:57:19 +01:00
Barney Gale	a40f557d7b	GH-116380 : Move pathlib globbing implementation into `pathlib._glob` (#118562) Moving this code under the `pathlib` package makes it quite a lot easier to backport in the `pathlib-abc` PyPI package. It was a bit foolish of me to add it to `glob` in the first place. Also add `translate()` to `__all__` in `glob`. This function is new in 3.13, so there's no NEWS needed.	2024-05-03 20:29:25 +00:00
Andrew Zipperer	a6b610a94b	docs: typo: tiny grammar change: "pointed by" -> "pointed to by" (#118411) * docs: tiny grammar change: "pointed by" -> "pointed to by" This commit uses "file pointed to by" to replace "file pointed by" in - doc for shutil.copytree - docstring for shutil.copytree - docstring _abc.PathBase.open - docstring for pathlib.Path.open - doc for os.copy_file_range - doc for os.splice The docs use "file pointed to by" more frequently than "file pointed by". So, this commit replaces the uses of "file pointed by" in order to make the uses consistent through the docs. ```bash $ grep -ri 'pointed to by' cpython/ ``` yields more results than ```bash $ grep -ri 'pointed by' cpython/ ``` Separately: There are two occurrences of "tree pointed by": - cpython/Doc/library/xml.etree.elementtree.rst for `xml.etree.ElementInclude.include` - cpython/Lib/xml/etree/ElementInclude.py for `include` For those uses of "tree pointed by", I expect "tree pointed to by" instead. However, I found enough uses online of (a) "tree pointed by" rather than (b) "tree pointed to by" to convince me that (a) is in common use. So, this commit does not replace those occurrences of "tree pointed by" to "tree pointed to by". But I will replace them if a reviewer believes it is correct to replace them. * docs: typo: "exists and executable" -> "exists and is executable" --------- Co-authored-by: Andrew-Zipperer <atzipperer@gmail.com>	2024-05-02 05:37:12 +00:00
Barney Gale	15fbd53ba9	GH-112855 : Speed up `pathlib.PurePath` pickling (#112856) The second item in the tuple returned from `__reduce__()` is a tuple of arguments to supply to path constructor. Previously we returned the `parts` tuple here, which entailed joining, parsing and normalising the path object, and produced a compact pickle representation. With this patch, we instead return a tuple of paths that were originally given to the path constructor. This makes pickling much faster (at the expense of compactness). It's worth noting that, in the olden times, pathlib performed this parsing/normalization up-front in every case, and so using `parts` for pickling was almost free. Nowadays pathlib only parses/normalises paths when it's necessary or advantageous to do so (e.g. computing a path parent, or iterating over a directory, respectively).	2024-04-20 17:46:52 +01:00
Barney Gale	a74f117dab	GH-115060 : Speed up `pathlib.Path.glob()` by omitting initial `stat()` (#117831) Since `6258844c`, paths that might not exist can be fed into pathlib's globbing implementation, which will call `os.scandir()` / `os.lstat()` only when strictly necessary. This allows us to drop an initial `self.is_dir()` call, which saves a `stat()`. Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>	2024-04-14 00:08:03 +01:00
Barney Gale	30f0643e36	GH-117727 : Speed up `pathlib.Path.iterdir()` by using `os.scandir()` (#117728) Replace use of `os.listdir()` with `os.scandir()`. Forgo setting `_drv`, `_root` and `_tail_cached`, as these usually aren't needed. Use `os.DirEntry.path` to set `_str`.	2024-04-12 22:02:39 +00:00
Barney Gale	0eb52f5f26	GH-115060 : Speed up `pathlib.Path.glob()` by not scanning literal parts (#117732) Don't bother calling `os.scandir()` to scan for literal pattern segments, like `foo` in `foo/*.py`. Instead, append the segment(s) as-is and call through to the next selector with `exists=False`, which signals that the path might not exist. Subsequent selectors will call `os.scandir()` or `os.lstat()` to filter out missing paths as needed.	2024-04-12 22:19:21 +01:00
Barney Gale	0cc71bde00	GH-117586 : Speed up `pathlib.Path.walk()` by working with strings (#117726) Move `pathlib.Path.walk()` implementation into `glob._Globber`. The new `glob._Globber.walk()` classmethod works with strings internally, which is a little faster than generating `Path` objects and keeping them normalized. The `pathlib.Path.walk()` method converts the strings back to path objects. In the private pathlib ABCs, our existing subclass of `_Globber` ensures that `PathBase` instances are used throughout. Follow-up to #117589.	2024-04-11 01:26:53 +01:00
Barney Gale	6258844c27	GH-117586 : Speed up `pathlib.Path.glob()` by working with strings (#117589) Move pathlib globbing implementation into a new private class: `glob._Globber`. This class implements fast string-based globbing. It's called by `pathlib.Path.glob()`, which then converts strings back to path objects. In the private pathlib ABCs, add a `pathlib._abc.Globber` subclass that works with `PathBase` objects rather than strings, and calls user-defined path methods like `PathBase.stat()` rather than `os.stat()`. This sets the stage for two more improvements: - GH-115060: Query non-wildcard segments with `lstat()` - GH-116380: Unify `pathlib` and `glob` implementations of globbing. No change to the implementations of `glob.glob()` and `glob.iglob()`.	2024-04-10 20:43:07 +01:00
Barney Gale	6150bb2412	GH-77609 : Add recurse_symlinks argument to `pathlib.Path.glob()` (#117311) Replace tri-state `follow_symlinks` with boolean `recurse_symlinks` argument. The new argument controls whether symlinks are followed when expanding recursive `**` wildcards. The possible argument values correspond as follows: follow_symlinks recurse_symlinks =============== ================ False N/A None False True True We therefore drop support for not following symlinks when expanding non-recursive pattern parts; it wasn't requested in the original issue, and it's a feature not found in any shells. This makes the API a easier to grok by eliminating `None` as an option. No news blurb as `follow_symlinks` was new in 3.13.	2024-04-05 18:51:54 +00:00
Barney Gale	752e18389e	GH-114575 : Rename `PurePath.pathmod` to `PurePath.parser` (#116513) And rename the private base class from `PathModuleBase` to `ParserBase`.	2024-03-31 19:14:48 +01:00
Barney Gale	1dce0073da	pathlib ABCs: follow all symlinks in `PathBase.glob()` (#116293) Switch the default value of follow_symlinks from `None` to `True` in `pathlib._abc.PathBase.glob()` and `rglob()`. This speeds up recursive globbing. No change to the public pathlib classes.	2024-03-04 02:26:33 +00:00
Barney Gale	e3dedeae7a	GH-114610 : Fix `pathlib.PurePath.with_stem('')` handling of file extensions (#114612) Raise `ValueError` if `with_stem('')` is called on a path with a file extension. Paths may only have an empty stem if they also have an empty suffix.	2024-02-24 19:37:03 +00:00
Barney Gale	6f93b4df92	GH-115060 : Speed up `pathlib.Path.glob()` by removing redundant regex matching (#115061) When expanding and filtering paths for a `*` wildcard segment, build an `re.Pattern` object from the subsequent pattern parts, rather than the entire pattern, and match against the `os.DirEntry` object prior to instantiating a path object. Also skip compiling a pattern when expanding a `` wildcard segment.	2024-02-10 18:12:34 +00:00
Barney Gale	1b1f8398d0	GH-106747 : Make pathlib ABC globbing more consistent with `glob.glob()` (#115056) When expanding `` wildcards, ensure we add a trailing slash to the topmost directory path. This matches `glob.glob()` behaviour: >>> glob.glob('dirA/', recursive=True) ['dirA/', 'dirA/dirB', 'dirA/dirB/dirC'] This does not affect `pathlib.Path.glob()`, because trailing slashes aren't supported in pathlib proper.	2024-02-06 02:48:18 +00:00
Barney Gale	574291963f	pathlib ABCs: drop partial, broken, untested support for `bytes` paths. (#114777) Methods like `full_match()`, `glob()`, etc, are difficult to make work with byte paths, and it's not worth the effort. This patch makes `PurePathBase` raise `TypeError` when given non-`str` path segments.	2024-01-31 00:59:33 +00:00
Barney Gale	1667c28686	pathlib ABCs: raise `UnsupportedOperation` directly. (#114776) Raise `UnsupportedOperation` directly, rather than via an `_unsupported()` helper, to give human readers and IDEs/typecheckers/etc a bigger hint that these methods are abstract.	2024-01-31 00:38:01 +00:00
Barney Gale	fda7445ca5	GH-70303 : Make `pathlib.Path.glob('')` return both files and directories (#114684) Return files and directories from `pathlib.Path.glob()` if the pattern ends with ``. This is more compatible with `PurePath.full_match()` and with other glob implementations such as bash and `glob.glob()`. Users can add a trailing slash to match only directories. In my previous patch I added a `FutureWarning` with the intention of fixing this in Python 3.15. Upon further reflection I think this was an unnecessarily cautious remedy to a clear bug.	2024-01-30 19:52:53 +00:00
Barney Gale	809eed4805	GH-114610 : Fix `pathlib._abc.PurePathBase.with_suffix('.ext')` handling of stems (#114613) Raise `ValueError` if `with_suffix('.ext')` is called on a path without a stem. Paths may only have a non-empty suffix if they also have a non-empty stem. ABC-only bugfix; no effect on public classes.	2024-01-30 14:25:16 +00:00
Barney Gale	823a38a960	GH-79634 : Speed up pathlib globbing by removing `joinpath()` call. (#114623) Remove `self.joinpath('')` call that should have been removed in `6313cdde`. This makes `PathBase.glob('')` yield itself without adding a trailing slash. It's hard to say whether this is more or less correct, but at least everything else is faster, and there's no behaviour change in the public classes where empty glob patterns are disallowed.	2024-01-27 19:59:51 +00:00
Barney Gale	7e31d6dea2	gh-88569 : add `ntpath.isreserved()` (#95486) Add `ntpath.isreserved()`, which identifies reserved pathnames such as "NUL", "AUX" and "CON". Deprecate `pathlib.PurePath.is_reserved()`. --------- Co-authored-by: Eryk Sun <eryksun@gmail.com> Co-authored-by: Brett Cannon <brett@python.org> Co-authored-by: Steve Dower <steve.dower@microsoft.com>	2024-01-26 18:14:24 +00:00
Barney Gale	b69548a0f5	GH-73435 : Add `pathlib.PurePath.full_match()` (#114350) In `49f90ba` we added support for the recursive wildcard `` in `pathlib.PurePath.match()`. This should allow arbitrary prefix and suffix matching, like `p.match('foo/')` or `p.match('/foo')`, but there's a problem: for relative patterns only, `match()` implicitly inserts a `` token on the left hand side, causing all patterns to match from the right. As a result, it's impossible to match relative patterns from the left: `PurePath('foo/bar').match('bar/')` is true! This commit reverts the changes to `match()`, and instead adds a new `full_match()` method that: - Allows empty patterns - Supports the recursive wildcard `` - Matches the entire path when given a relative pattern	2024-01-26 01:12:46 +00:00
Barney Gale	1e610fb05f	GH-113225 : Speed up `pathlib.Path.walk(top_down=False)` (#113693) Use `_make_child_entry()` rather than `_make_child_relpath()` to retrieve path objects for directories to visit. This saves the allocation of one path object per directory in user subclasses of `PathBase`, and avoids a second loop. This trick does not apply when walking top-down, because users can affect the walk by modifying dirnames in-place. A side effect of this change is that, in bottom-up mode, subdirectories of each directory are visited in reverse order, and that this order doesn't match that of the names in dirnames. I suspect this is fine as the order is arbitrary anyway.	2024-01-20 03:06:00 +00:00
Barney Gale	6313cdde58	GH-79634 : Accept path-like objects as pathlib glob patterns. (#114017) Allow `os.PathLike` objects to be passed as patterns to `pathlib.Path.glob()` and `rglob()`. (It's already possible to use them in `PurePath.match()`) While we're in the area: - Allow empty glob patterns in `PathBase` (but not `Path`) - Speed up globbing in `PathBase` by generating paths with trailing slashes only as a final step, rather than for every intermediate directory. - Simplify and speed up handling of rare patterns involving both `**` and `..` segments.	2024-01-20 02:10:25 +00:00
Barney Gale	4de4e654e5	Replace `pathlib._abc.PathModuleBase.splitroot()` with `splitdrive()` (#114065) This allows users of the `pathlib-abc` PyPI package to use `posixpath` or `ntpath` as a path module in versions of Python lacking `os.path.splitroot()` (3.11 and before).	2024-01-14 23:06:04 +00:00
Barney Gale	ca6cf56330	Add `pathlib._abc.PathModuleBase` (#113893) Path modules provide a subset of the `os.path` API, specifically those functions needed to provide `PurePathBase` functionality. Each `PurePathBase` subclass references its path module via a `pathmod` class attribute. This commit adds a new `PathModuleBase` class, which provides abstract methods that unconditionally raise `UnsupportedOperation`. An instance of this class is assigned to `PurePathBase.pathmod`, replacing `posixpath`. As a result, `PurePathBase` is no longer POSIX-y by default, and all its methods raise `UnsupportedOperation` courtesy of `pathmod`. Users who subclass `PurePathBase` or `PathBase` should choose the path syntax by setting `pathmod` to `posixpath`, `ntpath`, `os.path`, or their own subclass of `PathModuleBase`, as circumstances demand.	2024-01-14 21:49:53 +00:00
Barney Gale	21f83efd10	Add module docstring for `pathlib._abc`. (#113691)	2024-01-13 08:47:00 +00:00
Barney Gale	f20b151a1c	pathlib ABCs: add `_raw_path` property (#113976) It's wrong for the `PurePathBase` methods to rely so much on `__str__()`. Instead, they should treat the raw path(s) as opaque objects and leave the details to `pathmod`. This commit adds a `PurePathBase._raw_path` property and uses it through many of the other ABC methods. These methods are all redefined in `PurePath` and `Path`, so this has no effect on the public classes.	2024-01-13 08:03:21 +00:00
Barney Gale	e4ff131e01	GH-44626 , GH-105476 : Fix `ntpath.isabs()` handling of part-absolute paths (#113829) On Windows, `os.path.isabs()` now returns `False` when given a path that starts with exactly one (back)slash. This is more compatible with other functions in `os.path`, and with Microsoft's own documentation. Also adjust `pathlib.PureWindowsPath.is_absolute()` to call `ntpath.isabs()`, which corrects its handling of partial UNC/device paths like `//foo`. Co-authored-by: Jon Foster <jon@jon-foster.co.uk>	2024-01-13 07:36:05 +00:00
Barney Gale	5d8a3e74b5	pathlib ABCs: Require one or more initialiser arguments (#113885) Refuse to guess what a user means when they initialise a pathlib ABC without any positional arguments. In mainline pathlib it's normalised to `.`, but in the ABCs this guess isn't appropriate; for example, the path type may not represent the current directory as `.`, or may have no concept of a "current directory" at all.	2024-01-10 01:12:58 +00:00

1 2

69 Commits