This article will cover some non-obvious things related to the use of wildcards when copying, ambiguous command behavior cp when copying, as well as ways to correctly copy a huge number of files without gaps and crashes.
Let's say we need to copy everything from the /source folder to the /target folder.
The first thing that comes to mind is:
cp /source/* /target
Let's change this command to:
cp -a /source/* /target
Key -a will add copying of all attributes, rights and add recursion. When exact reproduction of rights is not required, a key is sufficient -r.
After copying, we will find that not all files were copied - files starting with a dot like:
.profile
.local
.mc
and the like.
Why did this happen?
Because wildcards handles the shell (bash in the typical case). By default, bash will ignore all files that start with dots, because it treats them as hidden. To avoid this behavior, we have to change the behavior bash using the command:
shopt -s dotglob
To make this change in behavior persist after a reboot, you can make a wildcard.sh file with this command in the folder /etc/profile.d (perhaps in your distribution a different folder).
And if there are no files in the source directory, then the shell will not be able to substitute anything for the asterisk, and the copy will also fail. There are options for this kind of situation. failglob ΠΈ nullglob. We need to put failglob, which will prevent the command from executing. nullglob will not work, since it converts a string with wildcards that did not find a match to an empty string (zero length), which for cp will cause an error.
However, if there are thousands of files or more in the folder, then the wildcards approach should be completely abandoned. The fact is that bash expands wildcards into a very long command line like:
cp -a /souce/a /source/b /source/c β¦β¦ /target
There is a limit on the length of the command line, which we can find out using the command:
getconf ARG_MAX
Get the maximum command line length in bytes:
2097152
ΠΠ»ΠΈ:
xargs --show-limits
We get something like:
β¦.
Maximum length of command we could actually use: 2089314
β¦.
So, let's do without wildcards at all.
Let's just write
cp -a /source /target
And here we are faced with the ambiguity of behavior cp. If the /target folder does not exist, then we will get what we need.
However, if the target folder exists, then the files will be copied to the /target/source folder.
We cannot always delete the /target folder in advance, since it may contain the files we need and our goal, for example, is to supplement the files in /target with files from /source.
If the source and destination folders were named the same, for example, we would copy from /source to /home/source, then we could use the command:
cp -a /source /home
And after copying, the files in /home/source would be augmented with files from /source.
Such is the logical problem: we can add files in the destination directory if the folders are named the same, but if they are different, then the source folder will be placed inside the destination. How to copy files from /source to /target using cp without wildcards?
To get around this harmful limitation, we use a non-obvious solution:
cp -a /source/. /target
Those who are familiar with DOS and Linux have already understood everything: inside each folder there are 2 invisible folders β.β and "..", which are pseudo-folders-links to the current and parent directories.
When copying cp checks for existence and tries to create /target/.
Such a directory exists and it is /target
Files from /source are copied to /target correctly.
So, we hang in a bold frame in our memory or on the wall:
cp -a /source/. /target
The behavior of this team is clear. Everything will work without errors, regardless of whether you have a million files or none at all.
Conclusions
If you need to copy all files from one folder to another, do not use wildcards, instead it is better to use cp combined with a dot at the end of the source folder. This will copy all files, including hidden ones, and won't fail with millions of files or no files at all.