The encoding T1 of A6.7.40 in doc#1 is the same encoding mov(3) of A7.1.44 in doc#2 .
Call the encoding, "mov3-T1-encoding". Furthermore, if the encoding is utilized with both Rd and Rm as low registers, call it "mov3-T1-low-encoding".
Summary:
The comment of A6.7.40 in doc#1 is not about the ability of the mov3-T1-encoding to receive high registers - it can receive them even for armv4t. It is about whether or not mov3-T1-low-encoding is legit - it is not on architectures < armv6; it is on architectures >= armv6.
Some details:
For architectures < armv6, the usage of mov3-T1-low-encoding is declared unpredictable by doc#2.
For architectures >= armv6, doc#1 says that only armv6-m and armv7-m can utilize mov3-T1-low-encoding, which is consistent with the above comment about the unpredictability on architectures < armv6.
However, doc#1 is silent about how to generate mov3-T1-low-encoding. The Operation section for mov3-T1-encoding in doc#1 /seems/ to imply that one can
write "mov r1, r6" and the mov3-T1-low-encoding will be generated. But I am wrong, at least in the case of Arm's gcc toolchain.
The doc#2 has more details on how to generate mov3-T1-low-encoding. It says that if "mov Rd, Rm" is given as source, with both Rd and Rm as low registers, the
assembler should generate a flag-setting copy by emitting "adds Rd, Rm, #0".
To actually generate mov3-T1-low-encoding as output, when both Rd and Rm are low registers, the source must use "cpy Rd, Rm", where cpy is a mnemonic specifically intended to be used for the purpose of generating a non-flag-setting copy between low registers. The mnemonic "cpy" is available on architectures >= armv6.
So, there are two sides to the issue: The behaviour of the cpu when encountering a mov3-T1-low-encoding, and the generation of the said encoding.
If a cpu adhering to an architecture < armv6 encountered the mov3-T1-low-encoding, the results are declared unpredictable. Thus, a toolchain is not supposed to generate mov3-T1-low-encoding when building for architectures < armv6.
Evidently, Arm's gcc toolchain adheres to this rule. It emits encoding-for-"adds r1, r6, #0" when assembling "mov r1, r6" as thumb not only for armv4 and armv5, but also for armv6, armv7 (and possibly more that I did not test). That is, mov-between-low-registers in assembler source code is always treated as flag-setting.
It emits the mov3-T1-low-encoding (which is non-flag-setting) when assembling "cpy Rd, Rm" as thumb, for any high-low combination of legal register-arguments of cpy.
For the "mov Rd, Rm" thumb instruction in the source code:
If [architecture < armv6] and [both registers are low], then generate the [flag-setting encoding for "adds Rd, Rm, #0"]
If [architecture < armv6] and [at least one register is high], then generate the [non-flag-setting mov3-T1-encoding]
If [architecture >= armv6] and [both registers are low], then generate the [flag-setting encoding for "adds Rd, Rm, #0"]
If [architecture >= armv6] and [at least one register is high], then generate the [non-flag-setting mov3-T1-encoding]