Opcode/Instruction | Op /En | 64/32 bit Mode Support | CPUID Feature Flag | Description |
---|---|---|---|---|
F3 0F 16 /r MOVSHDUP xmm1, xmm2/m128 |
RM | V/V | SSE3 | Move odd index single-precision floating-point values from xmm2/mem and duplicate each element into xmm1. |
VEX.128.F3.0F.WIG 16 /r VMOVSHDUP xmm1, xmm2/m128 |
RM | V/V | AVX | Move odd index single-precision floating-point values from xmm2/mem and duplicate each element into xmm1. |
VEX.256.F3.0F.WIG 16 /r VMOVSHDUP ymm1, ymm2/m256 |
RM | V/V | AVX | Move odd index single-precision floating-point values from ymm2/mem and duplicate each element into ymm1. |
EVEX.128.F3.0F.W0 16 /r VMOVSHDUP xmm1 {k1}{z}, xmm2/m128 |
FVM | V/V |
AVX512VL AVX512F |
Move odd index single-precision floating-point values from xmm2/m128 and duplicate each element into xmm1 under writemask. |
EVEX.256.F3.0F.W0 16 /r VMOVSHDUP ymm1 {k1}{z}, ymm2/m256 |
FVM | V/V |
AVX512VL AVX512F |
Move odd index single-precision floating-point values from ymm2/m256 and duplicate each element into ymm1 under writemask. |
EVEX.512.F3.0F.W0 16 /r VMOVSHDUP zmm1 {k1}{z}, zmm2/m512 |
FVM | V/V | AVX512F | Move odd index single-precision floating-point values from zmm2/m512 and duplicate each element into zmm1 under writemask. |
Op/En | Operand 1 | Operand 2 | Operand 3 | Operand 4 |
RM | ModRM:reg (w) | ModRM:r/m (r) | NA | NA |
FVM | ModRM:reg (w) | ModRM:r/m (r) | NA | NA |
Duplicates odd-indexed single-precision floating-point values from the source operand (the second operand) to adjacent element pair in the destination operand (the first operand). The source operand is an XMM, YMM or ZMM register or 128, 256 or 512-bit memory location and the destination operand is an XMM, YMM or ZMM register.
128-bit Legacy SSE version: Bits (MAX_VL-1:128) of the corresponding destination register remain unchanged.
VEX.128 encoded version: Bits (MAX_VL-1:128) of the destination register are zeroed.
VEX.256 encoded version: Bits (MAX_VL-1:256) of the destination register are zeroed.
EVEX encoded version: The destination operand is updated at 32-bit granularity according to the writemask.
Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.
VMOVSHDUP (EVEX encoded versions)
(KL, VL) = (4, 128), (8, 256), (16, 512) TMP_SRC[31:0] (cid:197) SRC[63:32] TMP_SRC[63:32] (cid:197) SRC[63:32] TMP_SRC[95:64] (cid:197) SRC[127:96] TMP_SRC[127:96] (cid:197) SRC[127:96] IF VL >= 256 TMP_SRC[159:128] (cid:197) SRC[191:160] TMP_SRC[191:160] (cid:197) SRC[191:160] TMP_SRC[223:192] (cid:197) SRC[255:224] TMP_SRC[255:224] (cid:197) SRC[255:224] FI; IF VL >= 512 TMP_SRC[287:256] (cid:197) SRC[319:288] TMP_SRC[319:288] (cid:197) SRC[319:288] TMP_SRC[351:320] (cid:197) SRC[383:352] TMP_SRC[383:352] (cid:197) SRC[383:352] TMP_SRC[415:384] (cid:197) SRC[447:416] TMP_SRC[447:416] (cid:197) SRC[447:416] TMP_SRC[479:448] (cid:197) SRC[511:480] TMP_SRC[511:480] (cid:197) SRC[511:480] FI; FOR j (cid:197) 0 TO KL-1 i (cid:197) j * 32 IF k1[j] OR *no writemask* THEN DEST[i+31:i] (cid:197) TMP_SRC[i+31:i] ELSE IF *merging-masking* ; merging-masking THEN *DEST[i+31:i] remains unchanged* ELSE ; zeroing-masking DEST[i+31:i] (cid:197) 0 FI FI; ENDFOR DEST[MAX_VL-1:VL] (cid:197) 0VMOVSHDUP (VEX.256 encoded version)
DEST[31:0] (cid:197) SRC[63:32] DEST[63:32] (cid:197) SRC[63:32] DEST[95:64] (cid:197) SRC[127:96] DEST[127:96] (cid:197) SRC[127:96] DEST[159:128] (cid:197) SRC[191:160] DEST[191:160] (cid:197) SRC[191:160] DEST[223:192] (cid:197) SRC[255:224] DEST[255:224] (cid:197) SRC[255:224] DEST[MAX_VL-1:256] (cid:197) 0VMOVSHDUP (VEX.128 encoded version)
DEST[31:0] (cid:197) SRC[63:32] DEST[63:32] (cid:197) SRC[63:32] DEST[95:64] (cid:197) SRC[127:96] DEST[127:96] (cid:197) SRC[127:96] DEST[MAX_VL-1:128] (cid:197) 0MOVSHDUP (128-bit Legacy SSE version)
DEST[31:0] (cid:197)SRC[63:32] DEST[63:32] (cid:197)SRC[63:32] DEST[95:64] (cid:197)SRC[127:96] DEST[127:96] (cid:197)SRC[127:96] DEST[MAX_VL-1:128] (Unmodified)
VMOVSHDUP __m512 _mm512_movehdup_ps( __m512 a); VMOVSHDUP __m512 _mm512_mask_movehdup_ps(__m512 s, __mmask16 k, __m512 a); VMOVSHDUP __m512 _mm512_maskz_movehdup_ps( __mmask16 k, __m512 a); VMOVSHDUP __m256 _mm256_mask_movehdup_ps(__m256 s, __mmask8 k, __m256 a); VMOVSHDUP __m256 _mm256_maskz_movehdup_ps( __mmask8 k, __m256 a); VMOVSHDUP __m128 _mm_mask_movehdup_ps(__m128 s, __mmask8 k, __m128 a); VMOVSHDUP __m128 _mm_maskz_movehdup_ps( __mmask8 k, __m128 a); VMOVSHDUP __m256 _mm256_movehdup_ps (__m256 a); VMOVSHDUP __m128 _mm_movehdup_ps (__m128 a);
None
Non-EVEX-encoded instruction, see Exceptions Type 4;
EVEX-encoded instruction, see Exceptions Type E4NF.nb. |
If EVEX.vvvv != 1111B or VEX.vvvv != 1111B. |