Intel x86 Opcode Table and Reference

I used for years the Félix Cloutier's repository, but when it was done, I was kinda stuck and so decided to provide
a backup with a different view of the index table and where all information that I need are accessible at first sight.

Online assembler and disassembler for multiple architectures are also available.

Instruction	Opcode	CPU Ext.	Description
AAA	37		ASCII adjust AL after addition.
AAD	D5 0A		ASCII adjust AX before division.
AAD imm8	D5 ib		Adjust AX before division to number base imm8.
AAM	D4 0A		ASCII adjust AX after multiply.
AAM imm8	D4 ib		Adjust AX after multiply to number base imm8.
AAS	3F		ASCII adjust AL after subtraction.
ADC AL, imm8	14 ib		Add with carry imm8 to AL.
ADC AX, imm16	15 iw		Add with carry imm16 to AX.
ADC EAX, imm32	15 id		Add with carry imm32 to EAX.
ADC RAX, imm32	REX.W + 15 id		Add with carry imm32 sign extended to 64-bits to RAX.
ADC r/m8, imm8	80 /2 ib		Add with carry imm8 to r/m8.
ADC r/m8, imm8	REX + 80 /2 ib		Add with carry imm8 to r/m8.
ADC r/m16, imm16	81 /2 iw		Add with carry imm16 to r/m16.
ADC r/m32, imm32	81 /2 id		Add with CF imm32 to r/m32.
ADC r/m64, imm32	REX.W + 81 /2 id		Add with CF imm32 sign extended to 64-bits to r/m64.
ADC r/m16, imm8	83 /2 ib		Add with CF sign-extended imm8 to r/m16.
ADC r/m32, imm8	83 /2 ib		Add with CF sign-extended imm8 into r/m32.
ADC r/m64, imm8	REX.W + 83 /2 ib		Add with CF sign-extended imm8 into r/m64.
ADC r/m8, r8	10 /r		Add with carry byte register to r/m8.
ADC r/m8, r8	REX + 10 /r		Add with carry byte register to r/m64.
ADC r/m16, r16	11 /r		Add with carry r16 to r/m16.
ADC r/m32, r32	11 /r		Add with CF r32 to r/m32.
ADC r/m64, r64	REX.W + 11 /r		Add with CF r64 to r/m64.
ADC r8, r/m8	12 /r		Add with carry r/m8 to byte register.
ADC r8, r/m8	REX + 12 /r		Add with carry r/m64 to byte register.
ADC r16, r/m16	13 /r		Add with carry r/m16 to r16.
ADC r32, r/m32	13 /r		Add with CF r/m32 to r32.
ADC r64, r/m64	REX.W + 13 /r		Add with CF r/m64 to r64.
ADCX r32, r/m32	66 0F 38 F6 /r	adx	Unsigned addition of r32 with CF, r/m32 to r32, writes CF.
ADCX r64, r/m64	66 REX.w 0F 38 F6 /r	adx	Unsigned addition of r64 with CF, r/m64 to r64, writes CF.
ADD AL, imm8	04 ib		Add imm8 to AL.
ADD AX, imm16	05 iw		Add imm16 to AX.
ADD EAX, imm32	05 id		Add imm32 to EAX.
ADD RAX, imm32	REX.W + 05 id		Add imm32 sign-extended to 64-bits to RAX.
ADD r/m8, imm8	80 /0 ib		Add imm8 to r/m8.
ADD r/m8, imm8	REX + 80 /0 ib		Add sign-extended imm8 to r/m64.
ADD r/m16, imm16	81 /0 iw		Add imm16 to r/m16.
ADD r/m32, imm32	81 /0 id		Add imm32 to r/m32.
ADD r/m64, imm32	REX.W + 81 /0 id		Add imm32 sign-extended to 64-bits to r/m64.
ADD r/m16, imm8	83 /0 ib		Add sign-extended imm8 to r/m16.
ADD r/m32, imm8	83 /0 ib		Add sign-extended imm8 to r/m32.
ADD r/m64, imm8	REX.W + 83 /0 ib		Add sign-extended imm8 to r/m64.
ADD r/m8, r8	00 /r		Add r8 to r/m8.
ADD r/m8, r8	REX + 00 /r		Add r8 to r/m8.
ADD r/m16, r16	01 /r		Add r16 to r/m16.
ADD r/m32, r32	01 /r		Add r32 to r/m32.
ADD r/m64, r64	REX.W + 01 /r		Add r64 to r/m64.
ADD r8, r/m8	02 /r		Add r/m8 to r8.
ADD r8, r/m8	REX + 02 /r		Add r/m8 to r8.
ADD r16, r/m16	03 /r		Add r/m16 to r16.
ADD r32, r/m32	03 /r		Add r/m32 to r32.
ADD r64, r/m64	REX.W + 03 /r		Add r/m64 to r64.
ADDPD xmm1, xmm2/m128	66 0F 58 /r	sse2	Add packed double-precision floating-point values from xmm2/mem to xmm1 and store result in xmm1.
VADDPD xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 58 /r	avx	Add packed double-precision floating-point values from xmm3/mem to xmm2 and store result in xmm1.
VADDPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 58 /r	avx	Add packed double-precision floating-point values from ymm3/mem to ymm2 and store result in ymm1.
VADDPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 58 /r	avx512	Add packed double-precision floating-point values from xmm3/m128/m64bcst to xmm2 and store result in xmm1 with writemask k1.
VADDPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 58 /r	avx512	Add packed double-precision floating-point values from ymm3/m256/m64bcst to ymm2 and store result in ymm1 with writemask k1.
VADDPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F.W1 58 /r	avx512	Add packed double-precision floating-point values from zmm3/m512/m64bcst to zmm2 and store result in zmm1 with writemask k1.
ADDPS xmm1, xmm2/m128	0F 58 /r	sse	Add packed single-precision floating-point values from xmm2/m128 to xmm1 and store result in xmm1.
VADDPS xmm1,xmm2, xmm3/m128	VEX.NDS.128.0F.WIG 58 /r	avx	Add packed single-precision floating-point values from xmm3/m128 to xmm2 and store result in xmm1.
VADDPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.0F.WIG 58 /r	avx	Add packed single-precision floating-point values from ymm3/m256 to ymm2 and store result in ymm1.
VADDPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.0F.W0 58 /r	avx512	Add packed single-precision floating-point values from xmm3/m128/m32bcst to xmm2 and store result in xmm1 with writemask k1.
VADDPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.0F.W0 58 /r	avx512	Add packed single-precision floating-point values from ymm3/m256/m32bcst to ymm2 and store result in ymm1 with writemask k1.
VADDPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst {er}	EVEX.NDS.512.0F.W0 58 /r	avx512	Add packed single-precision floating-point values from zmm3/m512/m32bcst to zmm2 and store result in zmm1 with writemask k1.
ADDSD xmm1, xmm2/m64	F2 0F 58 /r	sse2	Add the low double-precision floating-point value from xmm2/mem to xmm1 and store the result in xmm1.
VADDSD xmm1, xmm2, xmm3/m64	VEX.NDS.128.F2.0F.WIG 58 /r	avx	Add the low double-precision floating-point value from xmm3/mem to xmm2 and store the result in xmm1.
VADDSD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.NDS.LIG.F2.0F.W1 58 /r	avx512	Add the low double-precision floating-point value from xmm3/m64 to xmm2 and store the result in xmm1 with writemask k1.
ADDSS xmm1, xmm2/m32	F3 0F 58 /r	sse	Add the low single-precision floating-point value from xmm2/mem to xmm1 and store the result in xmm1.
VADDSS xmm1,xmm2, xmm3/m32	VEX.NDS.128.F3.0F.WIG 58 /r	avx	Add the low single-precision floating-point value from xmm3/mem to xmm2 and store the result in xmm1.
VADDSS xmm1{k1}{z}, xmm2, xmm3/m32{er}	EVEX.NDS.LIG.F3.0F.W0 58 /r	avx512	Add the low single-precision floating-point value from xmm3/m32 to xmm2 and store the result in xmm1with writemask k1.
ADDSUBPD xmm1, xmm2/m128	66 0F D0 /r	sse3	Add/subtract double-precision floating-point values from xmm2/m128 to xmm1.
VADDSUBPD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG D0 /r	avx	Add/subtract packed double-precision floating-point values from xmm3/mem to xmm2 and stores result in xmm1.
VADDSUBPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG D0 /r	avx	Add / subtract packed double-precision floating-point values from ymm3/mem to ymm2 and stores result in ymm1.
ADDSUBPS xmm1, xmm2/m128	F2 0F D0 /r	sse3	Add/subtract single-precision floating-point values from xmm2/m128 to xmm1.
VADDSUBPS xmm1, xmm2, xmm3/m128	VEX.NDS.128.F2.0F.WIG D0 /r	avx	Add/subtract single-precision floating-point values from xmm3/mem to xmm2 and stores result in xmm1.
VADDSUBPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.F2.0F.WIG D0 /r	avx	Add / subtract single-precision floating-point values from ymm3/mem to ymm2 and stores result in ymm1.
ADOX r32, r/m32	F3 0F 38 F6 /r	adx	Unsigned addition of r32 with OF, r/m32 to r32, writes OF.
ADOX r64, r/m64	F3 REX.w 0F 38 F6 /r	adx	Unsigned addition of r64 with OF, r/m64 to r64, writes OF.
AESDEC xmm1, xmm2/m128	66 0F 38 DE /r	aes	Perform one round of an AES decryption flow, using the Equivalent Inverse Cipher, operating on a 128-bit data (state) from xmm1 with a 128-bit round key from xmm2/m128.
VAESDEC xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG DE /r	aes	Perform one round of an AES decryption flow, using the Equivalent Inverse Cipher, operating on a 128-bit data (state) from xmm2 with a 128-bit round key from xmm3/m128; store the result in xmm1.
AESDECLAST xmm1, xmm2/m128	66 0F 38 DF /r	aes	Perform the last round of an AES decryption flow, using the Equivalent Inverse Cipher, operating on a 128-bit data (state) from xmm1 with a 128-bit round key from xmm2/m128.
VAESDECLAST xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG DF /r	aes	Perform the last round of an AES decryption flow, using the Equivalent Inverse Cipher, operating on a 128-bit data (state) from xmm2 with a 128-bit round key from xmm3/m128; store the result in xmm1.
AESENC xmm1, xmm2/m128	66 0F 38 DC /r	aes	Perform one round of an AES encryption flow, operating on a 128-bit data (state) from xmm1 with a 128-bit round key from xmm2/m128.
VAESENC xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG DC /r	aes	Perform one round of an AES encryption flow, operating on a 128-bit data (state) from xmm2 with a 128-bit round key from the xmm3/m128; store the result in xmm1.
AESENCLAST xmm1, xmm2/m128	66 0F 38 DD /r	aes	Perform the last round of an AES encryption flow, operating on a 128-bit data (state) from xmm1 with a 128-bit round key from xmm2/m128.
VAESENCLAST xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG DD /r	aes	Perform the last round of an AES encryption flow, operating on a 128-bit data (state) from xmm2 with a 128 bit round key from xmm3/m128; store the result in xmm1.
AESIMC xmm1, xmm2/m128	66 0F 38 DB /r	aes	Perform the InvMixColumn transformation on a 128-bit round key from xmm2/m128 and store the result in xmm1.
VAESIMC xmm1, xmm2/m128	VEX.128.66.0F38.WIG DB /r	aes	Perform the InvMixColumn transformation on a 128-bit round key from xmm2/m128 and store the result in xmm1.
AESKEYGENASSIST xmm1, xmm2/m128, imm8	66 0F 3A DF /r ib	aes	Assist in AES round key generation using an 8 bits Round Constant (RCON) specified in the immediate byte, operating on 128 bits of data specified in xmm2/m128 and stores the result in xmm1.
VAESKEYGENASSIST xmm1, xmm2/m128, imm8	VEX.128.66.0F3A.WIG DF /r ib	aes	Assist in AES round key generation using 8 bits Round Constant (RCON) specified in the immediate byte, operating on 128 bits of data specified in xmm2/m128 and stores the result in xmm1.
AND AL, imm8	24 ib		AL AND imm8.
AND AX, imm16	25 iw		AX AND imm16.
AND EAX, imm32	25 id		EAX AND imm32.
AND RAX, imm32	REX.W + 25 id		RAX AND imm32 sign-extended to 64-bits.
AND r/m8, imm8	80 /4 ib		r/m8 AND imm8.
AND r/m8, imm8	REX + 80 /4 ib		r/m8 AND imm8.
AND r/m16, imm16	81 /4 iw		r/m16 AND imm16.
AND r/m32, imm32	81 /4 id		r/m32 AND imm32.
AND r/m64, imm32	REX.W + 81 /4 id		r/m64 AND imm32 sign extended to 64-bits.
AND r/m16, imm8	83 /4 ib		r/m16 AND imm8 (sign-extended).
AND r/m32, imm8	83 /4 ib		r/m32 AND imm8 (sign-extended).
AND r/m64, imm8	REX.W + 83 /4 ib		r/m64 AND imm8 (sign-extended).
AND r/m8, r8	20 /r		r/m8 AND r8.
AND r/m8, r8	REX + 20 /r		r/m64 AND r8 (sign-extended).
AND r/m16, r16	21 /r		r/m16 AND r16.
AND r/m32, r32	21 /r		r/m32 AND r32.
AND r/m64, r64	REX.W + 21 /r		r/m64 AND r32.
AND r8, r/m8	22 /r		r8 AND r/m8.
AND r8, r/m8	REX + 22 /r		r/m64 AND r8 (sign-extended).
AND r16, r/m16	23 /r		r16 AND r/m16.
AND r32, r/m32	23 /r		r32 AND r/m32.
AND r64, r/m64	REX.W + 23 /r		r64 AND r/m64.
ANDN r32a, r32b, r/m32	VEX.NDS.LZ.0F38.W0 F2 /r	bmi1	Bitwise AND of inverted r32b with r/m32, store result in r32a.
ANDN r64a, r64b, r/m64	VEX.NDS.LZ. 0F38.W1 F2 /r	bmi1	Bitwise AND of inverted r64b with r/m64, store result in r64a.
ANDNPD xmm1, xmm2/m128	66 0F 55 /r	sse2	Return the bitwise logical AND NOT of packed double-precision floating-point values in xmm1 and xmm2/mem.
VANDNPD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F 55 /r	avx	Return the bitwise logical AND NOT of packed double-precision floating-point values in xmm2 and xmm3/mem.
VANDNPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F 55/r	avx	Return the bitwise logical AND NOT of packed double-precision floating-point values in ymm2 and ymm3/mem.
VANDNPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 55 /r	avx512	Return the bitwise logical AND NOT of packed double-precision floating-point values in xmm2 and xmm3/m128/m64bcst subject to writemask k1.
VANDNPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 55 /r	avx512	Return the bitwise logical AND NOT of packed double-precision floating-point values in ymm2 and ymm3/m256/m64bcst subject to writemask k1.
VANDNPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F.W1 55 /r	avx512	Return the bitwise logical AND NOT of packed double-precision floating-point values in zmm2 and zmm3/m512/m64bcst subject to writemask k1.
ANDNPS xmm1, xmm2/m128	0F 55 /r	sse	Return the bitwise logical AND NOT of packed single-precision floating-point values in xmm1 and xmm2/mem.
VANDNPS xmm1, xmm2, xmm3/m128	VEX.NDS.128.0F 55 /r	avx	Return the bitwise logical AND NOT of packed single-precision floating-point values in xmm2 and xmm3/mem.
VANDNPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.0F 55 /r	avx	Return the bitwise logical AND NOT of packed single-precision floating-point values in ymm2 and ymm3/mem.
VANDNPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.0F.W0 55 /r	avx512	Return the bitwise logical AND of packed single-precision floating-point values in xmm2 and xmm3/m128/m32bcst subject to writemask k1.
VANDNPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.0F.W0 55 /r	avx512	Return the bitwise logical AND of packed single-precision floating-point values in ymm2 and ymm3/m256/m32bcst subject to writemask k1.
VANDNPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.0F.W0 55 /r	avx512	Return the bitwise logical AND of packed single-precision floating-point values in zmm2 and zmm3/m512/m32bcst subject to writemask k1.
ANDPD xmm1, xmm2/m128	66 0F 54 /r	sse2	Return the bitwise logical AND of packed double-precision floating-point values in xmm1 and xmm2/mem.
VANDPD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F 54 /r	avx	Return the bitwise logical AND of packed double-precision floating-point values in xmm2 and xmm3/mem.
VANDPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F 54 /r	avx	Return the bitwise logical AND of packed double-precision floating-point values in ymm2 and ymm3/mem.
VANDPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 54 /r	avx512	Return the bitwise logical AND of packed double-precision floating-point values in xmm2 and xmm3/m128/m64bcst subject to writemask k1.
VANDPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 54 /r	avx512	Return the bitwise logical AND of packed double-precision floating-point values in ymm2 and ymm3/m256/m64bcst subject to writemask k1.
VANDPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F.W1 54 /r	avx512	Return the bitwise logical AND of packed double-precision floating-point values in zmm2 and zmm3/m512/m64bcst subject to writemask k1.
ANDPS xmm1, xmm2/m128	0F 54 /r	sse	Return the bitwise logical AND of packed single-precision floating-point values in xmm1 and xmm2/mem.
VANDPS xmm1,xmm2, xmm3/m128	VEX.NDS.128.0F 54 /r	avx	Return the bitwise logical AND of packed single-precision floating-point values in xmm2 and xmm3/mem.
VANDPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.0F 54 /r	avx	Return the bitwise logical AND of packed single-precision floating-point values in ymm2 and ymm3/mem.
VANDPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.0F.W0 54 /r	avx512	Return the bitwise logical AND of packed single-precision floating-point values in xmm2 and xmm3/m128/m32bcst subject to writemask k1.
VANDPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.0F.W0 54 /r	avx512	Return the bitwise logical AND of packed single-precision floating-point values in ymm2 and ymm3/m256/m32bcst subject to writemask k1.
VANDPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.0F.W0 54 /r	avx512	Return the bitwise logical AND of packed single-precision floating-point values in zmm2 and zmm3/m512/m32bcst subject to writemask k1.
ARPL r/m16, r16	63 /r		Adjust RPL of r/m16 to not less than RPL of r16.
BEXTR r32a, r/m32, r32b	VEX.NDS.LZ.0F38.W0 F7 /r		Contiguous bitwise extract from r/m32 using r32b as control; store result in r32a.
BEXTR r64a, r/m64, r64b	VEX.NDS.LZ.0F38.W1 F7 /r		Contiguous bitwise extract from r/m64 using r64b as control; store result in r64a
BLENDPD xmm1, xmm2/m128, imm8	66 0F 3A 0D /r ib	sse4.1	Select packed DP-FP values from xmm1 and xmm2/m128 from mask specified in imm8 and store the values into xmm1.
VBLENDPD xmm1, xmm2, xmm3/m128, imm8	VEX.NDS.128.66.0F3A.WIG 0D /r ib	avx	Select packed double-precision floating-point Values from xmm2 and xmm3/m128 from mask in imm8 and store the values in xmm1.
VBLENDPD ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.66.0F3A.WIG 0D /r ib	avx	Select packed double-precision floating-point Values from ymm2 and ymm3/m256 from mask in imm8 and store the values in ymm1.
BLENDPS xmm1, xmm2/m128, imm8	66 0F 3A 0C /r ib	sse4.1	Select packed single precision floating-point values from xmm1 and xmm2/m128 from mask specified in imm8 and store the values into xmm1.
VBLENDPS xmm1, xmm2, xmm3/m128, imm8	VEX.NDS.128.66.0F3A.WIG 0C /r ib	avx	Select packed single-precision floating-point values from xmm2 and xmm3/m128 from mask in imm8 and store the values in xmm1.
VBLENDPS ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.66.0F3A.WIG 0C /r ib	avx	Select packed single-precision floating-point values from ymm2 and ymm3/m256 from mask in imm8 and store the values in ymm1.
BLENDVPD xmm1, xmm2/m128 , <XMM0>	66 0F 38 15 /r	sse4.1	Select packed DP FP values from xmm1 and xmm2 from mask specified in XMM0 and store the values in xmm1.
VBLENDVPD xmm1, xmm2, xmm3/m128, xmm4	VEX.NDS.128.66.0F3A.W0 4B /r /is4	avx	Conditionally copy double-precision floating-point values from xmm2 or xmm3/m128 to xmm1, based on mask bits in the mask operand, xmm4.
VBLENDVPD ymm1, ymm2, ymm3/m256, ymm4	VEX.NDS.256.66.0F3A.W0 4B /r /is4	avx	Conditionally copy double-precision floating-point values from ymm2 or ymm3/m256 to ymm1, based on mask bits in the mask operand, ymm4.
BLENDVPS xmm1, xmm2/m128, <XMM0>	66 0F 38 14 /r	sse4.1	Select packed single precision floating-point values from xmm1 and xmm2/m128 from mask specified in XMM0 and store the values into xmm1.
VBLENDVPS xmm1, xmm2, xmm3/m128, xmm4	VEX.NDS.128.66.0F3A.W0 4A /r /is4	avx	Conditionally copy single-precision floating-point values from xmm2 or xmm3/m128 to xmm1, based on mask bits in the specified mask operand, xmm4.
VBLENDVPS ymm1, ymm2, ymm3/m256, ymm4	VEX.NDS.256.66.0F3A.W0 4A /r /is4	avx	Conditionally copy single-precision floating-point values from ymm2 or ymm3/m256 to ymm1, based on mask bits in the specified mask register, ymm4.
BLSI r32, r/m32	VEX.NDD.LZ.0F38.W0 F3 /3	bmi1	Extract lowest set bit from r/m32 and set that bit in r32.
BLSI r64, r/m64	VEX.NDD.LZ.0F38.W1 F3 /3	bmi1	Extract lowest set bit from r/m64, and set that bit in r64.
BLSMSK r32, r/m32	VEX.NDD.LZ.0F38.W0 F3 /2	bmi1	Set all lower bits in r32 to “1” starting from bit 0 to lowest set bit in r/m32.
BLSMSK r64, r/m64	VEX.NDD.LZ.0F38.W1 F3 /2	bmi1	Set all lower bits in r64 to “1” starting from bit 0 to lowest set bit in r/m64.
BLSR r32, r/m32	VEX.NDD.LZ.0F38.W0 F3 /1	bmi1	Reset lowest set bit of r/m32, keep all other bits of r/m32 and write result to r32.
BLSR r64, r/m64	VEX.NDD.LZ.0F38.W1 F3 /1	bmi1	Reset lowest set bit of r/m64, keep all other bits of r/m64 and write result to r64.
BNDCL bnd, r/m32	F3 0F 1A /r	mpx	Generate a #BR if the address in r/m32 is lower than the lower bound in bnd.LB.
BNDCL bnd, r/m64	F3 0F 1A /r	mpx	Generate a #BR if the address in r/m64 is lower than the lower bound in bnd.LB.
BNDCU bnd, r/m32	F2 0F 1A /r	mpx	Generate a #BR if the address in r/m32 is higher than the upper bound in bnd.UB (bnb.UB in 1's complement form).
BNDCU bnd, r/m64	F2 0F 1A /r	mpx	Generate a #BR if the address in r/m64 is higher than the upper bound in bnd.UB (bnb.UB in 1's complement form).
BNDCN bnd, r/m32	F2 0F 1B /r	mpx	Generate a #BR if the address in r/m32 is higher than the upper bound in bnd.UB (bnb.UB not in 1's complement form).
BNDCN bnd, r/m64	F2 0F 1B /r	mpx	Generate a #BR if the address in r/m64 is higher than the upper bound in bnd.UB (bnb.UB not in 1's complement form).
BNDLDX bnd, mib	0F 1A /r	mpx	Load the bounds stored in a bound table entry (BTE) into bnd with address translation using the base of mib and conditional on the index of mib matching the pointer value in the BTE.
BNDMK bnd, m32	F3 0F 1B /r	mpx	Make lower and upper bounds from m32 and store them in bnd.
BNDMK bnd, m64	F3 0F 1B /r	mpx	Make lower and upper bounds from m64 and store them in bnd.
BNDMOV bnd1, bnd2/m64	66 0F 1A /r	mpx	Move lower and upper bound from bnd2/m64 to bound register bnd1.
BNDMOV bnd1, bnd2/m128	66 0F 1A /r	mpx	Move lower and upper bound from bnd2/m128 to bound register bnd1.
BNDMOV bnd1/m64, bnd2	66 0F 1B /r	mpx	Move lower and upper bound from bnd2 to bnd1/m64.
BNDMOV bnd1/m128, bnd2	66 0F 1B /r	mpx	Move lower and upper bound from bnd2 to bound register bnd1/m128.
BNDSTX mib, bnd	0F 1B /r	mpx	Store the bounds in bnd and the pointer value in the index regis-ter of mib to a bound table entry (BTE) with address translation using the base of mib.
BOUND r16, m16&16	62 /r		Check if r16 (array index) is within bounds specified by m16&16.
BOUND r32, m32&32	62 /r		Check if r32 (array index) is within bounds specified by m32&32.
BSF r16, r/m16	0F BC /r		Bit scan forward on r/m16.
BSF r32, r/m32	0F BC /r		Bit scan forward on r/m32.
BSF r64, r/m64	REX.W + 0F BC /r		Bit scan forward on r/m64.
BSR r16, r/m16	0F BD /r		Bit scan reverse on r/m16.
BSR r32, r/m32	0F BD /r		Bit scan reverse on r/m32.
BSR r64, r/m64	REX.W + 0F BD /r		Bit scan reverse on r/m64.
BSWAP r32	0F C8+rd		Reverses the byte order of a 32-bit register.
BSWAP r64	REX.W + 0F C8+rd		Reverses the byte order of a 64-bit register.
BT r/m16, r16	0F A3 /r		Store selected bit in CF flag.
BT r/m32, r32	0F A3 /r		Store selected bit in CF flag.
BT r/m64, r64	REX.W + 0F A3 /r		Store selected bit in CF flag.
BT r/m16, imm8	0F BA /4 ib		Store selected bit in CF flag.
BT r/m32, imm8	0F BA /4 ib		Store selected bit in CF flag.
BT r/m64, imm8	REX.W + 0F BA /4 ib		Store selected bit in CF flag.
BTC r/m16, r16	0F BB /r		Store selected bit in CF flag and complement.
BTC r/m32, r32	0F BB /r		Store selected bit in CF flag and complement.
BTC r/m64, r64	REX.W + 0F BB /r		Store selected bit in CF flag and complement.
BTC r/m16, imm8	0F BA /7 ib		Store selected bit in CF flag and complement.
BTC r/m32, imm8	0F BA /7 ib		Store selected bit in CF flag and complement.
BTC r/m64, imm8	REX.W + 0F BA /7 ib		Store selected bit in CF flag and complement.
BTR r/m16, r16	0F B3 /r		Store selected bit in CF flag and clear.
BTR r/m32, r32	0F B3 /r		Store selected bit in CF flag and clear.
BTR r/m64, r64	REX.W + 0F B3 /r		Store selected bit in CF flag and clear.
BTR r/m16, imm8	0F BA /6 ib		Store selected bit in CF flag and clear.
BTR r/m32, imm8	0F BA /6 ib		Store selected bit in CF flag and clear.
BTR r/m64, imm8	REX.W + 0F BA /6 ib		Store selected bit in CF flag and clear.
BTS r/m16, r16	0F AB /r		Store selected bit in CF flag and set.
BTS r/m32, r32	0F AB /r		Store selected bit in CF flag and set.
BTS r/m64, r64	REX.W + 0F AB /r		Store selected bit in CF flag and set.
BTS r/m16, imm8	0F BA /5 ib		Store selected bit in CF flag and set.
BTS r/m32, imm8	0F BA /5 ib		Store selected bit in CF flag and set.
BTS r/m64, imm8	REX.W + 0F BA /5 ib		Store selected bit in CF flag and set.
BZHI r32a, r/m32, r32b	VEX.NDS.LZ.0F38.W0 F5 /r	bmi2	Zero bits in r/m32 starting with the position in r32b, write result to r32a.
BZHI r64a, r/m64, r64b	VEX.NDS.LZ.0F38.W1 F5 /r	bmi2	Zero bits in r/m64 starting with the position in r64b, write result to r64a.
CALL rel16	E8 cw		Call near, relative, displacement relative to next instruction.
CALL rel32	E8 cd		Call near, relative, displacement relative to next instruction. 32-bit displacement sign extended to 64-bits in 64-bit mode.
CALL r/m16	FF /2		Call near, absolute indirect, address given in r/m16.
CALL r/m32	FF /2		Call near, absolute indirect, address given in r/m32.
CALL r/m64	FF /2		Call near, absolute indirect, address given in r/m64.
CALL ptr16:16	9A cd		Call far, absolute, address given in operand.
CALL ptr16:32	9A cp		Call far, absolute, address given in operand.
CALL m16:16	FF /3		Call far, absolute indirect address given in m16:16. In 32-bit mode: if selector points to a gate, then RIP = 32-bit zero extended displacement taken from gate; else RIP = zero extended 16-bit offset from far pointer referenced in the instruction.
CALL m16:32	FF /3		In 64-bit mode: If selector points to a gate, then RIP = 64-bit displacement taken from gate; else RIP = zero extended 32-bit offset from far pointer referenced in the instruction.
CALL m16:64	REX.W + FF /3		In 64-bit mode: If selector points to a gate, then RIP = 64-bit displacement taken from gate; else RIP = 64-bit offset from far pointer referenced in the instruction.
CBW	98		AX ← sign-extend of AL.
CWDE	98		EAX ← sign-extend of AX.
CDQE	REX.W + 98		RAX ← sign-extend of EAX.
CLAC	0F 01 CA		Clear the AC flag in the EFLAGS register.
CLC	F8		Clear CF flag.
CLD	FC		Clear DF flag.
CLFLUSH m8	0F AE /7		Flushes cache line containing m8.
CLFLUSHOPT m8	66 0F AE /7		Flushes cache line containing m8.
CLI	FA		Clear interrupt flag; interrupts disabled when interrupt flag cleared.
CLTS	0F 06		Clears TS flag in CR0.
CMC	F5		Complement CF flag.
CMOVA r16, r/m16	0F 47 /r		Move if above (CF=0 and ZF=0).
CMOVA r32, r/m32	0F 47 /r		Move if above (CF=0 and ZF=0).
CMOVA r64, r/m64	REX.W + 0F 47 /r		Move if above (CF=0 and ZF=0).
CMOVAE r16, r/m16	0F 43 /r		Move if above or equal (CF=0).
CMOVAE r32, r/m32	0F 43 /r		Move if above or equal (CF=0).
CMOVAE r64, r/m64	REX.W + 0F 43 /r		Move if above or equal (CF=0).
CMOVB r16, r/m16	0F 42 /r		Move if below (CF=1).
CMOVB r32, r/m32	0F 42 /r		Move if below (CF=1).
CMOVB r64, r/m64	REX.W + 0F 42 /r		Move if below (CF=1).
CMOVBE r16, r/m16	0F 46 /r		Move if below or equal (CF=1 or ZF=1).
CMOVBE r32, r/m32	0F 46 /r		Move if below or equal (CF=1 or ZF=1).
CMOVBE r64, r/m64	REX.W + 0F 46 /r		Move if below or equal (CF=1 or ZF=1).
CMOVC r16, r/m16	0F 42 /r		Move if carry (CF=1).
CMOVC r32, r/m32	0F 42 /r		Move if carry (CF=1).
CMOVC r64, r/m64	REX.W + 0F 42 /r		Move if carry (CF=1).
CMOVE r16, r/m16	0F 44 /r		Move if equal (ZF=1).
CMOVE r32, r/m32	0F 44 /r		Move if equal (ZF=1).
CMOVE r64, r/m64	REX.W + 0F 44 /r		Move if equal (ZF=1).
CMOVG r16, r/m16	0F 4F /r		Move if greater (ZF=0 and SF=OF).
CMOVG r32, r/m32	0F 4F /r		Move if greater (ZF=0 and SF=OF).
CMOVG r64, r/m64	REX.W + 0F 4F /r		Move if greater (ZF=0 and SF=OF).
CMOVGE r16, r/m16	0F 4D /r		Move if greater or equal (SF=OF).
CMOVGE r32, r/m32	0F 4D /r		Move if greater or equal (SF=OF).
CMOVGE r64, r/m64	REX.W + 0F 4D /r		Move if greater or equal (SF=OF).
CMOVL r16, r/m16	0F 4C /r		Move if less (SF≠ OF).
CMOVL r32, r/m32	0F 4C /r		Move if less (SF≠ OF).
CMOVL r64, r/m64	REX.W + 0F 4C /r		Move if less (SF≠ OF).
CMOVLE r16, r/m16	0F 4E /r		Move if less or equal (ZF=1 or SF≠ OF).
CMOVLE r32, r/m32	0F 4E /r		Move if less or equal (ZF=1 or SF≠ OF).
CMOVLE r64, r/m64	REX.W + 0F 4E /r		Move if less or equal (ZF=1 or SF≠ OF).
CMOVNA r16, r/m16	0F 46 /r		Move if not above (CF=1 or ZF=1).
CMOVNA r32, r/m32	0F 46 /r		Move if not above (CF=1 or ZF=1).
CMOVNA r64, r/m64	REX.W + 0F 46 /r		Move if not above (CF=1 or ZF=1).
CMOVNAE r16, r/m16	0F 42 /r		Move if not above or equal (CF=1).
CMOVNAE r32, r/m32	0F 42 /r		Move if not above or equal (CF=1).
CMOVNAE r64, r/m64	REX.W + 0F 42 /r		Move if not above or equal (CF=1).
CMOVNB r16, r/m16	0F 43 /r		Move if not below (CF=0).
CMOVNB r32, r/m32	0F 43 /r		Move if not below (CF=0).
CMOVNB r64, r/m64	REX.W + 0F 43 /r		Move if not below (CF=0).
CMOVNBE r16, r/m16	0F 47 /r		Move if not below or equal (CF=0 and ZF=0).
CMOVNBE r32, r/m32	0F 47 /r		Move if not below or equal (CF=0 and ZF=0).
CMOVNBE r64, r/m64	REX.W + 0F 47 /r		Move if not below or equal (CF=0 and ZF=0).
CMOVNC r16, r/m16	0F 43 /r		Move if not carry (CF=0).
CMOVNC r32, r/m32	0F 43 /r		Move if not carry (CF=0).
CMOVNC r64, r/m64	REX.W + 0F 43 /r		Move if not carry (CF=0).
CMOVNE r16, r/m16	0F 45 /r		Move if not equal (ZF=0).
CMOVNE r32, r/m32	0F 45 /r		Move if not equal (ZF=0).
CMOVNE r64, r/m64	REX.W + 0F 45 /r		Move if not equal (ZF=0).
CMOVNG r16, r/m16	0F 4E /r		Move if not greater (ZF=1 or SF≠ OF).
CMOVNG r32, r/m32	0F 4E /r		Move if not greater (ZF=1 or SF≠ OF).
CMOVNG r64, r/m64	REX.W + 0F 4E /r		Move if not greater (ZF=1 or SF≠ OF).
CMOVNGE r16, r/m16	0F 4C /r		Move if not greater or equal (SF≠ OF).
CMOVNGE r32, r/m32	0F 4C /r		Move if not greater or equal (SF≠ OF).
CMOVNGE r64, r/m64	REX.W + 0F 4C /r		Move if not greater or equal (SF≠ OF).
CMOVNL r16, r/m16	0F 4D /r		Move if not less (SF=OF).
CMOVNL r32, r/m32	0F 4D /r		Move if not less (SF=OF).
CMOVNL r64, r/m64	REX.W + 0F 4D /r		Move if not less (SF=OF).
CMOVNLE r16, r/m16	0F 4F /r		Move if not less or equal (ZF=0 and SF=OF).
CMOVNLE r32, r/m32	0F 4F /r		Move if not less or equal (ZF=0 and SF=OF).
CMOVNLE r64, r/m64	REX.W + 0F 4F /r		Move if not less or equal (ZF=0 and SF=OF).
CMOVNO r16, r/m16	0F 41 /r		Move if not overflow (OF=0).
CMOVNO r32, r/m32	0F 41 /r		Move if not overflow (OF=0).
CMOVNO r64, r/m64	REX.W + 0F 41 /r		Move if not overflow (OF=0).
CMOVNP r16, r/m16	0F 4B /r		Move if not parity (PF=0).
CMOVNP r32, r/m32	0F 4B /r		Move if not parity (PF=0).
CMOVNP r64, r/m64	REX.W + 0F 4B /r		Move if not parity (PF=0).
CMOVNS r16, r/m16	0F 49 /r		Move if not sign (SF=0).
CMOVNS r32, r/m32	0F 49 /r		Move if not sign (SF=0).
CMOVNS r64, r/m64	REX.W + 0F 49 /r		Move if not sign (SF=0).
CMOVNZ r16, r/m16	0F 45 /r		Move if not zero (ZF=0).
CMOVNZ r32, r/m32	0F 45 /r		Move if not zero (ZF=0).
CMOVNZ r64, r/m64	REX.W + 0F 45 /r		Move if not zero (ZF=0).
CMOVO r16, r/m16	0F 40 /r		Move if overflow (OF=1).
CMOVO r32, r/m32	0F 40 /r		Move if overflow (OF=1).
CMOVO r64, r/m64	REX.W + 0F 40 /r		Move if overflow (OF=1).
CMOVP r16, r/m16	0F 4A /r		Move if parity (PF=1).
CMOVP r32, r/m32	0F 4A /r		Move if parity (PF=1).
CMOVP r64, r/m64	REX.W + 0F 4A /r		Move if parity (PF=1).
CMOVPE r16, r/m16	0F 4A /r		Move if parity even (PF=1).
CMOVPE r32, r/m32	0F 4A /r		Move if parity even (PF=1).
CMOVPE r64, r/m64	REX.W + 0F 4A /r		Move if parity even (PF=1).
CMOVPO r16, r/m16	0F 4B /r		Move if parity odd (PF=0).
CMOVPO r32, r/m32	0F 4B /r		Move if parity odd (PF=0).
CMOVPO r64, r/m64	REX.W + 0F 4B /r		Move if parity odd (PF=0).
CMOVS r16, r/m16	0F 48 /r		Move if sign (SF=1).
CMOVS r32, r/m32	0F 48 /r		Move if sign (SF=1).
CMOVS r64, r/m64	REX.W + 0F 48 /r		Move if sign (SF=1).
CMOVZ r16, r/m16	0F 44 /r		Move if zero (ZF=1).
CMOVZ r32, r/m32	0F 44 /r		Move if zero (ZF=1).
CMOVZ r64, r/m64	REX.W + 0F 44 /r		Move if zero (ZF=1).
CMP AL, imm8	3C ib		Compare imm8 with AL.
CMP AX, imm16	3D iw		Compare imm16 with AX.
CMP EAX, imm32	3D id		Compare imm32 with EAX.
CMP RAX, imm32	REX.W + 3D id		Compare imm32 sign-extended to 64-bits with RAX.
CMP r/m8, imm8	80 /7 ib		Compare imm8 with r/m8.
CMP r/m8, imm8	REX + 80 /7 ib		Compare imm8 with r/m8.
CMP r/m16, imm16	81 /7 iw		Compare imm16 with r/m16.
CMP r/m32, imm32	81 /7 id		Compare imm32 with r/m32.
CMP r/m64, imm32	REX.W + 81 /7 id		Compare imm32 sign-extended to 64-bits with r/m64.
CMP r/m16, imm8	83 /7 ib		Compare imm8 with r/m16.
CMP r/m32, imm8	83 /7 ib		Compare imm8 with r/m32.
CMP r/m64, imm8	REX.W + 83 /7 ib		Compare imm8 with r/m64.
CMP r/m8, r8	38 /r		Compare r8 with r/m8.
CMP r/m8, r8	REX + 38 /r		Compare r8 with r/m8.
CMP r/m16, r16	39 /r		Compare r16 with r/m16.
CMP r/m32, r32	39 /r		Compare r32 with r/m32.
CMP r/m64,r64	REX.W + 39 /r		Compare r64 with r/m64.
CMP r8, r/m8	3A /r		Compare r/m8 with r8.
CMP r8, r/m8	REX + 3A /r		Compare r/m8 with r8.
CMP r16, r/m16	3B /r		Compare r/m16 with r16.
CMP r32, r/m32	3B /r		Compare r/m32 with r32.
CMP r64, r/m64	REX.W + 3B /r		Compare r/m64 with r64.
CMPPD xmm1, xmm2/m128, imm8	66 0F C2 /r ib	sse2	Compare packed double-precision floating-point values in xmm2/m128 and xmm1 using bits 2:0 of imm8 as a comparison predicate.
VCMPPD xmm1, xmm2, xmm3/m128, imm8	VEX.NDS.128.66.0F.WIG C2 /r ib	avx	Compare packed double-precision floating-point values in xmm3/m128 and xmm2 using bits 4:0 of imm8 as a comparison predicate.
VCMPPD ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.66.0F.WIG C2 /r ib	avx	Compare packed double-precision floating-point values in ymm3/m256 and ymm2 using bits 4:0 of imm8 as a comparison predicate.
VCMPPD k1 {k2}, xmm2, xmm3/m128/m64bcst, imm8	EVEX.NDS.128.66.0F.W1 C2 /r ib	avx512	Compare packed double-precision floating-point values in xmm3/m128/m64bcst and xmm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VCMPPD k1 {k2}, ymm2, ymm3/m256/m64bcst, imm8	EVEX.NDS.256.66.0F.W1 C2 /r ib	avx512	Compare packed double-precision floating-point values in ymm3/m256/m64bcst and ymm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VCMPPD k1 {k2}, zmm2, zmm3/m512/m64bcst{sae}, imm8	EVEX.NDS.512.66.0F.W1 C2 /r ib	avx512	Compare packed double-precision floating-point values in zmm3/m512/m64bcst and zmm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
CMPPS xmm1, xmm2/m128, imm8	0F C2 /r ib	sse	Compare packed single-precision floating-point values in xmm2/m128 and xmm1 using bits 2:0 of imm8 as a comparison predicate.
VCMPPS xmm1, xmm2, xmm3/m128, imm8	VEX.NDS.128.0F.WIG C2 /r ib	avx	Compare packed single-precision floating-point values in xmm3/m128 and xmm2 using bits 4:0 of imm8 as a comparison predicate.
VCMPPS ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.0F.WIG C2 /r ib	avx	Compare packed single-precision floating-point values in ymm3/m256 and ymm2 using bits 4:0 of imm8 as a comparison predicate.
VCMPPS k1 {k2}, xmm2, xmm3/m128/m32bcst, imm8	EVEX.NDS.128.0F.W0 C2 /r ib	avx512	Compare packed single-precision floating-point values in xmm3/m128/m32bcst and xmm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VCMPPS k1 {k2}, ymm2, ymm3/m256/m32bcst, imm8	EVEX.NDS.256.0F.W0 C2 /r ib	avx512	Compare packed single-precision floating-point values in ymm3/m256/m32bcst and ymm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VCMPPS k1 {k2}, zmm2, zmm3/m512/m32bcst{sae}, imm8	EVEX.NDS.512.0F.W0 C2 /r ib	avx512	Compare packed single-precision floating-point values in zmm3/m512/m32bcst and zmm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
CMPSD xmm1, xmm2/m64, imm8	F2 0F C2 /r ib	sse2	Compare low double-precision floating-point value in xmm2/m64 and xmm1 using bits 2:0 of imm8 as comparison predicate.
VCMPSD xmm1, xmm2, xmm3/m64, imm8	VEX.NDS.128.F2.0F.WIG C2 /r ib	avx	Compare low double-precision floating-point value in xmm3/m64 and xmm2 using bits 4:0 of imm8 as comparison predicate.
VCMPSD k1 {k2}, xmm2, xmm3/m64{sae}, imm8	EVEX.NDS.LIG.F2.0F.W1 C2 /r ib	avx512	Compare low double-precision floating-point value in xmm3/m64 and xmm2 using bits 4:0 of imm8 as comparison predicate with writemask k2 and leave the result in mask register k1.
CMPSS xmm1, xmm2/m32, imm8	F3 0F C2 /r ib	sse	Compare low single-precision floating-point value in xmm2/m32 and xmm1 using bits 2:0 of imm8 as comparison predicate.
VCMPSS xmm1, xmm2, xmm3/m32, imm8	VEX.NDS.128.F3.0F.WIG C2 /r ib	avx	Compare low single-precision floating-point value in xmm3/m32 and xmm2 using bits 4:0 of imm8 as comparison predicate.
VCMPSS k1 {k2}, xmm2, xmm3/m32{sae}, imm8	EVEX.NDS.LIG.F3.0F.W0 C2 /r ib	avx512	Compare low single-precision floating-point value in xmm3/m32 and xmm2 using bits 4:0 of imm8 as comparison predicate with writemask k2 and leave the result in mask register k1.
CMPS m8, m8	A6		For legacy mode, compare byte at address DS:(E)SI with byte at address ES:(E)DI; For 64-bit mode compare byte at address (R\|E)SI to byte at address (R\|E)DI. The status flags are set accordingly.
CMPS m16, m16	A7		For legacy mode, compare word at address DS:(E)SI with word at address ES:(E)DI; For 64-bit mode compare word at address (R\|E)SI with word at address (R\|E)DI. The status flags are set accordingly.
CMPS m32, m32	A7		For legacy mode, compare dword at address DS:(E)SI at dword at address ES:(E)DI; For 64-bit mode compare dword at address (R\|E)SI at dword at address (R\|E)DI. The status flags are set accordingly.
CMPS m64, m64	REX.W + A7		Compares quadword at address (R\|E)SI with quadword at address (R\|E)DI and sets the status flags accordingly.
CMPSB	A6		For legacy mode, compare byte at address DS:(E)SI with byte at address ES:(E)DI; For 64-bit mode compare byte at address (R\|E)SI with byte at address (R\|E)DI. The status flags are set accordingly.
CMPSW	A7		For legacy mode, compare word at address DS:(E)SI with word at address ES:(E)DI; For 64-bit mode compare word at address (R\|E)SI with word at address (R\|E)DI. The status flags are set accordingly.
CMPSD	A7		For legacy mode, compare dword at address DS:(E)SI with dword at address ES:(E)DI; For 64-bit mode compare dword at address (R\|E)SI with dword at address (R\|E)DI. The status flags are set accordingly.
CMPSQ	REX.W + A7		Compares quadword at address (R\|E)SI with quadword at address (R\|E)DI and sets the status flags accordingly.
CMPXCHG r/m8, r8	0F B0/r		Compare AL with r/m8. If equal, ZF is set and r8 is loaded into r/m8. Else, clear ZF and load r/m8 into AL.
CMPXCHG r/m8,r8	REX + 0F B0/r		Compare AL with r/m8. If equal, ZF is set and r8 is loaded into r/m8. Else, clear ZF and load r/m8 into AL.
CMPXCHG r/m16, r16	0F B1/r		Compare AX with r/m16. If equal, ZF is set and r16 is loaded into r/m16. Else, clear ZF and load r/m16 into AX.
CMPXCHG r/m32, r32	0F B1/r		Compare EAX with r/m32. If equal, ZF is set and r32 is loaded into r/m32. Else, clear ZF and load r/m32 into EAX.
CMPXCHG r/m64, r64	REX.W + 0F B1/r		Compare RAX with r/m64. If equal, ZF is set and r64 is loaded into r/m64. Else, clear ZF and load r/m64 into RAX.
CMPXCHG8B m64	0F C7 /1 m64		Compare EDX:EAX with m64. If equal, set ZF and load ECX:EBX into m64. Else, clear ZF and load m64 into EDX:EAX.
CMPXCHG16B m128	REX.W + 0F C7 /1 m128		Compare RDX:RAX with m128. If equal, set ZF and load RCX:RBX into m128. Else, clear ZF and load m128 into RDX:RAX.
COMISD xmm1, xmm2/m64	66 0F 2F /r	sse2	Compare low double-precision floating-point values in xmm1 and xmm2/mem64 and set the EFLAGS flags accordingly.
VCOMISD xmm1, xmm2/m64	VEX.128.66.0F.WIG 2F /r	avx	Compare low double-precision floating-point values in xmm1 and xmm2/mem64 and set the EFLAGS flags accordingly.
VCOMISD xmm1, xmm2/m64{sae}	EVEX.LIG.66.0F.W1 2F /r	avx512	Compare low double-precision floating-point values in xmm1 and xmm2/mem64 and set the EFLAGS flags accordingly.
COMISS xmm1, xmm2/m32	0F 2F /r	sse	Compare low single-precision floating-point values in xmm1 and xmm2/mem32 and set the EFLAGS flags accordingly.
VCOMISS xmm1, xmm2/m32	VEX.128.0F.WIG 2F /r	avx	Compare low single-precision floating-point values in xmm1 and xmm2/mem32 and set the EFLAGS flags accordingly.
VCOMISS xmm1, xmm2/m32{sae}	EVEX.LIG.0F.W0 2F /r	avx512	Compare low single-precision floating-point values in xmm1 and xmm2/mem32 and set the EFLAGS flags accordingly.
CPUID	0F A2		Returns processor identification and feature information to the EAX, EBX, ECX, and EDX registers, as determined by input entered in EAX (in some cases, ECX as well).
CRC32 r32, r/m8	F2 0F 38 F0 /r		Accumulate CRC32 on r/m8.
CRC32 r32, r/m8	F2 REX 0F 38 F0 /r		Accumulate CRC32 on r/m8.
CRC32 r32, r/m16	F2 0F 38 F1 /r		Accumulate CRC32 on r/m16.
CRC32 r32, r/m32	F2 0F 38 F1 /r		Accumulate CRC32 on r/m32.
CRC32 r64, r/m8	F2 REX.W 0F 38 F0 /r		Accumulate CRC32 on r/m8.
CRC32 r64, r/m64	F2 REX.W 0F 38 F1 /r		Accumulate CRC32 on r/m64.
CVTDQ2PD xmm1, xmm2/m64	F3 0F E6 /r	sse2	Convert two packed signed doubleword integers from xmm2/mem to two packed double-precision floating-point values in xmm1.
VCVTDQ2PD xmm1, xmm2/m64	VEX.128.F3.0F.WIG E6 /r	avx	Convert two packed signed doubleword integers from xmm2/mem to two packed double-precision floating-point values in xmm1.
VCVTDQ2PD ymm1, xmm2/m128	VEX.256.F3.0F.WIG E6 /r	avx	Convert four packed signed doubleword integers from xmm2/mem to four packed double-precision floating-point values in ymm1.
VCVTDQ2PD xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.F3.0F.W0 E6 /r	avx512	Convert 2 packed signed doubleword integers from xmm2/m128/m32bcst to eight packed double-precision floating-point values in xmm1 with writemask k1.
VCVTDQ2PD ymm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.256.F3.0F.W0 E6 /r	avx512	Convert 4 packed signed doubleword integers from xmm2/m128/m32bcst to 4 packed double-precision floating-point values in ymm1 with writemask k1.
VCVTDQ2PD zmm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.512.F3.0F.W0 E6 /r	avx512	Convert eight packed signed doubleword integers from ymm2/m256/m32bcst to eight packed double-precision floating-point values in zmm1 with writemask k1.
CVTDQ2PS xmm1, xmm2/m128	0F 5B /r	sse2	Convert four packed signed doubleword integers from xmm2/mem to four packed single-precision floating-point values in xmm1.
VCVTDQ2PS xmm1, xmm2/m128	VEX.128.0F.WIG 5B /r	avx	Convert four packed signed doubleword integers from xmm2/mem to four packed single-precision floating-point values in xmm1.
VCVTDQ2PS ymm1, ymm2/m256	VEX.256.0F.WIG 5B /r	avx	Convert eight packed signed doubleword integers from ymm2/mem to eight packed single-precision floating-point values in ymm1.
VCVTDQ2PS xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.0F.W0 5B /r	avx512	Convert four packed signed doubleword integers from xmm2/m128/m32bcst to four packed single-precision floating-point values in xmm1with writemask k1.
VCVTDQ2PS ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.0F.W0 5B /r	avx512	Convert eight packed signed doubleword integers from ymm2/m256/m32bcst to eight packed single-precision floating-point values in ymm1with writemask k1.
VCVTDQ2PS zmm1 {k1}{z}, zmm2/m512/m32bcst{er}	EVEX.512.0F.W0 5B /r	avx512	Convert sixteen packed signed doubleword integers from zmm2/m512/m32bcst to sixteen packed single-precision floating-point values in zmm1with writemask k1.
CVTPD2DQ xmm1, xmm2/m128	F2 0F E6 /r	sse2	Convert two packed double-precision floating-point values in xmm2/mem to two signed doubleword integers in xmm1.
VCVTPD2DQ xmm1, xmm2/m128	VEX.128.F2.0F.WIG E6 /r	avx	Convert two packed double-precision floating-point values in xmm2/mem to two signed doubleword integers in xmm1.
VCVTPD2DQ xmm1, ymm2/m256	VEX.256.F2.0F.WIG E6 /r	avx	Convert four packed double-precision floating-point values in ymm2/mem to four signed doubleword integers in xmm1.
VCVTPD2DQ xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.F2.0F.W1 E6 /r	avx512	Convert two packed double-precision floating-point values in xmm2/m128/m64bcst to two signed doubleword integers in xmm1 subject to writemask k1.
VCVTPD2DQ xmm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.F2.0F.W1 E6 /r	avx512	Convert four packed double-precision floating-point values in ymm2/m256/m64bcst to four signed doubleword integers in xmm1 subject to writemask k1.
VCVTPD2DQ ymm1 {k1}{z}, zmm2/m512/m64bcst{er}	EVEX.512.F2.0F.W1 E6 /r	avx512	Convert eight packed double-precision floating-point values in zmm2/m512/m64bcst to eight signed doubleword integers in ymm1 subject to writemask k1.
CVTPD2PI mm, xmm/m128	66 0F 2D /r		Convert two packed double-precision floating-point values from xmm/m128 to two packed signed doubleword integers in mm.
CVTPD2PS xmm1, xmm2/m128	66 0F 5A /r	sse2	Convert two packed double-precision floating-point values in xmm2/mem to two single-precision floating-point values in xmm1.
VCVTPD2PS xmm1, xmm2/m128	VEX.128.66.0F.WIG 5A /r	avx	Convert two packed double-precision floating-point values in xmm2/mem to two single-precision floating-point values in xmm1.
VCVTPD2PS xmm1, ymm2/m256	VEX.256.66.0F.WIG 5A /r	avx	Convert four packed double-precision floating-point values in ymm2/mem to four single-precision floating-point values in xmm1.
VCVTPD2PS xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.66.0F.W1 5A /r	avx512	Convert two packed double-precision floating-point values in xmm2/m128/m64bcst to two single-precision floating-point values in xmm1with writemask k1.
VCVTPD2PS xmm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.66.0F.W1 5A /r	avx512	Convert four packed double-precision floating-point values in ymm2/m256/m64bcst to four single-precision floating-point values in xmm1with writemask k1.
VCVTPD2PS ymm1 {k1}{z}, zmm2/m512/m64bcst{er}	EVEX.512.66.0F.W1 5A /r	avx512	Convert eight packed double-precision floating-point values in zmm2/m512/m64bcst to eight single-precision floating-point values in ymm1with writemask k1.
CVTPI2PD xmm, mm/m64	66 0F 2A /r		Convert two packed signed doubleword integers from mm/mem64 to two packed double-precision floating-point values in xmm.
CVTPI2PS xmm, mm/m64	0F 2A /r		Convert two signed doubleword integers from mm/m64 to two single-precision floating-point values in xmm.
CVTPS2DQ xmm1, xmm2/m128	66 0F 5B /r	sse2	Convert four packed single-precision floating-point values from xmm2/mem to four packed signed doubleword values in xmm1.
VCVTPS2DQ xmm1, xmm2/m128	VEX.128.66.0F.WIG 5B /r	avx	Convert four packed single-precision floating-point values from xmm2/mem to four packed signed doubleword values in xmm1.
VCVTPS2DQ ymm1, ymm2/m256	VEX.256.66.0F.WIG 5B /r	avx	Convert eight packed single-precision floating-point values from ymm2/mem to eight packed signed doubleword values in ymm1.
VCVTPS2DQ xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.66.0F.W0 5B /r	avx512	Convert four packed single precision floating-point values from xmm2/m128/m32bcst to four packed signed doubleword values in xmm1 subject to writemask k1.
VCVTPS2DQ ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.66.0F.W0 5B /r	avx512	Convert eight packed single precision floating-point values from ymm2/m256/m32bcst to eight packed signed doubleword values in ymm1 subject to writemask k1.
VCVTPS2DQ zmm1 {k1}{z}, zmm2/m512/m32bcst{er}	EVEX.512.66.0F.W0 5B /r	avx512	Convert sixteen packed single-precision floating-point values from zmm2/m512/m32bcst to sixteen packed signed doubleword values in zmm1 subject to writemask k1.
CVTPS2PD xmm1, xmm2/m64	0F 5A /r	sse2	Convert two packed single-precision floating-point values in xmm2/m64 to two packed double-precision floating-point values in xmm1.
VCVTPS2PD xmm1, xmm2/m64	VEX.128.0F.WIG 5A /r	avx	Convert two packed single-precision floating-point values in xmm2/m64 to two packed double-precision floating-point values in xmm1.
VCVTPS2PD ymm1, xmm2/m128	VEX.256.0F.WIG 5A /r	avx	Convert four packed single-precision floating-point values in xmm2/m128 to four packed double-precision floating-point values in ymm1.
VCVTPS2PD xmm1 {k1}{z}, xmm2/m64/m32bcst	EVEX.128.0F.W0 5A /r	avx512	Convert two packed single-precision floating-point values in xmm2/m64/m32bcst to packed double-precision floating-point values in xmm1 with writemask k1.
VCVTPS2PD ymm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.256.0F.W0 5A /r	avx512	Convert four packed single-precision floating-point values in xmm2/m128/m32bcst to packed double-precision floating-point values in ymm1 with writemask k1.
VCVTPS2PD zmm1 {k1}{z}, ymm2/m256/m32bcst{sae}	EVEX.512.0F.W0 5A /r	avx512	Convert eight packed single-precision floating-point values in ymm2/m256/b32bcst to eight packed double-precision floating-point values in zmm1 with writemask k1.
CVTPS2PI mm, xmm/m64	0F 2D /r		Convert two packed single-precision floating-point values from xmm/m64 to two packed signed doubleword integers in mm.
CVTSD2SI r32, xmm1/m64	F2 0F 2D /r	sse2	Convert one double-precision floating-point value from xmm1/m64 to one signed doubleword integer r32.
CVTSD2SI r64, xmm1/m64	F2 REX.W 0F 2D /r	sse2	Convert one double-precision floating-point value from xmm1/m64 to one signed quadword integer sign-extended into r64.
VCVTSD2SI r32, xmm1/m64	VEX.128.F2.0F.W0 2D /r	avx	Convert one double-precision floating-point value from xmm1/m64 to one signed doubleword integer r32.
VCVTSD2SI r64, xmm1/m64	VEX.128.F2.0F.W1 2D /r	avx	Convert one double-precision floating-point value from xmm1/m64 to one signed quadword integer sign-extended into r64.
VCVTSD2SI r32, xmm1/m64{er}	EVEX.LIG.F2.0F.W0 2D /r	avx512	Convert one double-precision floating-point value from xmm1/m64 to one signed doubleword integer r32.
VCVTSD2SI r64, xmm1/m64{er}	EVEX.LIG.F2.0F.W1 2D /r	avx512	Convert one double-precision floating-point value from xmm1/m64 to one signed quadword integer sign-extended into r64.
CVTSD2SS xmm1, xmm2/m64	F2 0F 5A /r	sse2	Convert one double-precision floating-point value in xmm2/m64 to one single-precision floating-point value in xmm1.
VCVTSD2SS xmm1,xmm2, xmm3/m64	VEX.NDS.128.F2.0F.WIG 5A /r	avx	Convert one double-precision floating-point value in xmm3/m64 to one single-precision floating-point value and merge with high bits in xmm2.
VCVTSD2SS xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.NDS.LIG.F2.0F.W1 5A /r	avx512	Convert one double-precision floating-point value in xmm3/m64 to one single-precision floating-point value and merge with high bits in xmm2 under writemask k1.
CVTSI2SD xmm1, r32/m32	F2 0F 2A /r	sse2	Convert one signed doubleword integer from r32/m32 to one double-precision floating-point value in xmm1.
CVTSI2SD xmm1, r/m64	F2 REX.W 0F 2A /r	sse2	Convert one signed quadword integer from r/m64 to one double-precision floating-point value in xmm1.
VCVTSI2SD xmm1, xmm2, r/m32	VEX.NDS.128.F2.0F.W0 2A /r	avx	Convert one signed doubleword integer from r/m32 to one double-precision floating-point value in xmm1.
VCVTSI2SD xmm1, xmm2, r/m64	VEX.NDS.128.F2.0F.W1 2A /r	avx	Convert one signed quadword integer from r/m64 to one double-precision floating-point value in xmm1.
VCVTSI2SD xmm1, xmm2, r/m32	EVEX.NDS.LIG.F2.0F.W0 2A /r	avx512	Convert one signed doubleword integer from r/m32 to one double-precision floating-point value in xmm1.
VCVTSI2SD xmm1, xmm2, r/m64{er}	EVEX.NDS.LIG.F2.0F.W1 2A /r	avx512	Convert one signed quadword integer from r/m64 to one double-precision floating-point value in xmm1.
CVTSI2SS xmm1, r/m32	F3 0F 2A /r	sse	Convert one signed doubleword integer from r/m32 to one single-precision floating-point value in xmm1.
CVTSI2SS xmm1, r/m64	F3 REX.W 0F 2A /r	sse	Convert one signed quadword integer from r/m64 to one single-precision floating-point value in xmm1.
VCVTSI2SS xmm1, xmm2, r/m32	VEX.NDS.128.F3.0F.W0 2A /r	avx	Convert one signed doubleword integer from r/m32 to one single-precision floating-point value in xmm1.
VCVTSI2SS xmm1, xmm2, r/m64	VEX.NDS.128.F3.0F.W1 2A /r	avx	Convert one signed quadword integer from r/m64 to one single-precision floating-point value in xmm1.
VCVTSI2SS xmm1, xmm2, r/m32{er}	EVEX.NDS.LIG.F3.0F.W0 2A /r	avx512	Convert one signed doubleword integer from r/m32 to one single-precision floating-point value in xmm1.
VCVTSI2SS xmm1, xmm2, r/m64{er}	EVEX.NDS.LIG.F3.0F.W1 2A /r	avx512	Convert one signed quadword integer from r/m64 to one single-precision floating-point value in xmm1.
CVTSS2SD xmm1, xmm2/m32	F3 0F 5A /r	sse2	Convert one single-precision floating-point value in xmm2/m32 to one double-precision floating-point value in xmm1.
VCVTSS2SD xmm1, xmm2, xmm3/m32	VEX.NDS.128.F3.0F.WIG 5A /r	avx	Convert one single-precision floating-point value in xmm3/m32 to one double-precision floating-point value and merge with high bits of xmm2.
VCVTSS2SD xmm1 {k1}{z}, xmm2, xmm3/m32{sae}	EVEX.NDS.LIG.F3.0F.W0 5A /r	avx512	Convert one single-precision floating-point value in xmm3/m32 to one double-precision floating-point value and merge with high bits of xmm2 under writemask k1.
CVTSS2SI r32, xmm1/m32	F3 0F 2D /r	sse	Convert one single-precision floating-point value from xmm1/m32 to one signed doubleword integer in r32.
CVTSS2SI r64, xmm1/m32	F3 REX.W 0F 2D /r	sse	Convert one single-precision floating-point value from xmm1/m32 to one signed quadword integer in r64.
VCVTSS2SI r32, xmm1/m32	VEX.128.F3.0F.W0 2D /r	avx	Convert one single-precision floating-point value from xmm1/m32 to one signed doubleword integer in r32.
VCVTSS2SI r64, xmm1/m32	VEX.128.F3.0F.W1 2D /r	avx	Convert one single-precision floating-point value from xmm1/m32 to one signed quadword integer in r64.
VCVTSS2SI r32, xmm1/m32{er}	EVEX.LIG.F3.0F.W0 2D /r	avx512	Convert one single-precision floating-point value from xmm1/m32 to one signed doubleword integer in r32.
VCVTSS2SI r64, xmm1/m32{er}	EVEX.LIG.F3.0F.W1 2D /r	avx512	Convert one single-precision floating-point value from xmm1/m32 to one signed quadword integer in r64.
CVTTPD2DQ xmm1, xmm2/m128	66 0F E6 /r	sse2	Convert two packed double-precision floating-point values in xmm2/mem to two signed doubleword integers in xmm1 using truncation.
VCVTTPD2DQ xmm1, xmm2/m128	VEX.128.66.0F.WIG E6 /r	avx	Convert two packed double-precision floating-point values in xmm2/mem to two signed doubleword integers in xmm1 using truncation.
VCVTTPD2DQ xmm1, ymm2/m256	VEX.256.66.0F.WIG E6 /r	avx	Convert four packed double-precision floating-point values in ymm2/mem to four signed doubleword integers in xmm1 using truncation.
VCVTTPD2DQ xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.66.0F.W1 E6 /r	avx512	Convert two packed double-precision floating-point values in xmm2/m128/m64bcst to two signed doubleword integers in xmm1 using truncation subject to writemask k1.
VCVTTPD2DQ xmm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.66.0F.W1 E6 /r	avx512	Convert four packed double-precision floating-point values in ymm2/m256/m64bcst to four signed doubleword integers in xmm1 using truncation subject to writemask k1.
VCVTTPD2DQ ymm1 {k1}{z}, zmm2/m512/m64bcst{sae}	EVEX.512.66.0F.W1 E6 /r	avx512	Convert eight packed double-precision floating-point values in zmm2/m512/m64bcst to eight signed doubleword integers in ymm1 using truncation subject to writemask k1.
CVTTPD2PI mm, xmm/m128	66 0F 2C /r		Convert two packer double-precision floating-point values from xmm/m128 to two packed signed doubleword integers in mm using truncation.
CVTTPS2DQ xmm1, xmm2/m128	F3 0F 5B /r	sse2	Convert four packed single-precision floating-point values from xmm2/mem to four packed signed doubleword values in xmm1 using truncation.
VCVTTPS2DQ xmm1, xmm2/m128	VEX.128.F3.0F.WIG 5B /r	avx	Convert four packed single-precision floating-point values from xmm2/mem to four packed signed doubleword values in xmm1 using truncation.
VCVTTPS2DQ ymm1, ymm2/m256	VEX.256.F3.0F.WIG 5B /r	avx	Convert eight packed single-precision floating-point values from ymm2/mem to eight packed signed doubleword values in ymm1 using truncation.
VCVTTPS2DQ xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.F3.0F.W0 5B /r	avx512	Convert four packed single precision floating-point values from xmm2/m128/m32bcst to four packed signed doubleword values in xmm1 using truncation subject to writemask k1.
VCVTTPS2DQ ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.F3.0F.W0 5B /r	avx512	Convert eight packed single precision floating-point values from ymm2/m256/m32bcst to eight packed signed doubleword values in ymm1 using truncation subject to writemask k1.
VCVTTPS2DQ zmm1 {k1}{z}, zmm2/m512/m32bcst {sae}	EVEX.512.F3.0F.W0 5B /r	avx512	Convert sixteen packed single-precision floating-point values from zmm2/m512/m32bcst to sixteen packed signed doubleword values in zmm1 using truncation subject to writemask k1.
CVTTPS2PI mm, xmm/m64	0F 2C /r		Convert two single-precision floating-point values from xmm/m64 to two signed doubleword signed integers in mm using truncation.
CVTTSD2SI r32, xmm1/m64	F2 0F 2C /r	sse2	Convert one double-precision floating-point value from xmm1/m64 to one signed doubleword integer in r32 using truncation.
CVTTSD2SI r64, xmm1/m64	F2 REX.W 0F 2C /r	sse2	Convert one double-precision floating-point value from xmm1/m64 to one signed quadword integer in r64 using truncation.
VCVTTSD2SI r32, xmm1/m64	VEX.128.F2.0F.W0 2C /r	avx	Convert one double-precision floating-point value from xmm1/m64 to one signed doubleword integer in r32 using truncation.
VCVTTSD2SI r64, xmm1/m64	VEX.128.F2.0F.W1 2C /r	avx	Convert one double-precision floating-point value from xmm1/m64 to one signed quadword integer in r64 using truncation.
VCVTTSD2SI r32, xmm1/m64{sae}	EVEX.LIG.F2.0F.W0 2C /r	avx512	Convert one double-precision floating-point value from xmm1/m64 to one signed doubleword integer in r32 using truncation.
VCVTTSD2SI r64, xmm1/m64{sae}	EVEX.LIG.F2.0F.W1 2C /r	avx512	Convert one double-precision floating-point value from xmm1/m64 to one signed quadword integer in r64 using truncation.
CVTTSS2SI r32, xmm1/m32	F3 0F 2C /r	sse	Convert one single-precision floating-point value from xmm1/m32 to one signed doubleword integer in r32 using truncation.
CVTTSS2SI r64, xmm1/m32	F3 REX.W 0F 2C /r	sse	Convert one single-precision floating-point value from xmm1/m32 to one signed quadword integer in r64 using truncation.
VCVTTSS2SI r32, xmm1/m32	VEX.128.F3.0F.W0 2C /r	avx	Convert one single-precision floating-point value from xmm1/m32 to one signed doubleword integer in r32 using truncation.
VCVTTSS2SI r64, xmm1/m32	VEX.128.F3.0F.W1 2C /r	avx	Convert one single-precision floating-point value from xmm1/m32 to one signed quadword integer in r64 using truncation.
VCVTTSS2SI r32, xmm1/m32{sae}	EVEX.LIG.F3.0F.W0 2C /r	avx512	Convert one single-precision floating-point value from xmm1/m32 to one signed doubleword integer in r32 using truncation.
VCVTTSS2SI r64, xmm1/m32{sae}	EVEX.LIG.F3.0F.W1 2C /r	avx512	Convert one single-precision floating-point value from xmm1/m32 to one signed quadword integer in r64 using truncation.
CWD	99		DX:AX ← sign-extend of AX.
CDQ	99		EDX:EAX ← sign-extend of EAX.
CQO	REX.W + 99		RDX:RAX← sign-extend of RAX.
DAA	27		Decimal adjust AL after addition.
DAS	2F		Decimal adjust AL after subtraction.
DEC r/m8	FE /1		Decrement r/m8 by 1.
DEC r/m8	REX + FE /1		Decrement r/m8 by 1.
DEC r/m16	FF /1		Decrement r/m16 by 1.
DEC r/m32	FF /1		Decrement r/m32 by 1.
DEC r/m64	REX.W + FF /1		Decrement r/m64 by 1.
DEC r16	48+rw		Decrement r16 by 1.
DEC r32	48+rd		Decrement r32 by 1.
DIV r/m8	F6 /6		Unsigned divide AX by r/m8, with result stored in AL ← Quotient, AH ← Remainder.
DIV r/m8	REX + F6 /6		Unsigned divide AX by r/m8, with result stored in AL ← Quotient, AH ← Remainder.
DIV r/m16	F7 /6		Unsigned divide DX:AX by r/m16, with result stored in AX ← Quotient, DX ← Remainder.
DIV r/m32	F7 /6		Unsigned divide EDX:EAX by r/m32, with result stored in EAX ← Quotient, EDX ← Remainder.
DIV r/m64	REX.W + F7 /6		Unsigned divide RDX:RAX by r/m64, with result stored in RAX ← Quotient, RDX ← Remainder.
DIVPD xmm1, xmm2/m128	66 0F 5E /r	sse2	Divide packed double-precision floating-point values in xmm1 by packed double-precision floating-point values in xmm2/mem.
VDIVPD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 5E /r	avx	Divide packed double-precision floating-point values in xmm2 by packed double-precision floating-point values in xmm3/mem.
VDIVPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 5E /r	avx	Divide packed double-precision floating-point values in ymm2 by packed double-precision floating-point values in ymm3/mem.
VDIVPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 5E /r	avx512	Divide packed double-precision floating-point values in xmm2 by packed double-precision floating-point values in xmm3/m128/m64bcst and write results to xmm1 subject to writemask k1.
VDIVPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 5E /r	avx512	Divide packed double-precision floating-point values in ymm2 by packed double-precision floating-point values in ymm3/m256/m64bcst and write results to ymm1 subject to writemask k1.
VDIVPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F.W1 5E /r	avx512	Divide packed double-precision floating-point values in zmm2 by packed double-precision FP values in zmm3/m512/m64bcst and write results to zmm1 subject to writemask k1.
DIVPS xmm1, xmm2/m128	0F 5E /r	sse	Divide packed single-precision floating-point values in xmm1 by packed single-precision floating-point values in xmm2/mem.
VDIVPS xmm1, xmm2, xmm3/m128	VEX.NDS.128.0F.WIG 5E /r	avx	Divide packed single-precision floating-point values in xmm2 by packed single-precision floating-point values in xmm3/mem.
VDIVPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.0F.WIG 5E /r	avx	Divide packed single-precision floating-point values in ymm2 by packed single-precision floating-point values in ymm3/mem.
VDIVPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.0F.W0 5E /r	avx512	Divide packed single-precision floating-point values in xmm2 by packed single-precision floating-point values in xmm3/m128/m32bcst and write results to xmm1 subject to writemask k1.
VDIVPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.0F.W0 5E /r	avx512	Divide packed single-precision floating-point values in ymm2 by packed single-precision floating-point values in ymm3/m256/m32bcst and write results to ymm1 subject to writemask k1.
VDIVPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.0F.W0 5E /r	avx512	Divide packed single-precision floating-point values in zmm2 by packed single-precision floating-point values in zmm3/m512/m32bcst and write results to zmm1 subject to writemask k1.
DIVSD xmm1, xmm2/m64	F2 0F 5E /r	sse2	Divide low double-precision floating-point value in xmm1 by low double-precision floating-point value in xmm2/m64.
VDIVSD xmm1, xmm2, xmm3/m64	VEX.NDS.128.F2.0F.WIG 5E /r	avx	Divide low double-precision floating-point value in xmm2 by low double-precision floating-point value in xmm3/m64.
VDIVSD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.NDS.LIG.F2.0F.W1 5E /r	avx512	Divide low double-precision floating-point value in xmm2 by low double-precision floating-point value in xmm3/m64.
DIVSS xmm1, xmm2/m32	F3 0F 5E /r	sse	Divide low single-precision floating-point value in xmm1 by low single-precision floating-point value in xmm2/m32.
VDIVSS xmm1, xmm2, xmm3/m32	VEX.NDS.128.F3.0F.WIG 5E /r	avx	Divide low single-precision floating-point value in xmm2 by low single-precision floating-point value in xmm3/m32.
VDIVSS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.NDS.LIG.F3.0F.W0 5E /r	avx512	Divide low single-precision floating-point value in xmm2 by low single-precision floating-point value in xmm3/m32.
DPPD xmm1, xmm2/m128, imm8	66 0F 3A 41 /r ib	sse4.1	Selectively multiply packed DP floating-point values from xmm1 with packed DP floating-point values from xmm2, add and selectively store the packed DP floating-point values to xmm1.
VDPPD xmm1,xmm2, xmm3/m128, imm8	VEX.NDS.128.66.0F3A.WIG 41 /r ib	avx	Selectively multiply packed DP floating-point values from xmm2 with packed DP floating-point values from xmm3, add and selectively store the packed DP floating-point values to xmm1.
DPPS xmm1, xmm2/m128, imm8	66 0F 3A 40 /r ib	sse4.1	Selectively multiply packed SP floating-point values from xmm1 with packed SP floating-point values from xmm2, add and selectively store the packed SP floating-point values or zero values to xmm1.
VDPPS xmm1,xmm2, xmm3/m128, imm8	VEX.NDS.128.66.0F3A.WIG 40 /r ib	avx	Multiply packed SP floating point values from xmm1 with packed SP floating point values from xmm2/mem selectively add and store to xmm1.
VDPPS ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.66.0F3A.WIG 40 /r ib	avx	Multiply packed single-precision floating-point values from ymm2 with packed SP floating point values from ymm3/mem, selectively add pairs of elements and store to ymm1.
EMMS	0F 77		Set the x87 FPU tag word to empty.
ENTER imm16, 0	C8 iw 00		Create a stack frame for a procedure.
ENTER imm16,1	C8 iw 01		Create a stack frame with a nested pointer for a procedure.
ENTER imm16, imm8	C8 iw ib		Create a stack frame with nested pointers for a procedure.
EXTRACTPS reg/m32, xmm1, imm8	66 0F 3A 17 /r ib	sse4.1	Extract one single-precision floating-point value from xmm1 at the offset specified by imm8 and store the result in reg or m32. Zero extend the results in 64-bit register if applicable.
VEXTRACTPS reg/m32, xmm1, imm8	VEX.128.66.0F3A.WIG 17 /r ib	avx	Extract one single-precision floating-point value from xmm1 at the offset specified by imm8 and store the result in reg or m32. Zero extend the results in 64-bit register if applicable.
VEXTRACTPS reg/m32, xmm1, imm8	EVEX.128.66.0F3A.WIG 17 /r ib	avx512	Extract one single-precision floating-point value from xmm1 at the offset specified by imm8 and store the result in reg or m32. Zero extend the results in 64-bit register if applicable.
F2XM1	D9 F0		Replace ST(0) with (2^{ST(0)– 1).}
FABS	D9 E1		Replace ST with its absolute value.
FADD m32fp	D8 /0		Add m32fp to ST(0) and store result in ST(0).
FADD m64fp	DC /0		Add m64fp to ST(0) and store result in ST(0).
FADD ST(0), ST(i)	D8 C0+i		Add ST(0) to ST(i) and store result in ST(0).
FADD ST(i), ST(0)	DC C0+i		Add ST(i) to ST(0) and store result in ST(i).
FADDP ST(i), ST(0)	DE C0+i		Add ST(0) to ST(i), store result in ST(i), and pop the register stack.
FADDP	DE C1		Add ST(0) to ST(1), store result in ST(1), and pop the register stack.
FIADD m32int	DA /0		Add m32int to ST(0) and store result in ST(0).
FIADD m16int	DE /0		Add m16int to ST(0) and store result in ST(0).
FBLD m80dec	DF /4		Convert BCD value to floating-point and push onto the FPU stack.
FBSTP m80bcd	DF /6		Store ST(0) in m80bcd and pop ST(0).
FCHS	D9 E0		Complements sign of ST(0).
FCLEX	9B DB E2		Clear floating-point exception flags after checking for pending unmasked floating-point exceptions.
FNCLEX	DB E2		Clear floating-point exception flags without checking for pending unmasked floating-point exceptions.
FCMOVB ST(0), ST(i)	DA C0+i		Move if below (CF=1).
FCMOVE ST(0), ST(i)	DA C8+i		Move if equal (ZF=1).
FCMOVBE ST(0), ST(i)	DA D0+i		Move if below or equal (CF=1 or ZF=1).
FCMOVU ST(0), ST(i)	DA D8+i		Move if unordered (PF=1).
FCMOVNB ST(0), ST(i)	DB C0+i		Move if not below (CF=0).
FCMOVNE ST(0), ST(i)	DB C8+i		Move if not equal (ZF=0).
FCMOVNBE ST(0), ST(i)	DB D0+i		Move if not below or equal (CF=0 and ZF=0).
FCMOVNU ST(0), ST(i)	DB D8+i		Move if not unordered (PF=0).
FCOMI ST, ST(i)	DB F0+i		Compare ST(0) with ST(i) and set status flags accordingly.
FCOMIP ST, ST(i)	DF F0+i		Compare ST(0) with ST(i), set status flags accordingly, and pop register stack.
FUCOMI ST, ST(i)	DB E8+i		Compare ST(0) with ST(i), check for ordered values, and set status flags accordingly.
FUCOMIP ST, ST(i)	DF E8+i		Compare ST(0) with ST(i), check for ordered values, set status flags accordingly, and pop register stack.
FCOM m32fp	D8 /2		Compare ST(0) with m32fp.
FCOM m64fp	DC /2		Compare ST(0) with m64fp.
FCOM ST(i)	D8 D0+i		Compare ST(0) with ST(i).
FCOM	D8 D1		Compare ST(0) with ST(1).
FCOMP m32fp	D8 /3		Compare ST(0) with m32fp and pop register stack.
FCOMP m64fp	DC /3		Compare ST(0) with m64fp and pop register stack.
FCOMP ST(i)	D8 D8+i		Compare ST(0) with ST(i) and pop register stack.
FCOMP	D8 D9		Compare ST(0) with ST(1) and pop register stack.
FCOMPP	DE D9		Compare ST(0) with ST(1) and pop register stack twice.
FCOS	D9 FF		Replace ST(0) with its approximate cosine.
FDECSTP	D9 F6		Decrement TOP field in FPU status word.
FDIVR m32fp	D8 /7		Divide m32fp by ST(0) and store result in ST(0).
FDIVR m64fp	DC /7		Divide m64fp by ST(0) and store result in ST(0).
FDIVR ST(0), ST(i)	D8 F8+i		Divide ST(i) by ST(0) and store result in ST(0).
FDIVR ST(i), ST(0)	DC F0+i		Divide ST(0) by ST(i) and store result in ST(i).
FDIVRP ST(i), ST(0)	DE F0+i		Divide ST(0) by ST(i), store result in ST(i), and pop the register stack.
FDIVRP	DE F1		Divide ST(0) by ST(1), store result in ST(1), and pop the register stack.
FIDIVR m32int	DA /7		Divide m32int by ST(0) and store result in ST(0).
FIDIVR m16int	DE /7		Divide m16int by ST(0) and store result in ST(0).
FDIV m32fp	D8 /6		Divide ST(0) by m32fp and store result in ST(0).
FDIV m64fp	DC /6		Divide ST(0) by m64fp and store result in ST(0).
FDIV ST(0), ST(i)	D8 F0+i		Divide ST(0) by ST(i) and store result in ST(0).
FDIV ST(i), ST(0)	DC F8+i		Divide ST(i) by ST(0) and store result in ST(i).
FDIVP ST(i), ST(0)	DE F8+i		Divide ST(i) by ST(0), store result in ST(i), and pop the register stack.
FDIVP	DE F9		Divide ST(1) by ST(0), store result in ST(1), and pop the register stack.
FIDIV m32int	DA /6		Divide ST(0) by m32int and store result in ST(0).
FIDIV m16int	DE /6		Divide ST(0) by m16int and store result in ST(0).
FFREE ST(i)	DD C0+i		Sets tag for ST(i) to empty.
FICOM m16int	DE /2		Compare ST(0) with m16int.
FICOM m32int	DA /2		Compare ST(0) with m32int.
FICOMP m16int	DE /3		Compare ST(0) with m16int and pop stack register.
FICOMP m32int	DA /3		Compare ST(0) with m32int and pop stack register.
FILD m16int	DF /0		Push m16int onto the FPU register stack.
FILD m32int	DB /0		Push m32int onto the FPU register stack.
FILD m64int	DF /5		Push m64int onto the FPU register stack.
FINCSTP	D9 F7		Increment the TOP field in the FPU status register.
FINIT	9B DB E3		Initialize FPU after checking for pending unmasked floating-point exceptions.
FNINIT	DB E3		Initialize FPU without checking for pending unmasked floating-point exceptions.
FISTTP m16int	DF /1		Store ST(0) in m16int with truncation.
FISTTP m32int	DB /1		Store ST(0) in m32int with truncation.
FISTTP m64int	DD /1		Store ST(0) in m64int with truncation.
FIST m16int	DF /2		Store ST(0) in m16int.
FIST m32int	DB /2		Store ST(0) in m32int.
FISTP m16int	DF /3		Store ST(0) in m16int and pop register stack.
FISTP m32int	DB /3		Store ST(0) in m32int and pop register stack.
FISTP m64int	DF /7		Store ST(0) in m64int and pop register stack.
FLD m32fp	D9 /0		Push m32fp onto the FPU register stack.
FLD m64fp	DD /0		Push m64fp onto the FPU register stack.
FLD m80fp	DB /5		Push m80fp onto the FPU register stack.
FLD ST(i)	D9 C0+i		Push ST(i) onto the FPU register stack.
FLD1	D9 E8		Push +1.0 onto the FPU register stack.
FLDL2T	D9 E9		Push log₂10 onto the FPU register stack.
FLDL2E	D9 EA		Push log₂e onto the FPU register stack.
FLDPI	D9 EB		Push π onto the FPU register stack.
FLDLG2	D9 EC		Push log₁₀2 onto the FPU register stack.
FLDLN2	D9 ED		Push log_e2 onto the FPU register stack.
FLDZ	D9 EE		Push +0.0 onto the FPU register stack.
FLDCW m2byte	D9 /5		Load FPU control word from m2byte.
FLDENV m14/28byte	D9 /4		Load FPU environment from m14byte or m28byte.
FMUL m32fp	D8 /1		Multiply ST(0) by m32fp and store result in ST(0).
FMUL m64fp	DC /1		Multiply ST(0) by m64fp and store result in ST(0).
FMUL ST(0), ST(i)	D8 C8+i		Multiply ST(0) by ST(i) and store result in ST(0).
FMUL ST(i), ST(0)	DC C8+i		Multiply ST(i) by ST(0) and store result in ST(i).
FMULP ST(i), ST(0)	DE C8+i		Multiply ST(i) by ST(0), store result in ST(i), and pop the register stack.
FMULP	DE C9		Multiply ST(1) by ST(0), store result in ST(1), and pop the register stack.
FIMUL m32int	DA /1		Multiply ST(0) by m32int and store result in ST(0).
FIMUL m16int	DE /1		Multiply ST(0) by m16int and store result in ST(0).
FNOP	D9 D0		No operation is performed.
FPATAN	D9 F3		Replace ST(1) with arctan(ST(1)/ST(0)) and pop the register stack.
FPREM	D9 F8		Replace ST(0) with the remainder obtained from dividing ST(0) by ST(1).
FPREM1	D9 F5		Replace ST(0) with the IEEE remainder obtained from dividing ST(0) by ST(1).
FPTAN	D9 F2		Replace ST(0) with its approximate tangent and push 1 onto the FPU stack.
FRNDINT	D9 FC		Round ST(0) to an integer.
FRSTOR m94/108byte	DD /4		Load FPU state from m94byte or m108byte.
FSAVE m94/108byte	9B DD /6		Store FPU state to m94byte or m108byte after checking for pending unmasked floating-point exceptions. Then re-initialize the FPU.
FNSAVE m94/108byte	DD /6		Store FPU environment to m94byte or m108byte without checking for pending unmasked floating-point exceptions. Then re-initialize the FPU.
FSCALE	D9 FD		Scale ST(0) by ST(1).
FSIN	D9 FE		Replace ST(0) with the approximate of its sine.
FSINCOS	D9 FB		Compute the sine and cosine of ST(0); replace ST(0) with the approximate sine, and push the approximate cosine onto the register stack.
FSQRT	D9 FA		Computes square root of ST(0) and stores the result in ST(0).
FSTCW m2byte	9B D9 /7		Store FPU control word to m2byte after checking for pending unmasked floating-point exceptions.
FNSTCW m2byte	D9 /7		Store FPU control word to m2byte without checking for pending unmasked floating-point exceptions.
FSTENV m14/28byte	9B D9 /6		Store FPU environment to m14byte or m28byte after checking for pending unmasked floating-point exceptions. Then mask all floating-point exceptions.
FNSTENV m14/28byte	D9 /6		Store FPU environment to m14byte or m28byte without checking for pending unmasked floating-point exceptions. Then mask all floating-point exceptions.
FSTSW m2byte	9B DD /7		Store FPU status word at m2byte after checking for pending unmasked floating-point exceptions.
FSTSW AX	9B DF E0		Store FPU status word in AX register after checking for pending unmasked floating-point exceptions.
FNSTSW m2byte	DD /7		Store FPU status word at m2byte without checking for pending unmasked floating-point exceptions.
FNSTSW AX	DF E0		Store FPU status word in AX register without checking for pending unmasked floating-point exceptions.
FST m32fp	D9 /2		Copy ST(0) to m32fp.
FST m64fp	DD /2		Copy ST(0) to m64fp.
FST ST(i)	DD D0+i		Copy ST(0) to ST(i).
FSTP m32fp	D9 /3		Copy ST(0) to m32fp and pop register stack.
FSTP m64fp	DD /3		Copy ST(0) to m64fp and pop register stack.
FSTP m80fp	DB /7		Copy ST(0) to m80fp and pop register stack.
FSTP ST(i)	DD D8+i		Copy ST(0) to ST(i) and pop register stack.
FSUBR m32fp	D8 /5		Subtract ST(0) from m32fp and store result in ST(0).
FSUBR m64fp	DC /5		Subtract ST(0) from m64fp and store result in ST(0).
FSUBR ST(0), ST(i)	D8 E8+i		Subtract ST(0) from ST(i) and store result in ST(0).
FSUBR ST(i), ST(0)	DC E0+i		Subtract ST(i) from ST(0) and store result in ST(i).
FSUBRP ST(i), ST(0)	DE E0+i		Subtract ST(i) from ST(0), store result in ST(i), and pop register stack.
FSUBRP	DE E1		Subtract ST(1) from ST(0), store result in ST(1), and pop register stack.
FISUBR m32int	DA /5		Subtract ST(0) from m32int and store result in ST(0).
FISUBR m16int	DE /5		Subtract ST(0) from m16int and store result in ST(0).
FSUB m32fp	D8 /4		Subtract m32fp from ST(0) and store result in ST(0).
FSUB m64fp	DC /4		Subtract m64fp from ST(0) and store result in ST(0).
FSUB ST(0), ST(i)	D8 E0+i		Subtract ST(i) from ST(0) and store result in ST(0).
FSUB ST(i), ST(0)	DC E8+i		Subtract ST(0) from ST(i) and store result in ST(i).
FSUBP ST(i), ST(0)	DE E8+i		Subtract ST(0) from ST(i), store result in ST(i), and pop register stack.
FSUBP	DE E9		Subtract ST(0) from ST(1), store result in ST(1), and pop register stack.
FISUB m32int	DA /4		Subtract m32int from ST(0) and store result in ST(0).
FISUB m16int	DE /4		Subtract m16int from ST(0) and store result in ST(0).
FTST	D9 E4		Compare ST(0) with 0.0.
FUCOM ST(i)	DD E0+i		Compare ST(0) with ST(i).
FUCOM	DD E1		Compare ST(0) with ST(1).
FUCOMP ST(i)	DD E8+i		Compare ST(0) with ST(i) and pop register stack.
FUCOMP	DD E9		Compare ST(0) with ST(1) and pop register stack.
FUCOMPP	DA E9		Compare ST(0) with ST(1) and pop register stack twice.
FXAM	D9 E5		Classify value or number in ST(0).
FXCH ST(i)	D9 C8+i		Exchange the contents of ST(0) and ST(i).
FXCH	D9 C9		Exchange the contents of ST(0) and ST(1).
FXRSTOR m512byte	0F AE /1		Restore the x87 FPU, MMX, XMM, and MXCSR register state from m512byte.
FXRSTOR64 m512byte	REX.W+ 0F AE /1		Restore the x87 FPU, MMX, XMM, and MXCSR register state from m512byte.
FXSAVE m512byte	0F AE /0		Save the x87 FPU, MMX, XMM, and MXCSR register state to m512byte.
FXSAVE64 m512byte	REX.W+ 0F AE /0		Save the x87 FPU, MMX, XMM, and MXCSR register state to m512byte.
FXTRACT	D9 F4		Separate value in ST(0) into exponent and significand, store exponent in ST(0), and push the significand onto the register stack.
FYL2X	D9 F1		Replace ST(1) with (ST(1) ∗ log₂ST(0)) and pop the register stack.
FYL2XP1	D9 F9		Replace ST(1) with ST(1) ∗ log₂(ST(0) + 1.0) and pop the register stack.
HADDPD xmm1, xmm2/m128	66 0F 7C /r	sse3	Horizontal add packed double-precision floating-point values from xmm2/m128 to xmm1.
VHADDPD xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 7C /r	avx	Horizontal add packed double-precision floating-point values from xmm2 and xmm3/mem.
VHADDPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 7C /r	avx	Horizontal add packed double-precision floating-point values from ymm2 and ymm3/mem.
HADDPS xmm1, xmm2/m128	F2 0F 7C /r	sse3	Horizontal add packed single-precision floating-point values from xmm2/m128 to xmm1.
VHADDPS xmm1, xmm2, xmm3/m128	VEX.NDS.128.F2.0F.WIG 7C /r	avx	Horizontal add packed single-precision floating-point values from xmm2 and xmm3/mem.
VHADDPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.F2.0F.WIG 7C /r	avx	Horizontal add packed single-precision floating-point values from ymm2 and ymm3/mem.
HLT	F4		Halt
HSUBPD xmm1, xmm2/m128	66 0F 7D /r	sse3	Horizontal subtract packed double-precision floating-point values from xmm2/m128 to xmm1.
VHSUBPD xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 7D /r	avx	Horizontal subtract packed double-precision floating-point values from xmm2 and xmm3/mem.
VHSUBPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 7D /r	avx	Horizontal subtract packed double-precision floating-point values from ymm2 and ymm3/mem.
HSUBPS xmm1, xmm2/m128	F2 0F 7D /r	sse3	Horizontal subtract packed single-precision floating-point values from xmm2/m128 to xmm1.
VHSUBPS xmm1, xmm2, xmm3/m128	VEX.NDS.128.F2.0F.WIG 7D /r	avx	Horizontal subtract packed single-precision floating-point values from xmm2 and xmm3/mem.
VHSUBPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.F2.0F.WIG 7D /r	avx	Horizontal subtract packed single-precision floating-point values from ymm2 and ymm3/mem.
IDIV r/m8	F6 /7		Signed divide AX by r/m8, with result stored in: AL ← Quotient, AH ← Remainder.
IDIV r/m8	REX + F6 /7		Signed divide AX by r/m8, with result stored in AL ← Quotient, AH ← Remainder.
IDIV r/m16	F7 /7		Signed divide DX:AX by r/m16, with result stored in AX ← Quotient, DX ← Remainder.
IDIV r/m32	F7 /7		Signed divide EDX:EAX by r/m32, with result stored in EAX ← Quotient, EDX ← Remainder.
IDIV r/m64	REX.W + F7 /7		Signed divide RDX:RAX by r/m64, with result stored in RAX ← Quotient, RDX ← Remainder.
IMUL r/m8	F6 /5		AX← AL ∗ r/m byte.
IMUL r/m16	F7 /5		DX:AX ← AX ∗ r/m word.
IMUL r/m32	F7 /5		EDX:EAX ← EAX ∗ r/m32.
IMUL r/m64	REX.W + F7 /5		RDX:RAX ← RAX ∗ r/m64.
IMUL r16, r/m16	0F AF /r		word register ← word register ∗ r/m16.
IMUL r32, r/m32	0F AF /r		doubleword register ← doubleword register ∗ r/m32.
IMUL r64, r/m64	REX.W + 0F AF /r		Quadword register ← Quadword register ∗ r/m64.
IMUL r16, r/m16, imm8	6B /r ib		word register ← r/m16 ∗ sign-extended immediate byte.
IMUL r32, r/m32, imm8	6B /r ib		doubleword register ← r/m32 ∗ sign-extended immediate byte.
IMUL r64, r/m64, imm8	REX.W + 6B /r ib		Quadword register ← r/m64 ∗ sign-extended immediate byte.
IMUL r16, r/m16, imm16	69 /r iw		word register ← r/m16 ∗ immediate word.
IMUL r32, r/m32, imm32	69 /r id		doubleword register ← r/m32 ∗ immediate doubleword.
IMUL r64, r/m64, imm32	REX.W + 69 /r id		Quadword register ← r/m64 ∗ immediate doubleword.
IN AL, imm8	E4 ib		Input byte from imm8 I/O port address into AL.
IN AX, imm8	E5 ib		Input word from imm8 I/O port address into AX.
IN EAX, imm8	E5 ib		Input dword from imm8 I/O port address into EAX.
IN AL,DX	EC		Input byte from I/O port in DX into AL.
IN AX,DX	ED		Input word from I/O port in DX into AX.
IN EAX,DX	ED		Input doubleword from I/O port in DX into EAX.
INC r/m8	FE /0		Increment r/m byte by 1.
INC r/m8	REX + FE /0		Increment r/m byte by 1.
INC r/m16	FF /0		Increment r/m word by 1.
INC r/m32	FF /0		Increment r/m doubleword by 1.
INC r/m64	REX.W + FF /0		Increment r/m quadword by 1.
INC r16	40+ rw		Increment word register by 1.
INC r32	40+ rd		Increment doubleword register by 1.
INSERTPS xmm1, xmm2/m32, imm8	66 0F 3A 21 /r ib	sse4.1	Insert a single-precision floating-point value selected by imm8 from xmm2/m32 into xmm1 at the specified destination element specified by imm8 and zero out destination elements in xmm1 as indicated in imm8.
VINSERTPS xmm1, xmm2, xmm3/m32, imm8	VEX.NDS.128.66.0F3A.WIG 21 /r ib	avx	Insert a single-precision floating-point value selected by imm8 from xmm3/m32 and merge with values in xmm2 at the specified destination element specified by imm8 and write out the result and zero out destination elements in xmm1 as indicated in imm8.
VINSERTPS xmm1, xmm2, xmm3/m32, imm8	EVEX.NDS.128.66.0F3A.W0 21 /r ib	avx512	Insert a single-precision floating-point value selected by imm8 from xmm3/m32 and merge with values in xmm2 at the specified destination element specified by imm8 and write out the result and zero out destination elements in xmm1 as indicated in imm8.
INS m8, DX	6C		Input byte from I/O port specified in DX into memory location specified in ES:(E)DI or RDI.
INS m16, DX	6D		Input word from I/O port specified in DX into memory location specified in ES:(E)DI or RDI.
INS m32, DX	6D		Input doubleword from I/O port specified in DX into memory location specified in ES:(E)DI or RDI.
INSB	6C		Input byte from I/O port specified in DX into memory location specified with ES:(E)DI or RDI.
INSW	6D		Input word from I/O port specified in DX into memory location specified in ES:(E)DI or RDI.
INSD	6D		Input doubleword from I/O port specified in DX into memory location specified in ES:(E)DI or RDI.
INT 3	CC		Interrupt 3—trap to debugger.
INT imm8	CD ib		Interrupt vector specified by immediate byte.
INTO	CE		Interrupt 4—if overflow flag is 1.
INVD	0F 08		Flush internal caches; initiate flushing of external caches.
INVLPG m	0F 01/7		Invalidate TLB entries for page containing m.
INVPCID r32, m128	66 0F 38 82 /r	invpcid	Invalidates entries in the TLBs and paging-structure caches based on invalidation type in r32 and descrip-tor in m128.
INVPCID r64, m128	66 0F 38 82 /r	invpcid	Invalidates entries in the TLBs and paging-structure caches based on invalidation type in r64 and descrip-tor in m128.
IRET	CF		Interrupt return (16-bit operand size).
IRETD	CF		Interrupt return (32-bit operand size).
IRETQ	REX.W + CF		Interrupt return (64-bit operand size).
JMP rel8	EB cb		Jump short, RIP = RIP + 8-bit displacement sign extended to 64-bits
JMP rel16	E9 cw		Jump near, relative, displacement relative to next instruction. Not supported in 64-bit mode.
JMP rel32	E9 cd		Jump near, relative, RIP = RIP + 32-bit displacement sign extended to 64-bits
JMP r/m16	FF /4		Jump near, absolute indirect, address = zero-extended r/m16. Not supported in 64-bit mode.
JMP r/m32	FF /4		Jump near, absolute indirect, address given in r/m32. Not supported in 64-bit mode.
JMP r/m64	FF /4		Jump near, absolute indirect, RIP = 64-Bit offset from register or memory
JMP ptr16:16	EA cd		Jump far, absolute, address given in operand
JMP ptr16:32	EA cp		Jump far, absolute, address given in operand
JMP m16:16	FF /5		Jump far, absolute indirect, address given in m16:16
JMP m16:32	FF /5		Jump far, absolute indirect, address given in m16:32.
JMP m16:64	REX.W + FF /5		Jump far, absolute indirect, address given in m16:64.
JA rel8	77 cb		Jump short if above (CF=0 and ZF=0).
JAE rel8	73 cb		Jump short if above or equal (CF=0).
JB rel8	72 cb		Jump short if below (CF=1).
JBE rel8	76 cb		Jump short if below or equal (CF=1 or ZF=1).
JC rel8	72 cb		Jump short if carry (CF=1).
JCXZ rel8	E3 cb		Jump short if CX register is 0.
JECXZ rel8	E3 cb		Jump short if ECX register is 0.
JRCXZ rel8	E3 cb		Jump short if RCX register is 0.
JE rel8	74 cb		Jump short if equal (ZF=1).
JG rel8	7F cb		Jump short if greater (ZF=0 and SF=OF).
JGE rel8	7D cb		Jump short if greater or equal (SF=OF).
JL rel8	7C cb		Jump short if less (SF≠ OF).
JLE rel8	7E cb		Jump short if less or equal (ZF=1 or SF≠ OF).
JNA rel8	76 cb		Jump short if not above (CF=1 or ZF=1).
JNAE rel8	72 cb		Jump short if not above or equal (CF=1).
JNB rel8	73 cb		Jump short if not below (CF=0).
JNBE rel8	77 cb		Jump short if not below or equal (CF=0 and ZF=0).
JNC rel8	73 cb		Jump short if not carry (CF=0).
JNE rel8	75 cb		Jump short if not equal (ZF=0).
JNG rel8	7E cb		Jump short if not greater (ZF=1 or SF≠ OF).
JNGE rel8	7C cb		Jump short if not greater or equal (SF≠ OF).
JNL rel8	7D cb		Jump short if not less (SF=OF).
JNLE rel8	7F cb		Jump short if not less or equal (ZF=0 and SF=OF).
JNO rel8	71 cb		Jump short if not overflow (OF=0).
JNP rel8	7B cb		Jump short if not parity (PF=0).
JNS rel8	79 cb		Jump short if not sign (SF=0).
JNZ rel8	75 cb		Jump short if not zero (ZF=0).
JO rel8	70 cb		Jump short if overflow (OF=1).
JP rel8	7A cb		Jump short if parity (PF=1).
JPE rel8	7A cb		Jump short if parity even (PF=1).
JPO rel8	7B cb		Jump short if parity odd (PF=0).
JS rel8	78 cb		Jump short if sign (SF=1).
JZ rel8	74 cb		Jump short if zero (ZF = 1).
JA rel16	0F 87 cw		Jump near if above (CF=0 and ZF=0). Not supported in 64-bit mode.
JA rel32	0F 87 cd		Jump near if above (CF=0 and ZF=0).
JAE rel16	0F 83 cw		Jump near if above or equal (CF=0). Not supported in 64-bit mode.
JAE rel32	0F 83 cd		Jump near if above or equal (CF=0).
JB rel16	0F 82 cw		Jump near if below (CF=1). Not supported in 64-bit mode.
JB rel32	0F 82 cd		Jump near if below (CF=1).
JBE rel16	0F 86 cw		Jump near if below or equal (CF=1 or ZF=1). Not supported in 64-bit mode.
JBE rel32	0F 86 cd		Jump near if below or equal (CF=1 or ZF=1).
JC rel16	0F 82 cw		Jump near if carry (CF=1). Not supported in 64-bit mode.
JC rel32	0F 82 cd		Jump near if carry (CF=1).
JE rel16	0F 84 cw		Jump near if equal (ZF=1). Not supported in 64-bit mode.
JE rel32	0F 84 cd		Jump near if equal (ZF=1).
JZ rel16	0F 84 cw		Jump near if 0 (ZF=1). Not supported in 64-bit mode.
JZ rel32	0F 84 cd		Jump near if 0 (ZF=1).
JG rel16	0F 8F cw		Jump near if greater (ZF=0 and SF=OF). Not supported in 64-bit mode.
JG rel32	0F 8F cd		Jump near if greater (ZF=0 and SF=OF).
JGE rel16	0F 8D cw		Jump near if greater or equal (SF=OF). Not supported in 64-bit mode.
JGE rel32	0F 8D cd		Jump near if greater or equal (SF=OF).
JL rel16	0F 8C cw		Jump near if less (SF≠ OF). Not supported in 64-bit mode.
JL rel32	0F 8C cd		Jump near if less (SF≠ OF).
JLE rel16	0F 8E cw		Jump near if less or equal (ZF=1 or SF≠ OF). Not supported in 64-bit mode.
JLE rel32	0F 8E cd		Jump near if less or equal (ZF=1 or SF≠ OF).
JNA rel16	0F 86 cw		Jump near if not above (CF=1 or ZF=1). Not supported in 64-bit mode.
JNA rel32	0F 86 cd		Jump near if not above (CF=1 or ZF=1).
JNAE rel16	0F 82 cw		Jump near if not above or equal (CF=1). Not supported in 64-bit mode.
JNAE rel32	0F 82 cd		Jump near if not above or equal (CF=1).
JNB rel16	0F 83 cw		Jump near if not below (CF=0). Not supported in 64-bit mode.
JNB rel32	0F 83 cd		Jump near if not below (CF=0).
JNBE rel16	0F 87 cw		Jump near if not below or equal (CF=0 and ZF=0). Not supported in 64-bit mode.
JNBE rel32	0F 87 cd		Jump near if not below or equal (CF=0 and ZF=0).
JNC rel16	0F 83 cw		Jump near if not carry (CF=0). Not supported in 64-bit mode.
JNC rel32	0F 83 cd		Jump near if not carry (CF=0).
JNE rel16	0F 85 cw		Jump near if not equal (ZF=0). Not supported in 64-bit mode.
JNE rel32	0F 85 cd		Jump near if not equal (ZF=0).
JNG rel16	0F 8E cw		Jump near if not greater (ZF=1 or SF≠ OF). Not supported in 64-bit mode.
JNG rel32	0F 8E cd		Jump near if not greater (ZF=1 or SF≠ OF).
JNGE rel16	0F 8C cw		Jump near if not greater or equal (SF≠ OF). Not supported in 64-bit mode.
JNGE rel32	0F 8C cd		Jump near if not greater or equal (SF≠ OF).
JNL rel16	0F 8D cw		Jump near if not less (SF=OF). Not supported in 64-bit mode.
JNL rel32	0F 8D cd		Jump near if not less (SF=OF).
JNLE rel16	0F 8F cw		Jump near if not less or equal (ZF=0 and SF=OF). Not supported in 64-bit mode.
JNLE rel32	0F 8F cd		Jump near if not less or equal (ZF=0 and SF=OF).
JNO rel16	0F 81 cw		Jump near if not overflow (OF=0). Not supported in 64-bit mode.
JNO rel32	0F 81 cd		Jump near if not overflow (OF=0).
JNP rel16	0F 8B cw		Jump near if not parity (PF=0). Not supported in 64-bit mode.
JNP rel32	0F 8B cd		Jump near if not parity (PF=0).
JNS rel16	0F 89 cw		Jump near if not sign (SF=0). Not supported in 64-bit mode.
JNS rel32	0F 89 cd		Jump near if not sign (SF=0).
JNZ rel16	0F 85 cw		Jump near if not zero (ZF=0). Not supported in 64-bit mode.
JNZ rel32	0F 85 cd		Jump near if not zero (ZF=0).
JO rel16	0F 80 cw		Jump near if overflow (OF=1). Not supported in 64-bit mode.
JO rel32	0F 80 cd		Jump near if overflow (OF=1).
JP rel16	0F 8A cw		Jump near if parity (PF=1). Not supported in 64-bit mode.
JP rel32	0F 8A cd		Jump near if parity (PF=1).
JPE rel16	0F 8A cw		Jump near if parity even (PF=1). Not supported in 64-bit mode.
JPE rel32	0F 8B cd		Jump near if parity even (PF=1).
JPO rel16	0F 8A cw		Jump near if parity odd (PF=0). Not supported in 64-bit mode.
JPO rel32	0F 8B cd		Jump near if parity odd (PF=0).
JS rel16	0F 88 cw		Jump near if sign (SF=1). Not supported in 64-bit mode.
JS rel32	0F 88 cd		Jump near if sign (SF=1).
JZ rel16	0F 84 cw		Jump near if 0 (ZF=1). Not supported in 64-bit mode.
JZ rel32	0F 84 cd		Jump near if 0 (ZF=1).
KADDW k1, k2, k3	VEX.L1.0F.W0 4A /r	avx512	Add 16 bits masks in k2 and k3 and place result in k1.
KADDB k1, k2, k3	VEX.L1.66.0F.W0 4A /r	avx512	Add 8 bits masks in k2 and k3 and place result in k1.
KADDQ k1, k2, k3	VEX.L1.0F.W1 4A /r	avx512	Add 64 bits masks in k2 and k3 and place result in k1.
KADDD k1, k2, k3	VEX.L1.66.0F.W1 4A /r	avx512	Add 32 bits masks in k2 and k3 and place result in k1.
KANDNW k1, k2, k3	VEX.NDS.L1.0F.W0 42 /r	avx512	Bitwise AND NOT 16 bits masks k2 and k3 and place result in k1.
KANDNB k1, k2, k3	VEX.L1.66.0F.W0 42 /r	avx512	Bitwise AND NOT 8 bits masks k1 and k2 and place result in k1.
KANDNQ k1, k2, k3	VEX.L1.0F.W1 42 /r	avx512	Bitwise AND NOT 64 bits masks k2 and k3 and place result in k1.
KANDND k1, k2, k3	VEX.L1.66.0F.W1 42 /r	avx512	Bitwise AND NOT 32 bits masks k2 and k3 and place result in k1.
KANDW k1, k2, k3	VEX.NDS.L1.0F.W0 41 /r	avx512	Bitwise AND 16 bits masks k2 and k3 and place result in k1.
KANDB k1, k2, k3	VEX.L1.66.0F.W0 41 /r	avx512	Bitwise AND 8 bits masks k2 and k3 and place result in k1.
KANDQ k1, k2, k3	VEX.L1.0F.W1 41 /r	avx512	Bitwise AND 64 bits masks k2 and k3 and place result in k1.
KANDD k1, k2, k3	VEX.L1.66.0F.W1 41 /r	avx512	Bitwise AND 32 bits masks k2 and k3 and place result in k1.
KMOVW k1, k2/m16	VEX.L0.0F.W0 90 /r	avx512	Move 16 bits mask from k2/m16 and store the result in k1.
KMOVB k1, k2/m8	VEX.L0.66.0F.W0 90 /r	avx512	Move 8 bits mask from k2/m8 and store the result in k1.
KMOVQ k1, k2/m64	VEX.L0.0F.W1 90 /r	avx512	Move 64 bits mask from k2/m64 and store the result in k1.
KMOVD k1, k2/m32	VEX.L0.66.0F.W1 90 /r	avx512	Move 32 bits mask from k2/m32 and store the result in k1.
KMOVW m16, k1	VEX.L0.0F.W0 91 /r	avx512	Move 16 bits mask from k1 and store the result in m16.
KMOVB m8, k1	VEX.L0.66.0F.W0 91 /r	avx512	Move 8 bits mask from k1 and store the result in m8.
KMOVQ m64, k1	VEX.L0.0F.W1 91 /r	avx512	Move 64 bits mask from k1 and store the result in m64.
KMOVD m32, k1	VEX.L0.66.0F.W1 91 /r	avx512	Move 32 bits mask from k1 and store the result in m32.
KMOVW k1, r32	VEX.L0.0F.W0 92 /r	avx512	Move 16 bits mask from r32 to k1.
KMOVB k1, r32	VEX.L0.66.0F.W0 92 /r	avx512	Move 8 bits mask from r32 to k1.
KMOVQ k1, r64	VEX.L0.F2.0F.W1 92 /r	avx512	Move 64 bits mask from r64 to k1.
KMOVD k1, r32	VEX.L0.F2.0F.W0 92 /r	avx512	Move 32 bits mask from r32 to k1.
KMOVW r32, k1	VEX.L0.0F.W0 93 /r	avx512	Move 16 bits mask from k1 to r32.
KMOVB r32, k1	VEX.L0.66.0F.W0 93 /r	avx512	Move 8 bits mask from k1 to r32.
KMOVQ r64, k1	VEX.L0.F2.0F.W1 93 /r	avx512	Move 64 bits mask from k1 to r64.
KMOVD r32, k1	VEX.L0.F2.0F.W0 93 /r	avx512	Move 32 bits mask from k1 to r32.
KNOTW k1, k2	VEX.L0.0F.W0 44 /r	avx512	Bitwise NOT of 16 bits mask k2.
KNOTB k1, k2	VEX.L0.66.0F.W0 44 /r	avx512	Bitwise NOT of 8 bits mask k2.
KNOTQ k1, k2	VEX.L0.0F.W1 44 /r	avx512	Bitwise NOT of 64 bits mask k2.
KNOTD k1, k2	VEX.L0.66.0F.W1 44 /r	avx512	Bitwise NOT of 32 bits mask k2.
KORTESTW k1, k2	VEX.L0.0F.W0 98 /r	avx512	Bitwise OR 16 bits masks k1 and k2 and update ZF and CF accordingly.
KORTESTB k1, k2	VEX.L0.66.0F.W0 98 /r	avx512	Bitwise OR 8 bits masks k1 and k2 and update ZF and CF accordingly.
KORTESTQ k1, k2	VEX.L0.0F.W1 98 /r	avx512	Bitwise OR 64 bits masks k1 and k2 and update ZF and CF accordingly.
KORTESTD k1, k2	VEX.L0.66.0F.W1 98 /r	avx512	Bitwise OR 32 bits masks k1 and k2 and update ZF and CF accordingly.
KORW k1, k2, k3	VEX.NDS.L1.0F.W0 45 /r	avx512	Bitwise OR 16 bits masks k2 and k3 and place result in k1.
KORB k1, k2, k3	VEX.L1.66.0F.W0 45 /r	avx512	Bitwise OR 8 bits masks k2 and k3 and place result in k1.
KORQ k1, k2, k3	VEX.L1.0F.W1 45 /r	avx512	Bitwise OR 64 bits masks k2 and k3 and place result in k1.
KORD k1, k2, k3	VEX.L1.66.0F.W1 45 /r	avx512	Bitwise OR 32 bits masks k2 and k3 and place result in k1.
KSHIFTLW k1, k2, imm8	VEX.L0.66.0F3A.W1 32 /r	avx512	Shift left 16 bits in k2 by immediate and write result in k1.
KSHIFTLB k1, k2, imm8	VEX.L0.66.0F3A.W0 32 /r	avx512	Shift left 8 bits in k2 by immediate and write result in k1.
KSHIFTLQ k1, k2, imm8	VEX.L0.66.0F3A.W1 33 /r	avx512	Shift left 64 bits in k2 by immediate and write result in k1.
KSHIFTLD k1, k2, imm8	VEX.L0.66.0F3A.W0 33 /r	avx512	Shift left 32 bits in k2 by immediate and write result in k1.
KSHIFTRW k1, k2, imm8	VEX.L0.66.0F3A.W1 30 /r	avx512	Shift right 16 bits in k2 by immediate and write result in k1.
KSHIFTRB k1, k2, imm8	VEX.L0.66.0F3A.W0 30 /r	avx512	Shift right 8 bits in k2 by immediate and write result in k1.
KSHIFTRQ k1, k2, imm8	VEX.L0.66.0F3A.W1 31 /r	avx512	Shift right 64 bits in k2 by immediate and write result in k1.
KSHIFTRD k1, k2, imm8	VEX.L0.66.0F3A.W0 31 /r	avx512	Shift right 32 bits in k2 by immediate and write result in k1.
KTESTW k1, k2	VEX.L0.0F.W0 99 /r	avx512	Set ZF and CF depending on sign bit AND and ANDN of 16 bits mask register sources.
KTESTB k1, k2	VEX.L0.66.0F.W0 99 /r	avx512	Set ZF and CF depending on sign bit AND and ANDN of 8 bits mask register sources.
KTESTQ k1, k2	VEX.L0.0F.W1 99 /r	avx512	Set ZF and CF depending on sign bit AND and ANDN of 64 bits mask register sources.
KTESTD k1, k2	VEX.L0.66.0F.W1 99 /r	avx512	Set ZF and CF depending on sign bit AND and ANDN of 32 bits mask register sources.
KUNPCKBW k1, k2, k3	VEX.NDS.L1.66.0F.W0 4B /r	avx512	Unpack and interleave 8 bits masks in k2 and k3 and write word result in k1.
KUNPCKWD k1, k2, k3	VEX.NDS.L1.0F.W0 4B /r	avx512	Unpack and interleave 16 bits in k2 and k3 and write double-word result in k1.
KUNPCKDQ k1, k2, k3	VEX.NDS.L1.0F.W1 4B /r	avx512	Unpack and interleave 32 bits masks in k2 and k3 and write quadword result in k1.
KXNORW k1, k2, k3	VEX.NDS.L1.0F.W0 46 /r	avx512	Bitwise XNOR 16 bits masks k2 and k3 and place result in k1.
KXNORB k1, k2, k3	VEX.L1.66.0F.W0 46 /r	avx512	Bitwise XNOR 8 bits masks k2 and k3 and place result in k1.
KXNORQ k1, k2, k3	VEX.L1.0F.W1 46 /r	avx512	Bitwise XNOR 64 bits masks k2 and k3 and place result in k1.
KXNORD k1, k2, k3	VEX.L1.66.0F.W1 46 /r	avx512	Bitwise XNOR 32 bits masks k2 and k3 and place result in k1.
KXORW k1, k2, k3	VEX.NDS.L1.0F.W0 47 /r	avx512	Bitwise XOR 16 bits masks k2 and k3 and place result in k1.
KXORB k1, k2, k3	VEX.L1.66.0F.W0 47 /r	avx512	Bitwise XOR 8 bits masks k2 and k3 and place result in k1.
KXORQ k1, k2, k3	VEX.L1.0F.W1 47 /r	avx512	Bitwise XOR 64 bits masks k2 and k3 and place result in k1.
KXORD k1, k2, k3	VEX.L1.66.0F.W1 47 /r	avx512	Bitwise XOR 32 bits masks k2 and k3 and place result in k1.
LAHF	9F		Load: AH ← EFLAGS(SF:ZF:0:AF:0:PF:1:CF).
LAR r16, r16/m16	0F 02 /r		r16 ← access rights referenced by r16/m16
LAR reg, r32/m16	0F 02 /r		reg ← access rights referenced by r32/m16
LDDQU xmm1, mem	F2 0F F0 /r	sse3	Load unaligned data from mem and return double quadword in xmm1.
VLDDQU xmm1, m128	VEX.128.F2.0F.WIG F0 /r	avx	Load unaligned packed integer values from mem to xmm1.
VLDDQU ymm1, m256	VEX.256.F2.0F.WIG F0 /r	avx	Load unaligned packed integer values from mem to ymm1.
LDMXCSR m32	0F AE /2	sse	Load MXCSR register from m32.
VLDMXCSR m32	VEX.LZ.0F.WIG AE /2	avx	Load MXCSR register from m32.
LDS r16,m16:16	C5 /r		Load DS:r16 with far pointer from memory.
LDS r32,m16:32	C5 /r		Load DS:r32 with far pointer from memory.
LSS r16,m16:16	0F B2 /r		Load SS:r16 with far pointer from memory.
LSS r32,m16:32	0F B2 /r		Load SS:r32 with far pointer from memory.
LSS r64,m16:64	REX + 0F B2 /r		Load SS:r64 with far pointer from memory.
LES r16,m16:16	C4 /r		Load ES:r16 with far pointer from memory.
LES r32,m16:32	C4 /r		Load ES:r32 with far pointer from memory.
LFS r16,m16:16	0F B4 /r		Load FS:r16 with far pointer from memory.
LFS r32,m16:32	0F B4 /r		Load FS:r32 with far pointer from memory.
LFS r64,m16:64	REX + 0F B4 /r		Load FS:r64 with far pointer from memory.
LGS r16,m16:16	0F B5 /r		Load GS:r16 with far pointer from memory.
LGS r32,m16:32	0F B5 /r		Load GS:r32 with far pointer from memory.
LGS r64,m16:64	REX + 0F B5 /r		Load GS:r64 with far pointer from memory.
LEA r16,m	8D /r		Store effective address for m in register r16.
LEA r32,m	8D /r		Store effective address for m in register r32.
LEA r64,m	REX.W + 8D /r		Store effective address for m in register r64.
LEAVE	C9		Set SP to BP, then pop BP.
LEAVE	C9		Set ESP to EBP, then pop EBP.
LEAVE	C9		Set RSP to RBP, then pop RBP.
LFENCE	0F AE E8		Serializes load operations.
LGDT m16&32	0F 01 /2		Load m into GDTR.
LIDT m16&32	0F 01 /3		Load m into IDTR.
LGDT m16&64	0F 01 /2		Load m into GDTR.
LIDT m16&64	0F 01 /3		Load m into IDTR.
LLDT r/m16	0F 00 /2		Load segment selector r/m16 into LDTR.
LMSW r/m16	0F 01 /6		Loads r/m16 in machine status word of CR0.
LOCK	F0		Asserts LOCK# signal for duration of the accompanying instruction.
LODS m8	AC		For legacy mode, Load byte at address DS:(E)SI into AL. For 64-bit mode load byte at address (R)SI into AL.
LODS m16	AD		For legacy mode, Load word at address DS:(E)SI into AX. For 64-bit mode load word at address (R)SI into AX.
LODS m32	AD		For legacy mode, Load dword at address DS:(E)SI into EAX. For 64-bit mode load dword at address (R)SI into EAX.
LODS m64	REX.W + AD		Load qword at address (R)SI into RAX.
LODSB	AC		For legacy mode, Load byte at address DS:(E)SI into AL. For 64-bit mode load byte at address (R)SI into AL.
LODSW	AD		For legacy mode, Load word at address DS:(E)SI into AX. For 64-bit mode load word at address (R)SI into AX.
LODSD	AD		For legacy mode, Load dword at address DS:(E)SI into EAX. For 64-bit mode load dword at address (R)SI into EAX.
LODSQ	REX.W + AD		Load qword at address (R)SI into RAX.
LOOP rel8	E2 cb		Decrement count; jump short if count ≠ 0.
LOOPE rel8	E1 cb		Decrement count; jump short if count ≠ 0 and ZF = 1.
LOOPNE rel8	E0 cb		Decrement count; jump short if count ≠ 0 and ZF = 0.
LSL r16, r16/m16	0F 03 /r		Load: r16 ← segment limit, selector r16/m16.
LSL r32, r32/m16	0F 03 /r		Load: r32 ← segment limit, selector r32/m16.
LSL r64, r32/m16	REX.W + 0F 03 /r		Load: r64 ← segment limit, selector r32/m16
LTR r/m16	0F 00 /3		Load r/m16 into task register.
LZCNT r16, r/m16	F3 0F BD /r	lzcnt	Count the number of leading zero bits in r/m16, return result in r16.
LZCNT r32, r/m32	F3 0F BD /r	lzcnt	Count the number of leading zero bits in r/m32, return result in r32.
LZCNT r64, r/m64	F3 REX.W 0F BD /r	lzcnt	Count the number of leading zero bits in r/m64, return result in r64.
MASKMOVDQU xmm1, xmm2	66 0F F7 /r	sse2	Selectively write bytes from xmm1 to memory location using the byte mask in xmm2. The default memory location is specified by DS:DI/EDI/RDI.
VMASKMOVDQU xmm1, xmm2	VEX.128.66.0F.WIG F7 /r	avx	Selectively write bytes from xmm1 to memory location using the byte mask in xmm2. The default memory location is specified by DS:DI/EDI/RDI.
MASKMOVQ mm1, mm2	0F F7 /r		Selectively write bytes from mm1 to memory location using the byte mask in mm2. The default memory location is specified by DS:DI/EDI/RDI.
MAXPD xmm1, xmm2/m128	66 0F 5F /r	sse2	Return the maximum double-precision floating-point values between xmm1 and xmm2/m128.
VMAXPD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 5F /r	avx	Return the maximum double-precision floating-point values between xmm2 and xmm3/m128.
VMAXPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 5F /r	avx	Return the maximum packed double-precision floating-point values between ymm2 and ymm3/m256.
VMAXPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 5F /r	avx512	Return the maximum packed double-precision floating-point values between xmm2 and xmm3/m128/m64bcst and store result in xmm1 subject to writemask k1.
VMAXPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 5F /r	avx512	Return the maximum packed double-precision floating-point values between ymm2 and ymm3/m256/m64bcst and store result in ymm1 subject to writemask k1.
VMAXPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{sae}	EVEX.NDS.512.66.0F.W1 5F /r	avx512	Return the maximum packed double-precision floating-point values between zmm2 and zmm3/m512/m64bcst and store result in zmm1 subject to writemask k1.
MAXPS xmm1, xmm2/m128	0F 5F /r	sse	Return the maximum single-precision floating-point values between xmm1 and xmm2/mem.
VMAXPS xmm1, xmm2, xmm3/m128	VEX.NDS.128.0F.WIG 5F /r	avx	Return the maximum single-precision floating-point values between xmm2 and xmm3/mem.
VMAXPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.0F.WIG 5F /r	avx	Return the maximum single-precision floating-point values between ymm2 and ymm3/mem.
VMAXPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.0F.W0 5F /r	avx512	Return the maximum packed single-precision floating-point values between xmm2 and xmm3/m128/m32bcst and store result in xmm1 subject to writemask k1.
VMAXPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.0F.W0 5F /r	avx512	Return the maximum packed single-precision floating-point values between ymm2 and ymm3/m256/m32bcst and store result in ymm1 subject to writemask k1.
VMAXPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{sae}	EVEX.NDS.512.0F.W0 5F /r	avx512	Return the maximum packed single-precision floating-point values between zmm2 and zmm3/m512/m32bcst and store result in zmm1 subject to writemask k1.
MAXSD xmm1, xmm2/m64	F2 0F 5F /r	sse2	Return the maximum scalar double-precision floating-point value between xmm2/m64 and xmm1.
VMAXSD xmm1, xmm2, xmm3/m64	VEX.NDS.128.F2.0F.WIG 5F /r	avx	Return the maximum scalar double-precision floating-point value between xmm3/m64 and xmm2.
VMAXSD xmm1 {k1}{z}, xmm2, xmm3/m64{sae}	EVEX.NDS.LIG.F2.0F.W1 5F /r	avx512	Return the maximum scalar double-precision floating-point value between xmm3/m64 and xmm2.
MAXSS xmm1, xmm2/m32	F3 0F 5F /r	sse	Return the maximum scalar single-precision floating-point value between xmm2/m32 and xmm1.
VMAXSS xmm1, xmm2, xmm3/m32	VEX.NDS.128.F3.0F.WIG 5F /r	avx	Return the maximum scalar single-precision floating-point value between xmm3/m32 and xmm2.
VMAXSS xmm1 {k1}{z}, xmm2, xmm3/m32{sae}	EVEX.NDS.LIG.F3.0F.W0 5F /r	avx512	Return the maximum scalar single-precision floating-point value between xmm3/m32 and xmm2.
MFENCE	0F AE F0		Serializes load and store operations.
MINPD xmm1, xmm2/m128	66 0F 5D /r	sse2	Return the minimum double-precision floating-point values between xmm1 and xmm2/mem
VMINPD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 5D /r	avx	Return the minimum double-precision floating-point values between xmm2 and xmm3/mem.
VMINPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 5D /r	avx	Return the minimum packed double-precision floating-point values between ymm2 and ymm3/mem.
VMINPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 5D /r	avx512	Return the minimum packed double-precision floating-point values between xmm2 and xmm3/m128/m64bcst and store result in xmm1 subject to writemask k1.
VMINPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 5D /r	avx512	Return the minimum packed double-precision floating-point values between ymm2 and ymm3/m256/m64bcst and store result in ymm1 subject to writemask k1.
VMINPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{sae}	EVEX.NDS.512.66.0F.W1 5D /r	avx512	Return the minimum packed double-precision floating-point values between zmm2 and zmm3/m512/m64bcst and store result in zmm1 subject to writemask k1.
MINPS xmm1, xmm2/m128	0F 5D /r	sse	Return the minimum single-precision floating-point values between xmm1 and xmm2/mem.
VMINPS xmm1, xmm2, xmm3/m128	VEX.NDS.128.0F.WIG 5D /r	avx	Return the minimum single-precision floating-point values between xmm2 and xmm3/mem.
VMINPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.0F.WIG 5D /r	avx	Return the minimum single double-precision floating-point values between ymm2 and ymm3/mem.
VMINPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.0F.W0 5D /r	avx512	Return the minimum packed single-precision floating-point values between xmm2 and xmm3/m128/m32bcst and store result in xmm1 subject to writemask k1.
VMINPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.0F.W0 5D /r	avx512	Return the minimum packed single-precision floating-point values between ymm2 and ymm3/m256/m32bcst and store result in ymm1 subject to writemask k1.
VMINPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{sae}	EVEX.NDS.512.0F.W0 5D /r	avx512	Return the minimum packed single-precision floating-point values between zmm2 and zmm3/m512/m32bcst and store result in zmm1 subject to writemask k1.
MINSD xmm1, xmm2/m64	F2 0F 5D /r	sse2	Return the minimum scalar double-precision floating-point value between xmm2/m64 and xmm1.
VMINSD xmm1, xmm2, xmm3/m64	VEX.NDS.128.F2.0F.WIG 5D /r	avx	Return the minimum scalar double-precision floating-point value between xmm3/m64 and xmm2.
VMINSD xmm1 {k1}{z}, xmm2, xmm3/m64{sae}	EVEX.NDS.LIG.F2.0F.W1 5D /r	avx512	Return the minimum scalar double-precision floating-point value between xmm3/m64 and xmm2.
MINSS xmm1,xmm2/m32	F3 0F 5D /r	sse	Return the minimum scalar single-precision floating-point value between xmm2/m32 and xmm1.
VMINSS xmm1,xmm2, xmm3/m32	VEX.NDS.128.F3.0F.WIG 5D /r	avx	Return the minimum scalar single-precision floating-point value between xmm3/m32 and xmm2.
VMINSS xmm1 {k1}{z}, xmm2, xmm3/m32{sae}	EVEX.NDS.LIG.F3.0F.W0 5D /r	avx512	Return the minimum scalar single-precision floating-point value between xmm3/m32 and xmm2.
MONITOR	0F 01 C8		Sets up a linear address range to be monitored by hardware and activates the monitor. The address range should be a write-back memory caching type. The address is DS:EAX (DS:RAX in 64-bit mode).
MOV r32, CR0–CR7	0F 20/r		Move control register to r32.
MOV r64, CR0–CR7	0F 20/r		Move extended control register to r64.
MOV r64, CR8	REX.R + 0F 20 /0		Move extended CR8 to r64.
MOV CR0–CR7, r32	0F 22 /r		Move r32 to control register.
MOV CR0–CR7, r64	0F 22 /r		Move r64 to extended control register.
MOV CR8, r64	REX.R + 0F 22 /0		Move r64 to extended CR8.
MOV r32, DR0–DR7	0F 21/r		Move debug register to r32.
MOV r64, DR0–DR7	0F 21/r		Move extended debug register to r64.
MOV DR0–DR7, r32	0F 23 /r		Move r32 to debug register.
MOV DR0–DR7, r64	0F 23 /r		Move r64 to extended debug register.
MOV r/m8, r8	88 /r		Move r8 to r/m8.
MOV r/m8, r8	REX + 88 /r		Move r8 to r/m8.
MOV r/m16, r16	89 /r		Move r16 to r/m16.
MOV r/m32, r32	89 /r		Move r32 to r/m32.
MOV r/m64, r64	REX.W + 89 /r		Move r64 to r/m64.
MOV r8, r/m8	8A /r		Move r/m8 to r8.
MOV r8, r/m8	REX + 8A /r		Move r/m8 to r8.
MOV r16, r/m16	8B /r		Move r/m16 to r16.
MOV r32, r/m32	8B /r		Move r/m32 to r32.
MOV r64, r/m64	REX.W + 8B /r		Move r/m64 to r64.
MOV r/m16, Sreg	8C /r		Move segment register to r/m16.
MOV r/m64, Sreg	REX.W + 8C /r		Move zero extended 16-bit segment register to r/m64.
MOV Sreg, r/m16	8E /r		Move r/m16 to segment register.
MOV Sreg, r/m64	REX.W + 8E /r		Move lower 16 bits of r/m64 to segment register.
MOV AL, moffs8	A0		Move byte at (seg:offset) to AL.
MOV AL, moffs8	REX.W + A0		Move byte at (offset) to AL.
MOV AX, moffs16	A1		Move word at (seg:offset) to AX.
MOV EAX, moffs32	A1		Move doubleword at (seg:offset) to EAX.
MOV RAX, moffs64	REX.W + A1		Move quadword at (offset) to RAX.
MOV moffs8, AL	A2		Move AL to (seg:offset).
MOV moffs8, AL	REX.W + A2		Move AL to (offset).
MOV moffs16, AX	A3		Move AX to (seg:offset).
MOV moffs32, EAX	A3		Move EAX to (seg:offset).
MOV moffs64, RAX	REX.W + A3		Move RAX to (offset).
MOV r8, imm8	B0+ rb ib		Move imm8 to r8.
MOV r8, imm8	REX + B0+ rb ib		Move imm8 to r8.
MOV r16, imm16	B8+ rw iw		Move imm16 to r16.
MOV r32, imm32	B8+ rd id		Move imm32 to r32.
MOV r64, imm64	REX.W + B8+ rd io		Move imm64 to r64.
MOV r/m8, imm8	C6 /0 ib		Move imm8 to r/m8.
MOV r/m8, imm8	REX + C6 /0 ib		Move imm8 to r/m8.
MOV r/m16, imm16	C7 /0 iw		Move imm16 to r/m16.
MOV r/m32, imm32	C7 /0 id		Move imm32 to r/m32.
MOV r/m64, imm32	REX.W + C7 /0 io		Move imm32 sign extended to 64-bits to r/m64.
MOVAPD xmm1, xmm2/m128	66 0F 28 /r	sse2	Move aligned packed double-precision floating-point values from xmm2/mem to xmm1.
MOVAPD xmm2/m128, xmm1	66 0F 29 /r	sse2	Move aligned packed double-precision floating-point values from xmm1 to xmm2/mem.
VMOVAPD xmm1, xmm2/m128	VEX.128.66.0F.WIG 28 /r	avx	Move aligned packed double-precision floating-point values from xmm2/mem to xmm1.
VMOVAPD xmm2/m128, xmm1	VEX.128.66.0F.WIG 29 /r	avx	Move aligned packed double-precision floating-point values from xmm1 to xmm2/mem.
VMOVAPD ymm1, ymm2/m256	VEX.256.66.0F.WIG 28 /r	avx	Move aligned packed double-precision floating-point values from ymm2/mem to ymm1.
VMOVAPD ymm2/m256, ymm1	VEX.256.66.0F.WIG 29 /r	avx	Move aligned packed double-precision floating-point values from ymm1 to ymm2/mem.
VMOVAPD xmm1 {k1}{z}, xmm2/m128	EVEX.128.66.0F.W1 28 /r	avx512	Move aligned packed double-precision floating-point values from xmm2/m128 to xmm1 using writemask k1.
VMOVAPD ymm1 {k1}{z}, ymm2/m256	EVEX.256.66.0F.W1 28 /r	avx512	Move aligned packed double-precision floating-point values from ymm2/m256 to ymm1 using writemask k1.
VMOVAPD zmm1 {k1}{z}, zmm2/m512	EVEX.512.66.0F.W1 28 /r	avx512	Move aligned packed double-precision floating-point values from zmm2/m512 to zmm1 using writemask k1.
VMOVAPD xmm2/m128 {k1}{z}, xmm1	EVEX.128.66.0F.W1 29 /r	avx512	Move aligned packed double-precision floating-point values from xmm1 to xmm2/m128 using writemask k1.
VMOVAPD ymm2/m256 {k1}{z}, ymm1	EVEX.256.66.0F.W1 29 /r	avx512	Move aligned packed double-precision floating-point values from ymm1 to ymm2/m256 using writemask k1.
VMOVAPD zmm2/m512 {k1}{z}, zmm1	EVEX.512.66.0F.W1 29 /r	avx512	Move aligned packed double-precision floating-point values from zmm1 to zmm2/m512 using writemask k1.
MOVAPS xmm1, xmm2/m128	0F 28 /r	sse	Move aligned packed single-precision floating-point values from xmm2/mem to xmm1.
MOVAPS xmm2/m128, xmm1	0F 29 /r	sse	Move aligned packed single-precision floating-point values from xmm1 to xmm2/mem.
VMOVAPS xmm1, xmm2/m128	VEX.128.0F.WIG 28 /r	avx	Move aligned packed single-precision floating-point values from xmm2/mem to xmm1.
VMOVAPS xmm2/m128, xmm1	VEX.128.0F.WIG 29 /r	avx	Move aligned packed single-precision floating-point values from xmm1 to xmm2/mem.
VMOVAPS ymm1, ymm2/m256	VEX.256.0F.WIG 28 /r	avx	Move aligned packed single-precision floating-point values from ymm2/mem to ymm1.
VMOVAPS ymm2/m256, ymm1	VEX.256.0F.WIG 29 /r	avx	Move aligned packed single-precision floating-point values from ymm1 to ymm2/mem.
VMOVAPS xmm1 {k1}{z}, xmm2/m128	EVEX.128.0F.W0 28 /r	avx512	Move aligned packed single-precision floating-point values from xmm2/m128 to xmm1 using writemask k1.
VMOVAPS ymm1 {k1}{z}, ymm2/m256	EVEX.256.0F.W0 28 /r	avx512	Move aligned packed single-precision floating-point values from ymm2/m256 to ymm1 using writemask k1.
VMOVAPS zmm1 {k1}{z}, zmm2/m512	EVEX.512.0F.W0 28 /r	avx512	Move aligned packed single-precision floating-point values from zmm2/m512 to zmm1 using writemask k1.
VMOVAPS xmm2/m128 {k1}{z}, xmm1	EVEX.128.0F.W0 29 /r	avx512	Move aligned packed single-precision floating-point values from xmm1 to xmm2/m128 using writemask k1.
VMOVAPS ymm2/m256 {k1}{z}, ymm1	EVEX.256.0F.W0 29 /r	avx512	Move aligned packed single-precision floating-point values from ymm1 to ymm2/m256 using writemask k1.
VMOVAPS zmm2/m512 {k1}{z}, zmm1	EVEX.512.0F.W0 29 /r	avx512	Move aligned packed single-precision floating-point values from zmm1 to zmm2/m512 using writemask k1.
MOVBE r16, m16	0F 38 F0 /r		Reverse byte order in m16 and move to r16.
MOVBE r32, m32	0F 38 F0 /r		Reverse byte order in m32 and move to r32.
MOVBE r64, m64	REX.W + 0F 38 F0 /r		Reverse byte order in m64 and move to r64.
MOVBE m16, r16	0F 38 F1 /r		Reverse byte order in r16 and move to m16.
MOVBE m32, r32	0F 38 F1 /r		Reverse byte order in r32 and move to m32.
MOVBE m64, r64	REX.W + 0F 38 F1 /r		Reverse byte order in r64 and move to m64.
MOVDDUP xmm1, xmm2/m64	F2 0F 12 /r	sse3	Move double-precision floating-point value from xmm2/m64 and duplicate into xmm1.
VMOVDDUP xmm1, xmm2/m64	VEX.128.F2.0F.WIG 12 /r	avx	Move double-precision floating-point value from xmm2/m64 and duplicate into xmm1.
VMOVDDUP ymm1, ymm2/m256	VEX.256.F2.0F.WIG 12 /r	avx	Move even index double-precision floating-point values from ymm2/mem and duplicate each element into ymm1.
VMOVDDUP xmm1 {k1}{z}, xmm2/m64	EVEX.128.F2.0F.W1 12 /r	avx512	Move double-precision floating-point value from xmm2/m64 and duplicate each element into xmm1 subject to writemask k1.
VMOVDDUP ymm1 {k1}{z}, ymm2/m256	EVEX.256.F2.0F.W1 12 /r	avx512	Move even index double-precision floating-point values from ymm2/m256 and duplicate each element into ymm1 subject to writemask k1.
VMOVDDUP zmm1 {k1}{z}, zmm2/m512	EVEX.512.F2.0F.W1 12 /r	avx512	Move even index double-precision floating-point values from zmm2/m512 and duplicate each element into zmm1 subject to writemask k1.
MOVDQ2Q mm, xmm	F2 0F D6 /r		Move low quadword from xmm to mmx register.
MOVDQA xmm1, xmm2/m128	66 0F 6F /r	sse2	Move aligned packed integer values from xmm2/mem to xmm1.
MOVDQA xmm2/m128, xmm1	66 0F 7F /r	sse2	Move aligned packed integer values from xmm1 to xmm2/mem.
VMOVDQA xmm1, xmm2/m128	VEX.128.66.0F.WIG 6F /r	avx	Move aligned packed integer values from xmm2/mem to xmm1.
VMOVDQA xmm2/m128, xmm1	VEX.128.66.0F.WIG 7F /r	avx	Move aligned packed integer values from xmm1 to xmm2/mem.
VMOVDQA ymm1, ymm2/m256	VEX.256.66.0F.WIG 6F /r	avx	Move aligned packed integer values from ymm2/mem to ymm1.
VMOVDQA ymm2/m256, ymm1	VEX.256.66.0F.WIG 7F /r	avx	Move aligned packed integer values from ymm1 to ymm2/mem.
VMOVDQA32 xmm1 {k1}{z}, xmm2/m128	EVEX.128.66.0F.W0 6F /r	avx512	Move aligned packed doubleword integer values from xmm2/m128 to xmm1 using writemask k1.
VMOVDQA32 ymm1 {k1}{z}, ymm2/m256	EVEX.256.66.0F.W0 6F /r	avx512	Move aligned packed doubleword integer values from ymm2/m256 to ymm1 using writemask k1.
VMOVDQA32 zmm1 {k1}{z}, zmm2/m512	EVEX.512.66.0F.W0 6F /r	avx512	Move aligned packed doubleword integer values from zmm2/m512 to zmm1 using writemask k1.
VMOVDQA32 xmm2/m128 {k1}{z}, xmm1	EVEX.128.66.0F.W0 7F /r	avx512	Move aligned packed doubleword integer values from xmm1 to xmm2/m128 using writemask k1.
VMOVDQA32 ymm2/m256 {k1}{z}, ymm1	EVEX.256.66.0F.W0 7F /r	avx512	Move aligned packed doubleword integer values from ymm1 to ymm2/m256 using writemask k1.
VMOVDQA32 zmm2/m512 {k1}{z}, zmm1	EVEX.512.66.0F.W0 7F /r	avx512	Move aligned packed doubleword integer values from zmm1 to zmm2/m512 using writemask k1.
VMOVDQA64 xmm1 {k1}{z}, xmm2/m128	EVEX.128.66.0F.W1 6F /r	avx512	Move aligned quadword integer values from xmm2/m128 to xmm1 using writemask k1.
VMOVDQA64 ymm1 {k1}{z}, ymm2/m256	EVEX.256.66.0F.W1 6F /r	avx512	Move aligned quadword integer values from ymm2/m256 to ymm1 using writemask k1.
VMOVDQA64 zmm1 {k1}{z}, zmm2/m512	EVEX.512.66.0F.W1 6F /r	avx512	Move aligned packed quadword integer values from zmm2/m512 to zmm1 using writemask k1.
VMOVDQA64 xmm2/m128 {k1}{z}, xmm1	EVEX.128.66.0F.W1 7F /r	avx512	Move aligned packed quadword integer values from xmm1 to xmm2/m128 using writemask k1.
VMOVDQA64 ymm2/m256 {k1}{z}, ymm1	EVEX.256.66.0F.W1 7F /r	avx512	Move aligned packed quadword integer values from ymm1 to ymm2/m256 using writemask k1.
VMOVDQA64 zmm2/m512 {k1}{z}, zmm1	EVEX.512.66.0F.W1 7F /r	avx512	Move aligned packed quadword integer values from zmm1 to zmm2/m512 using writemask k1.
MOVDQU xmm1, xmm2/m128	F3 0F 6F /r	sse2	Move unaligned packed integer values from xmm2/m128 to xmm1.
MOVDQU xmm2/m128, xmm1	F3 0F 7F /r	sse2	Move unaligned packed integer values from xmm1 to xmm2/m128.
VMOVDQU xmm1, xmm2/m128	VEX.128.F3.0F.WIG 6F /r	avx	Move unaligned packed integer values from xmm2/m128 to xmm1.
VMOVDQU xmm2/m128, xmm1	VEX.128.F3.0F.WIG 7F /r	avx	Move unaligned packed integer values from xmm1 to xmm2/m128.
VMOVDQU ymm1, ymm2/m256	VEX.256.F3.0F.WIG 6F /r	avx	Move unaligned packed integer values from ymm2/m256 to ymm1.
VMOVDQU ymm2/m256, ymm1	VEX.256.F3.0F.WIG 7F /r	avx	Move unaligned packed integer values from ymm1 to ymm2/m256.
VMOVDQU8 xmm1 {k1}{z}, xmm2/m128	EVEX.128.F2.0F.W0 6F /r	avx512	Move unaligned packed byte integer values from xmm2/m128 to xmm1 using writemask k1.
VMOVDQU8 ymm1 {k1}{z}, ymm2/m256	EVEX.256.F2.0F.W0 6F /r	avx512	Move unaligned packed byte integer values from ymm2/m256 to ymm1 using writemask k1.
VMOVDQU8 zmm1 {k1}{z}, zmm2/m512	EVEX.512.F2.0F.W0 6F /r	avx512	Move unaligned packed byte integer values from zmm2/m512 to zmm1 using writemask k1.
VMOVDQU8 xmm2/m128 {k1}{z}, xmm1	EVEX.128.F2.0F.W0 7F /r	avx512	Move unaligned packed byte integer values from xmm1 to xmm2/m128 using writemask k1.
VMOVDQU8 ymm2/m256 {k1}{z}, ymm1	EVEX.256.F2.0F.W0 7F /r	avx512	Move unaligned packed byte integer values from ymm1 to ymm2/m256 using writemask k1.
VMOVDQU8 zmm2/m512 {k1}{z}, zmm1	EVEX.512.F2.0F.W0 7F /r	avx512	Move unaligned packed byte integer values from zmm1 to zmm2/m512 using writemask k1.
VMOVDQU16 xmm1 {k1}{z}, xmm2/m128	EVEX.128.F2.0F.W1 6F /r	avx512	Move unaligned packed word integer values from xmm2/m128 to xmm1 using writemask k1.
VMOVDQU16 ymm1 {k1}{z}, ymm2/m256	EVEX.256.F2.0F.W1 6F /r	avx512	Move unaligned packed word integer values from ymm2/m256 to ymm1 using writemask k1.
VMOVDQU16 zmm1 {k1}{z}, zmm2/m512	EVEX.512.F2.0F.W1 6F /r	avx512	Move unaligned packed word integer values from zmm2/m512 to zmm1 using writemask k1.
VMOVDQU16 xmm2/m128 {k1}{z}, xmm1	EVEX.128.F2.0F.W1 7F /r	avx512	Move unaligned packed word integer values from xmm1 to xmm2/m128 using writemask k1.
VMOVDQU16 ymm2/m256 {k1}{z}, ymm1	EVEX.256.F2.0F.W1 7F /r	avx512	Move unaligned packed word integer values from ymm1 to ymm2/m256 using writemask k1.
VMOVDQU16 zmm2/m512 {k1}{z}, zmm1	EVEX.512.F2.0F.W1 7F /r	avx512	Move unaligned packed word integer values from zmm1 to zmm2/m512 using writemask k1.
VMOVDQU32 xmm1 {k1}{z}, xmm2/mm128	EVEX.128.F3.0F.W0 6F /r	avx512	Move unaligned packed doubleword integer values from xmm2/m128 to xmm1 using writemask k1.
VMOVDQU32 ymm1 {k1}{z}, ymm2/m256	EVEX.256.F3.0F.W0 6F /r	avx512	Move unaligned packed doubleword integer values from ymm2/m256 to ymm1 using writemask k1.
VMOVDQU32 zmm1 {k1}{z}, zmm2/m512	EVEX.512.F3.0F.W0 6F /r	avx512	Move unaligned packed doubleword integer values from zmm2/m512 to zmm1 using writemask k1.
VMOVDQU32 xmm2/m128 {k1}{z}, xmm1	EVEX.128.F3.0F.W0 7F /r	avx512	Move unaligned packed doubleword integer values from xmm1 to xmm2/m128 using writemask k1.
VMOVDQU32 ymm2/m256 {k1}{z}, ymm1	EVEX.256.F3.0F.W0 7F /r	avx512	Move unaligned packed doubleword integer values from ymm1 to ymm2/m256 using writemask k1.
VMOVDQU32 zmm2/m512 {k1}{z}, zmm1	EVEX.512.F3.0F.W0 7F /r	avx512	Move unaligned packed doubleword integer values from zmm1 to zmm2/m512 using writemask k1.
VMOVDQU64 xmm1 {k1}{z}, xmm2/m128	EVEX.128.F3.0F.W1 6F /r	avx512	Move unaligned packed quadword integer values from xmm2/m128 to xmm1 using writemask k1.
VMOVDQU64 ymm1 {k1}{z}, ymm2/m256	EVEX.256.F3.0F.W1 6F /r	avx512	Move unaligned packed quadword integer values from ymm2/m256 to ymm1 using writemask k1.
VMOVDQU64 zmm1 {k1}{z}, zmm2/m512	EVEX.512.F3.0F.W1 6F /r	avx512	Move unaligned packed quadword integer values from zmm2/m512 to zmm1 using writemask k1.
VMOVDQU64 xmm2/m128 {k1}{z}, xmm1	EVEX.128.F3.0F.W1 7F /r	avx512	Move unaligned packed quadword integer values from xmm1 to xmm2/m128 using writemask k1.
VMOVDQU64 ymm2/m256 {k1}{z}, ymm1	EVEX.256.F3.0F.W1 7F /r	avx512	Move unaligned packed quadword integer values from ymm1 to ymm2/m256 using writemask k1.
VMOVDQU64 zmm2/m512 {k1}{z}, zmm1	EVEX.512.F3.0F.W1 7F /r	avx512	Move unaligned packed quadword integer values from zmm1 to zmm2/m512 using writemask k1.
MOVD mm, r/m32	0F 6E /r	mmx	Move doubleword from r/m32 to mm.
MOVQ mm, r/m64	REX.W + 0F 6E /r	mmx	Move quadword from r/m64 to mm.
MOVD r/m32, mm	0F 7E /r	mmx	Move doubleword from mm to r/m32.
MOVQ r/m64, mm	REX.W + 0F 7E /r	mmx	Move quadword from mm to r/m64.
MOVD xmm, r/m32	66 0F 6E /r	sse2	Move doubleword from r/m32 to xmm.
MOVQ xmm, r/m64	66 REX.W 0F 6E /r	sse2	Move quadword from r/m64 to xmm.
MOVD r/m32, xmm	66 0F 7E /r	sse2	Move doubleword from xmm register to r/m32.
MOVQ r/m64, xmm	66 REX.W 0F 7E /r	sse2	Move quadword from xmm register to r/m64.
VMOVD xmm1, r32/m32	VEX.128.66.0F.W0 6E /	avx	Move doubleword from r/m32 to xmm1.
VMOVQ xmm1, r64/m64	VEX.128.66.0F.W1 6E /r	avx	Move quadword from r/m64 to xmm1.
VMOVD r32/m32, xmm1	VEX.128.66.0F.W0 7E /r	avx	Move doubleword from xmm1 register to r/m32.
MOVD xmm1, r32/m32	VEX.128.66.0F.W1 7E /r	avx512	Move doubleword from r/m32 to xmm1.
VMOVQ xmm1, r64/m64	EVEX.128.66.0F.W0 6E /r	avx512	Move quadword from r/m64 to xmm1.
VMOVD r32/m32, xmm1	EVEX.128.66.0F.W0 7E /r	avx512	Move doubleword from xmm1 register to r/m32.
VMOVQ r64/m64, xmm1	EVEX.128.66.0F.W1 7E /r	avx512	Move quadword from xmm1 register to r/m64.
MOVHLPS xmm1, xmm2	0F 12 /r	sse	Move two packed single-precision floating-point values from high quadword of xmm2 to low quadword of xmm1.
VMOVHLPS xmm1, xmm2, xmm3	VEX.NDS.128.0F.WIG 12 /r	avx	Merge two packed single-precision floating-point values from high quadword of xmm3 and low quadword of xmm2.
VMOVHLPS xmm1, xmm2, xmm3	EVEX.NDS.128.0F.W0 12 /r	avx512	Merge two packed single-precision floating-point values from high quadword of xmm3 and low quadword of xmm2.
MOVHPD xmm1, m64	66 0F 16 /r	sse2	Move double-precision floating-point value from m64 to high quadword of xmm1.
VMOVHPD xmm2, xmm1, m64	VEX.NDS.128.66.0F.WIG 16 /r	avx	Merge double-precision floating-point value from m64 and the low quadword of xmm1.
VMOVHPD xmm2, xmm1, m64	EVEX.NDS.128.66.0F.W1 16 /r	avx512	Merge double-precision floating-point value from m64 and the low quadword of xmm1.
MOVHPD m64, xmm1	66 0F 17 /r	sse2	Move double-precision floating-point value from high quadword of xmm1 to m64.
VMOVHPD m64, xmm1	VEX.128.66.0F.WIG 17 /r	avx	Move double-precision floating-point value from high quadword of xmm1 to m64.
VMOVHPD m64, xmm1	EVEX.128.66.0F.W1 17 /r	avx512	Move double-precision floating-point value from high quadword of xmm1 to m64.
MOVHPS xmm1, m64	0F 16 /r	sse	Move two packed single-precision floating-point values from m64 to high quadword of xmm1.
VMOVHPS xmm2, xmm1, m64	VEX.NDS.128.0F.WIG 16 /r	avx	Merge two packed single-precision floating-point values from m64 and the low quadword of xmm1.
VMOVHPS xmm2, xmm1, m64	EVEX.NDS.128.0F.W0 16 /r	avx512	Merge two packed single-precision floating-point values from m64 and the low quadword of xmm1.
MOVHPS m64, xmm1	0F 17 /r	sse	Move two packed single-precision floating-point values from high quadword of xmm1 to m64.
VMOVHPS m64, xmm1	VEX.128.0F.WIG 17 /r	avx	Move two packed single-precision floating-point values from high quadword of xmm1 to m64.
VMOVHPS m64, xmm1	EVEX.128.0F.W0 17 /r	avx512	Move two packed single-precision floating-point values from high quadword of xmm1 to m64.
MOVLHPS xmm1, xmm2	0F 16 /r	sse	Move two packed single-precision floating-point values from low quadword of xmm2 to high quadword of xmm1.
VMOVLHPS xmm1, xmm2, xmm3	VEX.NDS.128.0F.WIG 16 /r	avx	Merge two packed single-precision floating-point values from low quadword of xmm3 and low quadword of xmm2.
VMOVLHPS xmm1, xmm2, xmm3	EVEX.NDS.128.0F.W0 16 /r	avx512	Merge two packed single-precision floating-point values from low quadword of xmm3 and low quadword of xmm2.
MOVLPD xmm1, m64	66 0F 12 /r	sse2	Move double-precision floating-point value from m64 to low quadword of xmm1.
VMOVLPD xmm2, xmm1, m64	VEX.NDS.128.66.0F.WIG 12 /r	avx	Merge double-precision floating-point value from m64 and the high quadword of xmm1.
VMOVLPD xmm2, xmm1, m64	EVEX.NDS.128.66.0F.W1 12 /r	avx512	Merge double-precision floating-point value from m64 and the high quadword of xmm1.
MOVLPD m64, xmm1	66 0F 13/r	sse2	Move double-precision floating-point value from low quadword of xmm1 to m64.
VMOVLPD m64, xmm1	VEX.128.66.0F.WIG 13/r	avx	Move double-precision floating-point value from low quadword of xmm1 to m64.
VMOVLPD m64, xmm1	EVEX.128.66.0F.W1 13/r	avx512	Move double-precision floating-point value from low quadword of xmm1 to m64.
MOVLPS xmm1, m64	0F 12 /r	sse	Move two packed single-precision floating-point values from m64 to low quadword of xmm1.
VMOVLPS xmm2, xmm1, m64	VEX.NDS.128.0F.WIG 12 /r	avx	Merge two packed single-precision floating-point values from m64 and the high quadword of xmm1.
VMOVLPS xmm2, xmm1, m64	EVEX.NDS.128.0F.W0 12 /r	avx512	Merge two packed single-precision floating-point values from m64 and the high quadword of xmm1.
MOVLPS m64, xmm1	0F 13/r	sse	Move two packed single-precision floating-point values from low quadword of xmm1 to m64.
VMOVLPS m64, xmm1	VEX.128.0F.WIG 13/r	avx	Move two packed single-precision floating-point values from low quadword of xmm1 to m64.
VMOVLPS m64, xmm1	EVEX.128.0F.W0 13/r	avx512	Move two packed single-precision floating-point values from low quadword of xmm1 to m64.
MOVMSKPD reg, xmm	66 0F 50 /r	sse2	Extract 2-bit sign mask from xmm and store in reg. The upper bits of r32 or r64 are filled with zeros.
VMOVMSKPD reg, xmm2	VEX.128.66.0F.WIG 50 /r	avx	Extract 2-bit sign mask from xmm2 and store in reg. The upper bits of r32 or r64 are zeroed.
VMOVMSKPD reg, ymm2	VEX.256.66.0F.WIG 50 /r	avx	Extract 4-bit sign mask from ymm2 and store in reg. The upper bits of r32 or r64 are zeroed.
MOVMSKPS reg, xmm	0F 50 /r	sse	Extract 4-bit sign mask from xmm and store in reg. The upper bits of r32 or r64 are filled with zeros.
VMOVMSKPS reg, xmm2	VEX.128.0F.WIG 50 /r	avx	Extract 4-bit sign mask from xmm2 and store in reg. The upper bits of r32 or r64 are zeroed.
VMOVMSKPS reg, ymm2	VEX.256.0F.WIG 50 /r	avx	Extract 8-bit sign mask from ymm2 and store in reg. The upper bits of r32 or r64 are zeroed.
MOVNTDQ m128, xmm1	66 0F E7 /r	sse2	Move packed integer values in xmm1 to m128 using non-temporal hint.
VMOVNTDQ m128, xmm1	VEX.128.66.0F.WIG E7 /r	avx	Move packed integer values in xmm1 to m128 using non-temporal hint.
VMOVNTDQ m256, ymm1	VEX.256.66.0F.WIG E7 /r	avx	Move packed integer values in ymm1 to m256 using non-temporal hint.
VMOVNTDQ m128, xmm1	EVEX.128.66.0F.W0 E7 /r	avx512	Move packed integer values in xmm1 to m128 using non-temporal hint.
VMOVNTDQ m256, ymm1	EVEX.256.66.0F.W0 E7 /r	avx512	Move packed integer values in zmm1 to m256 using non-temporal hint.
VMOVNTDQ m512, zmm1	EVEX.512.66.0F.W0 E7 /r	avx512	Move packed integer values in zmm1 to m512 using non-temporal hint.
MOVNTDQA xmm1, m128	66 0F 38 2A /r	sse4.1	Move double quadword from m128 to xmm1 using non-temporal hint if WC memory type.
VMOVNTDQA xmm1, m128	VEX.128.66.0F38.WIG 2A /r	avx	Move double quadword from m128 to xmm using non-temporal hint if WC memory type.
VMOVNTDQA ymm1, m256	VEX.256.66.0F38.WIG 2A /r	avx2	Move 256-bit data from m256 to ymm using non-temporal hint if WC memory type.
VMOVNTDQA xmm1, m128	EVEX.128.66.0F38.W0 2A /r	avx512	Move 128-bit data from m128 to xmm using non-temporal hint if WC memory type.
VMOVNTDQA ymm1, m256	EVEX.256.66.0F38.W0 2A /r	avx512	Move 256-bit data from m256 to ymm using non-temporal hint if WC memory type.
VMOVNTDQA zmm1, m512	EVEX.512.66.0F38.W0 2A /r	avx512	Move 512-bit data from m512 to zmm using non-temporal hint if WC memory type.
MOVNTI m32, r32	0F C3 /r		Move doubleword from r32 to m32 using non-temporal hint.
MOVNTI m64, r64	REX.W + 0F C3 /r		Move quadword from r64 to m64 using non-temporal hint.
MOVNTPD m128, xmm1	66 0F 2B /r	sse2	Move packed double-precision values in xmm1 to m128 using non-temporal hint.
VMOVNTPD m128, xmm1	VEX.128.66.0F.WIG 2B /r	avx	Move packed double-precision values in xmm1 to m128 using non-temporal hint.
VMOVNTPD m256, ymm1	VEX.256.66.0F.WIG 2B /r	avx	Move packed double-precision values in ymm1 to m256 using non-temporal hint.
VMOVNTPD m128, xmm1	EVEX.128.66.0F.W1 2B /r	avx512	Move packed double-precision values in xmm1 to m128 using non-temporal hint.
VMOVNTPD m256, ymm1	EVEX.256.66.0F.W1 2B /r	avx512	Move packed double-precision values in ymm1 to m256 using non-temporal hint.
VMOVNTPD m512, zmm1	EVEX.512.66.0F.W1 2B /r	avx512	Move packed double-precision values in zmm1 to m512 using non-temporal hint.
MOVNTPS m128, xmm1	0F 2B /r	sse	Move packed single-precision values xmm1 to mem using non-temporal hint.
VMOVNTPS m128, xmm1	VEX.128.0F.WIG 2B /r	avx	Move packed single-precision values xmm1 to mem using non-temporal hint.
VMOVNTPS m256, ymm1	VEX.256.0F.WIG 2B /r	avx	Move packed single-precision values ymm1 to mem using non-temporal hint.
VMOVNTPS m128, xmm1	EVEX.128.0F.W0 2B /r	avx512	Move packed single-precision values in xmm1 to m128 using non-temporal hint.
VMOVNTPS m256, ymm1	EVEX.256.0F.W0 2B /r	avx512	Move packed single-precision values in ymm1 to m256 using non-temporal hint.
VMOVNTPS m512, zmm1	EVEX.512.0F.W0 2B /r	avx512	Move packed single-precision values in zmm1 to m512 using non-temporal hint.
MOVNTQ m64, mm	0F E7 /r		Move quadword from mm to m64 using non-temporal hint.
MOVQ mm, mm/m64	0F 6F /r	mmx	Move quadword from mm/m64 to mm.
MOVQ mm/m64, mm	0F 7F /r	mmx	Move quadword from mm to mm/m64.
MOVQ xmm1, xmm2/m64	F3 0F 7E /r	sse2	Move quadword from xmm2/mem64 to xmm1.
VMOVQ xmm1, xmm2/m64	VEX.128.F3.0F.WIG 7E /r	avx	Move quadword from xmm2 to xmm1.
VMOVQ xmm1, xmm2/m64	EVEX.128.F3.0F.W1 7E /r	avx512	Move quadword from xmm2/m64 to xmm1.
MOVQ xmm2/m64, xmm1	66 0F D6 /r	sse2	Move quadword from xmm1 to xmm2/mem64.
VMOVQ xmm1/m64, xmm2	VEX.128.66.0F.WIG D6 /r	avx	Move quadword from xmm2 register to xmm1/m64.
VMOVQ xmm1/m64, xmm2	EVEX.128.66.0F.W1 D6 /r	avx512	Move quadword from xmm2 register to xmm1/m64.
MOVQ2DQ xmm, mm	F3 0F D6 /r		Move quadword from mmx to low quadword of xmm.
MOVSD xmm1, xmm2	F2 0F 10 /r	sse2	Move scalar double-precision floating-point value from xmm2 to xmm1 register.
MOVSD xmm1, m64	F2 0F 10 /r	sse2	Load scalar double-precision floating-point value from m64 to xmm1 register.
MOVSD xmm1/m64, xmm2	F2 0F 11 /r	sse2	Move scalar double-precision floating-point value from xmm2 register to xmm1/m64.
VMOVSD xmm1, xmm2, xmm3	VEX.NDS.LIG.F2.0F.WIG 10 /r	avx	Merge scalar double-precision floating-point value from xmm2 and xmm3 to xmm1 register.
VMOVSD xmm1, m64	VEX.LIG.F2.0F.WIG 10 /r	avx	Load scalar double-precision floating-point value from m64 to xmm1 register.
VMOVSD xmm1, xmm2, xmm3	VEX.NDS.LIG.F2.0F.WIG 11 /r	avx	Merge scalar double-precision floating-point value from xmm2 and xmm3 registers to xmm1.
VMOVSD m64, xmm1	VEX.LIG.F2.0F.WIG 11 /r	avx	Store scalar double-precision floating-point value from xmm1 register to m64.
VMOVSD xmm1 {k1}{z}, xmm2, xmm3	EVEX.NDS.LIG.F2.0F.W1 10 /r	avx512	Merge scalar double-precision floating-point value from xmm2 and xmm3 registers to xmm1 under writemask k1.
VMOVSD xmm1 {k1}{z}, m64	EVEX.LIG.F2.0F.W1 10 /r	avx512	Load scalar double-precision floating-point value from m64 to xmm1 register under writemask k1.
VMOVSD xmm1 {k1}{z}, xmm2, xmm3	EVEX.NDS.LIG.F2.0F.W1 11 /r	avx512	Merge scalar double-precision floating-point value from xmm2 and xmm3 registers to xmm1 under writemask k1.
VMOVSD m64 {k1}, xmm1	EVEX.LIG.F2.0F.W1 11 /r	avx512	Store scalar double-precision floating-point value from xmm1 register to m64 under writemask k1.
MOVSHDUP xmm1, xmm2/m128	F3 0F 16 /r	sse3	Move odd index single-precision floating-point values from xmm2/mem and duplicate each element into xmm1.
VMOVSHDUP xmm1, xmm2/m128	VEX.128.F3.0F.WIG 16 /r	avx	Move odd index single-precision floating-point values from xmm2/mem and duplicate each element into xmm1.
VMOVSHDUP ymm1, ymm2/m256	VEX.256.F3.0F.WIG 16 /r	avx	Move odd index single-precision floating-point values from ymm2/mem and duplicate each element into ymm1.
VMOVSHDUP xmm1 {k1}{z}, xmm2/m128	EVEX.128.F3.0F.W0 16 /r	avx512	Move odd index single-precision floating-point values from xmm2/m128 and duplicate each element into xmm1 under writemask.
VMOVSHDUP ymm1 {k1}{z}, ymm2/m256	EVEX.256.F3.0F.W0 16 /r	avx512	Move odd index single-precision floating-point values from ymm2/m256 and duplicate each element into ymm1 under writemask.
VMOVSHDUP zmm1 {k1}{z}, zmm2/m512	EVEX.512.F3.0F.W0 16 /r	avx512	Move odd index single-precision floating-point values from zmm2/m512 and duplicate each element into zmm1 under writemask.
MOVSLDUP xmm1, xmm2/m128	F3 0F 12 /r	sse3	Move even index single-precision floating-point values from xmm2/mem and duplicate each element into xmm1.
VMOVSLDUP xmm1, xmm2/m128	VEX.128.F3.0F.WIG 12 /r	avx	Move even index single-precision floating-point values from xmm2/mem and duplicate each element into xmm1.
VMOVSLDUP ymm1, ymm2/m256	VEX.256.F3.0F.WIG 12 /r	avx	Move even index single-precision floating-point values from ymm2/mem and duplicate each element into ymm1.
VMOVSLDUP xmm1 {k1}{z}, xmm2/m128	EVEX.128.F3.0F.W0 12 /r	avx512	Move even index single-precision floating-point values from xmm2/m128 and duplicate each element into xmm1 under writemask.
VMOVSLDUP ymm1 {k1}{z}, ymm2/m256	EVEX.256.F3.0F.W0 12 /r	avx512	Move even index single-precision floating-point values from ymm2/m256 and duplicate each element into ymm1 under writemask.
VMOVSLDUP zmm1 {k1}{z}, zmm2/m512	EVEX.512.F3.0F.W0 12 /r	avx512	Move even index single-precision floating-point values from zmm2/m512 and duplicate each element into zmm1 under writemask.
MOVSS xmm1, xmm2	F3 0F 10 /r	sse	Merge scalar single-precision floating-point value from xmm2 to xmm1 register.
MOVSS xmm1, m32	F3 0F 10 /r	sse	Load scalar single-precision floating-point value from m32 to xmm1 register.
VMOVSS xmm1, xmm2, xmm3	VEX.NDS.LIG.F3.0F.WIG 10 /r	avx	Merge scalar single-precision floating-point value from xmm2 and xmm3 to xmm1 register
VMOVSS xmm1, m32	VEX.LIG.F3.0F.WIG 10 /r	avx	Load scalar single-precision floating-point value from m32 to xmm1 register.
MOVSS xmm2/m32, xmm1	F3 0F 11 /r	sse	Move scalar single-precision floating-point value from xmm1 register to xmm2/m32.
VMOVSS xmm1, xmm2, xmm3	VEX.NDS.LIG.F3.0F.WIG 11 /r	avx	Move scalar single-precision floating-point value from xmm2 and xmm3 to xmm1 register.
VMOVSS m32, xmm1	VEX.LIG.F3.0F.WIG 11 /r	avx	Move scalar single-precision floating-point value from xmm1 register to m32.
VMOVSS xmm1 {k1}{z}, xmm2, xmm3	EVEX.NDS.LIG.F3.0F.W0 10 /r	avx512	Move scalar single-precision floating-point value from xmm2 and xmm3 to xmm1 register under writemask k1.
VMOVSS xmm1 {k1}{z}, m32	EVEX.LIG.F3.0F.W0 10 /r	avx512	Move scalar single-precision floating-point values from m32 to xmm1 under writemask k1.
VMOVSS xmm1 {k1}{z}, xmm2, xmm3	EVEX.NDS.LIG.F3.0F.W0 11 /r	avx512	Move scalar single-precision floating-point value from xmm2 and xmm3 to xmm1 register under writemask k1.
VMOVSS m32 {k1}, xmm1	EVEX.LIG.F3.0F.W0 11 /r	avx512	Move scalar single-precision floating-point values from xmm1 to m32 under writemask k1.
MOVSX r16, r/m8	0F BE /r		Move byte to word with sign-extension.
MOVSX r32, r/m8	0F BE /r		Move byte to doubleword with sign-extension.
MOVSX r64, r/m8	REX + 0F BE /r		Move byte to quadword with sign-extension.
MOVSX r32, r/m16	0F BF /r		Move word to doubleword, with sign-extension.
MOVSX r64, r/m16	REX.W + 0F BF /r		Move word to quadword with sign-extension.
MOVSXD r64, r/m32	REX.W + 63 /r		Move doubleword to quadword with sign-extension.
MOVS m8, m8	A4		For legacy mode, Move byte from address DS:(E)SI to ES:(E)DI. For 64-bit mode move byte from address (R\|E)SI to (R\|E)DI.
MOVS m16, m16	A5		For legacy mode, move word from address DS:(E)SI to ES:(E)DI. For 64-bit mode move word at address (R\|E)SI to (R\|E)DI.
MOVS m32, m32	A5		For legacy mode, move dword from address DS:(E)SI to ES:(E)DI. For 64-bit mode move dword from address (R\|E)SI to (R\|E)DI.
MOVS m64, m64	REX.W + A5		Move qword from address (R\|E)SI to (R\|E)DI.
MOVSB	A4		For legacy mode, Move byte from address DS:(E)SI to ES:(E)DI. For 64-bit mode move byte from address (R\|E)SI to (R\|E)DI.
MOVSW	A5		For legacy mode, move word from address DS:(E)SI to ES:(E)DI. For 64-bit mode move word at address (R\|E)SI to (R\|E)DI.
MOVSD	A5		For legacy mode, move dword from address DS:(E)SI to ES:(E)DI. For 64-bit mode move dword from address (R\|E)SI to (R\|E)DI.
MOVSQ	REX.W + A5		Move qword from address (R\|E)SI to (R\|E)DI.
MOVUPD xmm1, xmm2/m128	66 0F 10 /r	sse2	Move unaligned packed double-precision floating-point from xmm2/mem to xmm1.
MOVUPD xmm2/m128, xmm1	66 0F 11 /r	sse2	Move unaligned packed double-precision floating-point from xmm1 to xmm2/mem.
VMOVUPD xmm1, xmm2/m128	VEX.128.66.0F.WIG 10 /r	avx	Move unaligned packed double-precision floating-point from xmm2/mem to xmm1.
VMOVUPD xmm2/m128, xmm1	VEX.128.66.0F.WIG 11 /r	avx	Move unaligned packed double-precision floating-point from xmm1 to xmm2/mem.
VMOVUPD ymm1, ymm2/m256	VEX.256.66.0F.WIG 10 /r	avx	Move unaligned packed double-precision floating-point from ymm2/mem to ymm1.
VMOVUPD ymm2/m256, ymm1	VEX.256.66.0F.WIG 11 /r	avx	Move unaligned packed double-precision floating-point from ymm1 to ymm2/mem.
VMOVUPD xmm1 {k1}{z}, xmm2/m128	EVEX.128.66.0F.W1 10 /r	avx512	Move unaligned packed double-precision floating-point from xmm2/m128 to xmm1 using writemask k1.
VMOVUPD xmm2/m128 {k1}{z}, xmm1	EVEX.128.66.0F.W1 11 /r	avx512	Move unaligned packed double-precision floating-point from xmm1 to xmm2/m128 using writemask k1.
VMOVUPD ymm1 {k1}{z}, ymm2/m256	EVEX.256.66.0F.W1 10 /r	avx512	Move unaligned packed double-precision floating-point from ymm2/m256 to ymm1 using writemask k1.
VMOVUPD ymm2/m256 {k1}{z}, ymm1	EVEX.256.66.0F.W1 11 /r	avx512	Move unaligned packed double-precision floating-point from ymm1 to ymm2/m256 using writemask k1.
VMOVUPD zmm1 {k1}{z}, zmm2/m512	EVEX.512.66.0F.W1 10 /r	avx512	Move unaligned packed double-precision floating-point values from zmm2/m512 to zmm1 using writemask k1.
VMOVUPD zmm2/m512 {k1}{z}, zmm1	EVEX.512.66.0F.W1 11 /r	avx512	Move unaligned packed double-precision floating-point values from zmm1 to zmm2/m512 using writemask k1.
MOVUPS xmm1, xmm2/m128	0F 10 /r	sse	Move unaligned packed single-precision floating-point from xmm2/mem to xmm1.
MOVUPS xmm2/m128, xmm1	0F 11 /r	sse	Move unaligned packed single-precision floating-point from xmm1 to xmm2/mem.
VMOVUPS xmm1, xmm2/m128	VEX.128.0F.WIG 10 /r	avx	Move unaligned packed single-precision floating-point from xmm2/mem to xmm1.
VMOVUPS xmm2/m128, xmm1	VEX.128.0F 11.WIG /r	avx	Move unaligned packed single-precision floating-point from xmm1 to xmm2/mem.
VMOVUPS ymm1, ymm2/m256	VEX.256.0F 10.WIG /r	avx	Move unaligned packed single-precision floating-point from ymm2/mem to ymm1.
VMOVUPS ymm2/m256, ymm1	VEX.256.0F 11.WIG /r	avx	Move unaligned packed single-precision floating-point from ymm1 to ymm2/mem.
VMOVUPS xmm1 {k1}{z}, xmm2/m128	EVEX.128.0F.W0 10 /r	avx512	Move unaligned packed single-precision floating-point values from xmm2/m128 to xmm1 using writemask k1.
VMOVUPS ymm1 {k1}{z}, ymm2/m256	EVEX.256.0F.W0 10 /r	avx512	Move unaligned packed single-precision floating-point values from ymm2/m256 to ymm1 using writemask k1.
VMOVUPS zmm1 {k1}{z}, zmm2/m512	EVEX.512.0F.W0 10 /r	avx512	Move unaligned packed single-precision floating-point values from zmm2/m512 to zmm1 using writemask k1.
VMOVUPS xmm2/m128 {k1}{z}, xmm1	EVEX.128.0F.W0 11 /r	avx512	Move unaligned packed single-precision floating-point values from xmm1 to xmm2/m128 using writemask k1.
VMOVUPS ymm2/m256 {k1}{z}, ymm1	EVEX.256.0F.W0 11 /r	avx512	Move unaligned packed single-precision floating-point values from ymm1 to ymm2/m256 using writemask k1.
VMOVUPS zmm2/m512 {k1}{z}, zmm1	EVEX.512.0F.W0 11 /r	avx512	Move unaligned packed single-precision floating-point values from zmm1 to zmm2/m512 using writemask k1.
MOVZX r16, r/m8	0F B6 /r		Move byte to word with zero-extension.
MOVZX r32, r/m8	0F B6 /r		Move byte to doubleword, zero-extension.
MOVZX r64, r/m8	REX.W + 0F B6 /r		Move byte to quadword, zero-extension.
MOVZX r32, r/m16	0F B7 /r		Move word to doubleword, zero-extension.
MOVZX r64, r/m16	REX.W + 0F B7 /r		Move word to quadword, zero-extension.
MPSADBW xmm1, xmm2/m128, imm8	66 0F 3A 42 /r ib	sse4.1	Sums absolute 8-bit integer difference of adjacent groups of 4 byte integers in xmm1 and xmm2/m128 and writes the results in xmm1. Starting offsets within xmm1 and xmm2/m128 are determined by imm8.
VMPSADBW xmm1, xmm2, xmm3/m128, imm8	VEX.NDS.128.66.0F3A.WIG 42 /r ib	avx	Sums absolute 8-bit integer difference of adjacent groups of 4 byte integers in xmm2 and xmm3/m128 and writes the results in xmm1. Starting offsets within xmm2 and xmm3/m128 are determined by imm8.
VMPSADBW ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.66.0F3A.WIG 42 /r ib	avx2	Sums absolute 8-bit integer difference of adjacent groups of 4 byte integers in xmm2 and ymm3/m128 and writes the results in ymm1. Starting offsets within ymm2 and xmm3/m128 are determined by imm8.
MUL r/m8	F6 /4		Unsigned multiply (AX ← AL ∗ r/m8).
MUL r/m8	REX + F6 /4		Unsigned multiply (AX ← AL ∗ r/m8).
MUL r/m16	F7 /4		Unsigned multiply (DX:AX ← AX ∗ r/m16).
MUL r/m32	F7 /4		Unsigned multiply (EDX:EAX ← EAX ∗ r/m32).
MUL r/m64	REX.W + F7 /4		Unsigned multiply (RDX:RAX ← RAX ∗ r/m64).
MULPD xmm1, xmm2/m128	66 0F 59 /r	sse2	Multiply packed double-precision floating-point values in xmm2/m128 with xmm1 and store result in xmm1.
VMULPD xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 59 /r	avx	Multiply packed double-precision floating-point values in xmm3/m128 with xmm2 and store result in xmm1.
VMULPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 59 /r	avx	Multiply packed double-precision floating-point values in ymm3/m256 with ymm2 and store result in ymm1.
VMULPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 59 /r	avx512	Multiply packed double-precision floating-point values from xmm3/m128/m64bcst to xmm2 and store result in xmm1.
VMULPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 59 /r	avx512	Multiply packed double-precision floating-point values from ymm3/m256/m64bcst to ymm2 and store result in ymm1.
VMULPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F.W1 59 /r	avx512	Multiply packed double-precision floating-point values in zmm3/m512/m64bcst with zmm2 and store result in zmm1.
MULPS xmm1, xmm2/m128	0F 59 /r	sse	Multiply packed single-precision floating-point values in xmm2/m128 with xmm1 and store result in xmm1.
VMULPS xmm1,xmm2, xmm3/m128	VEX.NDS.128.0F.WIG 59 /r	avx	Multiply packed single-precision floating-point values in xmm3/m128 with xmm2 and store result in xmm1.
VMULPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.0F.WIG 59 /r	avx	Multiply packed single-precision floating-point values in ymm3/m256 with ymm2 and store result in ymm1.
VMULPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.0F.W0 59 /r	avx512	Multiply packed single-precision floating-point values from xmm3/m128/m32bcst to xmm2 and store result in xmm1.
VMULPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.0F.W0 59 /r	avx512	Multiply packed single-precision floating-point values from ymm3/m256/m32bcst to ymm2 and store result in ymm1.
VMULPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst {er}	EVEX.NDS.512.0F.W0 59 /r	avx512	Multiply packed single-precision floating-point values in zmm3/m512/m32bcst with zmm2 and store result in zmm1.
MULSD xmm1,xmm2/m64	F2 0F 59 /r	sse2	Multiply the low double-precision floating-point value in xmm2/m64 by low double-precision floating-point value in xmm1.
VMULSD xmm1,xmm2, xmm3/m64	VEX.NDS.128.F2.0F.WIG 59 /r	avx	Multiply the low double-precision floating-point value in xmm3/m64 by low double-precision floating-point value in xmm2.
VMULSD xmm1 {k1}{z}, xmm2, xmm3/m64 {er}	EVEX.NDS.LIG.F2.0F.W1 59 /r	avx512	Multiply the low double-precision floating-point value in xmm3/m64 by low double-precision floating-point value in xmm2.
MULSS xmm1,xmm2/m32	F3 0F 59 /r	sse	Multiply the low single-precision floating-point value in xmm2/m32 by the low single-precision floating-point value in xmm1.
VMULSS xmm1,xmm2, xmm3/m32	VEX.NDS.128.F3.0F.WIG 59 /r	avx	Multiply the low single-precision floating-point value in xmm3/m32 by the low single-precision floating-point value in xmm2.
VMULSS xmm1 {k1}{z}, xmm2, xmm3/m32 {er}	EVEX.NDS.LIG.F3.0F.W0 59 /r	avx512	Multiply the low single-precision floating-point value in xmm3/m32 by the low single-precision floating-point value in xmm2.
MULX r32a, r32b, r/m32	VEX.NDD.LZ.F2.0F38.W0 F6 /r	bmi2	Unsigned multiply of r/m32 with EDX without affecting arithmetic flags.
MULX r64a, r64b, r/m64	VEX.NDD.LZ.F2.0F38.W1 F6 /r	bmi2	Unsigned multiply of r/m64 with RDX without affecting arithmetic flags.
MWAIT	0F 01 C9		A hint that allow the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events.
NEG r/m8	F6 /3		Two's complement negate r/m8.
NEG r/m8	REX + F6 /3		Two's complement negate r/m8.
NEG r/m16	F7 /3		Two's complement negate r/m16.
NEG r/m32	F7 /3		Two's complement negate r/m32.
NEG r/m64	REX.W + F7 /3		Two's complement negate r/m64.
NOP	90		One byte no-operation instruction.
NOP r/m16	0F 1F /0		Multi-byte no-operation instruction.
NOP r/m32	0F 1F /0		Multi-byte no-operation instruction.
NOT r/m8	F6 /2		Reverse each bit of r/m8.
NOT r/m8	REX + F6 /2		Reverse each bit of r/m8.
NOT r/m16	F7 /2		Reverse each bit of r/m16.
NOT r/m32	F7 /2		Reverse each bit of r/m32.
NOT r/m64	REX.W + F7 /2		Reverse each bit of r/m64.
OR AL, imm8	0C ib		AL OR imm8.
OR AX, imm16	0D iw		AX OR imm16.
OR EAX, imm32	0D id		EAX OR imm32.
OR RAX, imm32	REX.W + 0D id		RAX OR imm32 (sign-extended).
OR r/m8, imm8	80 /1 ib		r/m8 OR imm8.
OR r/m8, imm8	REX + 80 /1 ib		r/m8 OR imm8.
OR r/m16, imm16	81 /1 iw		r/m16 OR imm16.
OR r/m32, imm32	81 /1 id		r/m32 OR imm32.
OR r/m64, imm32	REX.W + 81 /1 id		r/m64 OR imm32 (sign-extended).
OR r/m16, imm8	83 /1 ib		r/m16 OR imm8 (sign-extended).
OR r/m32, imm8	83 /1 ib		r/m32 OR imm8 (sign-extended).
OR r/m64, imm8	REX.W + 83 /1 ib		r/m64 OR imm8 (sign-extended).
OR r/m8, r8	08 /r		r/m8 OR r8.
OR r/m8, r8	REX + 08 /r		r/m8 OR r8.
OR r/m16, r16	09 /r		r/m16 OR r16.
OR r/m32, r32	09 /r		r/m32 OR r32.
OR r/m64, r64	REX.W + 09 /r		r/m64 OR r64.
OR r8, r/m8	0A /r		r8 OR r/m8.
OR r8, r/m8	REX + 0A /r		r8 OR r/m8.
OR r16, r/m16	0B /r		r16 OR r/m16.
OR r32, r/m32	0B /r		r32 OR r/m32.
OR r64, r/m64	REX.W + 0B /r		r64 OR r/m64.
ORPD xmm1, xmm2/m128	66 0F 56/r	sse2	Return the bitwise logical OR of packed double-precision floating-point values in xmm1 and xmm2/mem
VORPD xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F 56 /r	avx	Return the bitwise logical OR of packed double-precision floating-point values in xmm2 and xmm3/mem
VORPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F 56 /r	avx	Return the bitwise logical OR of packed double-precision floating-point values in ymm2 and ymm3/mem
VORPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 56 /r	avx512	Return the bitwise logical OR of packed double-precision floating-point values in xmm2 and xmm3/m128/m64bcst subject to writemask k1.
VORPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 56 /r	avx512	Return the bitwise logical OR of packed double-precision floating-point values in ymm2 and ymm3/m256/m64bcst subject to writemask k1.
VORPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F.W1 56 /r	avx512	Return the bitwise logical OR of packed double-precision floating-point values in zmm2 and zmm3/m512/m64bcst subject to writemask k1.
ORPS xmm1, xmm2/m128	0F 56 /r	sse	Return the bitwise logical OR of packed single-precision floating-point values in xmm1 and xmm2/mem
VORPS xmm1,xmm2, xmm3/m128	VEX.NDS.128.0F 56 /r	avx	Return the bitwise logical OR of packed single-precision floating-point values in xmm2 and xmm3/mem
VORPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.0F 56 /r	avx	Return the bitwise logical OR of packed single-precision floating-point values in ymm2 and ymm3/mem
VORPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.0F.W0 56 /r	avx512	Return the bitwise logical OR of packed single-precision floating-point values in xmm2 and xmm3/m128/m32bcst subject to writemask k1.
VORPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.0F.W0 56 /r	avx512	Return the bitwise logical OR of packed single-precision floating-point values in ymm2 and ymm3/m256/m32bcst subject to writemask k1.
VORPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.0F.W0 56 /r	avx512	Return the bitwise logical OR of packed single-precision floating-point values in zmm2 and zmm3/m512/m32bcst subject to writemask k1.
OUT imm8, AL	E6 ib		Output byte in AL to I/O port address imm8.
OUT imm8, AX	E7 ib		Output word in AX to I/O port address imm8.
OUT imm8, EAX	E7 ib		Output doubleword in EAX to I/O port address imm8.
OUT DX, AL	EE		Output byte in AL to I/O port address in DX.
OUT DX, AX	EF		Output word in AX to I/O port address in DX.
OUT DX, EAX	EF		Output doubleword in EAX to I/O port address in DX.
OUTS DX, m8	6E		Output byte from memory location specified in DS:(E)SI or RSI to I/O port specified in DX.
OUTS DX, m16	6F		Output word from memory location specified in DS:(E)SI or RSI to I/O port specified in DX.
OUTS DX, m32	6F		Output doubleword from memory location specified in DS:(E)SI or RSI to I/O port specified in DX.
OUTSB	6E		Output byte from memory location specified in DS:(E)SI or RSI to I/O port specified in DX.
OUTSW	6F		Output word from memory location specified in DS:(E)SI or RSI to I/O port specified in DX.
OUTSD	6F		Output doubleword from memory location specified in DS:(E)SI or RSI to I/O port specified in DX.
PABSB mm1, mm2/m64	0F 38 1C /r	ssse3	Compute the absolute value of bytes in mm2/m64 and store UNSIGNED result in mm1.
PABSB xmm1, xmm2/m128	66 0F 38 1C /r	ssse3	Compute the absolute value of bytes in xmm2/m128 and store UNSIGNED result in xmm1.
PABSW mm1, mm2/m64	0F 38 1D /r	ssse3	Compute the absolute value of 16-bit integers in mm2/m64 and store UNSIGNED result in mm1.
PABSW xmm1, xmm2/m128	66 0F 38 1D /r	ssse3	Compute the absolute value of 16-bit integers in xmm2/m128 and store UNSIGNED result in xmm1.
PABSD mm1, mm2/m64	0F 38 1E /r	ssse3	Compute the absolute value of 32-bit integers in mm2/m64 and store UNSIGNED result in mm1.
PABSD xmm1, xmm2/m128	66 0F 38 1E /r	ssse3	Compute the absolute value of 32-bit integers in xmm2/m128 and store UNSIGNED result in xmm1.
VPABSB xmm1, xmm2/m128	VEX.128.66.0F38.WIG 1C /r	avx	Compute the absolute value of bytes in xmm2/m128 and store UNSIGNED result in xmm1.
VPABSW xmm1, xmm2/m128	VEX.128.66.0F38.WIG 1D /r	avx	Compute the absolute value of 16- bit integers in xmm2/m128 and store UNSIGNED result in xmm1.
VPABSD xmm1, xmm2/m128	VEX.128.66.0F38.WIG 1E /r	avx	Compute the absolute value of 32- bit integers in xmm2/m128 and store UNSIGNED result in xmm1.
VPABSB ymm1, ymm2/m256	VEX.256.66.0F38.WIG 1C /r	avx2	Compute the absolute value of bytes in ymm2/m256 and store UNSIGNED result in ymm1.
VPABSW ymm1, ymm2/m256	VEX.256.66.0F38.WIG 1D /r	avx2	Compute the absolute value of 16-bit integers in ymm2/m256 and store UNSIGNED result in ymm1.
VPABSD ymm1, ymm2/m256	VEX.256.66.0F38.WIG 1E /r	avx2	Compute the absolute value of 32-bit integers in ymm2/m256 and store UNSIGNED result in ymm1.
VPABSB xmm1 {k1}{z}, xmm2/m128	EVEX.128.66.0F38.WIG 1C /r	avx512	Compute the absolute value of bytes in xmm2/m128 and store UNSIGNED result in xmm1 using writemask k1.
VPABSB ymm1 {k1}{z}, ymm2/m256	EVEX.256.66.0F38.WIG 1C /r	avx512	Compute the absolute value of bytes in ymm2/m256 and store UNSIGNED result in ymm1 using writemask k1.
VPABSB zmm1 {k1}{z}, zmm2/m512	EVEX.512.66.0F38.WIG 1C /r	avx512	Compute the absolute value of bytes in zmm2/m512 and store UNSIGNED result in zmm1 using writemask k1.
VPABSW xmm1 {k1}{z}, xmm2/m128	EVEX.128.66.0F38.WIG 1D /r	avx512	Compute the absolute value of 16-bit integers in xmm2/m128 and store UNSIGNED result in xmm1 using writemask k1.
PACKSSWB mm1, mm2/m64	0F 63 /r	mmx	Converts 4 packed signed word integers from mm1 and from mm2/m64 into 8 packed signed byte integers in mm1 using signed saturation.
PACKSSWB xmm1, xmm2/m128	66 0F 63 /r	sse2	Converts 8 packed signed word integers from xmm1 and from xxm2/m128 into 16 packed signed byte integers in xxm1 using signed saturation.
PACKSSDW mm1, mm2/m64	0F 6B /r	mmx	Converts 2 packed signed doubleword integers from mm1 and from mm2/m64 into 4 packed signed word integers in mm1 using signed saturation.
PACKSSDW xmm1, xmm2/m128	66 0F 6B /r	sse2	Converts 4 packed signed doubleword integers from xmm1 and from xxm2/m128 into 8 packed signed word integers in xxm1 using signed saturation.
VPACKSSWB xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 63 /r	avx	Converts 8 packed signed word integers from xmm2 and from xmm3/m128 into 16 packed signed byte integers in xmm1 using signed saturation.
VPACKSSDW xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 6B /r	avx	Converts 4 packed signed doubleword integers from xmm2 and from xmm3/m128 into 8 packed signed word integers in xmm1 using signed saturation.
VPACKSSWB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 63 /r	avx2	Converts 16 packed signed word integers from ymm2 and from ymm3/m256 into 32 packed signed byte integers in ymm1 using signed saturation.
VPACKSSDW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 6B /r	avx2	Converts 8 packed signed doubleword integers from ymm2 and from ymm3/m256 into 16 packed signed word integers in ymm1using signed saturation.
VPACKSSWB xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG 63 /r	avx512	Converts packed signed word integers from xmm2 and from xmm3/m128 into packed signed byte integers in xmm1 using signed saturation under writemask k1.
VPACKSSWB ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG 63 /r	avx512	Converts packed signed word integers from ymm2 and from ymm3/m256 into packed signed byte integers in ymm1 using signed saturation under writemask k1.
VPACKSSWB zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG 63 /r	avx512	Converts packed signed word integers from zmm2 and from zmm3/m512 into packed signed byte integers in zmm1 using signed saturation under writemask k1.
VPACKSSDW xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F.W0 6B /r	avx512	Converts packed signed doubleword integers from xmm2 and from xmm3/m128/m32bcst into packed signed word integers in xmm1 using signed saturation under writemask k1.
PACKUSDW xmm1, xmm2/m128	66 0F 38 2B /r	sse4.1	Convert 4 packed signed doubleword integers from xmm1 and 4 packed signed doubleword integers from xmm2/m128 into 8 packed unsigned word integers in xmm1 using unsigned saturation.
VPACKUSDW xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F38 2B /r	avx	Convert 4 packed signed doubleword integers from xmm2 and 4 packed signed doubleword integers from xmm3/m128 into 8 packed unsigned word integers in xmm1 using unsigned saturation.
VPACKUSDW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38 2B /r	avx2	Convert 8 packed signed doubleword integers from ymm2 and 8 packed signed doubleword integers from ymm3/m256 into 16 packed unsigned word integers in ymm1 using unsigned saturation.
VPACKUSDW xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 2B /r	avx512	Convert packed signed doubleword integers from xmm2 and packed signed doubleword integers from xmm3/m128/m32bcst into packed unsigned word integers in xmm1 using unsigned saturation under writemask k1.
VPACKUSDW ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 2B /r	avx512	Convert packed signed doubleword integers from ymm2 and packed signed doubleword integers from ymm3/m256/m32bcst into packed unsigned word integers in ymm1 using unsigned saturation under writemask k1.
VPACKUSDW zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 2B /r	avx512	Convert packed signed doubleword integers from zmm2 and packed signed doubleword integers from zmm3/m512/m32bcst into packed unsigned word integers in zmm1 using unsigned saturation under writemask k1.
PACKUSWB mm, mm/m64	0F 67 /r	mmx	Converts 4 signed word integers from mm and 4 signed word integers from mm/m64 into 8 unsigned byte integers in mm using unsigned saturation.
PACKUSWB xmm1, xmm2/m128	66 0F 67 /r	sse2	Converts 8 signed word integers from xmm1 and 8 signed word integers from xmm2/m128 into 16 unsigned byte integers in xmm1 using unsigned saturation.
VPACKUSWB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 67 /r	avx	Converts 8 signed word integers from xmm2 and 8 signed word integers from xmm3/m128 into 16 unsigned byte integers in xmm1 using unsigned saturation.
VPACKUSWB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 67 /r	avx2	Converts 16 signed word integers from ymm2 and 16signed word integers from ymm3/m256 into 32 unsigned byte integers in ymm1 using unsigned saturation.
VPACKUSWB xmm1{k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG 67 /r	avx512	Converts signed word integers from xmm2 and signed word integers from xmm3/m128 into unsigned byte integers in xmm1 using unsigned saturation under writemask k1.
VPACKUSWB ymm1{k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG 67 /r	avx512	Converts signed word integers from ymm2 and signed word integers from ymm3/m256 into unsigned byte integers in ymm1 using unsigned saturation under writemask k1.
VPACKUSWB zmm1{k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG 67 /r	avx512	Converts signed word integers from zmm2 and signed word integers from zmm3/m512 into unsigned byte integers in zmm1 using unsigned saturation under writemask k1.
PADDB mm, mm/m64	0F FC /r	mmx	Add packed byte integers from mm/m64 and mm.
PADDW mm, mm/m64	0F FD /r	mmx	Add packed word integers from mm/m64 and mm.
PADDB xmm1, xmm2/m128	66 0F FC /r	sse2	Add packed byte integers from xmm2/m128 and xmm1.
PADDW xmm1, xmm2/m128	66 0F FD /r	sse2	Add packed word integers from xmm2/m128 and xmm1.
PADDD xmm1, xmm2/m128	66 0F FE /r	sse2	Add packed doubleword integers from xmm2/m128 and xmm1.
PADDQ xmm1, xmm2/m128	66 0F D4 /r	sse2	Add packed quadword integers from xmm2/m128 and xmm1.
VPADDB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG FC /r	avx	Add packed byte integers from xmm2, and xmm3/m128 and store in xmm1.
VPADDW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG FD /r	avx	Add packed word integers from xmm2, xmm3/m128 and store in xmm1.
VPADDD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG FE /r	avx	Add packed doubleword integers from xmm2, xmm3/m128 and store in xmm1.
VPADDQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG D4 /r	avx	Add packed quadword integers from xmm2, xmm3/m128 and store in xmm1.
VPADDB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG FC /r	avx2	Add packed byte integers from ymm2, and ymm3/m256 and store in ymm1.
VPADDW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG FD /r	avx2	Add packed word integers from ymm2, ymm3/m256 and store in ymm1.
VPADDD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG FE /r	avx2	Add packed doubleword integers from ymm2, ymm3/m256 and store in ymm1.
VPADDQ ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG D4 /r	avx2	Add packed quadword integers from ymm2, ymm3/m256 and store in ymm1.
VPADDB xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG FC /r	avx512	Add packed byte integers from xmm2, and xmm3/m128 and store in xmm1 using writemask k1.
VPADDW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG FD /r	avx512	Add packed word integers from xmm2, and xmm3/m128 and store in xmm1 using writemask k1.
VPADDD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F.W0 FE /r	avx512	Add packed doubleword integers from xmm2, and xmm3/m128/m32bcst and store in xmm1 using writemask k1.
VPADDQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 D4 /r	avx512	Add packed quadword integers from xmm2, and xmm3/m128/m64bcst and store in xmm1 using writemask k1.
VPADDB ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG FC /r	avx512	Add packed byte integers from ymm2, and ymm3/m256 and store in ymm1 using writemask k1.
VPADDW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG FD /r	avx512	Add packed word integers from ymm2, and ymm3/m256 and store in ymm1 using writemask k1.
VPADDD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F.W0 FE /r	avx512	Add packed doubleword integers from ymm2, ymm3/m256/m32bcst and store in ymm1 using writemask k1.
VPADDQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 D4 /r	avx512	Add packed quadword integers from ymm2, ymm3/m256/m64bcst and store in ymm1 using writemask k1.
VPADDB zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG FC /r	avx512	Add packed byte integers from zmm2, and zmm3/m512 and store in zmm1 using writemask k1.
VPADDW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG FD /r	avx512	Add packed word integers from zmm2, and zmm3/m512 and store in zmm1 using writemask k1.
VPADDD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F.W0 FE /r	avx512	Add packed doubleword integers from zmm2, zmm3/m512/m32bcst and store in zmm1 using writemask k1.
VPADDQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F.W1 D4 /r	avx512	Add packed quadword integers from zmm2, zmm3/m512/m64bcst and store in zmm1 using writemask k1.
PADDSB mm, mm/m64	0F EC /r	mmx	Add packed signed byte integers from mm/m64 and mm and saturate the results.
PADDSB xmm1, xmm2/m128	66 0F EC /r	sse2	Add packed signed byte integers from xmm2/m128 and xmm1 saturate the results.
PADDSW mm, mm/m64	0F ED /r	mmx	Add packed signed word integers from mm/m64 and mm and saturate the results.
PADDSW xmm1, xmm2/m128	66 0F ED /r	sse2	Add packed signed word integers from xmm2/m128 and xmm1 and saturate the results.
VPADDSB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG EC /r	avx	Add packed signed byte integers from xmm3/m128 and xmm2 saturate the results.
VPADDSW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG ED /r	avx	Add packed signed word integers from xmm3/m128 and xmm2 and saturate the results.
VPADDSB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG EC /r	avx2	Add packed signed byte integers from ymm2, and ymm3/m256 and store the saturated results in ymm1.
VPADDSW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG ED /r	avx2	Add packed signed word integers from ymm2, and ymm3/m256 and store the saturated results in ymm1.
VPADDSB xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG EC /r	avx512	Add packed signed byte integers from xmm2, and xmm3/m128 and store the saturated results in xmm1 under writemask k1.
VPADDSB ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG EC /r	avx512	Add packed signed byte integers from ymm2, and ymm3/m256 and store the saturated results in ymm1 under writemask k1.
VPADDSB zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG EC /r	avx512	Add packed signed byte integers from zmm2, and zmm3/m512 and store the saturated results in zmm1 under writemask k1.
VPADDSW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG ED /r	avx512	Add packed signed word integers from xmm2, and xmm3/m128 and store the saturated results in xmm1 under writemask k1.
VPADDSW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG ED /r	avx512	Add packed signed word integers from ymm2, and ymm3/m256 and store the saturated results in ymm1 under writemask k1.
VPADDSW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG ED /r	avx512	Add packed signed word integers from zmm2, and zmm3/m512 and store the saturated results in zmm1 under writemask k1.
PADDUSB mm, mm/m64	0F DC /r	mmx	Add packed unsigned byte integers from mm/m64 and mm and saturate the results.
PADDUSB xmm1, xmm2/m128	66 0F DC /r	sse2	Add packed unsigned byte integers from xmm2/m128 and xmm1 saturate the results.
PADDUSW mm, mm/m64	0F DD /r	mmx	Add packed unsigned word integers from mm/m64 and mm and saturate the results.
PADDUSW xmm1, xmm2/m128	66 0F DD /r	sse2	Add packed unsigned word integers from xmm2/m128 to xmm1 and saturate the results.
VPADDUSB xmm1, xmm2, xmm3/m128	VEX.NDS.128.660F.WIG DC /r	avx	Add packed unsigned byte integers from xmm3/m128 to xmm2 and saturate the results.
VPADDUSW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG DD /r	avx	Add packed unsigned word integers from xmm3/m128 to xmm2 and saturate the results.
VPADDUSB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG DC /r	avx2	Add packed unsigned byte integers from ymm2, and ymm3/m256 and store the saturated results in ymm1.
VPADDUSW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG DD /r	avx2	Add packed unsigned word integers from ymm2, and ymm3/m256 and store the saturated results in ymm1.
VPADDUSB xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG DC /r	avx512	Add packed unsigned byte integers from xmm2, and xmm3/m128 and store the saturated results in xmm1 under writemask k1.
VPADDUSB ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG DC /r	avx512	Add packed unsigned byte integers from ymm2, and ymm3/m256 and store the saturated results in ymm1 under writemask k1.
VPADDUSB zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG DC /r	avx512	Add packed unsigned byte integers from zmm2, and zmm3/m512 and store the saturated results in zmm1 under writemask k1.
VPADDUSW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG DD /r	avx512	Add packed unsigned word integers from xmm2, and xmm3/m128 and store the saturated results in xmm1 under writemask k1.
VPADDUSW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG DD /r	avx512	Add packed unsigned word integers from ymm2, and ymm3/m256 and store the saturated results in ymm1 under writemask k1.
PALIGNR mm1, mm2/m64, imm8	0F 3A 0F /r ib	ssse3	Concatenate destination and source operands, extract byte-aligned result shifted to the right by constant value in imm8 into mm1.
PALIGNR xmm1, xmm2/m128, imm8	66 0F 3A 0F /r ib	ssse3	Concatenate destination and source operands, extract byte-aligned result shifted to the right by constant value in imm8 into xmm1.
VPALIGNR xmm1, xmm2, xmm3/m128, imm8	VEX.NDS.128.66.0F3A.WIG 0F /r ib	avx	Concatenate xmm2 and xmm3/m128, extract byte aligned result shifted to the right by constant value in imm8 and result is stored in xmm1.
VPALIGNR ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.66.0F3A.WIG 0F /r ib	avx2	Concatenate pairs of 16 bytes in ymm2 and ymm3/m256 into 32-byte intermediate result, extract byte-aligned, 16-byte result shifted to the right by constant values in imm8 from each intermediate result, and two 16-byte results are stored in ymm1.
VPALIGNR xmm1 {k1}{z}, xmm2, xmm3/m128, imm8	EVEX.NDS.128.66.0F3A.WIG 0F /r ib	avx512	Concatenate xmm2 and xmm3/m128 into a 32-byte intermediate result, extract byte aligned result shifted to the right by constant value in imm8 and result is stored in xmm1.
VPALIGNR ymm1 {k1}{z}, ymm2, ymm3/m256, imm8	EVEX.NDS.256.66.0F3A.WIG 0F /r ib	avx512	Concatenate pairs of 16 bytes in ymm2 and ymm3/m256 into 32-byte intermediate result, extract byte-aligned, 16-byte result shifted to the right by constant values in imm8 from each intermediate result, and two 16-byte results are stored in ymm1.
VPALIGNR zmm1 {k1}{z}, zmm2, zmm3/m512, imm8	EVEX.NDS.512.66.0F3A.WIG 0F /r ib	avx512	Concatenate pairs of 32 bytes in zmm2 and zmm3/m512 into 32-byte intermediate result, extract byte-aligned, 32-byte result shifted to the right by constant values in imm8 from each intermediate result, and four 32-byte results are stored in zmm1.
PAND mm, mm/m64	0F DB /r	mmx	Bitwise AND mm/m64 and mm.
PAND xmm1, xmm2/m128	66 0F DB /r	sse2	Bitwise AND of xmm2/m128 and xmm1.
VPAND xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG DB /r	avx	Bitwise AND of xmm3/m128 and xmm.
VPAND ymm1, ymm2, ymm3/.m256	VEX.NDS.256.66.0F.WIG DB /r	avx2	Bitwise AND of ymm2, and ymm3/m256 and store result in ymm1.
VPANDD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F.W0 DB /r	avx512	Bitwise AND of packed doubleword integers in xmm2 and xmm3/m128/m32bcst and store result in xmm1 using writemask k1.
VPANDD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F.W0 DB /r	avx512	Bitwise AND of packed doubleword integers in ymm2 and ymm3/m256/m32bcst and store result in ymm1 using writemask k1.
VPANDD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F.W0 DB /r	avx512	Bitwise AND of packed doubleword integers in zmm2 and zmm3/m512/m32bcst and store result in zmm1 using writemask k1.
VPANDQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 DB /r	avx512	Bitwise AND of packed quadword integers in xmm2 and xmm3/m128/m64bcst and store result in xmm1 using writemask k1.
VPANDQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 DB /r	avx512	Bitwise AND of packed quadword integers in ymm2 and ymm3/m256/m64bcst and store result in ymm1 using writemask k1.
VPANDQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F.W1 DB /r	avx512	Bitwise AND of packed quadword integers in zmm2 and zmm3/m512/m64bcst and store result in zmm1 using writemask k1.
PANDN mm, mm/m64	0F DF /r	mmx	Bitwise AND NOT of mm/m64 and mm.
PANDN xmm1, xmm2/m128	66 0F DF /r	sse2	Bitwise AND NOT of xmm2/m128 and xmm1.
VPANDN xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG DF /r	avx	Bitwise AND NOT of xmm3/m128 and xmm2.
VPANDN ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG DF /r	avx2	Bitwise AND NOT of ymm2, and ymm3/m256 and store result in ymm1.
VPANDND xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F.W0 DF /r	avx512	Bitwise AND NOT of packed doubleword integers in xmm2 and xmm3/m128/m32bcst and store result in xmm1 using writemask k1.
VPANDND ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F.W0 DF /r	avx512	Bitwise AND NOT of packed doubleword integers in ymm2 and ymm3/m256/m32bcst and store result in ymm1 using writemask k1.
VPANDND zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F.W0 DF /r	avx512	Bitwise AND NOT of packed doubleword integers in zmm2 and zmm3/m512/m32bcst and store result in zmm1 using writemask k1.
VPANDNQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 DF /r	avx512	Bitwise AND NOT of packed quadword integers in xmm2 and xmm3/m128/m64bcst and store result in xmm1 using writemask k1.
VPANDNQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 DF /r	avx512	Bitwise AND NOT of packed quadword integers in ymm2 and ymm3/m256/m64bcst and store result in ymm1 using writemask k1.
VPANDNQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F.W1 DF /r	avx512	Bitwise AND NOT of packed quadword integers in zmm2 and zmm3/m512/m64bcst and store result in zmm1 using writemask k1.
PAUSE	F3 90		Gives hint to processor that improves performance of spin-wait loops.
PAVGB mm1, mm2/m64	0F E0 /r	sse	Average packed unsigned byte integers from mm2/m64 and mm1 with rounding.
PAVGB xmm1, xmm2/m128	66 0F E0, /r	sse2	Average packed unsigned byte integers from xmm2/m128 and xmm1 with rounding.
PAVGW mm1, mm2/m64	0F E3 /r	sse	Average packed unsigned word integers from mm2/m64 and mm1 with rounding.
PAVGW xmm1, xmm2/m128	66 0F E3 /r	sse2	Average packed unsigned word integers from xmm2/m128 and xmm1 with rounding.
VPAVGB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG E0 /r	avx	Average packed unsigned byte integers from xmm3/m128 and xmm2 with rounding.
VPAVGW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG E3 /r	avx	Average packed unsigned word integers from xmm3/m128 and xmm2 with rounding.
VPAVGB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG E0 /r	avx2	Average packed unsigned byte integers from ymm2, and ymm3/m256 with rounding and store to ymm1.
VPAVGW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG E3 /r	avx2	Average packed unsigned word integers from ymm2, ymm3/m256 with rounding to ymm1.
VPAVGB xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG E0 /r	avx512	Average packed unsigned byte integers from xmm2, and xmm3/m128 with rounding and store to xmm1 under writemask k1.
VPAVGB ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG E0 /r	avx512	Average packed unsigned byte integers from ymm2, and ymm3/m256 with rounding and store to ymm1 under writemask k1.
VPAVGB zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG E0 /r	avx512	Average packed unsigned byte integers from zmm2, and zmm3/m512 with rounding and store to zmm1 under writemask k1.
VPAVGW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG E3 /r	avx512	Average packed unsigned word integers from xmm2, xmm3/m128 with rounding to xmm1 under writemask k1.
VPAVGW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG E3 /r	avx512	Average packed unsigned word integers from ymm2, ymm3/m256 with rounding to ymm1 under writemask k1.
VPAVGW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG E3 /r	avx512	Average packed unsigned word integers from zmm2, zmm3/m512 with rounding to zmm1 under writemask k1.
PBLENDVB xmm1, xmm2/m128, <XMM0>	66 0F 38 10 /r	sse4.1	Select byte values from xmm1 and xmm2/m128 from mask specified in the high bit of each byte in XMM0 and store the values into xmm1.
VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4	VEX.NDS.128.66.0F3A.W0 4C /r /is4	avx	Select byte values from xmm2 and xmm3/m128 using mask bits in the specified mask register, xmm4, and store the values into xmm1.
VPBLENDVB ymm1, ymm2, ymm3/m256, ymm4	VEX.NDS.256.66.0F3A.W0 4C /r /is4	avx2	Select byte values from ymm2 and ymm3/m256 from mask specified in the high bit of each byte in ymm4 and store the values into ymm1.
PBLENDW xmm1, xmm2/m128, imm8	66 0F 3A 0E /r ib	sse4.1	Select words from xmm1 and xmm2/m128 from mask specified in imm8 and store the values into xmm1.
VPBLENDW xmm1, xmm2, xmm3/m128, imm8	VEX.NDS.128.66.0F3A.WIG 0E /r ib	avx	Select words from xmm2 and xmm3/m128 from mask specified in imm8 and store the values into xmm1.
VPBLENDW ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.66.0F3A.WIG 0E /r ib	avx2	Select words from ymm2 and ymm3/m256 from mask specified in imm8 and store the values into ymm1.
PCLMULQDQ xmm1, xmm2/m128, imm8	66 0F 3A 44 /r ib	clmul	Carry-less multiplication of one quadword of xmm1 by one quadword of xmm2/m128, stores the 128-bit result in xmm1. The imme-diate is used to determine which quadwords of xmm1 and xmm2/m128 should be used.
VPCLMULQDQ xmm1, xmm2, xmm3/m128, imm8	VEX.NDS.128.66.0F3A.WIG 44 /r ib	avx	Carry-less multiplication of one quadword of xmm2 by one quadword of xmm3/m128, stores the 128-bit result in xmm1. The imme-diate is used to determine which quadwords of xmm2 and xmm3/m128 should be used.
PCMPEQB mm, mm/m64	0F 74 /r	mmx	Compare packed bytes in mm/m64 and mm for equality.
PCMPEQB xmm1, xmm2/m128	66 0F 74 /r	sse2	Compare packed bytes in xmm2/m128 and xmm1 for equality.
PCMPEQW mm, mm/m64	0F 75 /r	mmx	Compare packed words in mm/m64 and mm for equality.
PCMPEQW xmm1, xmm2/m128	66 0F 75 /r	sse2	Compare packed words in xmm2/m128 and xmm1 for equality.
PCMPEQD mm, mm/m64	0F 76 /r	mmx	Compare packed doublewords in mm/m64 and mm for equality.
PCMPEQD xmm1, xmm2/m128	66 0F 76 /r	sse2	Compare packed doublewords in xmm2/m128 and xmm1 for equality.
VPCMPEQB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 74 /r	avx	Compare packed bytes in xmm3/m128 and xmm2 for equality.
VPCMPEQW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 75 /r	avx	Compare packed words in xmm3/m128 and xmm2 for equality.
VPCMPEQD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 76 /r	avx	Compare packed doublewords in xmm3/m128 and xmm2 for equality.
VPCMPEQB ymm1, ymm2, ymm3 /m256	VEX.NDS.256.66.0F.WIG 74 /r	avx2	Compare packed bytes in ymm3/m256 and ymm2 for equality.
VPCMPEQW ymm1, ymm2, ymm3 /m256	VEX.NDS.256.66.0F.WIG 75 /r	avx2	Compare packed words in ymm3/m256 and ymm2 for equality.
VPCMPEQD ymm1, ymm2, ymm3 /m256	VEX.NDS.256.66.0F.WIG 76 /r	avx2	Compare packed doublewords in ymm3/m256 and ymm2 for equality.
VPCMPEQD k1 {k2}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F.W0 76 /r	avx512	Compare Equal between int32 vector xmm2 and int32 vector xmm3/m128/m32bcst, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
VPCMPEQD k1 {k2}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F.W0 76 /r	avx512	Compare Equal between int32 vector ymm2 and int32 vector ymm3/m256/m32bcst, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
VPCMPEQD k1 {k2}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F.W0 76 /r	avx512	Compare Equal between int32 vectors in zmm2 and zmm3/m512/m32bcst, and set destination k1 according to the comparison results under writemask k2,
VPCMPEQB k1 {k2}, xmm2, xmm3 /m128	EVEX.NDS.128.66.0F.WIG 74 /r	avx512	Compare packed bytes in xmm3/m128 and xmm2 for equality and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
PCMPEQQ xmm1, xmm2/m128	66 0F 38 29 /r	sse4.1	Compare packed qwords in xmm2/m128 and xmm1 for equality.
VPCMPEQQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 29 /r	avx	Compare packed quadwords in xmm3/m128 and xmm2 for equality.
VPCMPEQQ ymm1, ymm2, ymm3 /m256	VEX.NDS.256.66.0F38.WIG 29 /r	avx2	Compare packed quadwords in ymm3/m256 and ymm2 for equality.
VPCMPEQQ k1 {k2}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 29 /r	avx512	Compare Equal between int64 vector xmm2 and int64 vector xmm3/m128/m64bcst, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
VPCMPEQQ k1 {k2}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 29 /r	avx512	Compare Equal between int64 vector ymm2 and int64 vector ymm3/m256/m64bcst, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
VPCMPEQQ k1 {k2}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 29 /r	avx512	Compare Equal between int64 vector zmm2 and int64 vector zmm3/m512/m64bcst, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
PCMPESTRI xmm1, xmm2/m128, imm8	66 0F 3A 61 /r imm8	sse4.2	Perform a packed comparison of string data with explicit lengths, generating an index, and storing the result in ECX.
VPCMPESTRI xmm1, xmm2/m128, imm8	VEX.128.66.0F3A.WIG 61 /r ib	avx	Perform a packed comparison of string data with explicit lengths, generating an index, and storing the result in ECX.
PCMPESTRM xmm1, xmm2/m128, imm8	66 0F 3A 60 /r imm8	sse4.2	Perform a packed comparison of string data with explicit lengths, generating a mask, and storing the result in XMM0
VPCMPESTRM xmm1, xmm2/m128, imm8	VEX.128.66.0F3A.WIG 60 /r ib	avx	Perform a packed comparison of string data with explicit lengths, generating a mask, and storing the result in XMM0.
PCMPGTB mm, mm/m64	0F 64 /r	mmx	Compare packed signed byte integers in mm and mm/m64 for greater than.
PCMPGTB xmm1, xmm2/m128	66 0F 64 /r	sse2	Compare packed signed byte integers in xmm1 and xmm2/m128 for greater than.
PCMPGTW mm, mm/m64	0F 65 /r	mmx	Compare packed signed word integers in mm and mm/m64 for greater than.
PCMPGTW xmm1, xmm2/m128	66 0F 65 /r	sse2	Compare packed signed word integers in xmm1 and xmm2/m128 for greater than.
PCMPGTD mm, mm/m64	0F 66 /r	mmx	Compare packed signed doubleword integers in mm and mm/m64 for greater than.
PCMPGTD xmm1, xmm2/m128	66 0F 66 /r	sse2	Compare packed signed doubleword integers in xmm1 and xmm2/m128 for greater than.
VPCMPGTB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 64 /r	avx	Compare packed signed byte integers in xmm2 and xmm3/m128 for greater than.
VPCMPGTW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 65 /r	avx	Compare packed signed word integers in xmm2 and xmm3/m128 for greater than.
VPCMPGTD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 66 /r	avx	Compare packed signed doubleword integers in xmm2 and xmm3/m128 for greater than.
VPCMPGTB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 64 /r	avx2	Compare packed signed byte integers in ymm2 and ymm3/m256 for greater than.
VPCMPGTW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 65 /r	avx2	Compare packed signed word integers in ymm2 and ymm3/m256 for greater than.
VPCMPGTD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 66 /r	avx2	Compare packed signed doubleword integers in ymm2 and ymm3/m256 for greater than.
VPCMPGTD k1 {k2}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F.W0 66 /r	avx512	Compare Greater between int32 vector xmm2 and int32 vector xmm3/m128/m32bcst, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
VPCMPGTD k1 {k2}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F.W0 66 /r	avx512	Compare Greater between int32 vector ymm2 and int32 vector ymm3/m256/m32bcst, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
VPCMPGTD k1 {k2}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F.W0 66 /r	avx512	Compare Greater between int32 elements in zmm2 and zmm3/m512/m32bcst, and set destination k1 according to the comparison results under writemask. k2.
VPCMPGTB k1 {k2}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG 64 /r	avx512	Compare packed signed byte integers in xmm2 and xmm3/m128 for greater than, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
VPCMPGTB k1 {k2}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG 64 /r	avx512	Compare packed signed byte integers in ymm2 and ymm3/m256 for greater than, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
PCMPGTQ xmm1,xmm2/m128	66 0F 38 37 /r	sse4.2	Compare packed signed qwords in xmm2/m128 and xmm1 for greater than.
VPCMPGTQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 37 /r	avx	Compare packed signed qwords in xmm2 and xmm3/m128 for greater than.
VPCMPGTQ ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 37 /r	avx2	Compare packed signed qwords in ymm2 and ymm3/m256 for greater than.
VPCMPGTQ k1 {k2}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 37 /r	avx512	Compare Greater between int64 vector xmm2 and int64 vector xmm3/m128/m64bcst, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
VPCMPGTQ k1 {k2}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 37 /r	avx512	Compare Greater between int64 vector ymm2 and int64 vector ymm3/m256/m64bcst, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
VPCMPGTQ k1 {k2}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 37 /r	avx512	Compare Greater between int64 vector zmm2 and int64 vector zmm3/m512/m64bcst, and set vector mask k1 to reflect the zero/nonzero status of each element of the result, under writemask.
PCMPISTRI xmm1, xmm2/m128, imm8	66 0F 3A 63 /r imm8	sse4.2	Perform a packed comparison of string data with implicit lengths, generating an index, and storing the result in ECX.
VPCMPISTRI xmm1, xmm2/m128, imm8	VEX.128.66.0F3A.WIG 63 /r ib	avx	Perform a packed comparison of string data with implicit lengths, generating an index, and storing the result in ECX.
PCMPISTRM xmm1, xmm2/m128, imm8	66 0F 3A 62 /r imm8	sse4.2	Perform a packed comparison of string data with implicit lengths, generating a mask, and storing the result in XMM0.
VPCMPISTRM xmm1, xmm2/m128, imm8	VEX.128.66.0F3A.WIG 62 /r ib	avx	Perform a packed comparison of string data with implicit lengths, generating a Mask, and storing the result in XMM0.
PDEP r32a, r32b, r/m32	VEX.NDS.LZ.F2.0F38.W0 F5 /r	bmi2	Parallel deposit of bits from r32b using mask in r/m32, result is writ-ten to r32a.
PDEP r64a, r64b, r/m64	VEX.NDS.LZ.F2.0F38.W1 F5 /r	bmi2	Parallel deposit of bits from r64b using mask in r/m64, result is writ-ten to r64a.
PEXT r32a, r32b, r/m32	VEX.NDS.LZ.F3.0F38.W0 F5 /r	bmi2	Parallel extract of bits from r32b using mask in r/m32, result is writ-ten to r32a.
PEXT r64a, r64b, r/m64	VEX.NDS.LZ.F3.0F38.W1 F5 /r	bmi2	Parallel extract of bits from r64b using mask in r/m64, result is writ-ten to r64a.
PEXTRB reg/m8, xmm2, imm8	66 0F 3A 14 /r ib	sse4.1	Extract a byte integer value from xmm2 at the source byte offset specified by imm8 into reg or m8. The upper bits of r32 or r64 are zeroed.
PEXTRD r/m32, xmm2, imm8	66 0F 3A 16 /r ib	sse4.1	Extract a dword integer value from xmm2 at the source dword offset specified by imm8 into r/m32.
PEXTRQ r/m64, xmm2, imm8	66 REX.W 0F 3A 16 /r ib	sse4.1	Extract a qword integer value from xmm2 at the source qword offset specified by imm8 into r/m64.
VPEXTRB reg/m8, xmm2, imm8	VEX.128.66.0F3A.W0 14 /r ib	avx	Extract a byte integer value from xmm2 at the source byte offset specified by imm8 into reg or m8. The upper bits of r64/r32 is filled with zeros.
VPEXTRD r32/m32, xmm2, imm8	VEX.128.66.0F3A.W0 16 /r ib	avx	Extract a dword integer value from xmm2 at the source dword offset specified by imm8 into r32/m32.
VPEXTRQ r64/m64, xmm2, imm8	VEX.128.66.0F3A.W1 16 /r ib	avx	Extract a qword integer value from xmm2 at the source dword offset specified by imm8 into r64/m64.
VPEXTRB reg/m8, xmm2, imm8	EVEX.128.66.0F3A.WIG 14 /r ib	avx512	Extract a byte integer value from xmm2 at the source byte offset specified by imm8 into reg or m8. The upper bits of r64/r32 is filled with zeros.
VPEXTRD r32/m32, xmm2, imm8	EVEX.128.66.0F3A.W0 16 /r ib	avx512	Extract a dword integer value from xmm2 at the source dword offset specified by imm8 into r32/m32.
VPEXTRQ r64/m64, xmm2, imm8	EVEX.128.66.0F3A.W1 16 /r ib	avx512	Extract a qword integer value from xmm2 at the source dword offset specified by imm8 into r64/m64.
PEXTRW reg, mm, imm8	0F C5 /r ib	sse	Extract the word specified by imm8 from mm and move it to reg, bits 15-0. The upper bits of r32 or r64 is zeroed.
PEXTRW reg, xmm, imm8	66 0F C5 /r ib	sse2	Extract the word specified by imm8 from xmm and move it to reg, bits 15-0. The upper bits of r32 or r64 is zeroed.
PEXTRW reg/m16, xmm, imm8	66 0F 3A 15 /r ib	sse4.1	Extract the word specified by imm8 from xmm and copy it to lowest 16 bits of reg or m16. Zero-extend the result in the destination, r32 or r64.
VPEXTRW reg, xmm1, imm8	VEX.128.66.0F.W0 C5 /r ib	avx	Extract the word specified by imm8 from xmm1 and move it to reg, bits 15:0. Zero-extend the result. The upper bits of r64/r32 is filled with zeros.
VPEXTRW reg/m16, xmm2, imm8	VEX.128.66.0F3A.W0 15 /r ib	avx	Extract a word integer value from xmm2 at the source word offset specified by imm8 into reg or m16. The upper bits of r64/r32 is filled with zeros.
VPEXTRW reg, xmm1, imm8	EVEX.128.66.0F.WIG C5 /r ib	avx512	Extract the word specified by imm8 from xmm1 and move it to reg, bits 15:0. Zero-extend the result. The upper bits of r64/r32 is filled with zeros.
VPEXTRW reg/m16, xmm2, imm8	EVEX.128.66.0F3A.WIG 15 /r ib	avx512	Extract a word integer value from xmm2 at the source word offset specified by imm8 into reg or m16. The upper bits of r64/r32 is filled with zeros.
PHADDSW mm1, mm2/m64	0F 38 03 /r	ssse3	Add 16-bit signed integers horizontally, pack saturated integers to mm1.
PHADDSW xmm1, xmm2/m128	66 0F 38 03 /r	ssse3	Add 16-bit signed integers horizontally, pack saturated integers to xmm1.
VPHADDSW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 03 /r	avx	Add 16-bit signed integers horizontally, pack saturated integers to xmm1.
VPHADDSW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 03 /r	avx2	Add 16-bit signed integers horizontally, pack saturated integers to ymm1.
PHADDW mm1, mm2/m64	0F 38 01 /r	ssse3	Add 16-bit integers horizontally, pack to mm1.
PHADDW xmm1, xmm2/m128	66 0F 38 01 /r	ssse3	Add 16-bit integers horizontally, pack to xmm1.
PHADDD mm1, mm2/m64	0F 38 02 /r	ssse3	Add 32-bit integers horizontally, pack to mm1.
PHADDD xmm1, xmm2/m128	66 0F 38 02 /r	ssse3	Add 32-bit integers horizontally, pack to xmm1.
VPHADDW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 01 /r	avx	Add 16-bit integers horizontally, pack to xmm1.
VPHADDD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 02 /r	avx	Add 32-bit integers horizontally, pack to xmm1.
VPHADDW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 01 /r	avx2	Add 16-bit signed integers horizontally, pack to ymm1.
VPHADDD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 02 /r	avx2	Add 32-bit signed integers horizontally, pack to ymm1.
PHMINPOSUW xmm1, xmm2/m128	66 0F 38 41 /r	sse4.1	Find the minimum unsigned word in xmm2/m128 and place its value in the low word of xmm1 and its index in the second-lowest word of xmm1.
VPHMINPOSUW xmm1, xmm2/m128	VEX.128.66.0F38.WIG 41 /r	avx	Find the minimum unsigned word in xmm2/m128 and place its value in the low word of xmm1 and its index in the second-lowest word of xmm1.
PHSUBSW mm1, mm2/m64	0F 38 07 /r	ssse3	Subtract 16-bit signed integer horizontally, pack saturated integers to mm1.
PHSUBSW xmm1, xmm2/m128	66 0F 38 07 /r	ssse3	Subtract 16-bit signed integer horizontally, pack saturated integers to xmm1.
VPHSUBSW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 07 /r	avx	Subtract 16-bit signed integer horizontally, pack saturated integers to xmm1.
VPHSUBSW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 07 /r	avx2	Subtract 16-bit signed integer horizontally, pack saturated integers to ymm1.
PHSUBW mm1, mm2/m64	0F 38 05 /r	ssse3	Subtract 16-bit signed integers horizontally, pack to mm1.
PHSUBW xmm1, xmm2/m128	66 0F 38 05 /r	ssse3	Subtract 16-bit signed integers horizontally, pack to xmm1.
PHSUBD mm1, mm2/m64	0F 38 06 /r	ssse3	Subtract 32-bit signed integers horizontally, pack to mm1.
PHSUBD xmm1, xmm2/m128	66 0F 38 06 /r	ssse3	Subtract 32-bit signed integers horizontally, pack to xmm1.
VPHSUBW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 05 /r	avx	Subtract 16-bit signed integers horizontally, pack to xmm1.
VPHSUBD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 06 /r	avx	Subtract 32-bit signed integers horizontally, pack to xmm1.
VPHSUBW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 05 /r	avx2	Subtract 16-bit signed integers horizontally, pack to ymm1.
VPHSUBD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 06 /r	avx2	Subtract 32-bit signed integers horizontally, pack to ymm1.
PINSRB xmm1, r32/m8, imm8	66 0F 3A 20 /r ib	sse4.1	Insert a byte integer value from r32/m8 into xmm1 at the destination element in xmm1 specified by imm8.
PINSRD xmm1, r/m32, imm8	66 0F 3A 22 /r ib	sse4.1	Insert a dword integer value from r/m32 into the xmm1 at the destination element specified by imm8.
PINSRQ xmm1, r/m64, imm8	66 REX.W 0F 3A 22 /r ib	sse4.1	Insert a qword integer value from r/m64 into the xmm1 at the destination element specified by imm8.
VPINSRB xmm1, xmm2, r32/m8, imm8	VEX.NDS.128.66.0F3A.W0 20 /r ib	avx	Merge a byte integer value from r32/m8 and rest from xmm2 into xmm1 at the byte offset in imm8.
VPINSRD xmm1, xmm2, r/m32, imm8	VEX.NDS.128.66.0F3A.W0 22 /r ib	avx	Insert a dword integer value from r32/m32 and rest from xmm2 into xmm1 at the dword offset in imm8.
VPINSRQ xmm1, xmm2, r/m64, imm8	VEX.NDS.128.66.0F3A.W1 22 /r ib	avx	Insert a qword integer value from r64/m64 and rest from xmm2 into xmm1 at the qword offset in imm8.
VPINSRB xmm1, xmm2, r32/m8, imm8	EVEX.NDS.128.66.0F3A.WIG 20 /r ib	avx512	Merge a byte integer value from r32/m8 and rest from xmm2 into xmm1 at the byte offset in imm8.
VPINSRD xmm1, xmm2, r32/m32, imm8	EVEX.NDS.128.66.0F3A.W0 22 /r ib	avx512	Insert a dword integer value from r32/m32 and rest from xmm2 into xmm1 at the dword offset in imm8.
VPINSRQ xmm1, xmm2, r64/m64, imm8	EVEX.NDS.128.66.0F3A.W1 22 /r ib	avx512	Insert a qword integer value from r64/m64 and rest from xmm2 into xmm1 at the qword offset in imm8.
PINSRW mm, r32/m16, imm8	0F C4 /r ib	sse	Insert the low word from r32 or from m16 into mm at the word position specified by imm8.
PINSRW xmm, r32/m16, imm8	66 0F C4 /r ib	sse2	Move the low word of r32 or from m16 into xmm at the word position specified by imm8.
VPINSRW xmm1, xmm2, r32/m16, imm8	VEX.NDS.128.66.0F.W0 C4 /r ib	avx512	Insert a word integer value from r32/m16 and rest from xmm2 into xmm1 at the word offset in imm8.
VPINSRW xmm1, xmm2, r32/m16, imm8	EVEX.NDS.128.66.0F.WIG C4 /r ib	avx512	Insert a word integer value from r32/m16 and rest from xmm2 into xmm1 at the word offset in imm8.
PMADDUBSW mm1, mm2/m64	0F 38 04 /r	ssse3	Multiply signed and unsigned bytes, add horizontal pair of signed words, pack saturated signed-words to mm1.
PMADDUBSW xmm1, xmm2/m128	66 0F 38 04 /r	ssse3	Multiply signed and unsigned bytes, add horizontal pair of signed words, pack saturated signed-words to xmm1.
VPMADDUBSW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 04 /r	avx	Multiply signed and unsigned bytes, add horizontal pair of signed words, pack saturated signed-words to xmm1.
VPMADDUBSW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 04 /r	avx2	Multiply signed and unsigned bytes, add horizontal pair of signed words, pack saturated signed-words to ymm1.
VPMADDUBSW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.WIG 04 /r	avx512	Multiply signed and unsigned bytes, add horizontal pair of signed words, pack saturated signed-words to xmm1 under writemask k1.
VPMADDUBSW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.WIG 04 /r	avx512	Multiply signed and unsigned bytes, add horizontal pair of signed words, pack saturated signed-words to ymm1 under writemask k1.
VPMADDUBSW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.WIG 04 /r	avx512	Multiply signed and unsigned bytes, add horizontal pair of signed words, pack saturated signed-words to zmm1 under writemask k1.
PMADDWD mm, mm/m64	0F F5 /r	mmx	Multiply the packed words in mm by the packed words in mm/m64, add adjacent doubleword results, and store in mm.
PMADDWD xmm1, xmm2/m128	66 0F F5 /r	sse2	Multiply the packed word integers in xmm1 by the packed word integers in xmm2/m128, add adjacent doubleword results, and store in xmm1.
VPMADDWD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG F5 /r	avx	Multiply the packed word integers in xmm2 by the packed word integers in xmm3/m128, add adjacent doubleword results, and store in xmm1.
VPMADDWD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG F5 /r	avx2	Multiply the packed word integers in ymm2 by the packed word integers in ymm3/m256, add adjacent doubleword results, and store in ymm1.
VPMADDWD xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG F5 /r	avx512	Multiply the packed word integers in xmm2 by the packed word integers in xmm3/m128, add adjacent doubleword results, and store in xmm1 under writemask k1.
VPMADDWD ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG F5 /r	avx512	Multiply the packed word integers in ymm2 by the packed word integers in ymm3/m256, add adjacent doubleword results, and store in ymm1 under writemask k1.
VPMADDWD zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG F5 /r	avx512	Multiply the packed word integers in zmm2 by the packed word integers in zmm3/m512, add adjacent doubleword results, and store in zmm1 under writemask k1.
PMAXSW mm1, mm2/m64	0F EE /r	sse	Compare signed word integers in mm2/m64 and mm1 and return maximum values.
PMAXSB xmm1, xmm2/m128	66 0F 38 3C /r	sse4.1	Compare packed signed byte integers in xmm1 and xmm2/m128 and store packed maximum values in xmm1.
PMAXSW xmm1, xmm2/m128	66 0F EE /r	sse2	Compare packed signed word integers in xmm2/m128 and xmm1 and stores maximum packed values in xmm1.
PMAXSD xmm1, xmm2/m128	66 0F 38 3D /r	sse4.1	Compare packed signed dword integers in xmm1 and xmm2/m128 and store packed maximum values in xmm1.
VPMAXSB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 3C /r	avx	Compare packed signed byte integers in xmm2 and xmm3/m128 and store packed maximum values in xmm1.
VPMAXSW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG EE /r	avx	Compare packed signed word integers in xmm3/m128 and xmm2 and store packed maximum values in xmm1.
VPMAXSD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 3D /r	avx	Compare packed signed dword integers in xmm2 and xmm3/m128 and store packed maximum values in xmm1.
VPMAXSB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 3C /r	avx2	Compare packed signed byte integers in ymm2 and ymm3/m256 and store packed maximum values in ymm1.
VPMAXSW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG EE /r	avx2	Compare packed signed word integers in ymm3/m256 and ymm2 and store packed maximum values in ymm1.
VPMAXSD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 3D /r	avx2	Compare packed signed dword integers in ymm2 and ymm3/m256 and store packed maximum values in ymm1.
VPMAXSB xmm1{k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.WIG 3C /r	avx512	Compare packed signed byte integers in xmm2 and xmm3/m128 and store packed maximum values in xmm1 under writemask k1.
VPMAXSB ymm1{k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.WIG 3C /r	avx512	Compare packed signed byte integers in ymm2 and ymm3/m256 and store packed maximum values in ymm1 under writemask k1.
VPMAXSB zmm1{k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.WIG 3C /r	avx512	Compare packed signed byte integers in zmm2 and zmm3/m512 and store packed maximum values in zmm1 under writemask k1.
VPMAXSW xmm1{k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG EE /r	avx512	Compare packed signed word integers in xmm2 and xmm3/m128 and store packed maximum values in xmm1 under writemask k1.
VPMAXSW ymm1{k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG EE /r	avx512	Compare packed signed word integers in ymm2 and ymm3/m256 and store packed maximum values in ymm1 under writemask k1.
VPMAXSW zmm1{k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG EE /r	avx512	Compare packed signed word integers in zmm2 and zmm3/m512 and store packed maximum values in zmm1 under writemask k1.
VPMAXSD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 3D /r	avx512	Compare packed signed dword integers in xmm2 and xmm3/m128/m32bcst and store packed maximum values in xmm1 using writemask k1.
VPMAXSD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 3D /r	avx512	Compare packed signed dword integers in ymm2 and ymm3/m256/m32bcst and store packed maximum values in ymm1 using writemask k1.
VPMAXSD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 3D /r	avx512	Compare packed signed dword integers in zmm2 and zmm3/m512/m32bcst and store packed maximum values in zmm1 using writemask k1.
VPMAXSQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 3D /r	avx512	Compare packed signed qword integers in xmm2 and xmm3/m128/m64bcst and store packed maximum values in xmm1 using writemask k1.
VPMAXSQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 3D /r	avx512	Compare packed signed qword integers in ymm2 and ymm3/m256/m64bcst and store packed maximum values in ymm1 using writemask k1.
VPMAXSQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 3D /r	avx512	Compare packed signed qword integers in zmm2 and zmm3/m512/m64bcst and store packed maximum values in zmm1 using writemask k1.
PMAXUB mm1, mm2/m64	0F DE /r	sse	Compare unsigned byte integers in mm2/m64 and mm1 and returns maximum values.
PMAXUB xmm1, xmm2/m128	66 0F DE /r	sse2	Compare packed unsigned byte integers in xmm1 and xmm2/m128 and store packed maximum values in xmm1.
PMAXUW xmm1, xmm2/m128	66 0F 38 3E/r	sse4.1	Compare packed unsigned word integers in xmm2/m128 and xmm1 and stores maximum packed values in xmm1.
VPMAXUB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F DE /r	avx	Compare packed unsigned byte integers in xmm2 and xmm3/m128 and store packed maximum values in xmm1.
VPMAXUW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38 3E/r	avx	Compare packed unsigned word integers in xmm3/m128 and xmm2 and store maximum packed values in xmm1.
VPMAXUB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F DE /r	avx2	Compare packed unsigned byte integers in ymm2 and ymm3/m256 and store packed maximum values in ymm1.
VPMAXUW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38 3E/r	avx2	Compare packed unsigned word integers in ymm3/m256 and ymm2 and store maximum packed values in ymm1.
VPMAXUB xmm1{k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG DE /r	avx512	Compare packed unsigned byte integers in xmm2 and xmm3/m128 and store packed maximum values in xmm1 under writemask k1.
VPMAXUB ymm1{k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG DE /r	avx512	Compare packed unsigned byte integers in ymm2 and ymm3/m256 and store packed maximum values in ymm1 under writemask k1.
VPMAXUB zmm1{k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG DE /r	avx512	Compare packed unsigned byte integers in zmm2 and zmm3/m512 and store packed maximum values in zmm1 under writemask k1.
VPMAXUW xmm1{k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.WIG 3E /r	avx512	Compare packed unsigned word integers in xmm2 and xmm3/m128 and store packed maximum values in xmm1 under writemask k1.
VPMAXUW ymm1{k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.WIG 3E /r	avx512	Compare packed unsigned word integers in ymm2 and ymm3/m256 and store packed maximum values in ymm1 under writemask k1.
VPMAXUW zmm1{k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.WIG 3E /r	avx512	Compare packed unsigned word integers in zmm2 and zmm3/m512 and store packed maximum values in zmm1 under writemask k1.
PMAXUD xmm1, xmm2/m128	66 0F 38 3F /r	sse4.1	Compare packed unsigned dword integers in xmm1 and xmm2/m128 and store packed maximum values in xmm1.
VPMAXUD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 3F /r	avx	Compare packed unsigned dword integers in xmm2 and xmm3/m128 and store packed maximum values in xmm1.
VPMAXUD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 3F /r	avx2	Compare packed unsigned dword integers in ymm2 and ymm3/m256 and store packed maximum values in ymm1.
VPMAXUD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 3F /r	avx512	Compare packed unsigned dword integers in xmm2 and xmm3/m128/m32bcst and store packed maximum values in xmm1 under writemask k1.
VPMAXUD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 3F /r	avx512	Compare packed unsigned dword integers in ymm2 and ymm3/m256/m32bcst and store packed maximum values in ymm1 under writemask k1.
VPMAXUD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 3F /r	avx512	Compare packed unsigned dword integers in zmm2 and zmm3/m512/m32bcst and store packed maximum values in zmm1 under writemask k1.
VPMAXUQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 3F /r	avx512	Compare packed unsigned qword integers in xmm2 and xmm3/m128/m64bcst and store packed maximum values in xmm1 under writemask k1.
VPMAXUQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 3F /r	avx512	Compare packed unsigned qword integers in ymm2 and ymm3/m256/m64bcst and store packed maximum values in ymm1 under writemask k1.
VPMAXUQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 3F /r	avx512	Compare packed unsigned qword integers in zmm2 and zmm3/m512/m64bcst and store packed maximum values in zmm1 under writemask k1.
PMINSW mm1, mm2/m64	0F EA /r	sse	Compare signed word integers in mm2/m64 and mm1 and return minimum values.
PMINSB xmm1, xmm2/m128	66 0F 38 38 /r	sse4.1	Compare packed signed byte integers in xmm1 and xmm2/m128 and store packed minimum values in xmm1.
PMINSW xmm1, xmm2/m128	66 0F EA /r	sse2	Compare packed signed word integers in xmm2/m128 and xmm1 and store packed minimum values in xmm1.
VPMINSB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38 38 /r	avx	Compare packed signed byte integers in xmm2 and xmm3/m128 and store packed minimum values in xmm1.
VPMINSW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F EA /r	avx	Compare packed signed word integers in xmm3/m128 and xmm2 and return packed minimum values in xmm1.
VPMINSB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38 38 /r	avx2	Compare packed signed byte integers in ymm2 and ymm3/m256 and store packed minimum values in ymm1.
VPMINSW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F EA /r	avx2	Compare packed signed word integers in ymm3/m256 and ymm2 and return packed minimum values in ymm1.
VPMINSB xmm1{k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.WIG 38 /r	avx512	Compare packed signed byte integers in xmm2 and xmm3/m128 and store packed minimum values in xmm1 under writemask k1.
VPMINSB ymm1{k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.WIG 38 /r	avx512	Compare packed signed byte integers in ymm2 and ymm3/m256 and store packed minimum values in ymm1 under writemask k1.
VPMINSB zmm1{k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.WIG 38 /r	avx512	Compare packed signed byte integers in zmm2 and zmm3/m512 and store packed minimum values in zmm1 under writemask k1.
VPMINSW xmm1{k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG EA /r	avx512	Compare packed signed word integers in xmm2 and xmm3/m128 and store packed minimum values in xmm1 under writemask k1.
VPMINSW ymm1{k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG EA /r	avx512	Compare packed signed word integers in ymm2 and ymm3/m256 and store packed minimum values in ymm1 under writemask k1.
VPMINSW zmm1{k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG EA /r	avx512	Compare packed signed word integers in zmm2 and zmm3/m512 and store packed minimum values in zmm1 under writemask k1.
PMINSD xmm1, xmm2/m128	66 0F 38 39 /r	sse4.1	Compare packed signed dword integers in xmm1 and xmm2/m128 and store packed minimum values in xmm1.
VPMINSD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 39 /r	avx	Compare packed signed dword integers in xmm2 and xmm3/m128 and store packed minimum values in xmm1.
VPMINSD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 39 /r	avx2	Compare packed signed dword integers in ymm2 and ymm3/m128 and store packed minimum values in ymm1.
VPMINSD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 39 /r	avx512	Compare packed signed dword integers in xmm2 and xmm3/m128 and store packed minimum values in xmm1 under writemask k1.
VPMINSD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 39 /r	avx512	Compare packed signed dword integers in ymm2 and ymm3/m256 and store packed minimum values in ymm1 under writemask k1.
VPMINSD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 39 /r	avx512	Compare packed signed dword integers in zmm2 and zmm3/m512/m32bcst and store packed minimum values in zmm1 under writemask k1.
VPMINSQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 39 /r	avx512	Compare packed signed qword integers in xmm2 and xmm3/m128 and store packed minimum values in xmm1 under writemask k1.
VPMINSQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 39 /r	avx512	Compare packed signed qword integers in ymm2 and ymm3/m256 and store packed minimum values in ymm1 under writemask k1.
VPMINSQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 39 /r	avx512	Compare packed signed qword integers in zmm2 and zmm3/m512/m64bcst and store packed minimum values in zmm1 under writemask k1.
PMINUB mm1, mm2/m64	0F DA /r	sse	Compare unsigned byte integers in mm2/m64 and mm1 and returns minimum values.
PMINUB xmm1, xmm2/m128	66 0F DA /r	sse2	Compare packed unsigned byte integers in xmm1 and xmm2/m128 and store packed minimum values in xmm1.
PMINUW xmm1, xmm2/m128	66 0F 38 3A/r	sse4.1	Compare packed unsigned word integers in xmm2/m128 and xmm1 and store packed minimum values in xmm1.
VPMINUB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F DA /r	avx	Compare packed unsigned byte integers in xmm2 and xmm3/m128 and store packed minimum values in xmm1.
VPMINUW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38 3A/r	avx	Compare packed unsigned word integers in xmm3/m128 and xmm2 and return packed minimum values in xmm1.
VPMINUB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F DA /r	avx2	Compare packed unsigned byte integers in ymm2 and ymm3/m256 and store packed minimum values in ymm1.
VPMINUW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38 3A/r	avx2	Compare packed unsigned word integers in ymm3/m256 and ymm2 and return packed minimum values in ymm1.
VPMINUB xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F DA /r	avx512	Compare packed unsigned byte integers in xmm2 and xmm3/m128 and store packed minimum values in xmm1 under writemask k1.
VPMINUB ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F DA /r	avx512	Compare packed unsigned byte integers in ymm2 and ymm3/m256 and store packed minimum values in ymm1 under writemask k1.
VPMINUB zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F DA /r	avx512	Compare packed unsigned byte integers in zmm2 and zmm3/m512 and store packed minimum values in zmm1 under writemask k1.
VPMINUW xmm1{k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38 3A/r	avx512	Compare packed unsigned word integers in xmm3/m128 and xmm2 and return packed minimum values in xmm1 under writemask k1.
VPMINUW ymm1{k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38 3A/r	avx512	Compare packed unsigned word integers in ymm3/m256 and ymm2 and return packed minimum values in ymm1 under writemask k1.
VPMINUW zmm1{k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38 3A/r	avx512	Compare packed unsigned word integers in zmm3/m512 and zmm2 and return packed minimum values in zmm1 under writemask k1.
PMINUD xmm1, xmm2/m128	66 0F 38 3B /r	sse4.1	Compare packed unsigned dword integers in xmm1 and xmm2/m128 and store packed minimum values in xmm1.
VPMINUD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 3B /r	avx	Compare packed unsigned dword integers in xmm2 and xmm3/m128 and store packed minimum values in xmm1.
VPMINUD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 3B /r	avx2	Compare packed unsigned dword integers in ymm2 and ymm3/m256 and store packed minimum values in ymm1.
VPMINUD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 3B /r	avx512	Compare packed unsigned dword integers in xmm2 and xmm3/m128/m32bcst and store packed minimum values in xmm1 under writemask k1.
VPMINUD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 3B /r	avx512	Compare packed unsigned dword integers in ymm2 and ymm3/m256/m32bcst and store packed minimum values in ymm1 under writemask k1.
VPMINUD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 3B /r	avx512	Compare packed unsigned dword integers in zmm2 and zmm3/m512/m32bcst and store packed minimum values in zmm1 under writemask k1.
VPMINUQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 3B /r	avx512	Compare packed unsigned qword integers in xmm2 and xmm3/m128/m64bcst and store packed minimum values in xmm1 under writemask k1.
VPMINUQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 3B /r	avx512	Compare packed unsigned qword integers in ymm2 and ymm3/m256/m64bcst and store packed minimum values in ymm1 under writemask k1.
VPMINUQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 3B /r	avx512	Compare packed unsigned qword integers in zmm2 and zmm3/m512/m64bcst and store packed minimum values in zmm1 under writemask k1.
PMOVMSKB reg, mm	0F D7 /r	sse	Move a byte mask of mm to reg. The upper bits of r32 or r64 are zeroed
PMOVMSKB reg, xmm	66 0F D7 /r	sse2	Move a byte mask of xmm to reg. The upper bits of r32 or r64 are zeroed
VPMOVMSKB reg, xmm1	VEX.128.66.0F.WIG D7 /r	avx	Move a byte mask of xmm1 to reg. The upper bits of r32 or r64 are filled with zeros.
VPMOVMSKB reg, ymm1	VEX.256.66.0F.WIG D7 /r	avx2	Move a 32-bit mask of ymm1 to reg. The upper bits of r64 are filled with zeros.
PMOVSXBW xmm1, xmm2/m64	66 0f 38 20 /r	sse4.1	Sign extend 8 packed 8-bit integers in the low 8 bytes of xmm2/m64 to 8 packed 16-bit integers in xmm1.
PMOVSXBD xmm1, xmm2/m32	66 0f 38 21 /r	sse4.1	Sign extend 4 packed 8-bit integers in the low 4 bytes of xmm2/m32 to 4 packed 32-bit integers in xmm1.
PMOVSXBQ xmm1, xmm2/m16	66 0f 38 22 /r	sse4.1	Sign extend 2 packed 8-bit integers in the low 2 bytes of xmm2/m16 to 2 packed 64-bit integers in xmm1.
PMOVSXWD xmm1, xmm2/m64	66 0f 38 23/r	sse4.1	Sign extend 4 packed 16-bit integers in the low 8 bytes of xmm2/m64 to 4 packed 32-bit integers in xmm1.
PMOVSXWQ xmm1, xmm2/m32	66 0f 38 24 /r	sse4.1	Sign extend 2 packed 16-bit integers in the low 4 bytes of xmm2/m32 to 2 packed 64-bit integers in xmm1.
PMOVSXDQ xmm1, xmm2/m64	66 0f 38 25 /r	sse4.1	Sign extend 2 packed 32-bit integers in the low 8 bytes of xmm2/m64 to 2 packed 64-bit integers in xmm1.
VPMOVSXBW xmm1, xmm2/m64	VEX.128.66.0F38.WIG 20 /r	avx	Sign extend 8 packed 8-bit integers in the low 8 bytes of xmm2/m64 to 8 packed 16-bit integers in xmm1.
VPMOVSXBD xmm1, xmm2/m32	VEX.128.66.0F38.WIG 21 /r	avx	Sign extend 4 packed 8-bit integers in the low 4 bytes of xmm2/m32 to 4 packed 32-bit integers in xmm1.
VPMOVSXBQ xmm1, xmm2/m16	VEX.128.66.0F38.WIG 22 /r	avx	Sign extend 2 packed 8-bit integers in the low 2 bytes of xmm2/m16 to 2 packed 64-bit integers in xmm1.
VPMOVSXWD xmm1, xmm2/m64	VEX.128.66.0F38.WIG 23 /r	avx	Sign extend 4 packed 16-bit integers in the low 8 bytes of xmm2/m64 to 4 packed 32-bit integers in xmm1.
VPMOVSXWQ xmm1, xmm2/m32	VEX.128.66.0F38.WIG 24 /r	avx	Sign extend 2 packed 16-bit integers in the low 4 bytes of xmm2/m32 to 2 packed 64-bit integers in xmm1.
VPMOVSXDQ xmm1, xmm2/m64	VEX.128.66.0F38.WIG 25 /r	avx	Sign extend 2 packed 32-bit integers in the low 8 bytes of xmm2/m64 to 2 packed 64-bit integers in xmm1.
VPMOVSXBW ymm1, xmm2/m128	VEX.256.66.0F38.WIG 20 /r	avx2	Sign extend 16 packed 8-bit integers in xmm2/m128 to 16 packed 16-bit integers in ymm1.
VPMOVSXBD ymm1, xmm2/m64	VEX.256.66.0F38.WIG 21 /r	avx2	Sign extend 8 packed 8-bit integers in the low 8 bytes of xmm2/m64 to 8 packed 32-bit integers in ymm1.
VPMOVSXBQ ymm1, xmm2/m32	VEX.256.66.0F38.WIG 22 /r	avx2	Sign extend 4 packed 8-bit integers in the low 4 bytes of xmm2/m32 to 4 packed 64-bit integers in ymm1.
VPMOVSXWD ymm1, xmm2/m128	VEX.256.66.0F38.WIG 23 /r	avx2	Sign extend 8 packed 16-bit integers in the low 16 bytes of xmm2/m128 to 8 packed 32-bit integers in ymm1.
PMOVZXBW xmm1, xmm2/m64	66 0f 38 30 /r	sse4.1	Zero extend 8 packed 8-bit integers in the low 8 bytes of xmm2/m64 to 8 packed 16-bit integers in xmm1.
PMOVZXBD xmm1, xmm2/m32	66 0f 38 31 /r	sse4.1	Zero extend 4 packed 8-bit integers in the low 4 bytes of xmm2/m32 to 4 packed 32-bit integers in xmm1.
PMOVZXBQ xmm1, xmm2/m16	66 0f 38 32 /r	sse4.1	Zero extend 2 packed 8-bit integers in the low 2 bytes of xmm2/m16 to 2 packed 64-bit integers in xmm1.
PMOVZXWD xmm1, xmm2/m64	66 0f 38 33 /r	sse4.1	Zero extend 4 packed 16-bit integers in the low 8 bytes of xmm2/m64 to 4 packed 32-bit integers in xmm1.
PMOVZXWQ xmm1, xmm2/m32	66 0f 38 34 /r	sse4.1	Zero extend 2 packed 16-bit integers in the low 4 bytes of xmm2/m32 to 2 packed 64-bit integers in xmm1.
PMOVZXDQ xmm1, xmm2/m64	66 0f 38 35 /r	sse4.1	Zero extend 2 packed 32-bit integers in the low 8 bytes of xmm2/m64 to 2 packed 64-bit integers in xmm1.
VPMOVZXBW xmm1, xmm2/m64	VEX.128.66.0F38.WIG 30 /r	avx	Zero extend 8 packed 8-bit integers in the low 8 bytes of xmm2/m64 to 8 packed 16-bit integers in xmm1.
VPMOVZXBD xmm1, xmm2/m32	VEX.128.66.0F38.WIG 31 /r	avx	Zero extend 4 packed 8-bit integers in the low 4 bytes of xmm2/m32 to 4 packed 32-bit integers in xmm1.
VPMOVZXBQ xmm1, xmm2/m16	VEX.128.66.0F38.WIG 32 /r	avx	Zero extend 2 packed 8-bit integers in the low 2 bytes of xmm2/m16 to 2 packed 64-bit integers in xmm1.
VPMOVZXWD xmm1, xmm2/m64	VEX.128.66.0F38.WIG 33 /r	avx	Zero extend 4 packed 16-bit integers in the low 8 bytes of xmm2/m64 to 4 packed 32-bit integers in xmm1.
VPMOVZXWQ xmm1, xmm2/m32	VEX.128.66.0F38.WIG 34 /r	avx	Zero extend 2 packed 16-bit integers in the low 4 bytes of xmm2/m32 to 2 packed 64-bit integers in xmm1.
VPMOVZXDQ xmm1, xmm2/m64	VEX.128.66.0F 38.WIG 35 /r	avx	Zero extend 2 packed 32-bit integers in the low 8 bytes of xmm2/m64 to 2 packed 64-bit integers in xmm1.
VPMOVZXBW ymm1, xmm2/m128	VEX.256.66.0F38.WIG 30 /r	avx2	Zero extend 16 packed 8-bit integers in xmm2/m128 to 16 packed 16-bit integers in ymm1.
VPMOVZXBD ymm1, xmm2/m64	VEX.256.66.0F38.WIG 31 /r	avx2	Zero extend 8 packed 8-bit integers in the low 8 bytes of xmm2/m64 to 8 packed 32-bit integers in ymm1.
VPMOVZXBQ ymm1, xmm2/m32	VEX.256.66.0F38.WIG 32 /r	avx2	Zero extend 4 packed 8-bit integers in the low 4 bytes of xmm2/m32 to 4 packed 64-bit integers in ymm1.
VPMOVZXWD ymm1, xmm2/m128	VEX.256.66.0F38.WIG 33 /r	avx2	Zero extend 8 packed 16-bit integers xmm2/m128 to 8 packed 32-bit integers in ymm1.
PMULDQ xmm1, xmm2/m128	66 0F 38 28 /r	sse4.1	Multiply packed signed doubleword integers in xmm1 by packed signed doubleword integers in xmm2/m128, and store the quadword results in xmm1.
VPMULDQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 28 /r	avx	Multiply packed signed doubleword integers in xmm2 by packed signed doubleword integers in xmm3/m128, and store the quadword results in xmm1.
VPMULDQ ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 28 /r	avx2	Multiply packed signed doubleword integers in ymm2 by packed signed doubleword integers in ymm3/m256, and store the quadword results in ymm1.
VPMULDQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 28 /r	avx512	Multiply packed signed doubleword integers in xmm2 by packed signed doubleword integers in xmm3/m128/m64bcst, and store the quadword results in xmm1 using writemask k1.
VPMULDQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 28 /r	avx512	Multiply packed signed doubleword integers in ymm2 by packed signed doubleword integers in ymm3/m256/m64bcst, and store the quadword results in ymm1 using writemask k1.
VPMULDQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 28 /r	avx512	Multiply packed signed doubleword integers in zmm2 by packed signed doubleword integers in zmm3/m512/m64bcst, and store the quadword results in zmm1 using writemask k1.
PMULHRSW mm1, mm2/m64	0F 38 0B /r	ssse3	Multiply 16-bit signed words, scale and round signed doublewords, pack high 16 bits to mm1.
PMULHRSW xmm1, xmm2/m128	66 0F 38 0B /r	ssse3	Multiply 16-bit signed words, scale and round signed doublewords, pack high 16 bits to xmm1.
VPMULHRSW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 0B /r	avx	Multiply 16-bit signed words, scale and round signed doublewords, pack high 16 bits to xmm1.
VPMULHRSW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 0B /r	avx2	Multiply 16-bit signed words, scale and round signed doublewords, pack high 16 bits to ymm1.
VPMULHRSW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.WIG 0B /r	avx512	Multiply 16-bit signed words, scale and round signed doublewords, pack high 16 bits to xmm1 under writemask k1.
VPMULHRSW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.WIG 0B /r	avx512	Multiply 16-bit signed words, scale and round signed doublewords, pack high 16 bits to ymm1 under writemask k1.
VPMULHRSW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.WIG 0B /r	avx512	Multiply 16-bit signed words, scale and round signed doublewords, pack high 16 bits to zmm1 under writemask k1.
PMULHUW mm1, mm2/m64	0F E4 /r	sse	Multiply the packed unsigned word integers in mm1 register and mm2/m64, and store the high 16 bits of the results in mm1.
PMULHUW xmm1, xmm2/m128	66 0F E4 /r	sse2	Multiply the packed unsigned word integers in xmm1 and xmm2/m128, and store the high 16 bits of the results in xmm1.
VPMULHUW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG E4 /r	avx	Multiply the packed unsigned word integers in xmm2 and xmm3/m128, and store the high 16 bits of the results in xmm1.
VPMULHUW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG E4 /r	avx2	Multiply the packed unsigned word integers in ymm2 and ymm3/m256, and store the high 16 bits of the results in ymm1.
VPMULHUW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG E4 /r	avx512	Multiply the packed unsigned word integers in xmm2 and xmm3/m128, and store the high 16 bits of the results in xmm1 under writemask k1.
VPMULHUW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG E4 /r	avx512	Multiply the packed unsigned word integers in ymm2 and ymm3/m256, and store the high 16 bits of the results in ymm1 under writemask k1.
VPMULHUW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG E4 /r	avx512	Multiply the packed unsigned word integers in zmm2 and zmm3/m512, and store the high 16 bits of the results in zmm1 under writemask k1.
PMULHW mm, mm/m64	0F E5 /r	mmx	Multiply the packed signed word integers in mm1 register and mm2/m64, and store the high 16 bits of the results in mm1.
PMULHW xmm1, xmm2/m128	66 0F E5 /r	sse2	Multiply the packed signed word integers in xmm1 and xmm2/m128, and store the high 16 bits of the results in xmm1.
VPMULHW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG E5 /r	avx	Multiply the packed signed word integers in xmm2 and xmm3/m128, and store the high 16 bits of the results in xmm1.
VPMULHW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG E5 /r	avx2	Multiply the packed signed word integers in ymm2 and ymm3/m256, and store the high 16 bits of the results in ymm1.
VPMULHW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG E5 /r	avx512	Multiply the packed signed word integers in xmm2 and xmm3/m128, and store the high 16 bits of the results in xmm1 under writemask k1.
VPMULHW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG E5 /r	avx512	Multiply the packed signed word integers in ymm2 and ymm3/m256, and store the high 16 bits of the results in ymm1 under writemask k1.
VPMULHW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG E5 /r	avx512	Multiply the packed signed word integers in zmm2 and zmm3/m512, and store the high 16 bits of the results in zmm1 under writemask k1.
PMULLD xmm1, xmm2/m128	66 0F 38 40 /r	sse4.1	Multiply the packed dword signed integers in xmm1 and xmm2/m128 and store the low 32 bits of each product in xmm1.
VPMULLD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 40 /r	avx	Multiply the packed dword signed integers in xmm2 and xmm3/m128 and store the low 32 bits of each product in xmm1.
VPMULLD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 40 /r	avx2	Multiply the packed dword signed integers in ymm2 and ymm3/m256 and store the low 32 bits of each product in ymm1.
VPMULLD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 40 /r	avx512	Multiply the packed dword signed integers in xmm2 and xmm3/m128/m32bcst and store the low 32 bits of each product in xmm1 under writemask k1.
VPMULLD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 40 /r	avx512	Multiply the packed dword signed integers in ymm2 and ymm3/m256/m32bcst and store the low 32 bits of each product in ymm1 under writemask k1.
VPMULLD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 40 /r	avx512	Multiply the packed dword signed integers in zmm2 and zmm3/m512/m32bcst and store the low 32 bits of each product in zmm1 under writemask k1.
VPMULLQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 40 /r	avx512	Multiply the packed qword signed integers in xmm2 and xmm3/m128/m64bcst and store the low 64 bits of each product in xmm1 under writemask k1.
VPMULLQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 40 /r	avx512	Multiply the packed qword signed integers in ymm2 and ymm3/m256/m64bcst and store the low 64 bits of each product in ymm1 under writemask k1.
VPMULLQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 40 /r	avx512	Multiply the packed qword signed integers in zmm2 and zmm3/m512/m64bcst and store the low 64 bits of each product in zmm1 under writemask k1.
PMULLW mm, mm/m64	0F D5 /r	mmx	Multiply the packed signed word integers in mm1 register and mm2/m64, and store the low 16 bits of the results in mm1.
PMULLW xmm1, xmm2/m128	66 0F D5 /r	sse2	Multiply the packed signed word integers in xmm1 and xmm2/m128, and store the low 16 bits of the results in xmm1.
VPMULLW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG D5 /r	avx	Multiply the packed dword signed integers in xmm2 and xmm3/m128 and store the low 32 bits of each product in xmm1.
VPMULLW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG D5 /r	avx2	Multiply the packed signed word integers in ymm2 and ymm3/m256, and store the low 16 bits of the results in ymm1.
VPMULLW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG D5 /r	avx512	Multiply the packed signed word integers in xmm2 and xmm3/m128, and store the low 16 bits of the results in xmm1 under writemask k1.
VPMULLW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG D5 /r	avx512	Multiply the packed signed word integers in ymm2 and ymm3/m256, and store the low 16 bits of the results in ymm1 under writemask k1.
VPMULLW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG D5 /r	avx512	Multiply the packed signed word integers in zmm2 and zmm3/m512, and store the low 16 bits of the results in zmm1 under writemask k1.
PMULUDQ mm1, mm2/m64	0F F4 /r	sse2	Multiply unsigned doubleword integer in mm1 by unsigned doubleword integer in mm2/m64, and store the quadword result in mm1.
PMULUDQ xmm1, xmm2/m128	66 0F F4 /r	sse2	Multiply packed unsigned doubleword integers in xmm1 by packed unsigned doubleword integers in xmm2/m128, and store the quadword results in xmm1.
VPMULUDQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG F4 /r	avx	Multiply packed unsigned doubleword integers in xmm2 by packed unsigned doubleword integers in xmm3/m128, and store the quadword results in xmm1.
VPMULUDQ ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG F4 /r	avx2	Multiply packed unsigned doubleword integers in ymm2 by packed unsigned doubleword integers in ymm3/m256, and store the quadword results in ymm1.
VPMULUDQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 F4 /r	avx512	Multiply packed unsigned doubleword integers in xmm2 by packed unsigned doubleword integers in xmm3/m128/m64bcst, and store the quadword results in xmm1 under writemask k1.
VPMULUDQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 F4 /r	avx512	Multiply packed unsigned doubleword integers in ymm2 by packed unsigned doubleword integers in ymm3/m256/m64bcst, and store the quadword results in ymm1 under writemask k1.
VPMULUDQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F.W1 F4 /r	avx512	Multiply packed unsigned doubleword integers in zmm2 by packed unsigned doubleword integers in zmm3/m512/m64bcst, and store the quadword results in zmm1 under writemask k1.
POP r/m16	8F /0		Pop top of stack into m16; increment stack pointer.
POP r/m32	8F /0		Pop top of stack into m32; increment stack pointer.
POP r/m64	8F /0		Pop top of stack into m64; increment stack pointer. Cannot encode 32-bit operand size.
POP r16	58+ rw		Pop top of stack into r16; increment stack pointer.
POP r32	58+ rd		Pop top of stack into r32; increment stack pointer.
POP r64	58+ rd		Pop top of stack into r64; increment stack pointer. Cannot encode 32-bit operand size.
POP DS	1F		Pop top of stack into DS; increment stack pointer.
POP ES	07		Pop top of stack into ES; increment stack pointer.
POP SS	17		Pop top of stack into SS; increment stack pointer.
POP FS	0F A1		Pop top of stack into FS; increment stack pointer by 16 bits.
POP FS	0F A1		Pop top of stack into FS; increment stack pointer by 32 bits.
POP FS	0F A1		Pop top of stack into FS; increment stack pointer by 64 bits.
POP GS	0F A9		Pop top of stack into GS; increment stack pointer by 16 bits.
POP GS	0F A9		Pop top of stack into GS; increment stack pointer by 32 bits.
POP GS	0F A9		Pop top of stack into GS; increment stack pointer by 64 bits.
POPA	61		Pop DI, SI, BP, BX, DX, CX, and AX.
POPAD	61		Pop EDI, ESI, EBP, EBX, EDX, ECX, and EAX.
POPCNT r16, r/m16	F3 0F B8 /r		POPCNT on r/m16
POPCNT r32, r/m32	F3 0F B8 /r		POPCNT on r/m32
POPCNT r64, r/m64	F3 REX.W 0F B8 /r		POPCNT on r/m64
POPF	9D		Pop top of stack into lower 16 bits of EFLAGS.
POPFD	9D		Pop top of stack into EFLAGS.
POPFQ	9D		Pop top of stack and zero-extend into RFLAGS.
POR mm, mm/m64	0F EB /r	mmx	Bitwise OR of mm/m64 and mm.
POR xmm1, xmm2/m128	66 0F EB /r	sse2	Bitwise OR of xmm2/m128 and xmm1.
VPOR xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG EB /r	avx	Bitwise OR of xmm2/m128 and xmm3.
VPOR ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG EB /r	avx2	Bitwise OR of ymm2/m256 and ymm3.
VPORD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F.W0 EB /r	avx512	Bitwise OR of packed doubleword integers in xmm2 and xmm3/m128/m32bcst using writemask k1.
VPORD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F.W0 EB /r	avx512	Bitwise OR of packed doubleword integers in ymm2 and ymm3/m256/m32bcst using writemask k1.
VPORD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F.W0 EB /r	avx512	Bitwise OR of packed doubleword integers in zmm2 and zmm3/m512/m32bcst using writemask k1.
VPORQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 EB /r	avx512	Bitwise OR of packed quadword integers in xmm2 and xmm3/m128/m64bcst using writemask k1.
VPORQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 EB /r	avx512	Bitwise OR of packed quadword integers in ymm2 and ymm3/m256/m64bcst using writemask k1.
VPORQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F.W1 EB /r	avx512	Bitwise OR of packed quadword integers in zmm2 and zmm3/m512/m64bcst using writemask k1.
PREFETCHW m8	0F 0D /1	prfchw	Move data from m8 closer to the processor in anticipation of a write.
PREFETCHWT1 m8	0F 0D /2	prefetchwt1	Move data from m8 closer to the processor using T1 hint with intent to write.
PREFETCHT0 m8	0F 18 /1		Move data from m8 closer to the processor using T0 hint.
PREFETCHT1 m8	0F 18 /2		Move data from m8 closer to the processor using T1 hint.
PREFETCHT2 m8	0F 18 /3		Move data from m8 closer to the processor using T2 hint.
PREFETCHNTA m8	0F 18 /0		Move data from m8 closer to the processor using NTA hint.
VPROLVD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 15 /r	avx512	Rotate doublewords in xmm2 left by count in the corresponding element of xmm3/m128/m32bcst. Result written to xmm1 under writemask k1.
VPROLD xmm1 {k1}{z}, xmm2/m128/m32bcst, imm8	EVEX.NDD.128.66.0F.W0 72 /1 ib	avx512	Rotate doublewords in xmm2/m128/m32bcst left by imm8. Result written to xmm1 using writemask k1.
VPROLVQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 15 /r	avx512	Rotate quadwords in xmm2 left by count in the corresponding element of xmm3/m128/m64bcst. Result written to xmm1 under writemask k1.
VPROLQ xmm1 {k1}{z}, xmm2/m128/m64bcst, imm8	EVEX.NDD.128.66.0F.W1 72 /1 ib	avx512	Rotate quadwords in xmm2/m128/m64bcst left by imm8. Result written to xmm1 using writemask k1.
VPROLVD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 15 /r	avx512	Rotate doublewords in ymm2 left by count in the corresponding element of ymm3/m256/m32bcst. Result written to ymm1 under writemask k1.
VPROLD ymm1 {k1}{z}, ymm2/m256/m32bcst, imm8	EVEX.NDD.256.66.0F.W0 72 /1 ib	avx512	Rotate doublewords in ymm2/m256/m32bcst left by imm8. Result written to ymm1 using writemask k1.
VPROLVQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 15 /r	avx512	Rotate quadwords in ymm2 left by count in the corresponding element of ymm3/m256/m64bcst. Result written to ymm1 under writemask k1.
VPROLQ ymm1 {k1}{z}, ymm2/m256/m64bcst, imm8	EVEX.NDD.256.66.0F.W1 72 /1 ib	avx512	Rotate quadwords in ymm2/m256/m64bcst left by imm8. Result written to ymm1 using writemask k1.
VPROLVD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 15 /r	avx512	Rotate left of doublewords in zmm2 by count in the corresponding element of zmm3/m512/m32bcst. Result written to zmm1 using writemask k1.
VPROLD zmm1 {k1}{z}, zmm2/m512/m32bcst, imm8	EVEX.NDD.512.66.0F.W0 72 /1 ib	avx512	Rotate left of doublewords in zmm3/m512/m32bcst by imm8. Result written to zmm1 using writemask k1.
VPROLVQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 15 /r	avx512	Rotate quadwords in zmm2 left by count in the corresponding element of zmm3/m512/m64bcst. Result written to zmm1under writemask k1.
VPROLQ zmm1 {k1}{z}, zmm2/m512/m64bcst, imm8	EVEX.NDD.512.66.0F.W1 72 /1 ib	avx512	Rotate quadwords in zmm2/m512/m64bcst left by imm8. Result written to zmm1 using writemask k1.
VPRORVD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 14 /r	avx512	Rotate doublewords in xmm2 right by count in the corresponding element of xmm3/m128/m32bcst, store result using writemask k1.
VPRORD xmm1 {k1}{z}, xmm2/m128/m32bcst, imm8	EVEX.NDD.128.66.0F.W0 72 /0 ib	avx512	Rotate doublewords in xmm2/m128/m32bcst right by imm8, store result using writemask k1.
VPRORVQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 14 /r	avx512	Rotate quadwords in xmm2 right by count in the corresponding element of xmm3/m128/m64bcst, store result using writemask k1.
VPRORQ xmm1 {k1}{z}, xmm2/m128/m64bcst, imm8	EVEX.NDD.128.66.0F.W1 72 /0 ib	avx512	Rotate quadwords in xmm2/m128/m64bcst right by imm8, store result using writemask k1.
VPRORVD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 14 /r	avx512	Rotate doublewords in ymm2 right by count in the corresponding element of ymm3/m256/m32bcst, store using result writemask k1.
VPRORD ymm1 {k1}{z}, ymm2/m256/m32bcst, imm8	EVEX.NDD.256.66.0F.W0 72 /0 ib	avx512	Rotate doublewords in ymm2/m256/m32bcst right by imm8, store result using writemask k1.
VPRORVQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 14 /r	avx512	Rotate quadwords in ymm2 right by count in the corresponding element of ymm3/m256/m64bcst, store result using writemask k1.
VPRORQ ymm1 {k1}{z}, ymm2/m256/m64bcst, imm8	EVEX.NDD.256.66.0F.W1 72 /0 ib	avx512	Rotate quadwords in ymm2/m256/m64bcst right by imm8, store result using writemask k1.
VPRORVD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 14 /r	avx512	Rotate doublewords in zmm2 right by count in the corresponding element of zmm3/m512/m32bcst, store result using writemask k1.
VPRORD zmm1 {k1}{z}, zmm2/m512/m32bcst, imm8	EVEX.NDD.512.66.0F.W0 72 /0 ib	avx512	Rotate doublewords in zmm2/m512/m32bcst right by imm8, store result using writemask k1.
VPRORVQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 14 /r	avx512	Rotate quadwords in zmm2 right by count in the corresponding element of zmm3/m512/m64bcst, store result using writemask k1.
VPRORQ zmm1 {k1}{z}, zmm2/m512/m64bcst, imm8	EVEX.NDD.512.66.0F.W1 72 /0 ib	avx512	Rotate quadwords in zmm2/m512/m64bcst right by imm8, store result using writemask k1.
PSADBW mm1, mm2/m64	0F F6 /r	sse	Computes the absolute differences of the packed unsigned byte integers from mm2 /m64 and mm1; differences are then summed to produce an unsigned word integer result.
PSADBW xmm1, xmm2/m128	66 0F F6 /r	sse2	Computes the absolute differences of the packed unsigned byte integers from xmm2 /m128 and xmm1; the 8 low differences and 8 high differences are then summed separately to produce two unsigned word integer results.
VPSADBW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG F6 /r	avx	Computes the absolute differences of the packed unsigned byte integers from xmm3 /m128 and xmm2; the 8 low differences and 8 high differences are then summed separately to produce two unsigned word integer results.
VPSADBW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG F6 /r	avx2	Computes the absolute differences of the packed unsigned byte integers from ymm3 /m256 and ymm2; then each consecutive 8 differences are summed separately to produce four unsigned word integer results.
VPSADBW xmm1, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG F6 /r	avx512	Computes the absolute differences of the packed unsigned byte integers from xmm3 /m128 and xmm2; then each consecutive 8 differences are summed separately to produce four unsigned word integer results.
VPSADBW ymm1, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG F6 /r	avx512	Computes the absolute differences of the packed unsigned byte integers from ymm3 /m256 and ymm2; then each consecutive 8 differences are summed separately to produce four unsigned word integer results.
VPSADBW zmm1, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG F6 /r	avx512	Computes the absolute differences of the packed unsigned byte integers from zmm3 /m512 and zmm2; then each consecutive 8 differences are summed separately to produce four unsigned word integer results.
PSHUFB mm1, mm2/m64	0F 38 00 /r	ssse3	Shuffle bytes in mm1 according to contents of mm2/m64.
PSHUFB xmm1, xmm2/m128	66 0F 38 00 /r	ssse3	Shuffle bytes in xmm1 according to contents of xmm2/m128.
VPSHUFB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 00 /r	avx	Shuffle bytes in xmm2 according to contents of xmm3/m128.
VPSHUFB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 00 /r	avx2	Shuffle bytes in ymm2 according to contents of ymm3/m256.
VPSHUFB xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.WIG 00 /r	avx512	Shuffle bytes in xmm2 according to contents of xmm3/m128 under write mask k1.
VPSHUFB ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.WIG 00 /r	avx512	Shuffle bytes in ymm2 according to contents of ymm3/m256 under write mask k1.
VPSHUFB zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.WIG 00 /r	avx512	Shuffle bytes in ymm2 according to contents of zmm3/m512 under write mask k1.
PSHUFD xmm1, xmm2/m128, imm8	66 0F 70 /r ib	sse2	Shuffle the doublewords in xmm2/m128 based on the encoding in imm8 and store the result in xmm1.
VPSHUFD xmm1, xmm2/m128, imm8	VEX.128.66.0F.WIG 70 /r ib	avx	Shuffle the doublewords in xmm2/m128 based on the encoding in imm8 and store the result in xmm1.
VPSHUFD ymm1, ymm2/m256, imm8	VEX.256.66.0F.WIG 70 /r ib	avx2	Shuffle the doublewords in ymm2/m256 based on the encoding in imm8 and store the result in ymm1.
VPSHUFD xmm1 {k1}{z}, xmm2/m128/m32bcst, imm8	EVEX.128.66.0F.W0 70 /r ib	avx512	Shuffle the doublewords in xmm2/m128/m32bcst based on the encoding in imm8 and store the result in xmm1 using writemask k1.
VPSHUFD ymm1 {k1}{z}, ymm2/m256/m32bcst, imm8	EVEX.256.66.0F.W0 70 /r ib	avx512	Shuffle the doublewords in ymm2/m256/m32bcst based on the encoding in imm8 and store the result in ymm1 using writemask k1.
VPSHUFD zmm1 {k1}{z}, zmm2/m512/m32bcst, imm8	EVEX.512.66.0F.W0 70 /r ib	avx512	Shuffle the doublewords in zmm2/m512/m32bcst based on the encoding in imm8 and store the result in zmm1 using writemask k1.
PSHUFHW xmm1, xmm2/m128, imm8	F3 0F 70 /r ib	sse2	Shuffle the high words in xmm2/m128 based on the encoding in imm8 and store the result in xmm1.
VPSHUFHW xmm1, xmm2/m128, imm8	VEX.128.F3.0F.WIG 70 /r ib	avx	Shuffle the high words in xmm2/m128 based on the encoding in imm8 and store the result in xmm1.
VPSHUFHW ymm1, ymm2/m256, imm8	VEX.256.F3.0F.WIG 70 /r ib	avx2	Shuffle the high words in ymm2/m256 based on the encoding in imm8 and store the result in ymm1.
VPSHUFHW xmm1 {k1}{z}, xmm2/m128, imm8	EVEX.128.F3.0F.WIG 70 /r ib	avx512	Shuffle the high words in xmm2/m128 based on the encoding in imm8 and store the result in xmm1 under write mask k1.
VPSHUFHW ymm1 {k1}{z}, ymm2/m256, imm8	EVEX.256.F3.0F.WIG 70 /r ib	avx512	Shuffle the high words in ymm2/m256 based on the encoding in imm8 and store the result in ymm1 under write mask k1.
VPSHUFHW zmm1 {k1}{z}, zmm2/m512, imm8	EVEX.512.F3.0F.WIG 70 /r ib	avx512	Shuffle the high words in ymm2/m256 based on the encoding in imm8 and store the result in zmm1 under write mask k1.
PSHUFLW xmm1, xmm2/m128, imm8	F2 0F 70 /r ib	sse2	Shuffle the low words in xmm2/m128 based on the encoding in imm8 and store the result in xmm1.
VPSHUFLW xmm1, xmm2/m128, imm8	VEX.128.F2.0F.WIG 70 /r ib	avx	Shuffle the low words in xmm2/m128 based on the encoding in imm8 and store the result in xmm1.
VPSHUFLW ymm1, ymm2/m256, imm8	VEX.256.F2.0F.WIG 70 /r ib	avx2	Shuffle the low words in ymm2/m256 based on the encoding in imm8 and store the result in ymm1.
VPSHUFLW xmm1 {k1}{z}, xmm2/m128, imm8	EVEX.128.F2.0F.WIG 70 /r ib	avx512	Shuffle the low words in xmm2/m128 based on the encoding in imm8 and store the result in xmm1 under write mask k1.
VPSHUFLW ymm1 {k1}{z}, ymm2/m256, imm8	EVEX.256.F2.0F.WIG 70 /r ib	avx512	Shuffle the low words in ymm2/m256 based on the encoding in imm8 and store the result in ymm1 under write mask k1.
VPSHUFLW zmm1 {k1}{z}, zmm2/m512, imm8	EVEX.512.F2.0F.WIG 70 /r ib	avx512	Shuffle the low words in zmm2/m512 based on the encoding in imm8 and store the result in zmm1 under write mask k1.
PSHUFW mm1, mm2/m64, imm8	0F 70 /r ib		Shuffle the words in mm2/m64 based on the encoding in imm8 and store the result in mm1.
PSIGNB mm1, mm2/m64	0F 38 08 /r	ssse3	Negate/zero/preserve packed byte integers in mm1 depending on the corresponding sign in mm2/m64.
PSIGNB xmm1, xmm2/m128	66 0F 38 08 /r	ssse3	Negate/zero/preserve packed byte integers in xmm1 depending on the corresponding sign in xmm2/m128.
PSIGNW mm1, mm2/m64	0F 38 09 /r	ssse3	Negate/zero/preserve packed word integers in mm1 depending on the corresponding sign in mm2/m128.
PSIGNW xmm1, xmm2/m128	66 0F 38 09 /r	ssse3	Negate/zero/preserve packed word integers in xmm1 depending on the corresponding sign in xmm2/m128.
PSIGND mm1, mm2/m64	0F 38 0A /r	ssse3	Negate/zero/preserve packed doubleword integers in mm1 depending on the corresponding sign in mm2/m128.
PSIGND xmm1, xmm2/m128	66 0F 38 0A /r	ssse3	Negate/zero/preserve packed doubleword integers in xmm1 depending on the corresponding sign in xmm2/m128.
VPSIGNB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 08 /r	avx	Negate/zero/preserve packed byte integers in xmm2 depending on the corresponding sign in xmm3/m128.
VPSIGNW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 09 /r	avx	Negate/zero/preserve packed word integers in xmm2 depending on the corresponding sign in xmm3/m128.
VPSIGND xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.WIG 0A /r	avx	Negate/zero/preserve packed doubleword integers in xmm2 depending on the corresponding sign in xmm3/m128.
VPSIGNB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 08 /r	avx2	Negate packed byte integers in ymm2 if the corresponding sign in ymm3/m256 is less than zero.
VPSIGNW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 09 /r	avx2	Negate packed 16-bit integers in ymm2 if the corresponding sign in ymm3/m256 is less than zero.
VPSIGND ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.WIG 0A /r	avx2	Negate packed doubleword integers in ymm2 if the corresponding sign in ymm3/m256 is less than zero.
PSLLDQ xmm1, imm8	66 0F 73 /7 ib	sse2	Shift xmm1 left by imm8 bytes while shifting in 0s.
VPSLLDQ xmm1, xmm2, imm8	VEX.NDD.128.66.0F.WIG 73 /7 ib	avx	Shift xmm2 left by imm8 bytes while shifting in 0s and store result in xmm1.
VPSLLDQ ymm1, ymm2, imm8	VEX.NDD.256.66.0F.WIG 73 /7 ib	avx2	Shift ymm2 left by imm8 bytes while shifting in 0s and store result in ymm1.
VPSLLDQ xmm1,xmm2/ m128, imm8	EVEX.NDD.128.66.0F.WIG 73 /7 ib	avx512	Shift xmm2/m128 left by imm8 bytes while shifting in 0s and store result in xmm1.
VPSLLDQ ymm1, ymm2/m256, imm8	EVEX.NDD.256.66.0F.WIG 73 /7 ib	avx512	Shift ymm2/m256 left by imm8 bytes while shifting in 0s and store result in ymm1.
VPSLLDQ zmm1, zmm2/m512, imm8	EVEX.NDD.512.66.0F.WIG 73 /7 ib	avx512	Shift zmm2/m512 left by imm8 bytes while shifting in 0s and store result in zmm1.
PSLLW mm, mm/m64	0F F1 /r	mmx	Shift words in mm left mm/m64 while shifting in 0s.
PSLLW xmm1, xmm2/m128	66 0F F1 /r	sse2	Shift words in xmm1 left by xmm2/m128 while shifting in 0s.
PSLLW mm1, imm8	0F 71 /6 ib	mmx	Shift words in mm left by imm8 while shifting in 0s.
PSLLW xmm1, imm8	66 0F 71 /6 ib	sse2	Shift words in xmm1 left by imm8 while shifting in 0s.
PSLLD mm, mm/m64	0F F2 /r	mmx	Shift doublewords in mm left by mm/m64 while shifting in 0s.
PSLLD xmm1, xmm2/m128	66 0F F2 /r	sse2	Shift doublewords in xmm1 left by xmm2/m128 while shifting in 0s.
PSLLD mm, imm8	0F 72 /6 ib	mmx	Shift doublewords in mm left by imm8 while shifting in 0s.
PSLLD xmm1, imm8	66 0F 72 /6 ib	sse2	Shift doublewords in xmm1 left by imm8 while shifting in 0s.
PSLLQ mm, mm/m64	0F F3 /r	mmx	Shift quadword in mm left by mm/m64 while shifting in 0s.
PSLLQ xmm1, xmm2/m128	66 0F F3 /r	sse2	Shift quadwords in xmm1 left by xmm2/m128 while shifting in 0s.
PSLLQ mm, imm8	0F 73 /6 ib	mmx	Shift quadword in mm left by imm8 while shifting in 0s.
PSLLQ xmm1, imm8	66 0F 73 /6 ib	sse2	Shift quadwords in xmm1 left by imm8 while shifting in 0s.
VPSLLW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG F1 /r	avx	Shift words in xmm2 left by amount specified in xmm3/m128 while shifting in 0s.
VPSLLW xmm1, xmm2, imm8	VEX.NDD.128.66.0F.WIG 71 /6 ib	avx	Shift words in xmm2 left by imm8 while shifting in 0s.
VPSLLD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG F2 /r	avx	Shift doublewords in xmm2 left by amount specified in xmm3/m128 while shifting in 0s.
VPSLLD xmm1, xmm2, imm8	VEX.NDD.128.66.0F.WIG 72 /6 ib	avx	Shift doublewords in xmm2 left by imm8 while shifting in 0s.
VPSLLQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG F3 /r	avx	Shift quadwords in xmm2 left by amount specified in xmm3/m128 while shifting in 0s.
VPSLLQ xmm1, xmm2, imm8	VEX.NDD.128.66.0F.WIG 73 /6 ib	avx	Shift quadwords in xmm2 left by imm8 while shifting in 0s.
VPSLLW ymm1, ymm2, xmm3/m128	VEX.NDS.256.66.0F.WIG F1 /r	avx2	Shift words in ymm2 left by amount specified in xmm3/m128 while shifting in 0s.
VPSLLW ymm1, ymm2, imm8	VEX.NDD.256.66.0F.WIG 71 /6 ib	avx2	Shift words in ymm2 left by imm8 while shifting in 0s.
PSRAW mm, mm/m64	0F E1 /r	mmx	Shift words in mm right by mm/m64 while shifting in sign bits.
PSRAW xmm1, xmm2/m128	66 0F E1 /r	sse2	Shift words in xmm1 right by xmm2/m128 while shifting in sign bits.
PSRAW mm, imm8	0F 71 /4 ib	mmx	Shift words in mm right by imm8 while shifting in sign bits
PSRAW xmm1, imm8	66 0F 71 /4 ib	sse2	Shift words in xmm1 right by imm8 while shifting in sign bits
PSRAD mm, mm/m64	0F E2 /r	mmx	Shift doublewords in mm right by mm/m64 while shifting in sign bits.
PSRAD xmm1, xmm2/m128	66 0F E2 /r	sse2	Shift doubleword in xmm1 right by xmm2 /m128 while shifting in sign bits.
PSRAD mm, imm8	0F 72 /4 ib	mmx	Shift doublewords in mm right by imm8 while shifting in sign bits.
PSRAD xmm1, imm8	66 0F 72 /4 ib	sse2	Shift doublewords in xmm1 right by imm8 while shifting in sign bits.
VPSRAW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG E1 /r	avx	Shift words in xmm2 right by amount specified in xmm3/m128 while shifting in sign bits.
VPSRAW xmm1, xmm2, imm8	VEX.NDD.128.66.0F.WIG 71 /4 ib	avx	Shift words in xmm2 right by imm8 while shifting in sign bits.
VPSRAD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG E2 /r	avx	Shift doublewords in xmm2 right by amount specified in xmm3/m128 while shifting in sign bits.
VPSRAD xmm1, xmm2, imm8	VEX.NDD.128.66.0F.WIG 72 /4 ib	avx	Shift doublewords in xmm2 right by imm8 while shifting in sign bits.
VPSRAW ymm1, ymm2, xmm3/m128	VEX.NDS.256.66.0F.WIG E1 /r	avx2	Shift words in ymm2 right by amount specified in xmm3/m128 while shifting in sign bits.
VPSRAW ymm1, ymm2, imm8	VEX.NDD.256.66.0F.WIG 71 /4 ib	avx2	Shift words in ymm2 right by imm8 while shifting in sign bits.
VPSRAD ymm1, ymm2, xmm3/m128	VEX.NDS.256.66.0F.WIG E2 /r	avx2	Shift doublewords in ymm2 right by amount specified in xmm3/m128 while shifting in sign bits.
VPSRAD ymm1, ymm2, imm8	VEX.NDD.256.66.0F.WIG 72 /4 ib	avx2	Shift doublewords in ymm2 right by imm8 while shifting in sign bits.
VPSRAW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG E1 /r	avx512	Shift words in xmm2 right by amount specified in xmm3/m128 while shifting in sign bits using writemask k1.
VPSRAW ymm1 {k1}{z}, ymm2, xmm3/m128	EVEX.NDS.256.66.0F.WIG E1 /r	avx512	Shift words in ymm2 right by amount specified in xmm3/m128 while shifting in sign bits using writemask k1.
PSRLDQ xmm1, imm8	66 0F 73 /3 ib	sse2	Shift xmm1 right by imm8 while shifting in 0s.
VPSRLDQ xmm1, xmm2, imm8	VEX.NDD.128.66.0F.WIG 73 /3 ib	avx	Shift xmm2 right by imm8 bytes while shifting in 0s.
VPSRLDQ ymm1, ymm2, imm8	VEX.NDD.256.66.0F.WIG 73 /3 ib	avx2	Shift ymm1 right by imm8 bytes while shifting in 0s.
VPSRLDQ xmm1, xmm2/m128, imm8	EVEX.NDD.128.66.0F.WIG 73 /3 ib	avx512	Shift xmm2/m128 right by imm8 bytes while shifting in 0s and store result in xmm1.
VPSRLDQ ymm1, ymm2/m256, imm8	EVEX.NDD.256.66.0F.WIG 73 /3 ib	avx512	Shift ymm2/m256 right by imm8 bytes while shifting in 0s and store result in ymm1.
VPSRLDQ zmm1, zmm2/m512, imm8	EVEX.NDD.512.66.0F.WIG 73 /3 ib	avx512	Shift zmm2/m512 right by imm8 bytes while shifting in 0s and store result in zmm1.
PSRLW mm, mm/m64	0F D1 /r	mmx	Shift words in mm right by amount specified in mm/m64 while shifting in 0s.
PSRLW xmm1, xmm2/m128	66 0F D1 /r	sse2	Shift words in xmm1 right by amount specified in xmm2/m128 while shifting in 0s.
PSRLW mm, imm8	0F 71 /2 ib	mmx	Shift words in mm right by imm8 while shifting in 0s.
PSRLW xmm1, imm8	66 0F 71 /2 ib	sse2	Shift words in xmm1 right by imm8 while shifting in 0s.
PSRLD mm, mm/m64	0F D2 /r	mmx	Shift doublewords in mm right by amount specified in mm/m64 while shifting in 0s.
PSRLD xmm1, xmm2/m128	66 0F D2 /r	sse2	Shift doublewords in xmm1 right by amount specified in xmm2 /m128 while shifting in 0s.
PSRLD mm, imm8	0F 72 /2 ib	mmx	Shift doublewords in mm right by imm8 while shifting in 0s.
PSRLD xmm1, imm8	66 0F 72 /2 ib	sse2	Shift doublewords in xmm1 right by imm8 while shifting in 0s.
PSRLQ mm, mm/m64	0F D3 /r	mmx	Shift mm right by amount specified in mm/m64 while shifting in 0s.
PSRLQ xmm1, xmm2/m128	66 0F D3 /r	sse2	Shift quadwords in xmm1 right by amount specified in xmm2/m128 while shifting in 0s.
PSRLQ mm, imm8	0F 73 /2 ib	mmx	Shift mm right by imm8 while shifting in 0s.
PSRLQ xmm1, imm8	66 0F 73 /2 ib	sse2	Shift quadwords in xmm1 right by imm8 while shifting in 0s.
VPSRLW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG D1 /r	avx	Shift words in xmm2 right by amount specified in xmm3/m128 while shifting in 0s.
VPSRLW xmm1, xmm2, imm8	VEX.NDD.128.66.0F.WIG 71 /2 ib	avx	Shift words in xmm2 right by imm8 while shifting in 0s.
VPSRLD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG D2 /r	avx	Shift doublewords in xmm2 right by amount specified in xmm3/m128 while shifting in 0s.
VPSRLD xmm1, xmm2, imm8	VEX.NDD.128.66.0F.WIG 72 /2 ib	avx	Shift doublewords in xmm2 right by imm8 while shifting in 0s.
VPSRLQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG D3 /r	avx	Shift quadwords in xmm2 right by amount specified in xmm3/m128 while shifting in 0s.
VPSRLQ xmm1, xmm2, imm8	VEX.NDD.128.66.0F.WIG 73 /2 ib	avx	Shift quadwords in xmm2 right by imm8 while shifting in 0s.
VPSRLW ymm1, ymm2, xmm3/m128	VEX.NDS.256.66.0F.WIG D1 /r	avx2	Shift words in ymm2 right by amount specified in xmm3/m128 while shifting in 0s.
VPSRLW ymm1, ymm2, imm8	VEX.NDD.256.66.0F.WIG 71 /2 ib	avx2	Shift words in ymm2 right by imm8 while shifting in 0s.
PSUBB mm, mm/m64	0F F8 /r	mmx	Subtract packed byte integers in mm/m64 from packed byte integers in mm.
PSUBB xmm1, xmm2/m128	66 0F F8 /r	sse2	Subtract packed byte integers in xmm2/m128 from packed byte integers in xmm1.
PSUBW mm, mm/m64	0F F9 /r	mmx	Subtract packed word integers in mm/m64 from packed word integers in mm.
PSUBW xmm1, xmm2/m128	66 0F F9 /r	sse2	Subtract packed word integers in xmm2/m128 from packed word integers in xmm1.
PSUBD mm, mm/m64	0F FA /r	mmx	Subtract packed doubleword integers in mm/m64 from packed doubleword integers in mm.
PSUBD xmm1, xmm2/m128	66 0F FA /r	sse2	Subtract packed doubleword integers in xmm2/mem128 from packed doubleword integers in xmm1.
VPSUBB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG F8 /r	avx	Subtract packed byte integers in xmm3/m128 from xmm2.
VPSUBW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG F9 /r	avx	Subtract packed word integers in xmm3/m128 from xmm2.
VPSUBD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG FA /r	avx	Subtract packed doubleword integers in xmm3/m128 from xmm2.
VPSUBB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG F8 /r	avx2	Subtract packed byte integers in ymm3/m256 from ymm2.
VPSUBW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG F9 /r	avx2	Subtract packed word integers in ymm3/m256 from ymm2.
VPSUBD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG FA /r	avx2	Subtract packed doubleword integers in ymm3/m256 from ymm2.
VPSUBB xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG F8 /r	avx512	Subtract packed byte integers in xmm3/m128 from xmm2 and store in xmm1 using writemask k1.
VPSUBB ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG F8 /r	avx512	Subtract packed byte integers in ymm3/m256 from ymm2 and store in ymm1 using writemask k1.
VPSUBB zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG F8 /r	avx512	Subtract packed byte integers in zmm3/m512 from zmm2 and store in zmm1 using writemask k1.
VPSUBW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG F9 /r	avx512	Subtract packed word integers in xmm3/m128 from xmm2 and store in xmm1 using writemask k1.
VPSUBW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG F9 /r	avx512	Subtract packed word integers in ymm3/m256 from ymm2 and store in ymm1 using writemask k1.
VPSUBW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG F9 /r	avx512	Subtract packed word integers in zmm3/m512 from zmm2 and store in zmm1 using writemask k1.
PSUBQ mm1, mm2/m64	0F FB /r	sse2	Subtract quadword integer in mm1 from mm2 /m64.
PSUBQ xmm1, xmm2/m128	66 0F FB /r	sse2	Subtract packed quadword integers in xmm1 from xmm2 /m128.
VPSUBQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG FB/r	avx	Subtract packed quadword integers in xmm3/m128 from xmm2.
VPSUBQ ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG FB /r	avx2	Subtract packed quadword integers in ymm3/m256 from ymm2.
VPSUBQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 FB /r	avx512	Subtract packed quadword integers in xmm3/m128/m64bcst from xmm2 and store in xmm1 using writemask k1.
VPSUBQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 FB /r	avx512	Subtract packed quadword integers in ymm3/m256/m64bcst from ymm2 and store in ymm1 using writemask k1.
VPSUBQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F.W1 FB/r	avx512	Subtract packed quadword integers in zmm3/m512/m64bcst from zmm2 and store in zmm1 using writemask k1.
PSUBSB mm, mm/m64	0F E8 /r	mmx	Subtract signed packed bytes in mm/m64 from signed packed bytes in mm and saturate results.
PSUBSB xmm1, xmm2/m128	66 0F E8 /r	sse2	Subtract packed signed byte integers in xmm2/m128 from packed signed byte integers in xmm1 and saturate results.
PSUBSW mm, mm/m64	0F E9 /r	mmx	Subtract signed packed words in mm/m64 from signed packed words in mm and saturate results.
PSUBSW xmm1, xmm2/m128	66 0F E9 /r	sse2	Subtract packed signed word integers in xmm2/m128 from packed signed word integers in xmm1 and saturate results.
VPSUBSB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG E8 /r	avx	Subtract packed signed byte integers in xmm3/m128 from packed signed byte integers in xmm2 and saturate results.
VPSUBSW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG E9 /r	avx	Subtract packed signed word integers in xmm3/m128 from packed signed word integers in xmm2 and saturate results.
VPSUBSB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG E8 /r	avx2	Subtract packed signed byte integers in ymm3/m256 from packed signed byte integers in ymm2 and saturate results.
VPSUBSW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG E9 /r	avx2	Subtract packed signed word integers in ymm3/m256 from packed signed word integers in ymm2 and saturate results.
VPSUBSB xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG E8 /r	avx512	Subtract packed signed byte integers in xmm3/m128 from packed signed byte integers in xmm2 and saturate results and store in xmm1 using writemask k1.
VPSUBSB ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG E8 /r	avx512	Subtract packed signed byte integers in ymm3/m256 from packed signed byte integers in ymm2 and saturate results and store in ymm1 using writemask k1.
VPSUBSB zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG E8 /r	avx512	Subtract packed signed byte integers in zmm3/m512 from packed signed byte integers in zmm2 and saturate results and store in zmm1 using writemask k1.
VPSUBSW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG E9 /r	avx512	Subtract packed signed word integers in xmm3/m128 from packed signed word integers in xmm2 and saturate results and store in xmm1 using writemask k1.
VPSUBSW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG E9 /r	avx512	Subtract packed signed word integers in ymm3/m256 from packed signed word integers in ymm2 and saturate results and store in ymm1 using writemask k1.
PSUBUSB mm, mm/m64	0F D8 /r	mmx	Subtract unsigned packed bytes in mm/m64 from unsigned packed bytes in mm and saturate result.
PSUBUSB xmm1, xmm2/m128	66 0F D8 /r	sse2	Subtract packed unsigned byte integers in xmm2/m128 from packed unsigned byte integers in xmm1 and saturate result.
PSUBUSW mm, mm/m64	0F D9 /r	mmx	Subtract unsigned packed words in mm/m64 from unsigned packed words in mm and saturate result.
PSUBUSW xmm1, xmm2/m128	66 0F D9 /r	sse2	Subtract packed unsigned word integers in xmm2/m128 from packed unsigned word integers in xmm1 and saturate result.
VPSUBUSB xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG D8 /r	avx	Subtract packed unsigned byte integers in xmm3/m128 from packed unsigned byte integers in xmm2 and saturate result.
VPSUBUSW xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG D9 /r	avx	Subtract packed unsigned word integers in xmm3/m128 from packed unsigned word integers in xmm2 and saturate result.
VPSUBUSB ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG D8 /r	avx2	Subtract packed unsigned byte integers in ymm3/m256 from packed unsigned byte integers in ymm2 and saturate result.
VPSUBUSW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG D9 /r	avx2	Subtract packed unsigned word integers in ymm3/m256 from packed unsigned word integers in ymm2 and saturate result.
VPSUBUSB xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG D8 /r	avx512	Subtract packed unsigned byte integers in xmm3/m128 from packed unsigned byte integers in xmm2, saturate results and store in xmm1 using writemask k1.
VPSUBUSB ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG D8 /r	avx512	Subtract packed unsigned byte integers in ymm3/m256 from packed unsigned byte integers in ymm2, saturate results and store in ymm1 using writemask k1.
VPSUBUSB zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F.WIG D8 /r	avx512	Subtract packed unsigned byte integers in zmm3/m512 from packed unsigned byte integers in zmm2, saturate results and store in zmm1 using writemask k1.
VPSUBUSW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG D9 /r	avx512	Subtract packed unsigned word integers in xmm3/m128 from packed unsigned word integers in xmm2 and saturate results and store in xmm1 using writemask k1.
VPSUBUSW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F.WIG D9 /r	avx512	Subtract packed unsigned word integers in ymm3/m256 from packed unsigned word integers in ymm2, saturate results and store in ymm1 using writemask k1.
PTEST xmm1, xmm2/m128	66 0F 38 17 /r	sse4.1	Set ZF if xmm2/m128 AND xmm1 result is all 0s. Set CF if xmm2/m128 AND NOT xmm1 result is all 0s.
VPTEST xmm1, xmm2/m128	VEX.128.66.0F38.WIG 17 /r	avx	Set ZF and CF depending on bitwise AND and ANDN of sources.
VPTEST ymm1, ymm2/m256	VEX.256.66.0F38.WIG 17 /r	avx	Set ZF and CF depending on bitwise AND and ANDN of sources.
PTWRITE r64/m64	F3 REX.W 0F AE /4		Reads the data from r64/m64 to encod into a PTW packet if dependencies are met (see details below).
PTWRITE r32/m32	F3 0F AE /4		Reads the data from r32/m32 to encode into a PTW packet if dependencies are met (see details below).
PUNPCKHBW mm, mm/m64	0F 68 /r	mmx	Unpack and interleave high-order bytes from mm and mm/m64 into mm.
PUNPCKHBW xmm1, xmm2/m128	66 0F 68 /r	sse2	Unpack and interleave high-order bytes from xmm1 and xmm2/m128 into xmm1.
PUNPCKHWD mm, mm/m64	0F 69 /r	mmx	Unpack and interleave high-order words from mm and mm/m64 into mm.
PUNPCKHWD xmm1, xmm2/m128	66 0F 69 /r	sse2	Unpack and interleave high-order words from xmm1 and xmm2/m128 into xmm1.
PUNPCKHDQ mm, mm/m64	0F 6A /r	mmx	Unpack and interleave high-order doublewords from mm and mm/m64 into mm.
PUNPCKHDQ xmm1, xmm2/m128	66 0F 6A /r	sse2	Unpack and interleave high-order doublewords from xmm1 and xmm2/m128 into xmm1.
PUNPCKHQDQ xmm1, xmm2/m128	66 0F 6D /r	sse2	Unpack and interleave high-order quadwords from xmm1 and xmm2/m128 into xmm1.
VPUNPCKHBW xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 68/r	avx	Interleave high-order bytes from xmm2 and xmm3/m128 into xmm1.
VPUNPCKHWD xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 69/r	avx	Interleave high-order words from xmm2 and xmm3/m128 into xmm1.
VPUNPCKHDQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 6A/r	avx	Interleave high-order doublewords from xmm2 and xmm3/m128 into xmm1.
VPUNPCKHQDQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 6D/r	avx	Interleave high-order quadword from xmm2 and xmm3/m128 into xmm1 register.
VPUNPCKHBW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 68 /r	avx2	Interleave high-order bytes from ymm2 and ymm3/m256 into ymm1 register.
VPUNPCKHWD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 69 /r	avx2	Interleave high-order words from ymm2 and ymm3/m256 into ymm1 register.
VPUNPCKHDQ ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 6A /r	avx2	Interleave high-order doublewords from ymm2 and ymm3/m256 into ymm1 register.
VPUNPCKHQDQ ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 6D /r	avx2	Interleave high-order quadword from ymm2 and ymm3/m256 into ymm1 register.
VPUNPCKHBW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG 68 /r	avx512	Interleave high-order bytes from xmm2 and xmm3/m128 into xmm1 register using k1 write mask.
VPUNPCKHWD xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG 69 /r	avx512	Interleave high-order words from xmm2 and xmm3/m128 into xmm1 register using k1 write mask.
VPUNPCKHDQ xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F.W0 6A /r	avx512	Interleave high-order doublewords from xmm2 and xmm3/m128/m32bcst into xmm1 register using k1 write mask.
PUNPCKLBW mm, mm/m32	0F 60 /r	mmx	Interleave low-order bytes from mm and mm/m32 into mm.
PUNPCKLBW xmm1, xmm2/m128	66 0F 60 /r	sse2	Interleave low-order bytes from xmm1 and xmm2/m128 into xmm1.
PUNPCKLWD mm, mm/m32	0F 61 /r	mmx	Interleave low-order words from mm and mm/m32 into mm.
PUNPCKLWD xmm1, xmm2/m128	66 0F 61 /r	sse2	Interleave low-order words from xmm1 and xmm2/m128 into xmm1.
PUNPCKLDQ mm, mm/m32	0F 62 /r	mmx	Interleave low-order doublewords from mm and mm/m32 into mm.
PUNPCKLDQ xmm1, xmm2/m128	66 0F 62 /r	sse2	Interleave low-order doublewords from xmm1 and xmm2/m128 into xmm1.
PUNPCKLQDQ xmm1, xmm2/m128	66 0F 6C /r	sse2	Interleave low-order quadword from xmm1 and xmm2/m128 into xmm1 register.
VPUNPCKLBW xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 60/r	avx	Interleave low-order bytes from xmm2 and xmm3/m128 into xmm1.
VPUNPCKLWD xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 61/r	avx	Interleave low-order words from xmm2 and xmm3/m128 into xmm1.
VPUNPCKLDQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 62/r	avx	Interleave low-order doublewords from xmm2 and xmm3/m128 into xmm1.
VPUNPCKLQDQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 6C/r	avx	Interleave low-order quadword from xmm2 and xmm3/m128 into xmm1 register.
VPUNPCKLBW ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 60 /r	avx2	Interleave low-order bytes from ymm2 and ymm3/m256 into ymm1 register.
VPUNPCKLWD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 61 /r	avx2	Interleave low-order words from ymm2 and ymm3/m256 into ymm1 register.
VPUNPCKLDQ ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 62 /r	avx2	Interleave low-order doublewords from ymm2 and ymm3/m256 into ymm1 register.
VPUNPCKLQDQ ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 6C /r	avx2	Interleave low-order quadword from ymm2 and ymm3/m256 into ymm1 register.
VPUNPCKLBW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG 60 /r	avx512	Interleave low-order bytes from xmm2 and xmm3/m128 into xmm1 register subject to write mask k1.
VPUNPCKLWD xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F.WIG 61 /r	avx512	Interleave low-order words from xmm2 and xmm3/m128 into xmm1 register subject to write mask k1.
VPUNPCKLDQ xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F.W0 62 /r	avx512	Interleave low-order doublewords from xmm2 and xmm3/m128/m32bcst into xmm1 register subject to write mask k1.
VPUNPCKLQDQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 6C /r	avx512	Interleave low-order quadword from zmm2 and zmm3/m512/m64bcst into zmm1 register subject to write mask k1.
PUSH r/m16	FF /6		Push r/m16.
PUSH r/m32	FF /6		Push r/m32.
PUSH r/m64	FF /6		Push r/m64.
PUSH r16	50+rw		Push r16.
PUSH r32	50+rd		Push r32.
PUSH r64	50+rd		Push r64.
PUSH imm8	6A ib		Push imm8.
PUSH imm16	68 iw		Push imm16.
PUSH imm32	68 id		Push imm32.
PUSH CS	0E		Push CS.
PUSH SS	16		Push SS.
PUSH DS	1E		Push DS.
PUSH ES	06		Push ES.
PUSH FS	0F A0		Push FS.
PUSH GS	0F A8		Push GS.
PUSHA	60		Push AX, CX, DX, BX, original SP, BP, SI, and DI.
PUSHAD	60		Push EAX, ECX, EDX, EBX, original ESP, EBP, ESI, and EDI.
PUSHF	9C		Push lower 16 bits of EFLAGS.
PUSHFD	9C		Push EFLAGS.
PUSHFQ	9C		Push RFLAGS.
PXOR mm, mm/m64	0F EF /r	mmx	Bitwise XOR of mm/m64 and mm.
PXOR xmm1, xmm2/m128	66 0F EF /r	sse2	Bitwise XOR of xmm2/m128 and xmm1.
VPXOR xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG EF /r	avx	Bitwise XOR of xmm3/m128 and xmm2.
VPXOR ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG EF /r	avx2	Bitwise XOR of ymm3/m256 and ymm2.
VPXORD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F.W0 EF /r	avx512	Bitwise XOR of packed doubleword integers in xmm2 and xmm3/m128 using writemask k1.
VPXORD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F.W0 EF /r	avx512	Bitwise XOR of packed doubleword integers in ymm2 and ymm3/m256 using writemask k1.
VPXORD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F.W0 EF /r	avx512	Bitwise XOR of packed doubleword integers in zmm2 and zmm3/m512/m32bcst using writemask k1.
VPXORQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 EF /r	avx512	Bitwise XOR of packed quadword integers in xmm2 and xmm3/m128 using writemask k1.
VPXORQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 EF /r	avx512	Bitwise XOR of packed quadword integers in ymm2 and ymm3/m256 using writemask k1.
VPXORQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F.W1 EF /r	avx512	Bitwise XOR of packed quadword integers in zmm2 and zmm3/m512/m64bcst using writemask k1.
RCL r/m8, 1	D0 /2		Rotate 9 bits (CF, r/m8) left once.
RCL r/m8*, 1	REX + D0 /2		Rotate 9 bits (CF, r/m8) left once.
RCL r/m8, CL	D2 /2		Rotate 9 bits (CF, r/m8) left CL times.
RCL r/m8*, CL	REX + D2 /2		Rotate 9 bits (CF, r/m8) left CL times.
RCL r/m8, imm8	C0 /2 ib		Rotate 9 bits (CF, r/m8) left imm8 times.
RCL r/m8*, imm8	REX + C0 /2 ib		Rotate 9 bits (CF, r/m8) left imm8 times.
RCL r/m16, 1	D1 /2		Rotate 17 bits (CF, r/m16) left once.
RCL r/m16, CL	D3 /2		Rotate 17 bits (CF, r/m16) left CL times.
RCL r/m16, imm8	C1 /2 ib		Rotate 17 bits (CF, r/m16) left imm8 times.
RCL r/m32, 1	D1 /2		Rotate 33 bits (CF, r/m32) left once.
RCL r/m64, 1	REX.W + D1 /2		Rotate 65 bits (CF, r/m64) left once. Uses a 6 bit count.
RCL r/m32, CL	D3 /2		Rotate 33 bits (CF, r/m32) left CL times.
RCL r/m64, CL	REX.W + D3 /2		Rotate 65 bits (CF, r/m64) left CL times. Uses a 6 bit count.
RCL r/m32, imm8	C1 /2 ib		Rotate 33 bits (CF, r/m32) left imm8 times.
RCL r/m64, imm8	REX.W + C1 /2 ib		Rotate 65 bits (CF, r/m64) left imm8 times. Uses a 6 bit count.
RCR r/m8, 1	D0 /3		Rotate 9 bits (CF, r/m8) right once.
RCR r/m8*, 1	REX + D0 /3		Rotate 9 bits (CF, r/m8) right once.
RCR r/m8, CL	D2 /3		Rotate 9 bits (CF, r/m8) right CL times.
RCR r/m8*, CL	REX + D2 /3		Rotate 9 bits (CF, r/m8) right CL times.
RCR r/m8, imm8	C0 /3 ib		Rotate 9 bits (CF, r/m8) right imm8 times.
RCR r/m8*, imm8	REX + C0 /3 ib		Rotate 9 bits (CF, r/m8) right imm8 times.
RCR r/m16, 1	D1 /3		Rotate 17 bits (CF, r/m16) right once.
RCR r/m16, CL	D3 /3		Rotate 17 bits (CF, r/m16) right CL times.
RCR r/m16, imm8	C1 /3 ib		Rotate 17 bits (CF, r/m16) right imm8 times.
RCR r/m32, 1	D1 /3		Rotate 33 bits (CF, r/m32) right once. Uses a 6 bit count.
RCR r/m64, 1	REX.W + D1 /3		Rotate 65 bits (CF, r/m64) right once. Uses a 6 bit count.
RCR r/m32, CL	D3 /3		Rotate 33 bits (CF, r/m32) right CL times.
RCR r/m64, CL	REX.W + D3 /3		Rotate 65 bits (CF, r/m64) right CL times. Uses a 6 bit count.
RCR r/m32, imm8	C1 /3 ib		Rotate 33 bits (CF, r/m32) right imm8 times.
RCR r/m64, imm8	REX.W + C1 /3 ib		Rotate 65 bits (CF, r/m64) right imm8 times. Uses a 6 bit count.
ROL r/m8, 1	D0 /0		Rotate 8 bits r/m8 left once.
ROL r/m8, 1	REX + D0 /0		Rotate 8 bits r/m8 left once
ROL r/m8, CL	D2 /0		Rotate 8 bits r/m8 left CL times.
ROL r/m8, CL	REX + D2 /0		Rotate 8 bits r/m8 left CL times.
ROL r/m8, imm8	C0 /0 ib		Rotate 8 bits r/m8 left imm8 times.
ROL r/m8, imm8	REX + C0 /0 ib		Rotate 8 bits r/m8 left imm8 times.
ROL r/m16, 1	D1 /0		Rotate 16 bits r/m16 left once.
ROL r/m16, CL	D3 /0		Rotate 16 bits r/m16 left CL times.
ROL r/m16, imm8	C1 /0 ib		Rotate 16 bits r/m16 left imm8 times.
ROL r/m32, 1	D1 /0		Rotate 32 bits r/m32 left once.
ROL r/m64, 1	REX.W + D1 /0		Rotate 64 bits r/m64 left once. Uses a 6 bit count.
ROL r/m32, CL	D3 /0		Rotate 32 bits r/m32 left CL times.
ROL r/m64, CL	REX.W + D3 /0		Rotate 64 bits r/m64 left CL times. Uses a 6 bit count.
ROL r/m32, imm8	C1 /0 ib		Rotate 32 bits r/m32 left imm8 times.
ROL r/m64, imm8	REX.W + C1 /0 ib		Rotate 64 bits r/m64 left imm8 times. Uses a 6 bit count.
ROR r/m8, 1	D0 /1		Rotate 8 bits r/m8 right once.
ROR r/m8, 1	REX + D0 /1		Rotate 8 bits r/m8 right once.
ROR r/m8, CL	D2 /1		Rotate 8 bits r/m8 right CL times.
ROR r/m8, CL	REX + D2 /1		Rotate 8 bits r/m8 right CL times.
ROR r/m8, imm8	C0 /1 ib		Rotate 8 bits r/m16 right imm8 times.
ROR r/m8, imm8	REX + C0 /1 ib		Rotate 8 bits r/m16 right imm8 times.
ROR r/m16, 1	D1 /1		Rotate 16 bits r/m16 right once.
ROR r/m16, CL	D3 /1		Rotate 16 bits r/m16 right CL times.
ROR r/m16, imm8	C1 /1 ib		Rotate 16 bits r/m16 right imm8 times.
ROR r/m32, 1	D1 /1		Rotate 32 bits r/m32 right once.
ROR r/m64, 1	REX.W + D1 /1		Rotate 64 bits r/m64 right once. Uses a 6 bit count.
ROR r/m32, CL	D3 /1		Rotate 32 bits r/m32 right CL times.
ROR r/m64, CL	REX.W + D3 /1		Rotate 64 bits r/m64 right CL times. Uses a 6 bit count.
ROR r/m32, imm8	C1 /1 ib		Rotate 32 bits r/m32 right imm8 times.
ROR r/m64, imm8	REX.W + C1 /1 ib		Rotate 64 bits r/m64 right imm8 times. Uses a 6 bit count.
RCPPS xmm1, xmm2/m128	0F 53 /r	sse	Computes the approximate reciprocals of the packed single-precision floating-point values in xmm2/m128 and stores the results in xmm1.
VRCPPS xmm1, xmm2/m128	VEX.128.0F.WIG 53 /r	avx	Computes the approximate reciprocals of packed single-precision values in xmm2/mem and stores the results in xmm1.
VRCPPS ymm1, ymm2/m256	VEX.256.0F.WIG 53 /r	avx	Computes the approximate reciprocals of packed single-precision values in ymm2/mem and stores the results in ymm1.
RCPSS xmm1, xmm2/m32	F3 0F 53 /r	sse	Computes the approximate reciprocal of the scalar single-precision floating-point value in xmm2/m32 and stores the result in xmm1.
VRCPSS xmm1, xmm2, xmm3/m32	VEX.NDS.LIG.F3.0F.WIG 53 /r	avx	Computes the approximate reciprocal of the scalar single-precision floating-point value in xmm3/m32 and stores the result in xmm1. Also, upper single precision floating-point values (bits[127:32]) from xmm2 are copied to xmm1[127:32].
RDFSBASE r32	F3 0F AE /0	fsgsbase	Load the 32-bit destination register with the FS base address.
RDFSBASE r64	F3 REX.W 0F AE /0	fsgsbase	Load the 64-bit destination register with the FS base address.
RDGSBASE r32	F3 0F AE /1	fsgsbase	Load the 32-bit destination register with the GS base address.
RDGSBASE r64	F3 REX.W 0F AE /1	fsgsbase	Load the 64-bit destination register with the GS base address.
RDMSR	0F 32		Read MSR specified by ECX into EDX:EAX.
RDPID r32	F3 0F C7 /7	rdpid	Read IA32_TSC_AUX into r32.
RDPID r64	F3 0F C7 /7	rdpid	Read IA32_TSC_AUX into r64.
RDPKRU	0F 01 EE	ospke	Reads PKRU into EAX.
RDPMC	0F 33		Read performance-monitoring counter specified by ECX into EDX:EAX.
RDRAND r16	0F C7 /6	rdrand	Read a 16-bit random number and store in the destination register.
RDRAND r32	0F C7 /6	rdrand	Read a 32-bit random number and store in the destination register.
RDRAND r64	REX.W + 0F C7 /6	rdrand	Read a 64-bit random number and store in the destination register.
RDSEED r16	0F C7 /7	rdseed	Read a 16-bit NIST SP800-90B & C compliant random value and store in the destination register.
RDSEED r32	0F C7 /7	rdseed	Read a 32-bit NIST SP800-90B & C compliant random value and store in the destination register.
RDSEED r64	REX.W + 0F C7 /7	rdseed	Read a 64-bit NIST SP800-90B & C compliant random value and store in the destination register.
RDTSC	0F 31		Read time-stamp counter into EDX:EAX.
RDTSCP	0F 01 F9		Read 64-bit time-stamp counter and IA32_TSC_AUX value into EDX:EAX and ECX.
REP INS m8, DX	F3 6C		Input (E)CX bytes from port DX into ES:[(E)DI].
REP INS m8, DX	F3 6C		Input RCX bytes from port DX into [RDI].
REP INS m16, DX	F3 6D		Input (E)CX words from port DX into ES:[(E)DI.]
REP INS m32, DX	F3 6D		Input (E)CX doublewords from port DX into ES:[(E)DI].
REP INS r/m32, DX	F3 6D		Input RCX default size from port DX into [RDI].
REP MOVS m8, m8	F3 A4		Move (E)CX bytes from DS:[(E)SI] to ES:[(E)DI].
REP MOVS m8, m8	F3 REX.W A4		Move RCX bytes from [RSI] to [RDI].
REP MOVS m16, m16	F3 A5		Move (E)CX words from DS:[(E)SI] to ES:[(E)DI].
REP MOVS m32, m32	F3 A5		Move (E)CX doublewords from DS:[(E)SI] to ES:[(E)DI].
REP MOVS m64, m64	F3 REX.W A5		Move RCX quadwords from [RSI] to [RDI].
REP OUTS DX, r/m8	F3 6E		Output (E)CX bytes from DS:[(E)SI] to port DX.
REP OUTS DX, r/m8*	F3 REX.W 6E		Output RCX bytes from [RSI] to port DX.
REP OUTS DX, r/m16	F3 6F		Output (E)CX words from DS:[(E)SI] to port DX.
REP OUTS DX, r/m32	F3 6F		Output (E)CX doublewords from DS:[(E)SI] to port DX.
REP OUTS DX, r/m32	F3 REX.W 6F		Output RCX default size from [RSI] to port DX.
REP LODS AL	F3 AC		Load (E)CX bytes from DS:[(E)SI] to AL.
REP LODS AL	F3 REX.W AC		Load RCX bytes from [RSI] to AL.
REP LODS AX	F3 AD		Load (E)CX words from DS:[(E)SI] to AX.
REP LODS EAX	F3 AD		Load (E)CX doublewords from DS:[(E)SI] to EAX.
REP LODS RAX	F3 REX.W AD		Load RCX quadwords from [RSI] to RAX.
REP STOS m8	F3 AA		Fill (E)CX bytes at ES:[(E)DI] with AL.
REP STOS m8	F3 REX.W AA		Fill RCX bytes at [RDI] with AL.
REP STOS m16	F3 AB		Fill (E)CX words at ES:[(E)DI] with AX.
REP STOS m32	F3 AB		Fill (E)CX doublewords at ES:[(E)DI] with EAX.
REP STOS m64	F3 REX.W AB		Fill RCX quadwords at [RDI] with RAX.
REPE CMPS m8, m8	F3 A6		Find nonmatching bytes in ES:[(E)DI] and DS:[(E)SI].
REPE CMPS m8, m8	F3 REX.W A6		Find non-matching bytes in [RDI] and [RSI].
REPE CMPS m16, m16	F3 A7		Find nonmatching words in ES:[(E)DI] and DS:[(E)SI].
REPE CMPS m32, m32	F3 A7		Find nonmatching doublewords in ES:[(E)DI] and DS:[(E)SI].
REPE CMPS m64, m64	F3 REX.W A7		Find non-matching quadwords in [RDI] and [RSI].
REPE SCAS m8	F3 AE		Find non-AL byte starting at ES:[(E)DI].
REPE SCAS m8	F3 REX.W AE		Find non-AL byte starting at [RDI].
REPE SCAS m16	F3 AF		Find non-AX word starting at ES:[(E)DI].
REPE SCAS m32	F3 AF		Find non-EAX doubleword starting at ES:[(E)DI].
REPE SCAS m64	F3 REX.W AF		Find non-RAX quadword starting at [RDI].
REPNE CMPS m8, m8	F2 A6		Find matching bytes in ES:[(E)DI] and DS:[(E)SI].
REPNE CMPS m8, m8	F2 REX.W A6		Find matching bytes in [RDI] and [RSI].
REPNE CMPS m16, m16	F2 A7		Find matching words in ES:[(E)DI] and DS:[(E)SI].
REPNE CMPS m32, m32	F2 A7		Find matching doublewords in ES:[(E)DI] and DS:[(E)SI].
REPNE CMPS m64, m64	F2 REX.W A7		Find matching doublewords in [RDI] and [RSI].
REPNE SCAS m8	F2 AE		Find AL, starting at ES:[(E)DI].
REPNE SCAS m8	F2 REX.W AE		Find AL, starting at [RDI].
REPNE SCAS m16	F2 AF		Find AX, starting at ES:[(E)DI].
REPNE SCAS m32	F2 AF		Find EAX, starting at ES:[(E)DI].
REPNE SCAS m64	F2 REX.W AF		Find RAX, starting at [RDI].
RET	C3		Near return to calling procedure.
RET	CB		Far return to calling procedure.
RET imm16	C2 iw		Near return to calling procedure and pop imm16 bytes from stack.
RET imm16	CA iw		Far return to calling procedure and pop imm16 bytes from stack.
RORX r32, r/m32, imm8	VEX.LZ.F2.0F3A.W0 F0 /r ib	bmi2	Rotate 32-bit r/m32 right imm8 times without affecting arithmetic flags.
RORX r64, r/m64, imm8	VEX.LZ.F2.0F3A.W1 F0 /r ib	bmi2	Rotate 64-bit r/m64 right imm8 times without affecting arithmetic flags.
ROUNDPD xmm1, xmm2/m128, imm8	66 0F 3A 09 /r ib	sse4.1	Round packed double precision floating-point values in xmm2/m128 and place the result in xmm1. The rounding mode is determined by imm8.
VROUNDPD xmm1, xmm2/m128, imm8	VEX.128.66.0F3A.WIG 09 /r ib	avx	Round packed double-precision floating-point values in xmm2/m128 and place the result in xmm1. The rounding mode is determined by imm8.
VROUNDPD ymm1, ymm2/m256, imm8	VEX.256.66.0F3A.WIG 09 /r ib	avx	Round packed double-precision floating-point values in ymm2/m256 and place the result in ymm1. The rounding mode is determined by imm8.
ROUNDPS xmm1, xmm2/m128, imm8	66 0F 3A 08 /r ib	sse4.1	Round packed single precision floating-point values in xmm2/m128 and place the result in xmm1. The rounding mode is determined by imm8.
VROUNDPS xmm1, xmm2/m128, imm8	VEX.128.66.0F3A.WIG 08 /r ib	avx	Round packed single-precision floating-point values in xmm2/m128 and place the result in xmm1. The rounding mode is determined by imm8.
VROUNDPS ymm1, ymm2/m256, imm8	VEX.256.66.0F3A.WIG 08 /r ib	avx	Round packed single-precision floating-point values in ymm2/m256 and place the result in ymm1. The rounding mode is determined by imm8.
ROUNDSD xmm1, xmm2/m64, imm8	66 0F 3A 0B /r ib	sse4.1	Round the low packed double precision floating-point value in xmm2/m64 and place the result in xmm1. The rounding mode is determined by imm8.
VROUNDSD xmm1, xmm2, xmm3/m64, imm8	VEX.NDS.LIG.66.0F3A.WIG 0B /r ib	avx	Round the low packed double precision floating-point value in xmm3/m64 and place the result in xmm1. The rounding mode is determined by imm8. Upper packed double precision floating-point value (bits[127:64]) from xmm2 is copied to xmm1[127:64].
ROUNDSS xmm1, xmm2/m32, imm8	66 0F 3A 0A /r ib	sse4.1	Round the low packed single precision floating-point value in xmm2/m32 and place the result in xmm1. The rounding mode is determined by imm8.
VROUNDSS xmm1, xmm2, xmm3/m32, imm8	VEX.NDS.LIG.66.0F3A.WIG 0A /r ib	avx	Round the low packed single precision floating-point value in xmm3/m32 and place the result in xmm1. The rounding mode is determined by imm8. Also, upper packed single precision floating-point values (bits[127:32]) from xmm2 are copied to xmm1[127:32].
RSM	0F AA		Resume operation of interrupted program.
RSQRTPS xmm1, xmm2/m128	0F 52 /r	sse	Computes the approximate reciprocals of the square roots of the packed single-precision floating-point values in xmm2/m128 and stores the results in xmm1.
VRSQRTPS xmm1, xmm2/m128	VEX.128.0F.WIG 52 /r	avx	Computes the approximate reciprocals of the square roots of packed single-precision values in xmm2/mem and stores the results in xmm1.
VRSQRTPS ymm1, ymm2/m256	VEX.256.0F.WIG 52 /r	avx	Computes the approximate reciprocals of the square roots of packed single-precision values in ymm2/mem and stores the results in ymm1.
RSQRTSS xmm1, xmm2/m32	F3 0F 52 /r	sse	Computes the approximate reciprocal of the square root of the low single-precision floating-point value in xmm2/m32 and stores the results in xmm1.
VRSQRTSS xmm1, xmm2, xmm3/m32	VEX.NDS.LIG.F3.0F.WIG 52 /r	avx	Computes the approximate reciprocal of the square root of the low single precision floating-point value in xmm3/m32 and stores the results in xmm1. Also, upper single precision floating-point values (bits[127:32]) from xmm2 are copied to xmm1[127:32].
SAHF	9E		Loads SF, ZF, AF, PF, and CF from AH into EFLAGS register.
SAL r/m8, 1	D0 /4		Multiply r/m8 by 2, once.
SAL r/m8, 1	REX + D0 /4		Multiply r/m8 by 2, once.
SAL r/m8, CL	D2 /4		Multiply r/m8 by 2, CL times.
SAL r/m8, CL	REX + D2 /4		Multiply r/m8 by 2, CL times.
SAL r/m8, imm8	C0 /4 ib		Multiply r/m8 by 2, imm8 times.
SAL r/m8, imm8	REX + C0 /4 ib		Multiply r/m8 by 2, imm8 times.
SAL r/m16, 1	D1 /4		Multiply r/m16 by 2, once.
SAL r/m16, CL	D3 /4		Multiply r/m16 by 2, CL times.
SAL r/m16, imm8	C1 /4 ib		Multiply r/m16 by 2, imm8 times.
SAL r/m32, 1	D1 /4		Multiply r/m32 by 2, once.
SAL r/m64, 1	REX.W + D1 /4		Multiply r/m64 by 2, once.
SAL r/m32, CL	D3 /4		Multiply r/m32 by 2, CL times.
SAL r/m64, CL	REX.W + D3 /4		Multiply r/m64 by 2, CL times.
SAL r/m32, imm8	C1 /4 ib		Multiply r/m32 by 2, imm8 times.
SAL r/m64, imm8	REX.W + C1 /4 ib		Multiply r/m64 by 2, imm8 times.
SAR r/m8, 1	D0 /7		Signed divide* r/m8 by 2, once.
SAR r/m8, 1	REX + D0 /7		Signed divide* r/m8 by 2, once.
SAR r/m8, CL	D2 /7		Signed divide* r/m8 by 2, CL times.
SAR r/m8, CL	REX + D2 /7		Signed divide* r/m8 by 2, CL times.
SAR r/m8, imm8	C0 /7 ib		Signed divide* r/m8 by 2, imm8 time.
SAR r/m8, imm8	REX + C0 /7 ib		Signed divide* r/m8 by 2, imm8 times.
SAR r/m16,1	D1 /7		Signed divide* r/m16 by 2, once.
SAR r/m16, CL	D3 /7		Signed divide* r/m16 by 2, CL times.
SAR r/m16, imm8	C1 /7 ib		Signed divide*r/m16 by 2, imm8 times.
SAR r/m32, 1	D1 /7		Signed divide* r/m32 by 2, once.
SAR r/m64, 1	REX.W + D1 /7		Signed divide* r/m64 by 2, once.
SAR r/m32, CL	D3 /7		Signed divide*r/m32 by 2, CL times.
SAR r/m64, CL	REX.W + D3 /7		Signed divide*r/m64 by 2, CL times.
SAR r/m32, imm8	C1 /7 ib		Signed divide*r/m32 by 2, imm8 times.
SAR r/m64, imm8	REX.W + C1 /7 ib		Signed divide*r/m64 by 2, imm8 times
SHL r/m8, 1	D0 /4		Multiply r/m8 by 2, once.
SHL r/m8, 1	REX + D0 /4		Multiply r/m8 by 2, once.
SHL r/m8, CL	D2 /4		Multiply r/m8 by 2, CL times.
SHL r/m8, CL	REX + D2 /4		Multiply r/m8 by 2, CL times.
SHL r/m8, imm8	C0 /4 ib		Multiply r/m8 by 2, imm8 times.
SHL r/m8, imm8	REX + C0 /4 ib		Multiply r/m8 by 2, imm8 times.
SHL r/m16,1	D1 /4		Multiply r/m16 by 2, once.
SHL r/m16, CL	D3 /4		Multiply r/m16 by 2, CL times.
SHL r/m16, imm8	C1 /4 ib		Multiply r/m16 by 2, imm8 times.
SHL r/m32,1	D1 /4		Multiply r/m32 by 2, once.
SHL r/m64,1	REX.W + D1 /4		Multiply r/m64 by 2, once.
SHL r/m32, CL	D3 /4		Multiply r/m32 by 2, CL times.
SHL r/m64, CL	REX.W + D3 /4		Multiply r/m64 by 2, CL times.
SHL r/m32, imm8	C1 /4 ib		Multiply r/m32 by 2, imm8 times.
SHL r/m64, imm8	REX.W + C1 /4 ib		Multiply r/m64 by 2, imm8 times.
SHR r/m8,1	D0 /5		Unsigned divide r/m8 by 2, once.
SHR r/m8, 1	REX + D0 /5		Unsigned divide r/m8 by 2, once.
SHR r/m8, CL	D2 /5		Unsigned divide r/m8 by 2, CL times.
SHR r/m8, CL	REX + D2 /5		Unsigned divide r/m8 by 2, CL times.
SHR r/m8, imm8	C0 /5 ib		Unsigned divide r/m8 by 2, imm8 times.
SHR r/m8, imm8	REX + C0 /5 ib		Unsigned divide r/m8 by 2, imm8 times.
SHR r/m16, 1	D1 /5		Unsigned divide r/m16 by 2, once.
SHR r/m16, CL	D3 /5		Unsigned divide r/m16 by 2, CL times
SHR r/m16, imm8	C1 /5 ib		Unsigned divide r/m16 by 2, imm8 times.
SHR r/m32, 1	D1 /5		Unsigned divide r/m32 by 2, once.
SHR r/m64, 1	REX.W + D1 /5		Unsigned divide r/m64 by 2, once.
SHR r/m32, CL	D3 /5		Unsigned divide r/m32 by 2, CL times.
SHR r/m64, CL	REX.W + D3 /5		Unsigned divide r/m64 by 2, CL times.
SHR r/m32, imm8	C1 /5 ib		Unsigned divide r/m32 by 2, imm8 times.
SHR r/m64, imm8	REX.W + C1 /5 ib		Unsigned divide r/m64 by 2, imm8 times.
SARX r32a, r/m32, r32b	VEX.NDS.LZ.F3.0F38.W0 F7 /r	bmi2	Shift r/m32 arithmetically right with count specified in r32b.
SHLX r32a, r/m32, r32b	VEX.NDS.LZ.66.0F38.W0 F7 /r	bmi2	Shift r/m32 logically left with count specified in r32b.
SHRX r32a, r/m32, r32b	VEX.NDS.LZ.F2.0F38.W0 F7 /r	bmi2	Shift r/m32 logically right with count specified in r32b.
SARX r64a, r/m64, r64b	VEX.NDS.LZ.F3.0F38.W1 F7 /r	bmi2	Shift r/m64 arithmetically right with count specified in r64b.
SHLX r64a, r/m64, r64b	VEX.NDS.LZ.66.0F38.W1 F7 /r	bmi2	Shift r/m64 logically left with count specified in r64b.
SHRX r64a, r/m64, r64b	VEX.NDS.LZ.F2.0F38.W1 F7 /r	bmi2	Shift r/m64 logically right with count specified in r64b.
SBB AL, imm8	1C ib		Subtract with borrow imm8 from AL.
SBB AX, imm16	1D iw		Subtract with borrow imm16 from AX.
SBB EAX, imm32	1D id		Subtract with borrow imm32 from EAX.
SBB RAX, imm32	REX.W + 1D id		Subtract with borrow sign-extended imm.32 to 64-bits from RAX.
SBB r/m8, imm8	80 /3 ib		Subtract with borrow imm8 from r/m8.
SBB r/m8, imm8	REX + 80 /3 ib		Subtract with borrow imm8 from r/m8.
SBB r/m16, imm16	81 /3 iw		Subtract with borrow imm16 from r/m16.
SBB r/m32, imm32	81 /3 id		Subtract with borrow imm32 from r/m32.
SBB r/m64, imm32	REX.W + 81 /3 id		Subtract with borrow sign-extended imm32 to 64-bits from r/m64.
SBB r/m16, imm8	83 /3 ib		Subtract with borrow sign-extended imm8 from r/m16.
SBB r/m32, imm8	83 /3 ib		Subtract with borrow sign-extended imm8 from r/m32.
SBB r/m64, imm8	REX.W + 83 /3 ib		Subtract with borrow sign-extended imm8 from r/m64.
SBB r/m8, r8	18 /r		Subtract with borrow r8 from r/m8.
SBB r/m8, r8	REX + 18 /r		Subtract with borrow r8 from r/m8.
SBB r/m16, r16	19 /r		Subtract with borrow r16 from r/m16.
SBB r/m32, r32	19 /r		Subtract with borrow r32 from r/m32.
SBB r/m64, r64	REX.W + 19 /r		Subtract with borrow r64 from r/m64.
SBB r8, r/m8	1A /r		Subtract with borrow r/m8 from r8.
SBB r8, r/m8	REX + 1A /r		Subtract with borrow r/m8 from r8.
SBB r16, r/m16	1B /r		Subtract with borrow r/m16 from r16.
SBB r32, r/m32	1B /r		Subtract with borrow r/m32 from r32.
SBB r64, r/m64	REX.W + 1B /r		Subtract with borrow r/m64 from r64.
SCAS m8	AE		Compare AL with byte at ES:(E)DI or RDI, then set status flags.
SCAS m16	AF		Compare AX with word at ES:(E)DI or RDI, then set status flags.
SCAS m32	AF		Compare EAX with doubleword at ES(E)DI or RDI then set status flags.
SCAS m64	REX.W + AF		Compare RAX with quadword at RDI or EDI then set status flags.
SCASB	AE		Compare AL with byte at ES:(E)DI or RDI then set status flags.
SCASW	AF		Compare AX with word at ES:(E)DI or RDI then set status flags.
SCASD	AF		Compare EAX with doubleword at ES:(E)DI or RDI then set status flags.
SCASQ	REX.W + AF		Compare RAX with quadword at RDI or EDI then set status flags.
SETA r/m8	0F 97		Set byte if above (CF=0 and ZF=0).
SETA r/m8	REX + 0F 97		Set byte if above (CF=0 and ZF=0).
SETAE r/m8	0F 93		Set byte if above or equal (CF=0).
SETAE r/m8	REX + 0F 93		Set byte if above or equal (CF=0).
SETB r/m8	0F 92		Set byte if below (CF=1).
SETB r/m8	REX + 0F 92		Set byte if below (CF=1).
SETBE r/m8	0F 96		Set byte if below or equal (CF=1 or ZF=1).
SETBE r/m8	REX + 0F 96		Set byte if below or equal (CF=1 or ZF=1).
SETC r/m8	0F 92		Set byte if carry (CF=1).
SETC r/m8	REX + 0F 92		Set byte if carry (CF=1).
SETE r/m8	0F 94		Set byte if equal (ZF=1).
SETE r/m8	REX + 0F 94		Set byte if equal (ZF=1).
SETG r/m8	0F 9F		Set byte if greater (ZF=0 and SF=OF).
SETG r/m8	REX + 0F 9F		Set byte if greater (ZF=0 and SF=OF).
SETGE r/m8	0F 9D		Set byte if greater or equal (SF=OF).
SETGE r/m8	REX + 0F 9D		Set byte if greater or equal (SF=OF).
SETL r/m8	0F 9C		Set byte if less (SF≠ OF).
SETL r/m8	REX + 0F 9C		Set byte if less (SF≠ OF).
SETLE r/m8	0F 9E		Set byte if less or equal (ZF=1 or SF≠ OF).
SETLE r/m8	REX + 0F 9E		Set byte if less or equal (ZF=1 or SF≠ OF).
SETNA r/m8	0F 96		Set byte if not above (CF=1 or ZF=1).
SETNA r/m8	REX + 0F 96		Set byte if not above (CF=1 or ZF=1).
SETNAE r/m8	0F 92		Set byte if not above or equal (CF=1).
SETNAE r/m8	REX + 0F 92		Set byte if not above or equal (CF=1).
SETNB r/m8	0F 93		Set byte if not below (CF=0).
SETNB r/m8	REX + 0F 93		Set byte if not below (CF=0).
SETNBE r/m8	0F 97		Set byte if not below or equal (CF=0 and ZF=0).
SETNBE r/m8	REX + 0F 97		Set byte if not below or equal (CF=0 and ZF=0).
SETNC r/m8	0F 93		Set byte if not carry (CF=0).
SETNC r/m8	REX + 0F 93		Set byte if not carry (CF=0).
SETNE r/m8	0F 95		Set byte if not equal (ZF=0).
SETNE r/m8	REX + 0F 95		Set byte if not equal (ZF=0).
SETNG r/m8	0F 9E		Set byte if not greater (ZF=1 or SF≠ OF)
SETNG r/m8	REX + 0F 9E		Set byte if not greater (ZF=1 or SF≠ OF).
SETNGE r/m8	0F 9C		Set byte if not greater or equal (SF≠ OF).
SETNGE r/m8	REX + 0F 9C		Set byte if not greater or equal (SF≠ OF).
SETNL r/m8	0F 9D		Set byte if not less (SF=OF).
SETNL r/m8	REX + 0F 9D		Set byte if not less (SF=OF).
SETNLE r/m8	0F 9F		Set byte if not less or equal (ZF=0 and SF=OF).
SETNLE r/m8	REX + 0F 9F		Set byte if not less or equal (ZF=0 and SF=OF).
SETNO r/m8	0F 91		Set byte if not overflow (OF=0).
SETNO r/m8	REX + 0F 91		Set byte if not overflow (OF=0).
SETNP r/m8	0F 9B		Set byte if not parity (PF=0).
SETNP r/m8	REX + 0F 9B		Set byte if not parity (PF=0).
SETNS r/m8	0F 99		Set byte if not sign (SF=0).
SETNS r/m8	REX + 0F 99		Set byte if not sign (SF=0).
SETNZ r/m8	0F 95		Set byte if not zero (ZF=0).
SETNZ r/m8	REX + 0F 95		Set byte if not zero (ZF=0).
SETO r/m8	0F 90		Set byte if overflow (OF=1)
SETO r/m8	REX + 0F 90		Set byte if overflow (OF=1).
SETP r/m8	0F 9A		Set byte if parity (PF=1).
SETP r/m8	REX + 0F 9A		Set byte if parity (PF=1).
SETPE r/m8	0F 9A		Set byte if parity even (PF=1).
SETPE r/m8	REX + 0F 9A		Set byte if parity even (PF=1).
SETPO r/m8	0F 9B		Set byte if parity odd (PF=0).
SETPO r/m8	REX + 0F 9B		Set byte if parity odd (PF=0).
SETS r/m8	0F 98		Set byte if sign (SF=1).
SETS r/m8	REX + 0F 98		Set byte if sign (SF=1).
SETZ r/m8	0F 94		Set byte if zero (ZF=1).
SETZ r/m8	REX + 0F 94		Set byte if zero (ZF=1).
SFENCE	0F AE F8		Serializes store operations.
SGDT m	0F 01 /0		Store GDTR to m.
SHA1MSG1 xmm1, xmm2/m128	0F 38 C9 /r	sha	Performs an intermediate calculation for the next four SHA1 message dwords using previous message dwords from xmm1 and xmm2/m128, storing the result in xmm1.
SHA1MSG2 xmm1, xmm2/m128	0F 38 CA /r	sha	Performs the final calculation for the next four SHA1 message dwords using intermediate results from xmm1 and the previous message dwords from xmm2/m128, storing the result in xmm1.
SHA1NEXTE xmm1, xmm2/m128	0F 38 C8 /r	sha	Calculates SHA1 state variable E after four rounds of operation from the current SHA1 state variable A in xmm1. The calculated value of the SHA1 state variable E is added to the scheduled dwords in xmm2/m128, and stored with some of the scheduled dwords in xmm1.
SHA1RNDS4 xmm1, xmm2/m128, imm8	0F 3A CC /r ib	sha	Performs four rounds of SHA1 operation operating on SHA1 state (A,B,C,D) from xmm1, with a pre-computed sum of the next 4 round message dwords and state variable E from xmm2/m128. The immediate byte controls logic functions and round constants.
SHA256MSG1 xmm1, xmm2/m128	0F 38 CC /r	sha	Performs an intermediate calculation for the next four SHA256 message dwords using previous message dwords from xmm1 and xmm2/m128, storing the result in xmm1.
SHA256MSG2 xmm1, xmm2/m128	0F 38 CD /r	sha	Performs the final calculation for the next four SHA256 message dwords using previous message dwords from xmm1 and xmm2/m128, storing the result in xmm1.
SHA256RNDS2 xmm1, xmm2/m128, <XMM0>	0F 38 CB /r	sha	Perform 2 rounds of SHA256 operation using an initial SHA256 state (C,D,G,H) from xmm1, an initial SHA256 state (A,B,E,F) from xmm2/m128, and a pre-computed sum of the next 2 round mes-sage dwords and the corresponding round constants from the implicit operand XMM0, storing the updated SHA256 state (A,B,E,F) result in xmm1.
SHLD r/m16, r16, imm8	0F A4 /r ib		Shift r/m16 to left imm8 places while shifting bits from r16 in from the right.
SHLD r/m16, r16, CL	0F A5 /r		Shift r/m16 to left CL places while shifting bits from r16 in from the right.
SHLD r/m32, r32, imm8	0F A4 /r ib		Shift r/m32 to left imm8 places while shifting bits from r32 in from the right.
SHLD r/m64, r64, imm8	REX.W + 0F A4 /r ib		Shift r/m64 to left imm8 places while shifting bits from r64 in from the right.
SHLD r/m32, r32, CL	0F A5 /r		Shift r/m32 to left CL places while shifting bits from r32 in from the right.
SHLD r/m64, r64, CL	REX.W + 0F A5 /r		Shift r/m64 to left CL places while shifting bits from r64 in from the right.
SHRD r/m16, r16, imm8	0F AC /r ib		Shift r/m16 to right imm8 places while shifting bits from r16 in from the left.
SHRD r/m16, r16, CL	0F AD /r		Shift r/m16 to right CL places while shifting bits from r16 in from the left.
SHRD r/m32, r32, imm8	0F AC /r ib		Shift r/m32 to right imm8 places while shifting bits from r32 in from the left.
SHRD r/m64, r64, imm8	REX.W + 0F AC /r ib		Shift r/m64 to right imm8 places while shifting bits from r64 in from the left.
SHRD r/m32, r32, CL	0F AD /r		Shift r/m32 to right CL places while shifting bits from r32 in from the left.
SHRD r/m64, r64, CL	REX.W + 0F AD /r		Shift r/m64 to right CL places while shifting bits from r64 in from the left.
SHUFPD xmm1, xmm2/m128, imm8	66 0F C6 /r ib	sse2	Shuffle two pairs of double-precision floating-point values from xmm1 and xmm2/m128 using imm8 to select from each pair, interleaved result is stored in xmm1.
VSHUFPD xmm1, xmm2, xmm3/m128, imm8	VEX.NDS.128.66.0F.WIG C6 /r ib	avx	Shuffle two pairs of double-precision floating-point values from xmm2 and xmm3/m128 using imm8 to select from each pair, interleaved result is stored in xmm1.
VSHUFPD ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.66.0F.WIG C6 /r ib	avx	Shuffle four pairs of double-precision floating-point values from ymm2 and ymm3/m256 using imm8 to select from each pair, interleaved result is stored in xmm1.
VSHUFPD xmm1{k1}{z}, xmm2, xmm3/m128/m64bcst, imm8	EVEX.NDS.128.66.0F.W1 C6 /r ib	avx512	Shuffle two paris of double-precision floating-point values from xmm2 and xmm3/m128/m64bcst using imm8 to select from each pair. store interleaved results in xmm1 subject to writemask k1.
VSHUFPD ymm1{k1}{z}, ymm2, ymm3/m256/m64bcst, imm8	EVEX.NDS.256.66.0F.W1 C6 /r ib	avx512	Shuffle four paris of double-precision floating-point values from ymm2 and ymm3/m256/m64bcst using imm8 to select from each pair. store interleaved results in ymm1 subject to writemask k1.
VSHUFPD zmm1{k1}{z}, zmm2, zmm3/m512/m64bcst, imm8	EVEX.NDS.512.66.0F.W1 C6 /r ib	avx512	Shuffle eight paris of double-precision floating-point values from zmm2 and zmm3/m512/m64bcst using imm8 to select from each pair. store interleaved results in zmm1 subject to writemask k1.
SHUFPS xmm1, xmm3/m128, imm8	0F C6 /r ib	sse	Select from quadruplet of single-precision floating-point values in xmm1 and xmm2/m128 using imm8, interleaved result pairs are stored in xmm1.
VSHUFPS xmm1, xmm2, xmm3/m128, imm8	VEX.NDS.128.0F.WIG C6 /r ib	avx	Select from quadruplet of single-precision floating-point values in xmm1 and xmm2/m128 using imm8, interleaved result pairs are stored in xmm1.
VSHUFPS ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.0F.WIG C6 /r ib	avx	Select from quadruplet of single-precision floating-point values in ymm2 and ymm3/m256 using imm8, interleaved result pairs are stored in ymm1.
VSHUFPS xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst, imm8	EVEX.NDS.128.0F.W0 C6 /r ib	avx512	Select from quadruplet of single-precision floating-point values in xmm1 and xmm2/m128 using imm8, interleaved result pairs are stored in xmm1, subject to writemask k1.
VSHUFPS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst, imm8	EVEX.NDS.256.0F.W0 C6 /r ib	avx512	Select from quadruplet of single-precision floating-point values in ymm2 and ymm3/m256 using imm8, interleaved result pairs are stored in ymm1, subject to writemask k1.
VSHUFPS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst, imm8	EVEX.NDS.512.0F.W0 C6 /r ib	avx512	Select from quadruplet of single-precision floating-point values in zmm2 and zmm3/m512 using imm8, interleaved result pairs are stored in zmm1, subject to writemask k1.
SIDT m	0F 01 /1		Store IDTR to m.
SLDT r/m16	0F 00 /0		Stores segment selector from LDTR in r/m16.
SLDT r64/m16	REX.W + 0F 00 /0		Stores segment selector from LDTR in r64/m16.
SMSW r/m16	0F 01 /4		Store machine status word to r/m16.
SMSW r32/m16	0F 01 /4		Store machine status word in low-order 16 bits of r32/m16; high-order 16 bits of r32 are undefined.
SMSW r64/m16	REX.W + 0F 01 /4		Store machine status word in low-order 16 bits of r64/m16; high-order 16 bits of r32 are undefined.
SQRTPD xmm1, xmm2/m128	66 0F 51 /r	sse2	Computes Square Roots of the packed double-precision floating-point values in xmm2/m128 and stores the result in xmm1.
VSQRTPD xmm1, xmm2/m128	VEX.128.66.0F.WIG 51 /r	avx	Computes Square Roots of the packed double-precision floating-point values in xmm2/m128 and stores the result in xmm1.
VSQRTPD ymm1, ymm2/m256	VEX.256.66.0F.WIG 51 /r	avx	Computes Square Roots of the packed double-precision floating-point values in ymm2/m256 and stores the result in ymm1.
VSQRTPD xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.66.0F.W1 51 /r	avx512	Computes Square Roots of the packed double-precision floating-point values in xmm2/m128/m64bcst and stores the result in xmm1 subject to writemask k1.
VSQRTPD ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.66.0F.W1 51 /r	avx512	Computes Square Roots of the packed double-precision floating-point values in ymm2/m256/m64bcst and stores the result in ymm1 subject to writemask k1.
VSQRTPD zmm1 {k1}{z}, zmm2/m512/m64bcst{er}	EVEX.512.66.0F.W1 51 /r	avx512	Computes Square Roots of the packed double-precision floating-point values in zmm2/m512/m64bcst and stores the result in zmm1 subject to writemask k1.
SQRTPS xmm1, xmm2/m128	0F 51 /r	sse	Computes Square Roots of the packed single-precision floating-point values in xmm2/m128 and stores the result in xmm1.
VSQRTPS xmm1, xmm2/m128	VEX.128.0F.WIG 51 /r	avx	Computes Square Roots of the packed single-precision floating-point values in xmm2/m128 and stores the result in xmm1.
VSQRTPS ymm1, ymm2/m256	VEX.256.0F.WIG 51/r	avx	Computes Square Roots of the packed single-precision floating-point values in ymm2/m256 and stores the result in ymm1.
VSQRTPS xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.0F.W0 51 /r	avx512	Computes Square Roots of the packed single-precision floating-point values in xmm2/m128/m32bcst and stores the result in xmm1 subject to writemask k1.
VSQRTPS ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.0F.W0 51 /r	avx512	Computes Square Roots of the packed single-precision floating-point values in ymm2/m256/m32bcst and stores the result in ymm1 subject to writemask k1.
VSQRTPS zmm1 {k1}{z}, zmm2/m512/m32bcst{er}	EVEX.512.0F.W0 51/r	avx512	Computes Square Roots of the packed single-precision floating-point values in zmm2/m512/m32bcst and stores the result in zmm1 subject to writemask k1.
SQRTSD xmm1,xmm2/m64	F2 0F 51/r	sse2	Computes square root of the low double-precision floating-point value in xmm2/m64 and stores the results in xmm1.
VSQRTSD xmm1,xmm2, xmm3/m64	VEX.NDS.128.F2.0F.WIG 51/r	avx	Computes square root of the low double-precision floating-point value in xmm3/m64 and stores the results in xmm1. Also, upper double-precision floating-point value (bits[127:64]) from xmm2 is copied to xmm1[127:64].
VSQRTSD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.NDS.LIG.F2.0F.W1 51/r	avx512	Computes square root of the low double-precision floating-point value in xmm3/m64 and stores the results in xmm1 under writemask k1. Also, upper double-precision floating-point value (bits[127:64]) from xmm2 is copied to xmm1[127:64].
SQRTSS xmm1, xmm2/m32	F3 0F 51 /r	sse	Computes square root of the low single-precision floating-point value in xmm2/m32 and stores the results in xmm1.
VSQRTSS xmm1, xmm2, xmm3/m32	VEX.NDS.128.F3.0F.WIG 51 /r	avx	Computes square root of the low single-precision floating-point value in xmm3/m32 and stores the results in xmm1. Also, upper single-precision floating-point values (bits[127:32]) from xmm2 are copied to xmm1[127:32].
VSQRTSS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.NDS.LIG.F3.0F.W0 51 /r	avx512	Computes square root of the low single-precision floating-point value in xmm3/m32 and stores the results in xmm1 under writemask k1. Also, upper single-precision floating-point values (bits[127:32]) from xmm2 are copied to xmm1[127:32].
STAC	0F 01 CB		Set the AC flag in the EFLAGS register.
STC	F9		Set CF flag.
STD	FD		Set DF flag.
STI	FB		Set interrupt flag; external, maskable interrupts enabled at the end of the next instruction.
STMXCSR m32	0F AE /3	sse	Store contents of MXCSR register to m32.
VSTMXCSR m32	VEX.LZ.0F.WIG AE /3	avx	Store contents of MXCSR register to m32.
STOS m8	AA		For legacy mode, store AL at address ES:(E)DI; For 64-bit mode store AL at address RDI or EDI.
STOS m16	AB		For legacy mode, store AX at address ES:(E)DI; For 64-bit mode store AX at address RDI or EDI.
STOS m32	AB		For legacy mode, store EAX at address ES:(E)DI; For 64-bit mode store EAX at address RDI or EDI.
STOS m64	REX.W + AB		Store RAX at address RDI or EDI.
STOSB	AA		For legacy mode, store AL at address ES:(E)DI; For 64-bit mode store AL at address RDI or EDI.
STOSW	AB		For legacy mode, store AX at address ES:(E)DI; For 64-bit mode store AX at address RDI or EDI.
STOSD	AB		For legacy mode, store EAX at address ES:(E)DI; For 64-bit mode store EAX at address RDI or EDI.
STOSQ	REX.W + AB		Store RAX at address RDI or EDI.
STR r/m16	0F 00 /1		Stores segment selector from TR in r/m16.
SUB AL, imm8	2C ib		Subtract imm8 from AL.
SUB AX, imm16	2D iw		Subtract imm16 from AX.
SUB EAX, imm32	2D id		Subtract imm32 from EAX.
SUB RAX, imm32	REX.W + 2D id		Subtract imm32 sign-extended to 64-bits from RAX.
SUB r/m8, imm8	80 /5 ib		Subtract imm8 from r/m8.
SUB r/m8, imm8	REX + 80 /5 ib		Subtract imm8 from r/m8.
SUB r/m16, imm16	81 /5 iw		Subtract imm16 from r/m16.
SUB r/m32, imm32	81 /5 id		Subtract imm32 from r/m32.
SUB r/m64, imm32	REX.W + 81 /5 id		Subtract imm32 sign-extended to 64-bits from r/m64.
SUB r/m16, imm8	83 /5 ib		Subtract sign-extended imm8 from r/m16.
SUB r/m32, imm8	83 /5 ib		Subtract sign-extended imm8 from r/m32.
SUB r/m64, imm8	REX.W + 83 /5 ib		Subtract sign-extended imm8 from r/m64.
SUB r/m8, r8	28 /r		Subtract r8 from r/m8.
SUB r/m8, r8	REX + 28 /r		Subtract r8 from r/m8.
SUB r/m16, r16	29 /r		Subtract r16 from r/m16.
SUB r/m32, r32	29 /r		Subtract r32 from r/m32.
SUB r/m64, r64	REX.W + 29 /r		Subtract r64 from r/m64.
SUB r8, r/m8	2A /r		Subtract r/m8 from r8.
SUB r8, r/m8	REX + 2A /r		Subtract r/m8 from r8.
SUB r16, r/m16	2B /r		Subtract r/m16 from r16.
SUB r32, r/m32	2B /r		Subtract r/m32 from r32.
SUB r64, r/m64	REX.W + 2B /r		Subtract r/m64 from r64.
SUBPD xmm1, xmm2/m128	66 0F 5C /r	sse2	Subtract packed double-precision floating-point values in xmm2/mem from xmm1 and store result in xmm1.
VSUBPD xmm1,xmm2, xmm3/m128	VEX.NDS.128.66.0F.WIG 5C /r	avx	Subtract packed double-precision floating-point values in xmm3/mem from xmm2 and store result in xmm1.
VSUBPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F.WIG 5C /r	avx	Subtract packed double-precision floating-point values in ymm3/mem from ymm2 and store result in ymm1.
VSUBPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F.W1 5C /r	avx512	Subtract packed double-precision floating-point values from xmm3/m128/m64bcst to xmm2 and store result in xmm1 with writemask k1.
VSUBPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F.W1 5C /r	avx512	Subtract packed double-precision floating-point values from ymm3/m256/m64bcst to ymm2 and store result in ymm1 with writemask k1.
VSUBPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F.W1 5C /r	avx512	Subtract packed double-precision floating-point values from zmm3/m512/m64bcst to zmm2 and store result in zmm1 with writemask k1.
SUBPS xmm1, xmm2/m128	0F 5C /r	sse	Subtract packed single-precision floating-point values in xmm2/mem from xmm1 and store result in xmm1.
VSUBPS xmm1,xmm2, xmm3/m128	VEX.NDS.128.0F.WIG 5C /r	avx	Subtract packed single-precision floating-point values in xmm3/mem from xmm2 and stores result in xmm1.
VSUBPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.0F.WIG 5C /r	avx	Subtract packed single-precision floating-point values in ymm3/mem from ymm2 and stores result in ymm1.
VSUBPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.0F.W0 5C /r	avx512	Subtract packed single-precision floating-point values from xmm3/m128/m32bcst to xmm2 and stores result in xmm1 with writemask k1.
VSUBPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.0F.W0 5C /r	avx512	Subtract packed single-precision floating-point values from ymm3/m256/m32bcst to ymm2 and stores result in ymm1 with writemask k1.
VSUBPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.0F.W0 5C /r	avx512	Subtract packed single-precision floating-point values in zmm3/m512/m32bcst from zmm2 and stores result in zmm1 with writemask k1.
SUBSD xmm1, xmm2/m64	F2 0F 5C /r	sse2	Subtract the low double-precision floating-point value in xmm2/m64 from xmm1 and store the result in xmm1.
VSUBSD xmm1,xmm2, xmm3/m64	VEX.NDS.128.F2.0F.WIG 5C /r	avx	Subtract the low double-precision floating-point value in xmm3/m64 from xmm2 and store the result in xmm1.
VSUBSD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.NDS.LIG.F2.0F.W1 5C /r	avx512	Subtract the low double-precision floating-point value in xmm3/m64 from xmm2 and store the result in xmm1 under writemask k1.
SUBSS xmm1, xmm2/m32	F3 0F 5C /r	sse	Subtract the low single-precision floating-point value in xmm2/m32 from xmm1 and store the result in xmm1.
VSUBSS xmm1,xmm2, xmm3/m32	VEX.NDS.128.F3.0F.WIG 5C /r	avx	Subtract the low single-precision floating-point value in xmm3/m32 from xmm2 and store the result in xmm1.
VSUBSS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.NDS.LIG.F3.0F.W0 5C /r	avx512	Subtract the low single-precision floating-point value in xmm3/m32 from xmm2 and store the result in xmm1 under writemask k1.
SWAPGS	0F 01 F8		Exchanges the current GS base register value with the value contained in MSR address C0000102H.
SYSCALL	0F 05		Fast call to privilege level 0 system procedures.
SYSENTER	0F 34		Fast call to privilege level 0 system procedures.
SYSEXIT	0F 35		Fast return to privilege level 3 user code.
SYSEXIT	REX.W + 0F 35		Fast return to 64-bit mode privilege level 3 user code.
SYSRET	0F 07		Return to compatibility mode from fast system call
SYSRET	REX.W + 0F 07		Return to 64-bit mode from fast system call
TEST AL, imm8	A8 ib		AND imm8 with AL; set SF, ZF, PF according to result.
TEST AX, imm16	A9 iw		AND imm16 with AX; set SF, ZF, PF according to result.
TEST EAX, imm32	A9 id		AND imm32 with EAX; set SF, ZF, PF according to result.
TEST RAX, imm32	REX.W + A9 id		AND imm32 sign-extended to 64-bits with RAX; set SF, ZF, PF according to result.
TEST r/m8, imm8	F6 /0 ib		AND imm8 with r/m8; set SF, ZF, PF according to result.
TEST r/m8, imm8	REX + F6 /0 ib		AND imm8 with r/m8; set SF, ZF, PF according to result.
TEST r/m16, imm16	F7 /0 iw		AND imm16 with r/m16; set SF, ZF, PF according to result.
TEST r/m32, imm32	F7 /0 id		AND imm32 with r/m32; set SF, ZF, PF according to result.
TEST r/m64, imm32	REX.W + F7 /0 id		AND imm32 sign-extended to 64-bits with r/m64; set SF, ZF, PF according to result.
TEST r/m8, r8	84 /r		AND r8 with r/m8; set SF, ZF, PF according to result.
TEST r/m8, r8	REX + 84 /r		AND r8 with r/m8; set SF, ZF, PF according to result.
TEST r/m16, r16	85 /r		AND r16 with r/m16; set SF, ZF, PF according to result.
TEST r/m32, r32	85 /r		AND r32 with r/m32; set SF, ZF, PF according to result.
TEST r/m64, r64	REX.W + 85 /r		AND r64 with r/m64; set SF, ZF, PF according to result.
TZCNT r16, r/m16	F3 0F BC /r	bmi1	Count the number of trailing zero bits in r/m16, return result in r16.
TZCNT r32, r/m32	F3 0F BC /r	bmi1	Count the number of trailing zero bits in r/m32, return result in r32.
TZCNT r64, r/m64	F3 REX.W 0F BC /r	bmi1	Count the number of trailing zero bits in r/m64, return result in r64.
UCOMISD xmm1, xmm2/m64	66 0F 2E /r	sse2	Compare low double-precision floating-point values in xmm1 and xmm2/mem64 and set the EFLAGS flags accordingly.
VUCOMISD xmm1, xmm2/m64	VEX.128.66.0F.WIG 2E /r	avx	Compare low double-precision floating-point values in xmm1 and xmm2/mem64 and set the EFLAGS flags accordingly.
VUCOMISD xmm1, xmm2/m64{sae}	EVEX.LIG.66.0F.W1 2E /r	avx512	Compare low double-precision floating-point values in xmm1 and xmm2/m64 and set the EFLAGS flags accordingly.
UCOMISS xmm1, xmm2/m32	0F 2E /r	sse	Compare low single-precision floating-point values in xmm1 and xmm2/mem32 and set the EFLAGS flags accordingly.
VUCOMISS xmm1, xmm2/m32	VEX.128.0F.WIG 2E /r	avx	Compare low single-precision floating-point values in xmm1 and xmm2/mem32 and set the EFLAGS flags accordingly.
VUCOMISS xmm1, xmm2/m32{sae}	EVEX.LIG.0F.W0 2E /r	avx512	Compare low single-precision floating-point values in xmm1 and xmm2/mem32 and set the EFLAGS flags accordingly.
UD2	0F 0B		Raise invalid opcode exception.
UNPCKHPD xmm1, xmm2/m128	66 0F 15 /r	sse2	Unpacks and Interleaves double-precision floating-point values from high quadwords of xmm1 and xmm2/m128.
VUNPCKHPD xmm1,xmm2, xmm3/m128	VEX.128.66.0F.WIG 15 /r	avx	Unpacks and Interleaves double-precision floating-point values from high quadwords of xmm2 and xmm3/m128.
VUNPCKHPD ymm1,ymm2, ymm3/m256	VEX.256.66.0F.WIG 15 /r	avx	Unpacks and Interleaves double-precision floating-point values from high quadwords of ymm2 and ymm3/m256.
VUNPCKHPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.128.66.0F.W1 15 /r	avx512	Unpacks and Interleaves double precision floating-point values from high quadwords of xmm2 and xmm3/m128/m64bcst subject to writemask k1.
VUNPCKHPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.256.66.0F.W1 15 /r	avx512	Unpacks and Interleaves double precision floating-point values from high quadwords of ymm2 and ymm3/m256/m64bcst subject to writemask k1.
VUNPCKHPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.512.66.0F.W1 15 /r	avx512	Unpacks and Interleaves double-precision floating-point values from high quadwords of zmm2 and zmm3/m512/m64bcst subject to writemask k1.
UNPCKHPS xmm1, xmm2/m128	NP 0F 15 /r	sse	Unpacks and Interleaves single-precision floating-point values from high quadwords of xmm1 and xmm2/m128.
VUNPCKHPS xmm1, xmm2, xmm3/m128	VEX.128.0F.WIG 15 /r	avx	Unpacks and Interleaves single-precision floating-point values from high quadwords of xmm2 and xmm3/m128.
VUNPCKHPS ymm1, ymm2, ymm3/m256	VEX.256.0F.WIG 15 /r	avx	Unpacks and Interleaves single-precision floating-point values from high quadwords of ymm2 and ymm3/m256.
VUNPCKHPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.128.0F.W0 15 /r	avx512	Unpacks and Interleaves single-precision floating-point values from high quadwords of xmm2 and xmm3/m128/m32bcst and write result to xmm1 subject to writemask k1.
VUNPCKHPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.256.0F.W0 15 /r	avx512	Unpacks and Interleaves single-precision floating-point values from high quadwords of ymm2 and ymm3/m256/m32bcst and write result to ymm1 subject to writemask k1.
VUNPCKHPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.512.0F.W0 15 /r	avx512	Unpacks and Interleaves single-precision floating-point values from high quadwords of zmm2 and zmm3/m512/m32bcst and write result to zmm1 subject to writemask k1.
UNPCKLPD xmm1, xmm2/m128	66 0F 14 /r	sse	Unpacks and Interleaves double-precision floating-point values from low quadwords of xmm1 and xmm2/m128.
VUNPCKLPD xmm1,xmm2, xmm3/m128	VEX.128.66.0F.WIG 14 /r	avx	Unpacks and Interleaves double-precision floating-point values from low quadwords of xmm2 and xmm3/m128.
VUNPCKLPD ymm1,ymm2, ymm3/m256	VEX.256.66.0F.WIG 14 /r	avx	Unpacks and Interleaves double-precision floating-point values from low quadwords of ymm2 and ymm3/m256.
VUNPCKLPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.128.66.0F.W1 14 /r	avx512	Unpacks and Interleaves double precision floating-point values from low quadwords of xmm2 and xmm3/m128/m64bcst subject to write mask k1.
VUNPCKLPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.256.66.0F.W1 14 /r	avx512	Unpacks and Interleaves double precision floating-point values from low quadwords of ymm2 and ymm3/m256/m64bcst subject to write mask k1.
VUNPCKLPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.512.66.0F.W1 14 /r	avx512	Unpacks and Interleaves double-precision floating-point values from low quadwords of zmm2 and zmm3/m512/m64bcst subject to write mask k1.
UNPCKLPS xmm1, xmm2/m128	NP 0F 14 /r	sse	Unpacks and Interleaves single-precision floating-point values from low quadwords of xmm1 and xmm2/m128.
VUNPCKLPS xmm1,xmm2, xmm3/m128	VEX.128.0F.WIG 14 /r	avx	Unpacks and Interleaves single-precision floating-point values from low quadwords of xmm2 and xmm3/m128.
VUNPCKLPS ymm1,ymm2,ymm3/m256	VEX.256.0F.WIG 14 /r	avx	Unpacks and Interleaves single-precision floating-point values from low quadwords of ymm2 and ymm3/m256.
VUNPCKLPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.128.0F.W0 14 /r	avx512	Unpacks and Interleaves single-precision floating-point values from low quadwords of xmm2 and xmm3/mem and write result to xmm1 subject to write mask k1.
VUNPCKLPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.256.0F.W0 14 /r	avx512	Unpacks and Interleaves single-precision floating-point values from low quadwords of ymm2 and ymm3/mem and write result to ymm1 subject to write mask k1.
VUNPCKLPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.512.0F.W0 14 /r	avx512	Unpacks and Interleaves single-precision floating-point values from low quadwords of zmm2 and zmm3/m512/m32bcst and write result to zmm1 subject to write mask k1.
VALIGND xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst, imm8	EVEX.NDS.128.66.0F3A.W0 03 /r ib	avx512	Shift right and merge vectors xmm2 and xmm3/m128/m32bcst with double-word granularity using imm8 as number of elements to shift, and store the final result in xmm1, under writemask.
VALIGNQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst, imm8	EVEX.NDS.128.66.0F3A.W1 03 /r ib	avx512	Shift right and merge vectors xmm2 and xmm3/m128/m64bcst with quad-word granularity using imm8 as number of elements to shift, and store the final result in xmm1, under writemask.
VALIGND ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst, imm8	EVEX.NDS.256.66.0F3A.W0 03 /r ib	avx512	Shift right and merge vectors ymm2 and ymm3/m256/m32bcst with double-word granularity using imm8 as number of elements to shift, and store the final result in ymm1, under writemask.
VALIGNQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst, imm8	EVEX.NDS.256.66.0F3A.W1 03 /r ib	avx512	Shift right and merge vectors ymm2 and ymm3/m256/m64bcst with quad-word granularity using imm8 as number of elements to shift, and store the final result in ymm1, under writemask.
VALIGND zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst, imm8	EVEX.NDS.512.66.0F3A.W0 03 /r ib	avx512	Shift right and merge vectors zmm2 and zmm3/m512/m32bcst with double-word granularity using imm8 as number of elements to shift, and store the final result in zmm1, under writemask.
VALIGNQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst, imm8	EVEX.NDS.512.66.0F3A.W1 03 /r ib	avx512	Shift right and merge vectors zmm2 and zmm3/m512/m64bcst with quad-word granularity using imm8 as number of elements to shift, and store the final result in zmm1, under writemask.
VBLENDMPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 65 /r	avx512	Blend double-precision vector xmm2 and double-precision vector xmm3/m128/m64bcst and store the result in xmm1, under control mask.
VBLENDMPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 65 /r	avx512	Blend double-precision vector ymm2 and double-precision vector ymm3/m256/m64bcst and store the result in ymm1, under control mask.
VBLENDMPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 65 /r	avx512	Blend double-precision vector zmm2 and double-precision vector zmm3/m512/m64bcst and store the result in zmm1, under control mask.
VBLENDMPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 65 /r	avx512	Blend single-precision vector xmm2 and single-precision vector xmm3/m128/m32bcst and store the result in xmm1, under control mask.
VBLENDMPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 65 /r	avx512	Blend single-precision vector ymm2 and single-precision vector ymm3/m256/m32bcst and store the result in ymm1, under control mask.
VBLENDMPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 65 /r	avx512	Blend single-precision vector zmm2 and single-precision vector zmm3/m512/m32bcst using k1 as select control and store the result in zmm1.
VBROADCASTSS xmm1, m32	VEX.128.66.0F38.W0 18 /r	avx	Broadcast single-precision floating-point element in mem to four locations in xmm1.
VBROADCASTSS ymm1, m32	VEX.256.66.0F38.W0 18 /r	avx	Broadcast single-precision floating-point element in mem to eight locations in ymm1.
VBROADCASTSD ymm1, m64	VEX.256.66.0F38.W0 19 /r	avx	Broadcast double-precision floating-point element in mem to four locations in ymm1.
VBROADCASTF128 ymm1, m128	VEX.256.66.0F38.W0 1A /r	avx	Broadcast 128 bits of floating-point data in mem to low and high 128-bits in ymm1.
VBROADCASTSD ymm1 {k1}{z}, xmm2/m64	EVEX.256.66.0F38.W1 19 /r	avx512	Broadcast low double-precision floating-point element in xmm2/m64 to four locations in ymm1 using writemask k1.
VBROADCASTSD zmm1 {k1}{z}, xmm2/m64	EVEX.512.66.0F38.W1 19 /r	avx512	Broadcast low double-precision floating-point element in xmm2/m64 to eight locations in zmm1 using writemask k1.
VBROADCASTF32X2 ymm1 {k1}{z}, xmm2/m64	EVEX.256.66.0F38.W0 19 /r	avx512	Broadcast two single-precision floating-point elements in xmm2/m64 to locations in ymm1 using writemask k1.
VBROADCASTF32X2 zmm1 {k1}{z}, xmm2/m64	EVEX.512.66.0F38.W0 19 /r	avx512	Broadcast two single-precision floating-point elements in xmm2/m64 to locations in zmm1 using writemask k1.
VBROADCASTSS xmm1 {k1}{z}, xmm2/m32	EVEX.128.66.0F38.W0 18 /r	avx512	Broadcast low single-precision floating-point element in xmm2/m32 to all locations in xmm1 using writemask k1.
VBROADCASTSS ymm1 {k1}{z}, xmm2/m32	EVEX.256.66.0F38.W0 18 /r	avx512	Broadcast low single-precision floating-point element in xmm2/m32 to all locations in ymm1 using writemask k1.
VBROADCASTSS zmm1 {k1}{z}, xmm2/m32	EVEX.512.66.0F38.W0 18 /r	avx512	Broadcast low single-precision floating-point element in xmm2/m32 to all locations in zmm1 using writemask k1.
VBROADCASTF32X4 ymm1 {k1}{z}, m128	EVEX.256.66.0F38.W0 1A /r	avx512	Broadcast 128 bits of 4 single-precision floating-point data in mem to locations in ymm1 using writemask k1.
VBROADCASTF32X4 zmm1 {k1}{z}, m128	EVEX.512.66.0F38.W0 1A /r	avx512	Broadcast 128 bits of 4 single-precision floating-point data in mem to locations in zmm1 using writemask k1.
VBROADCASTF64X2 ymm1 {k1}{z}, m128	EVEX.256.66.0F38.W1 1A /r	avx512	Broadcast 128 bits of 2 double-precision floating-point data in mem to locations in ymm1 using writemask k1.
VBROADCASTF64X2 zmm1 {k1}{z}, m128	EVEX.512.66.0F38.W1 1A /r	avx512	Broadcast 128 bits of 2 double-precision floating-point data in mem to locations in zmm1 using writemask k1.
VBROADCASTF32X8 zmm1 {k1}{z}, m256	EVEX.512.66.0F38.W0 1B /r	avx512	Broadcast 256 bits of 8 single-precision floating-point data in mem to locations in zmm1 using writemask k1.
VBROADCASTF64X4 zmm1 {k1}{z}, m256	EVEX.512.66.0F38.W1 1B /r	avx512	Broadcast 256 bits of 4 double-precision floating-point data in mem to locations in zmm1 using writemask k1.
VCOMPRESSPD xmm1/m128 {k1}{z}, xmm2	EVEX.128.66.0F38.W1 8A /r	avx512	Compress packed double-precision floating-point values from xmm2 to xmm1/m128 using writemask k1.
VCOMPRESSPD ymm1/m256 {k1}{z}, ymm2	EVEX.256.66.0F38.W1 8A /r	avx512	Compress packed double-precision floating-point values from ymm2 to ymm1/m256 using writemask k1.
VCOMPRESSPD zmm1/m512 {k1}{z}, zmm2	EVEX.512.66.0F38.W1 8A /r	avx512	Compress packed double-precision floating-point values from zmm2 using control mask k1 to zmm1/m512.
VCOMPRESSPS xmm1/m128 {k1}{z}, xmm2	EVEX.128.66.0F38.W0 8A /r	avx512	Compress packed single-precision floating-point values from xmm2 to xmm1/m128 using writemask k1.
VCOMPRESSPS ymm1/m256 {k1}{z}, ymm2	EVEX.256.66.0F38.W0 8A /r	avx512	Compress packed single-precision floating-point values from ymm2 to ymm1/m256 using writemask k1.
VCOMPRESSPS zmm1/m512 {k1}{z}, zmm2	EVEX.512.66.0F38.W0 8A /r	avx512	Compress packed single-precision floating-point values from zmm2 using control mask k1 to zmm1/m512.
VCVTPD2QQ xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.66.0F.W1 7B /r	avx512	Convert two packed double-precision floating-point values from xmm2/m128/m64bcst to two packed quadword integers in xmm1 with writemask k1.
VCVTPD2QQ ymm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.66.0F.W1 7B /r	avx512	Convert four packed double-precision floating-point values from ymm2/m256/m64bcst to four packed quadword integers in ymm1 with writemask k1.
VCVTPD2QQ zmm1 {k1}{z}, zmm2/m512/m64bcst{er}	EVEX.512.66.0F.W1 7B /r	avx512	Convert eight packed double-precision floating-point values from zmm2/m512/m64bcst to eight packed quadword integers in zmm1 with writemask k1.
VCVTPD2UDQ xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.0F.W1 79 /r	avx512	Convert two packed double-precision floating-point values in xmm2/m128/m64bcst to two unsigned doubleword integers in xmm1 subject to writemask k1.
VCVTPD2UDQ xmm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.0F.W1 79 /r	avx512	Convert four packed double-precision floating-point values in ymm2/m256/m64bcst to four unsigned doubleword integers in xmm1 subject to writemask k1.
VCVTPD2UDQ ymm1 {k1}{z}, zmm2/m512/m64bcst{er}	EVEX.512.0F.W1 79 /r	avx512	Convert eight packed double-precision floating-point values in zmm2/m512/m64bcst to eight unsigned doubleword integers in ymm1 subject to writemask k1.
VCVTPD2UQQ xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.66.0F.W1 79 /r	avx512	Convert two packed double-precision floating-point values from xmm2/mem to two packed unsigned quadword integers in xmm1 with writemask k1.
VCVTPD2UQQ ymm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.66.0F.W1 79 /r	avx512	Convert fourth packed double-precision floating-point values from ymm2/mem to four packed unsigned quadword integers in ymm1 with writemask k1.
VCVTPD2UQQ zmm1 {k1}{z}, zmm2/m512/m64bcst{er}	EVEX.512.66.0F.W1 79 /r	avx512	Convert eight packed double-precision floating-point values from zmm2/mem to eight packed unsigned quadword integers in zmm1 with writemask k1.
VCVTPH2PS xmm1, xmm2/m64	VEX.128.66.0F38.W0 1313 /r	f16c	Convert four packed half precision (16-bit) floating-point values in xmm2/m64 to packed single-precision floating-point value in xmm1.
VCVTPH2PS ymm1, xmm2/m128	VEX.256.66.0F38.W0 1313 /r	f16c	Convert eight packed half precision (16-bit) floating-point values in xmm2/m128 to packed single-precision floating-point value in ymm1.
VCVTPH2PS xmm1 {k1}{z}, xmm2/m64	EVEX.128.66.0F38.W0 1313 /r	avx512	Convert four packed half precision (16-bit) floating-point values in xmm2/m64 to packed single-precision floating-point values in xmm1.
VCVTPH2PS ymm1 {k1}{z}, xmm2/m128	EVEX.256.66.0F38.W0 1313 /r	avx512	Convert eight packed half precision (16-bit) floating-point values in xmm2/m128 to packed single-precision floating-point values in ymm1.
VCVTPH2PS zmm1 {k1}{z}, ymm2/m256 {sae}	EVEX.512.66.0F38.W0 1313 /r	avx512	Convert sixteen packed half precision (16-bit) floating-point values in ymm2/m256 to packed single-precision floating-point values in zmm1.
VCVTPS2PH xmm1/m64, xmm2, imm8	VEX.128.66.0F3A.W0 1D 1D/r ib	f16c	Convert four packed single-precision floating-point values in xmm2 to packed half-precision (16-bit) floating-point values in xmm1/m64. Imm8 provides rounding controls.
VCVTPS2PH xmm1/m128, ymm2, imm8	VEX.256.66.0F3A.W0 1D1D /r ib	f16c	Convert eight packed single-precision floating-point values in ymm2 to packed half-precision (16-bit) floating-point values in xmm1/m128. Imm8 provides rounding controls.
VCVTPS2PH xmm1/m64 {k1}{z}, xmm2, imm8	EVEX.128.66.0F3A.W0 1D1D /r ib	avx512	Convert four packed single-precision floating-point values in xmm2 to packed half-precision (16-bit) floating-point values in xmm1/m64. Imm8 provides rounding controls.
VCVTPS2PH xmm1/m128 {k1}{z}, ymm2, imm8	EVEX.256.66.0F3A.W0 1D1D /r ib	avx512	Convert eight packed single-precision floating-point values in ymm2 to packed half-precision (16-bit) floating-point values in xmm1/m128. Imm8 provides rounding controls.
VCVTPS2PH ymm1/m256 {k1}{z}, zmm2{sae}, imm8	EVEX.512.66.0F3A.W0 1D1D /r ib	avx512	Convert sixteen packed single-precision floating-point values in zmm2 to packed half-precision (16-bit) floating-point values in ymm1/m256. Imm8 provides rounding controls.
VCVTPS2QQ xmm1 {k1}{z}, xmm2/m64/m32bcst	EVEX.128.66.0F.W0 7B /r	avx512	Convert two packed single precision floating-point values from xmm2/m64/m32bcst to two packed signed quadword values in xmm1 subject to writemask k1.
VCVTPS2QQ ymm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.256.66.0F.W0 7B /r	avx512	Convert four packed single precision floating-point values from xmm2/m128/m32bcst to four packed signed quadword values in ymm1 subject to writemask k1.
VCVTPS2QQ zmm1 {k1}{z}, ymm2/m256/m32bcst{er}	EVEX.512.66.0F.W0 7B /r	avx512	Convert eight packed single precision floating-point values from ymm2/m256/m32bcst to eight packed signed quadword values in zmm1 subject to writemask k1.
VCVTPS2UDQ xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.0F.W0 79 /r	avx512	Convert four packed single precision floating-point values from xmm2/m128/m32bcst to four packed unsigned doubleword values in xmm1 subject to writemask k1.
VCVTPS2UDQ ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.0F.W0 79 /r	avx512	Convert eight packed single precision floating-point values from ymm2/m256/m32bcst to eight packed unsigned doubleword values in ymm1 subject to writemask k1.
VCVTPS2UDQ zmm1 {k1}{z}, zmm2/m512/m32bcst{er}	EVEX.512.0F.W0 79 /r	avx512	Convert sixteen packed single-precision floating-point values from zmm2/m512/m32bcst to sixteen packed unsigned doubleword values in zmm1 subject to writemask k1.
VCVTPS2UQQ xmm1 {k1}{z}, xmm2/m64/m32bcst	EVEX.128.66.0F.W0 79 /r	avx512	Convert two packed single precision floating-point values from zmm2/m64/m32bcst to two packed unsigned quadword values in zmm1 subject to writemask k1.
VCVTPS2UQQ ymm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.256.66.0F.W0 79 /r	avx512	Convert four packed single precision floating-point values from xmm2/m128/m32bcst to four packed unsigned quadword values in ymm1 subject to writemask k1.
VCVTPS2UQQ zmm1 {k1}{z}, ymm2/m256/m32bcst{er}	EVEX.512.66.0F.W0 79 /r	avx512	Convert eight packed single precision floating-point values from ymm2/m256/m32bcst to eight packed unsigned quadword values in zmm1 subject to writemask k1.
VCVTQQ2PD xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.F3.0F.W1 E6 /r	avx512	Convert two packed quadword integers from xmm2/m128/m64bcst to packed double-precision floating-point values in xmm1 with writemask k1.
VCVTQQ2PD ymm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.F3.0F.W1 E6 /r	avx512	Convert four packed quadword integers from ymm2/m256/m64bcst to packed double-precision floating-point values in ymm1 with writemask k1.
VCVTQQ2PD zmm1 {k1}{z}, zmm2/m512/m64bcst{er}	EVEX.512.F3.0F.W1 E6 /r	avx512	Convert eight packed quadword integers from zmm2/m512/m64bcst to eight packed double-precision floating-point values in zmm1 with writemask k1.
VCVTQQ2PS xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.0F.W1 5B /r	avx512	Convert two packed quadword integers from xmm2/mem to packed single-precision floating-point values in xmm1 with writemask k1.
VCVTQQ2PS xmm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.0F.W1 5B /r	avx512	Convert four packed quadword integers from ymm2/mem to packed single-precision floating-point values in xmm1 with writemask k1.
VCVTQQ2PS ymm1 {k1}{z}, zmm2/m512/m64bcst{er}	EVEX.512.0F.W1 5B /r	avx512	Convert eight packed quadword integers from zmm2/mem to eight packed single-precision floating-point values in ymm1 with writemask k1.
VCVTSD2USI r32, xmm1/m64{er}	EVEX.LIG.F2.0F.W0 79 /r	avx512	Convert one double-precision floating-point value from xmm1/m64 to one unsigned doubleword integer r32.
VCVTSD2USI r64, xmm1/m64{er}	EVEX.LIG.F2.0F.W1 79 /r	avx512	Convert one double-precision floating-point value from xmm1/m64 to one unsigned quadword integer zero-extended into r64.
VCVTSS2USI r32, xmm1/m32{er}	EVEX.LIG.F3.0F.W0 79 /r	avx512	Convert one single-precision floating-point value from xmm1/m32 to one unsigned doubleword integer in r32.
VCVTSS2USI r64, xmm1/m32{er}	EVEX.LIG.F3.0F.W1 79 /r	avx512	Convert one single-precision floating-point value from xmm1/m32 to one unsigned quadword integer in r64.
VCVTTPD2QQ xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.66.0F.W1 7A /r	avx512	Convert two packed double-precision floating-point values from zmm2/m128/m64bcst to two packed quadword integers in zmm1 using truncation with writemask k1.
VCVTTPD2QQ ymm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.66.0F.W1 7A /r	avx512	Convert four packed double-precision floating-point values from ymm2/m256/m64bcst to four packed quadword integers in ymm1 using truncation with writemask k1.
VCVTTPD2QQ zmm1 {k1}{z}, zmm2/m512/m64bcst{sae}	EVEX.512.66.0F.W1 7A /r	avx512	Convert eight packed double-precision floating-point values from zmm2/m512 to eight packed quadword integers in zmm1 using truncation with writemask k1.
VCVTTPD2UDQ xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.0F.W1 78 /r	avx512	Convert two packed double-precision floating-point values in xmm2/m128/m64bcst to two unsigned doubleword integers in xmm1 using truncation subject to writemask k1.
VCVTTPD2UDQ xmm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.0F.W1 78 02 /r	avx512	Convert four packed double-precision floating-point values in ymm2/m256/m64bcst to four unsigned doubleword integers in xmm1 using truncation subject to writemask k1.
VCVTTPD2UDQ ymm1 {k1}{z}, zmm2/m512/m64bcst{sae}	EVEX.512.0F.W1 78 /r	avx512	Convert eight packed double-precision floating-point values in zmm2/m512/m64bcst to eight unsigned doubleword integers in ymm1 using truncation subject to writemask k1.
VCVTTPD2UQQ xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.66.0F.W1 78 /r	avx512	Convert two packed double-precision floating-point values from xmm2/m128/m64bcst to two packed unsigned quadword integers in xmm1 using truncation with writemask k1.
VCVTTPD2UQQ ymm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.66.0F.W1 78 /r	avx512	Convert four packed double-precision floating-point values from ymm2/m256/m64bcst to four packed unsigned quadword integers in ymm1 using truncation with writemask k1.
VCVTTPD2UQQ zmm1 {k1}{z}, zmm2/m512/m64bcst{sae}	EVEX.512.66.0F.W1 78 /r	avx512	Convert eight packed double-precision floating-point values from zmm2/mem to eight packed unsigned quadword integers in zmm1 using truncation with writemask k1.
VCVTTPS2QQ xmm1 {k1}{z}, xmm2/m64/m32bcst	EVEX.128.66.0F.W0 7A /r	avx512	Convert two packed single precision floating-point values from xmm2/m64/m32bcst to two packed signed quadword values in xmm1 using truncation subject to writemask k1.
VCVTTPS2QQ ymm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.256.66.0F.W0 7A /r	avx512	Convert four packed single precision floating-point values from xmm2/m128/m32bcst to four packed signed quadword values in ymm1 using truncation subject to writemask k1.
VCVTTPS2QQ zmm1 {k1}{z}, ymm2/m256/m32bcst{sae}	EVEX.512.66.0F.W0 7A /r	avx512	Convert eight packed single precision floating-point values from ymm2/m256/m32bcst to eight packed signed quadword values in zmm1 using truncation subject to writemask k1.
VCVTTPS2UDQ xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.0F.W0 78 /r	avx512	Convert four packed single precision floating-point values from xmm2/m128/m32bcst to four packed unsigned doubleword values in xmm1 using truncation subject to writemask k1.
VCVTTPS2UDQ ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.0F.W0 78 /r	avx512	Convert eight packed single precision floating-point values from ymm2/m256/m32bcst to eight packed unsigned doubleword values in ymm1 using truncation subject to writemask k1.
VCVTTPS2UDQ zmm1 {k1}{z}, zmm2/m512/m32bcst{sae}	EVEX.512.0F.W0 78 /r	avx512	Convert sixteen packed single-precision floating-point values from zmm2/m512/m32bcst to sixteen packed unsigned doubleword values in zmm1 using truncation subject to writemask k1.
VCVTTPS2UQQ xmm1 {k1}{z}, xmm2/m64/m32bcst	EVEX.128.66.0F.W0 78 /r	avx512	Convert two packed single precision floating-point values from xmm2/m64/m32bcst to two packed unsigned quadword values in xmm1 using truncation subject to writemask k1.
VCVTTPS2UQQ ymm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.256.66.0F.W0 78 /r	avx512	Convert four packed single precision floating-point values from xmm2/m128/m32bcst to four packed unsigned quadword values in ymm1 using truncation subject to writemask k1.
VCVTTPS2UQQ zmm1 {k1}{z}, ymm2/m256/m32bcst{sae}	EVEX.512.66.0F.W0 78 /r	avx512	Convert eight packed single precision floating-point values from ymm2/m256/m32bcst to eight packed unsigned quadword values in zmm1 using truncation subject to writemask k1.
VCVTTSD2USI r32, xmm1/m64{sae}	EVEX.LIG.F2.0F.W0 78 /r	avx512	Convert one double-precision floating-point value from xmm1/m64 to one unsigned doubleword integer r32 using truncation.
VCVTTSD2USI r64, xmm1/m64{sae}	EVEX.LIG.F2.0F.W1 78 /r	avx512	Convert one double-precision floating-point value from xmm1/m64 to one unsigned quadword integer zero-extended into r64 using truncation.
VCVTTSS2USI r32, xmm1/m32{sae}	EVEX.LIG.F3.0F.W0 78 /r	avx512	Convert one single-precision floating-point value from xmm1/m32 to one unsigned doubleword integer in r32 using truncation.
VCVTTSS2USI r64, xmm1/m32{sae}	EVEX.LIG.F3.0F.W1 78 /r	avx512	Convert one single-precision floating-point value from xmm1/m32 to one unsigned quadword integer in r64 using truncation.
VCVTUDQ2PD xmm1 {k1}{z}, xmm2/m64/m32bcst	EVEX.128.F3.0F.W0 7A /r	avx512	Convert two packed unsigned doubleword integers from ymm2/m64/m32bcst to packed double-precision floating-point values in zmm1 with writemask k1.
VCVTUDQ2PD ymm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.256.F3.0F.W0 7A /r	avx512	Convert four packed unsigned doubleword integers from xmm2/m128/m32bcst to packed double-precision floating-point values in zmm1 with writemask k1.
VCVTUDQ2PD zmm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.512.F3.0F.W0 7A /r	avx512	Convert eight packed unsigned doubleword integers from ymm2/m256/m32bcst to eight packed double-precision floating-point values in zmm1 with writemask k1.
VCVTUDQ2PS xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.F2.0F.W0 7A /r	avx512	Convert four packed unsigned doubleword integers from xmm2/m128/m32bcst to packed single-precision floating-point values in xmm1 with writemask k1.
VCVTUDQ2PS ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.F2.0F.W0 7A /r	avx512	Convert eight packed unsigned doubleword integers from ymm2/m256/m32bcst to packed single-precision floating-point values in zmm1 with writemask k1.
VCVTUDQ2PS zmm1 {k1}{z}, zmm2/m512/m32bcst{er}	EVEX.512.F2.0F.W0 7A /r	avx512	Convert sixteen packed unsigned doubleword integers from zmm2/m512/m32bcst to sixteen packed single-precision floating-point values in zmm1 with writemask k1.
VCVTUQQ2PD xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.F3.0F.W1 7A /r	avx512	Convert two packed unsigned quadword integers from xmm2/m128/m64bcst to two packed double-precision floating-point values in xmm1 with writemask k1.
VCVTUQQ2PD ymm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.F3.0F.W1 7A /r	avx512	Convert four packed unsigned quadword integers from ymm2/m256/m64bcst to packed double-precision floating-point values in ymm1 with writemask k1.
VCVTUQQ2PD zmm1 {k1}{z}, zmm2/m512/m64bcst{er}	EVEX.512.F3.0F.W1 7A /r	avx512	Convert eight packed unsigned quadword integers from zmm2/m512/m64bcst to eight packed double-precision floating-point values in zmm1 with writemask k1.
VCVTUQQ2PS xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.F2.0F.W1 7A /r	avx512	Convert two packed unsigned quadword integers from xmm2/m128/m64bcst to packed single-precision floating-point values in zmm1 with writemask k1.
VCVTUQQ2PS xmm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.F2.0F.W1 7A /r	avx512	Convert four packed unsigned quadword integers from ymm2/m256/m64bcst to packed single-precision floating-point values in xmm1 with writemask k1.
VCVTUQQ2PS ymm1 {k1}{z}, zmm2/m512/m64bcst{er}	EVEX.512.F2.0F.W1 7A /r	avx512	Convert eight packed unsigned quadword integers from zmm2/m512/m64bcst to eight packed single-precision floating-point values in zmm1 with writemask k1.
VCVTUSI2SD xmm1, xmm2, r/m32	EVEX.NDS.LIG.F2.0F.W0 7B /r	avx512	Convert one unsigned doubleword integer from r/m32 to one double-precision floating-point value in xmm1.
VCVTUSI2SD xmm1, xmm2, r/m64{er}	EVEX.NDS.LIG.F2.0F.W1 7B /r	avx512	Convert one unsigned quadword integer from r/m64 to one double-precision floating-point value in xmm1.
VCVTUSI2SS xmm1, xmm2, r/m32{er}	EVEX.NDS.LIG.F3.0F.W0 7B /r	avx512	Convert one signed doubleword integer from r/m32 to one single-precision floating-point value in xmm1.
VCVTUSI2SS xmm1, xmm2, r/m64{er}	EVEX.NDS.LIG.F3.0F.W1 7B /r	avx512	Convert one signed quadword integer from r/m64 to one single-precision floating-point value in xmm1.
VDBPSADBW xmm1 {k1}{z}, xmm2, xmm3/m128, imm8	EVEX.NDS.128.66.0F3A.W0 42 /r ib	avx512	Compute packed SAD word results of unsigned bytes in dword block from xmm2 with unsigned bytes of dword blocks transformed from xmm3/m128 using the shuffle controls in imm8. Results are written to xmm1 under the writemask k1.
VDBPSADBW ymm1 {k1}{z}, ymm2, ymm3/m256, imm8	EVEX.NDS.256.66.0F3A.W0 42 /r ib	avx512	Compute packed SAD word results of unsigned bytes in dword block from ymm2 with unsigned bytes of dword blocks transformed from ymm3/m256 using the shuffle controls in imm8. Results are written to ymm1 under the writemask k1.
VDBPSADBW zmm1 {k1}{z}, zmm2, zmm3/m512, imm8	EVEX.NDS.512.66.0F3A.W0 42 /r ib	avx512	Compute packed SAD word results of unsigned bytes in dword block from zmm2 with unsigned bytes of dword blocks transformed from zmm3/m512 using the shuffle controls in imm8. Results are written to zmm1 under the writemask k1.
VERR r/m16	0F 00 /4		Set ZF=1 if segment specified with r/m16 can be read.
VERW r/m16	0F 00 /5		Set ZF=1 if segment specified with r/m16 can be written.
VEXP2PD zmm1 {k1}{z}, zmm2/m512/m64bcst {sae}	EVEX.512.66.0F38.W1 C8 /r	avx512	Computes approximations to the exponential 2^x (with less than 2^-23 of maximum relative error) of the packed double-precision floating-point values from zmm2/m512/m64bcst and stores the floating-point result in zmm1with writemask k1.
VEXP2PS zmm1 {k1}{z}, zmm2/m512/m32bcst {sae}	EVEX.512.66.0F38.W0 C8 /r	avx512	Computes approximations to the exponential 2^x (with less than 2^-23 of maximum relative error) of the packed single-precision floating-point values from zmm2/m512/m32bcst and stores the floating-point result in zmm1with writemask k1.
VEXPANDPD xmm1 {k1}{z}, xmm2/m128	EVEX.128.66.0F38.W1 88 /r	avx512	Expand packed double-precision floating-point values from xmm2/m128 to xmm1 using writemask k1.
VEXPANDPD ymm1 {k1}{z}, ymm2/m256	EVEX.256.66.0F38.W1 88 /r	avx512	Expand packed double-precision floating-point values from ymm2/m256 to ymm1 using writemask k1.
VEXPANDPD zmm1 {k1}{z}, zmm2/m512	EVEX.512.66.0F38.W1 88 /r	avx512	Expand packed double-precision floating-point values from zmm2/m512 to zmm1 using writemask k1.
VEXPANDPS xmm1 {k1}{z}, xmm2/m128	EVEX.128.66.0F38.W0 88 /r	avx512	Expand packed single-precision floating-point values from xmm2/m128 to xmm1 using writemask k1.
VEXPANDPS ymm1 {k1}{z}, ymm2/m256	EVEX.256.66.0F38.W0 88 /r	avx512	Expand packed single-precision floating-point values from ymm2/m256 to ymm1 using writemask k1.
VEXPANDPS zmm1 {k1}{z}, zmm2/m512	EVEX.512.66.0F38.W0 88 /r	avx512	Expand packed single-precision floating-point values from zmm2/m512 to zmm1 using writemask k1.
VEXTRACTF128 xmm1/m128, ymm2, imm8	VEX.256.66.0F3A.W0 19 /r ib	avx	Extract 128 bits of packed floating-point values from ymm2 and store results in xmm1/m128.
VEXTRACTF32X4 xmm1/m128 {k1}{z}, ymm2, imm8	EVEX.256.66.0F3A.W0 19 /r ib	avx512	Extract 128 bits of packed single-precision floating-point values from ymm2 and store results in xmm1/m128 subject to writemask k1.
VEXTRACTF32x4 xmm1/m128 {k1}{z}, zmm2, imm8	EVEX.512.66.0F3A.W0 19 /r ib	avx512	Extract 128 bits of packed single-precision floating-point values from zmm2 and store results in xmm1/m128 subject to writemask k1.
VEXTRACTF64X2 xmm1/m128 {k1}{z}, ymm2, imm8	EVEX.256.66.0F3A.W1 19 /r ib	avx512	Extract 128 bits of packed double-precision floating-point values from ymm2 and store results in xmm1/m128 subject to writemask k1.
VEXTRACTF64X2 xmm1/m128 {k1}{z}, zmm2, imm8	EVEX.512.66.0F3A.W1 19 /r ib	avx512	Extract 128 bits of packed double-precision floating-point values from zmm2 and store results in xmm1/m128 subject to writemask k1.
VEXTRACTF32X8 ymm1/m256 {k1}{z}, zmm2, imm8	EVEX.512.66.0F3A.W0 1B /r ib	avx512	Extract 256 bits of packed single-precision floating-point values from zmm2 and store results in ymm1/m256 subject to writemask k1.
VEXTRACTF64x4 ymm1/m256 {k1}{z}, zmm2, imm8	EVEX.512.66.0F3A.W1 1B /r ib	avx512	Extract 256 bits of packed double-precision floating-point values from zmm2 and store results in ymm1/m256 subject to writemask k1.
VEXTRACTI128 xmm1/m128, ymm2, imm8	VEX.256.66.0F3A.W0 39 /r ib	avx2	Extract 128 bits of integer data from ymm2 and store results in xmm1/m128.
VEXTRACTI32X4 xmm1/m128 {k1}{z}, ymm2, imm8	EVEX.256.66.0F3A.W0 39 /r ib	avx512	Extract 128 bits of double-word integer values from ymm2 and store results in xmm1/m128 subject to writemask k1.
VEXTRACTI32x4 xmm1/m128 {k1}{z}, zmm2, imm8	EVEX.512.66.0F3A.W0 39 /r ib	avx512	Extract 128 bits of double-word integer values from zmm2 and store results in xmm1/m128 subject to writemask k1.
VEXTRACTI64X2 xmm1/m128 {k1}{z}, ymm2, imm8	EVEX.256.66.0F3A.W1 39 /r ib	avx512	Extract 128 bits of quad-word integer values from ymm2 and store results in xmm1/m128 subject to writemask k1.
VEXTRACTI64X2 xmm1/m128 {k1}{z}, zmm2, imm8	EVEX.512.66.0F3A.W1 39 /r ib	avx512	Extract 128 bits of quad-word integer values from zmm2 and store results in xmm1/m128 subject to writemask k1.
VEXTRACTI32X8 ymm1/m256 {k1}{z}, zmm2, imm8	EVEX.512.66.0F3A.W0 3B /r ib	avx512	Extract 256 bits of double-word integer values from zmm2 and store results in ymm1/m256 subject to writemask k1.
VEXTRACTI64x4 ymm1/m256 {k1}{z}, zmm2, imm8	EVEX.512.66.0F3A.W1 3B /r ib	avx512	Extract 256 bits of quad-word integer values from zmm2 and store results in ymm1/m256 subject to writemask k1.
VFIXUPIMMPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst, imm8	EVEX.NDS.128.66.0F3A.W1 54 /r ib	avx512	Fix up special numbers in float64 vector xmm1, float64 vector xmm2 and int64 vector xmm3/m128/m64bcst and store the result in xmm1, under writemask.
VFIXUPIMMPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst, imm8	EVEX.NDS.256.66.0F3A.W1 54 /r ib	avx512	Fix up special numbers in float64 vector ymm1, float64 vector ymm2 and int64 vector ymm3/m256/m64bcst and store the result in ymm1, under writemask.
VFIXUPIMMPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{sae}, imm8	EVEX.NDS.512.66.0F3A.W1 54 /r ib	avx512	Fix up elements of float64 vector in zmm2 using int64 vector table in zmm3/m512/m64bcst, combine with preserved elements from zmm1, and store the result in zmm1.
VFIXUPIMMPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst, imm8	EVEX.NDS.128.66.0F3A.W0 54 /r	avx512	Fix up special numbers in float32 vector xmm1, float32 vector xmm2 and int32 vector xmm3/m128/m32bcst and store the result in xmm1, under writemask.
VFIXUPIMMPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst, imm8	EVEX.NDS.256.66.0F3A.W0 54 /r	avx512	Fix up special numbers in float32 vector ymm1, float32 vector ymm2 and int32 vector ymm3/m256/m32bcst and store the result in ymm1, under writemask.
VFIXUPIMMPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{sae}, imm8	EVEX.NDS.512.66.0F3A.W0 54 /r ib	avx512	Fix up elements of float32 vector in zmm2 using int32 vector table in zmm3/m512/m32bcst, combine with preserved elements from zmm1, and store the result in zmm1.
VFIXUPIMMSD xmm1 {k1}{z}, xmm2, xmm3/m64{sae}, imm8	EVEX.NDS.LIG.66.0F3A.W1 55 /r ib	avx512	Fix up a float64 number in the low quadword element of xmm2 using scalar int32 table in xmm3/m64 and store the result in xmm1.
VFIXUPIMMSS xmm1 {k1}{z}, xmm2, xmm3/m32{sae}, imm8	EVEX.NDS.LIG.66.0F3A.W0 55 /r ib	avx512	Fix up a float32 number in the low doubleword element in xmm2 using scalar int32 table in xmm3/m32 and store the result in xmm1.
VFMADD132PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 98 /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm3/mem, add to xmm2 and put result in xmm1.
VFMADD213PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 A8 /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm2, add to xmm3/mem and put result in xmm1.
VFMADD231PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 B8 /r	fma	Multiply packed double-precision floating-point values from xmm2 and xmm3/mem, add to xmm1 and put result in xmm1.
VFMADD132PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 98 /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm3/mem, add to ymm2 and put result in ymm1.
VFMADD213PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 A8 /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm2, add to ymm3/mem and put result in ymm1.
VFMADD231PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 B8 /r	fma	Multiply packed double-precision floating-point values from ymm2 and ymm3/mem, add to ymm1 and put result in ymm1.
VFMADD132PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 98 /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm3/m128/m64bcst, add to xmm2 and put result in xmm1.
VFMADD213PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 A8 /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm2, add to xmm3/m128/m64bcst and put result in xmm1.
VFMADD231PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 B8 /r	avx512	Multiply packed double-precision floating-point values from xmm2 and xmm3/m128/m64bcst, add to xmm1 and put result in xmm1.
VFMADD132PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 98 /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm3/m256/m64bcst, add to ymm2 and put result in ymm1.
VFMADD213PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 A8 /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm2, add to ymm3/m256/m64bcst and put result in ymm1.
VFMADD231PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 B8 /r	avx512	Multiply packed double-precision floating-point values from ymm2 and ymm3/m256/m64bcst, add to ymm1 and put result in ymm1.
VFMADD132PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 98 /r	avx512	Multiply packed double-precision floating-point values from zmm1 and zmm3/m512/m64bcst, add to zmm2 and put result in zmm1.
VFMADD213PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 A8 /r	avx512	Multiply packed double-precision floating-point values from zmm1 and zmm2, add to zmm3/m512/m64bcst and put result in zmm1.
VFMADD231PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 B8 /r	avx512	Multiply packed double-precision floating-point values from zmm2 and zmm3/m512/m64bcst, add to zmm1 and put result in zmm1.
VFMADD132PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 98 /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm3/mem, add to xmm2 and put result in xmm1.
VFMADD213PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 A8 /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm2, add to xmm3/mem and put result in xmm1.
VFMADD231PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 B8 /r	fma	Multiply packed single-precision floating-point values from xmm2 and xmm3/mem, add to xmm1 and put result in xmm1.
VFMADD132PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 98 /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm3/mem, add to ymm2 and put result in ymm1.
VFMADD213PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 A8 /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm2, add to ymm3/mem and put result in ymm1.
VFMADD231PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.0 B8 /r	fma	Multiply packed single-precision floating-point values from ymm2 and ymm3/mem, add to ymm1 and put result in ymm1.
VFMADD132PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 98 /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm3/m128/m32bcst, add to xmm2 and put result in xmm1.
VFMADD213PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 A8 /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm2, add to xmm3/m128/m32bcst and put result in xmm1.
VFMADD231PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 B8 /r	avx512	Multiply packed single-precision floating-point values from xmm2 and xmm3/m128/m32bcst, add to xmm1 and put result in xmm1.
VFMADD132PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 98 /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm3/m256/m32bcst, add to ymm2 and put result in ymm1.
VFMADD213PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 A8 /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm2, add to ymm3/m256/m32bcst and put result in ymm1.
VFMADD231PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 B8 /r	avx512	Multiply packed single-precision floating-point values from ymm2 and ymm3/m256/m32bcst, add to ymm1 and put result in ymm1.
VFMADD132PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 98 /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm3/m512/m32bcst, add to zmm2 and put result in zmm1.
VFMADD213PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 A8 /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm2, add to zmm3/m512/m32bcst and put result in zmm1.
VFMADD231PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 B8 /r	avx512	Multiply packed single-precision floating-point values from zmm2 and zmm3/m512/m32bcst, add to zmm1 and put result in zmm1.
VFMADD132SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 99 /r	fma	Multiply scalar double-precision floating-point value from xmm1 and xmm3/m64, add to xmm2 and put result in xmm1.
VFMADD213SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 A9 /r	fma	Multiply scalar double-precision floating-point value from xmm1 and xmm2, add to xmm3/m64 and put result in xmm1.
VFMADD231SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 B9 /r	fma	Multiply scalar double-precision floating-point value from xmm2 and xmm3/m64, add to xmm1 and put result in xmm1.
VFMADD132SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 99 /r	avx512	Multiply scalar double-precision floating-point value from xmm1 and xmm3/m64, add to xmm2 and put result in xmm1.
VFMADD213SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 A9 /r	avx512	Multiply scalar double-precision floating-point value from xmm1 and xmm2, add to xmm3/m64 and put result in xmm1.
VFMADD231SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 B9 /r	avx512	Multiply scalar double-precision floating-point value from xmm2 and xmm3/m64, add to xmm1 and put result in xmm1.
VFMADD132SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 99 /r	fma	Multiply scalar single-precision floating-point value from xmm1 and xmm3/m32, add to xmm2 and put result in xmm1.
VFMADD213SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 A9 /r	fma	Multiply scalar single-precision floating-point value from xmm1 and xmm2, add to xmm3/m32 and put result in xmm1.
VFMADD231SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 B9 /r	fma	Multiply scalar single-precision floating-point value from xmm2 and xmm3/m32, add to xmm1 and put result in xmm1.
VFMADD132SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 99 /r	avx512	Multiply scalar single-precision floating-point value from xmm1 and xmm3/m32, add to xmm2 and put result in xmm1.
VFMADD213SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 A9 /r	avx512	Multiply scalar single-precision floating-point value from xmm1 and xmm2, add to xmm3/m32 and put result in xmm1.
VFMADD231SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 B9 /r	avx512	Multiply scalar single-precision floating-point value from xmm2 and xmm3/m32, add to xmm1 and put result in xmm1.
VFMADDSUB132PD xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W1 96 /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm3/mem, add/subtract elements in xmm2 and put result in xmm1.
VFMADDSUB213PD xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W1 A6 /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm2, add/subtract elements in xmm3/mem and put result in xmm1.
VFMADDSUB231PD xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W1 B6 /r	fma	Multiply packed double-precision floating-point values from xmm2 and xmm3/mem, add/subtract elements in xmm1 and put result in xmm1.
VFMADDSUB132PD ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W1 96 /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm3/mem, add/subtract elements in ymm2 and put result in ymm1.
VFMADDSUB213PD ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W1 A6 /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm2, add/subtract elements in ymm3/mem and put result in ymm1.
VFMADDSUB231PD ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W1 B6 /r	fma	Multiply packed double-precision floating-point values from ymm2 and ymm3/mem, add/subtract elements in ymm1 and put result in ymm1.
VFMADDSUB213PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.DDS.128.66.0F38.W1 A6 /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm2, add/subtract elements in xmm3/m128/m64bcst and put result in xmm1 subject to writemask k1.
VFMADDSUB231PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.DDS.128.66.0F38.W1 B6 /r	avx512	Multiply packed double-precision floating-point values from xmm2 and xmm3/m128/m64bcst, add/subtract elements in xmm1 and put result in xmm1 subject to writemask k1.
VFMADDSUB132PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.DDS.128.66.0F38.W1 96 /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm3/m128/m64bcst, add/subtract elements in xmm2 and put result in xmm1 subject to writemask k1.
VFMADDSUB213PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.DDS.256.66.0F38.W1 A6 /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm2, add/subtract elements in ymm3/m256/m64bcst and put result in ymm1 subject to writemask k1.
VFMADDSUB231PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.DDS.256.66.0F38.W1 B6 /r	avx512	Multiply packed double-precision floating-point values from ymm2 and ymm3/m256/m64bcst, add/subtract elements in ymm1 and put result in ymm1 subject to writemask k1.
VFMADDSUB132PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.DDS.256.66.0F38.W1 96 /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm3/m256/m64bcst, add/subtract elements in ymm2 and put result in ymm1 subject to writemask k1.
VFMADDSUB213PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.DDS.512.66.0F38.W1 A6 /r	avx512	Multiply packed double-precision floating-point values from zmm1and zmm2, add/subtract elements in zmm3/m512/m64bcst and put result in zmm1 subject to writemask k1.
VFMADDSUB231PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.DDS.512.66.0F38.W1 B6 /r	avx512	Multiply packed double-precision floating-point values from zmm2 and zmm3/m512/m64bcst, add/subtract elements in zmm1 and put result in zmm1 subject to writemask k1.
VFMADDSUB132PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.DDS.512.66.0F38.W1 96 /r	avx512	Multiply packed double-precision floating-point values from zmm1 and zmm3/m512/m64bcst, add/subtract elements in zmm2 and put result in zmm1 subject to writemask k1.
VFMADDSUB132PS xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W0 96 /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm3/mem, add/subtract elements in xmm2 and put result in xmm1.
VFMADDSUB213PS xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W0 A6 /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm2, add/subtract elements in xmm3/mem and put result in xmm1.
VFMADDSUB231PS xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W0 B6 /r	fma	Multiply packed single-precision floating-point values from xmm2 and xmm3/mem, add/subtract elements in xmm1 and put result in xmm1.
VFMADDSUB132PS ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W0 96 /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm3/mem, add/subtract elements in ymm2 and put result in ymm1.
VFMADDSUB213PS ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W0 A6 /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm2, add/subtract elements in ymm3/mem and put result in ymm1.
VFMADDSUB231PS ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W0 B6 /r	fma	Multiply packed single-precision floating-point values from ymm2 and ymm3/mem, add/subtract elements in ymm1 and put result in ymm1.
VFMADDSUB213PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.DDS.128.66.0F38.W0 A6 /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm2, add/subtract elements in xmm3/m128/m32bcst and put result in xmm1 subject to writemask k1.
VFMADDSUB231PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.DDS.128.66.0F38.W0 B6 /r	avx512	Multiply packed single-precision floating-point values from xmm2 and xmm3/m128/m32bcst, add/subtract elements in xmm1 and put result in xmm1 subject to writemask k1.
VFMADDSUB132PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.DDS.128.66.0F38.W0 96 /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm3/m128/m32bcst, add/subtract elements in zmm2 and put result in xmm1 subject to writemask k1.
VFMADDSUB213PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.DDS.256.66.0F38.W0 A6 /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm2, add/subtract elements in ymm3/m256/m32bcst and put result in ymm1 subject to writemask k1.
VFMADDSUB231PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.DDS.256.66.0F38.W0 B6 /r	avx512	Multiply packed single-precision floating-point values from ymm2 and ymm3/m256/m32bcst, add/subtract elements in ymm1 and put result in ymm1 subject to writemask k1.
VFMADDSUB132PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.DDS.256.66.0F38.W0 96 /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm3/m256/m32bcst, add/subtract elements in ymm2 and put result in ymm1 subject to writemask k1.
VFMADDSUB213PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.DDS.512.66.0F38.W0 A6 /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm2, add/subtract elements in zmm3/m512/m32bcst and put result in zmm1 subject to writemask k1.
VFMADDSUB231PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.DDS.512.66.0F38.W0 B6 /r	avx512	Multiply packed single-precision floating-point values from zmm2 and zmm3/m512/m32bcst, add/subtract elements in zmm1 and put result in zmm1 subject to writemask k1.
VFMADDSUB132PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.DDS.512.66.0F38.W0 96 /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm3/m512/m32bcst, add/subtract elements in zmm2 and put result in zmm1 subject to writemask k1.
VFMSUB132PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 9A /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm3/mem, subtract xmm2 and put result in xmm1.
VFMSUB213PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 AA /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm2, subtract xmm3/mem and put result in xmm1.
VFMSUB231PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 BA /r	fma	Multiply packed double-precision floating-point values from xmm2 and xmm3/mem, subtract xmm1 and put result in xmm1.
VFMSUB132PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 9A /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm3/mem, subtract ymm2 and put result in ymm1.
VFMSUB213PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 AA /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm2, subtract ymm3/mem and put result in ymm1.
VFMSUB231PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 BA /r	fma	Multiply packed double-precision floating-point values from ymm2 and ymm3/mem, subtract ymm1 and put result in ymm1.S
VFMSUB132PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 9A /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm3/m128/m64bcst, subtract xmm2 and put result in xmm1 subject to writemask k1.
VFMSUB213PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 AA /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm2, subtract xmm3/m128/m64bcst and put result in xmm1 subject to writemask k1.
VFMSUB231PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 BA /r	avx512	Multiply packed double-precision floating-point values from xmm2 and xmm3/m128/m64bcst, subtract xmm1 and put result in xmm1 subject to writemask k1.
VFMSUB132PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 9A /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm3/m256/m64bcst, subtract ymm2 and put result in ymm1 subject to writemask k1.
VFMSUB213PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 AA /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm2, subtract ymm3/m256/m64bcst and put result in ymm1 subject to writemask k1.
VFMSUB231PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 BA /r	avx512	Multiply packed double-precision floating-point values from ymm2 and ymm3/m256/m64bcst, subtract ymm1 and put result in ymm1 subject to writemask k1.
VFMSUB132PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 9A /r	avx512	Multiply packed double-precision floating-point values from zmm1 and zmm3/m512/m64bcst, subtract zmm2 and put result in zmm1 subject to writemask k1.
VFMSUB213PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 AA /r	avx512	Multiply packed double-precision floating-point values from zmm1 and zmm2, subtract zmm3/m512/m64bcst and put result in zmm1 subject to writemask k1.
VFMSUB231PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 BA /r	avx512	Multiply packed double-precision floating-point values from zmm2 and zmm3/m512/m64bcst, subtract zmm1 and put result in zmm1 subject to writemask k1.
VFMSUB132PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 9A /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm3/mem, subtract xmm2 and put result in xmm1.
VFMSUB213PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 AA /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm2, subtract xmm3/mem and put result in xmm1.
VFMSUB231PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 BA /r	fma	Multiply packed single-precision floating-point values from xmm2 and xmm3/mem, subtract xmm1 and put result in xmm1.
VFMSUB132PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 9A /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm3/mem, subtract ymm2 and put result in ymm1.
VFMSUB213PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 AA /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm2, subtract ymm3/mem and put result in ymm1.
VFMSUB231PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.0 BA /r	fma	Multiply packed single-precision floating-point values from ymm2 and ymm3/mem, subtract ymm1 and put result in ymm1.
VFMSUB132PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 9A /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm3/m128/m32bcst, subtract xmm2 and put result in xmm1.
VFMSUB213PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 AA /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm2, subtract xmm3/m128/m32bcst and put result in xmm1.
VFMSUB231PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 BA /r	avx512	Multiply packed single-precision floating-point values from xmm2 and xmm3/m128/m32bcst, subtract xmm1 and put result in xmm1.
VFMSUB132PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 9A /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm3/m256/m32bcst, subtract ymm2 and put result in ymm1.
VFMSUB213PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 AA /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm2, subtract ymm3/m256/m32bcst and put result in ymm1.
VFMSUB231PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 BA /r	avx512	Multiply packed single-precision floating-point values from ymm2 and ymm3/m256/m32bcst, subtract ymm1 and put result in ymm1.
VFMSUB132PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 9A /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm3/m512/m32bcst, subtract zmm2 and put result in zmm1.
VFMSUB213PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 AA /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm2, subtract zmm3/m512/m32bcst and put result in zmm1.
VFMSUB231PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 BA /r	avx512	Multiply packed single-precision floating-point values from zmm2 and zmm3/m512/m32bcst, subtract zmm1 and put result in zmm1.
VFMSUB132SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 9B /r	fma	Multiply scalar double-precision floating-point value from xmm1 and xmm3/m64, subtract xmm2 and put result in xmm1.
VFMSUB213SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 AB /r	fma	Multiply scalar double-precision floating-point value from xmm1 and xmm2, subtract xmm3/m64 and put result in xmm1.
VFMSUB231SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 BB /r	fma	Multiply scalar double-precision floating-point value from xmm2 and xmm3/m64, subtract xmm1 and put result in xmm1.
VFMSUB132SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 9B /r	avx512	Multiply scalar double-precision floating-point value from xmm1 and xmm3/m64, subtract xmm2 and put result in xmm1.
VFMSUB213SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 AB /r	avx512	Multiply scalar double-precision floating-point value from xmm1 and xmm2, subtract xmm3/m64 and put result in xmm1.
VFMSUB231SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 BB /r	avx512	Multiply scalar double-precision floating-point value from xmm2 and xmm3/m64, subtract xmm1 and put result in xmm1.
VFMSUB132SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 9B /r	fma	Multiply scalar single-precision floating-point value from xmm1 and xmm3/m32, subtract xmm2 and put result in xmm1.
VFMSUB213SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 AB /r	fma	Multiply scalar single-precision floating-point value from xmm1 and xmm2, subtract xmm3/m32 and put result in xmm1.
VFMSUB231SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 BB /r	fma	Multiply scalar single-precision floating-point value from xmm2 and xmm3/m32, subtract xmm1 and put result in xmm1.
VFMSUB132SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 9B /r	avx512	Multiply scalar single-precision floating-point value from xmm1 and xmm3/m32, subtract xmm2 and put result in xmm1.
VFMSUB213SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 AB /r	avx512	Multiply scalar single-precision floating-point value from xmm1 and xmm2, subtract xmm3/m32 and put result in xmm1.
VFMSUB231SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 BB /r	avx512	Multiply scalar single-precision floating-point value from xmm2 and xmm3/m32, subtract xmm1 and put result in xmm1.
VFMSUBADD132PD xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W1 97 /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm3/mem, subtract/add elements in xmm2 and put result in xmm1.
VFMSUBADD213PD xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W1 A7 /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm2, subtract/add elements in xmm3/mem and put result in xmm1.
VFMSUBADD231PD xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W1 B7 /r	fma	Multiply packed double-precision floating-point values from xmm2 and xmm3/mem, subtract/add elements in xmm1 and put result in xmm1.
VFMSUBADD132PD ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W1 97 /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm3/mem, subtract/add elements in ymm2 and put result in ymm1.
VFMSUBADD213PD ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W1 A7 /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm2, subtract/add elements in ymm3/mem and put result in ymm1.
VFMSUBADD231PD ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W1 B7 /r	fma	Multiply packed double-precision floating-point values from ymm2 and ymm3/mem, subtract/add elements in ymm1 and put result in ymm1.
VFMSUBADD132PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.DDS.128.66.0F38.W1 97 /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm3/m128/m64bcst, subtract/add elements in xmm2 and put result in xmm1 subject to writemask k1.
VFMSUBADD213PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.DDS.128.66.0F38.W1 A7 /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm2, subtract/add elements in xmm3/m128/m64bcst and put result in xmm1 subject to writemask k1.
VFMSUBADD231PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.DDS.128.66.0F38.W1 B7 /r	avx512	Multiply packed double-precision floating-point values from xmm2 and xmm3/m128/m64bcst, subtract/add elements in xmm1 and put result in xmm1 subject to writemask k1.
VFMSUBADD132PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.DDS.256.66.0F38.W1 97 /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm3/m256/m64bcst, subtract/add elements in ymm2 and put result in ymm1 subject to writemask k1.
VFMSUBADD213PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.DDS.256.66.0F38.W1 A7 /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm2, subtract/add elements in ymm3/m256/m64bcst and put result in ymm1 subject to writemask k1.
VFMSUBADD231PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.DDS.256.66.0F38.W1 B7 /r	avx512	Multiply packed double-precision floating-point values from ymm2 and ymm3/m256/m64bcst, subtract/add elements in ymm1 and put result in ymm1 subject to writemask k1.
VFMSUBADD132PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.DDS.512.66.0F38.W1 97 /r	avx512	Multiply packed double-precision floating-point values from zmm1 and zmm3/m512/m64bcst, subtract/add elements in zmm2 and put result in zmm1 subject to writemask k1.
VFMSUBADD213PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.DDS.512.66.0F38.W1 A7 /r	avx512	Multiply packed double-precision floating-point values from zmm1 and zmm2, subtract/add elements in zmm3/m512/m64bcst and put result in zmm1 subject to writemask k1.
VFMSUBADD231PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.DDS.512.66.0F38.W1 B7 /r	avx512	Multiply packed double-precision floating-point values from zmm2 and zmm3/m512/m64bcst, subtract/add elements in zmm1 and put result in zmm1 subject to writemask k1.
VFMSUBADD132PS xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W0 97 /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm3/mem, subtract/add elements in xmm2 and put result in xmm1.
VFMSUBADD213PS xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W0 A7 /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm2, subtract/add elements in xmm3/mem and put result in xmm1.
VFMSUBADD231PS xmm1, xmm2, xmm3/m128	VEX.DDS.128.66.0F38.W0 B7 /r	fma	Multiply packed single-precision floating-point values from xmm2 and xmm3/mem, subtract/add elements in xmm1 and put result in xmm1.
VFMSUBADD132PS ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W0 97 /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm3/mem, subtract/add elements in ymm2 and put result in ymm1.
VFMSUBADD213PS ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W0 A7 /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm2, subtract/add elements in ymm3/mem and put result in ymm1.
VFMSUBADD231PS ymm1, ymm2, ymm3/m256	VEX.DDS.256.66.0F38.W0 B7 /r	fma	Multiply packed single-precision floating-point values from ymm2 and ymm3/mem, subtract/add elements in ymm1 and put result in ymm1.
VFMSUBADD132PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.DDS.128.66.0F38.W0 97 /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm3/m128/m32bcst, subtract/add elements in xmm2 and put result in xmm1 subject to writemask k1.
VFMSUBADD213PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.DDS.128.66.0F38.W0 A7 /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm2, subtract/add elements in xmm3/m128/m32bcst and put result in xmm1 subject to writemask k1.
VFMSUBADD231PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.DDS.128.66.0F38.W0 B7 /r	avx512	Multiply packed single-precision floating-point values from xmm2 and xmm3/m128/m32bcst, subtract/add elements in xmm1 and put result in xmm1 subject to writemask k1.
VFMSUBADD132PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.DDS.256.66.0F38.W0 97 /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm3/m256/m32bcst, subtract/add elements in ymm2 and put result in ymm1 subject to writemask k1.
VFMSUBADD213PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.DDS.256.66.0F38.W0 A7 /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm2, subtract/add elements in ymm3/m256/m32bcst and put result in ymm1 subject to writemask k1.
VFMSUBADD231PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.DDS.256.66.0F38.W0 B7 /r	avx512	Multiply packed single-precision floating-point values from ymm2 and ymm3/m256/m32bcst, subtract/add elements in ymm1 and put result in ymm1 subject to writemask k1.
VFMSUBADD132PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.DDS.512.66.0F38.W0 97 /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm3/m512/m32bcst, subtract/add elements in zmm2 and put result in zmm1 subject to writemask k1.
VFMSUBADD213PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.DDS.512.66.0F38.W0 A7 /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm2, subtract/add elements in zmm3/m512/m32bcst and put result in zmm1 subject to writemask k1.
VFMSUBADD231PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.DDS.512.66.0F38.W0 B7 /r	avx512	Multiply packed single-precision floating-point values from zmm2 and zmm3/m512/m32bcst, subtract/add elements in zmm1 and put result in zmm1 subject to writemask k1.
VFNMADD132PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 9C /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm3/mem, negate the multiplication result and add to xmm2 and put result in xmm1.
VFNMADD213PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 AC /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm2, negate the multiplication result and add to xmm3/mem and put result in xmm1.
VFNMADD231PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 BC /r	fma	Multiply packed double-precision floating-point values from xmm2 and xmm3/mem, negate the multiplication result and add to xmm1 and put result in xmm1.
VFNMADD132PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 9C /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm3/mem, negate the multiplication result and add to ymm2 and put result in ymm1.
VFNMADD213PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 AC /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm2, negate the multiplication result and add to ymm3/mem and put result in ymm1.
VFNMADD231PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 BC /r	fma	Multiply packed double-precision floating-point values from ymm2 and ymm3/mem, negate the multiplication result and add to ymm1 and put result in ymm1.
VFNMADD132PD xmm0 {k1}{z}, xmm1, xmm2/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 9C /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm3/m128/m64bcst, negate the multiplication result and add to xmm2 and put result in xmm1.
VFNMADD213PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 AC /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm2, negate the multiplication result and add to xmm3/m128/m64bcst and put result in xmm1.
VFNMADD231PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 BC /r	avx512	Multiply packed double-precision floating-point values from xmm2 and xmm3/m128/m64bcst, negate the multiplication result and add to xmm1 and put result in xmm1.
VFNMADD132PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 9C /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm3/m256/m64bcst, negate the multiplication result and add to ymm2 and put result in ymm1.
VFNMADD213PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 AC /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm2, negate the multiplication result and add to ymm3/m256/m64bcst and put result in ymm1.
VFNMADD231PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 BC /r	avx512	Multiply packed double-precision floating-point values from ymm2 and ymm3/m256/m64bcst, negate the multiplication result and add to ymm1 and put result in ymm1.
VFNMADD132PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 9C /r	avx512	Multiply packed double-precision floating-point values from zmm1 and zmm3/m512/m64bcst, negate the multiplication result and add to zmm2 and put result in zmm1.
VFNMADD213PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 AC /r	avx512	Multiply packed double-precision floating-point values from zmm1 and zmm2, negate the multiplication result and add to zmm3/m512/m64bcst and put result in zmm1.
VFNMADD231PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 BC /r	avx512	Multiply packed double-precision floating-point values from zmm2 and zmm3/m512/m64bcst, negate the multiplication result and add to zmm1 and put result in zmm1.
VFNMADD132PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 9C /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm3/mem, negate the multiplication result and add to xmm2 and put result in xmm1.
VFNMADD213PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 AC /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm2, negate the multiplication result and add to xmm3/mem and put result in xmm1.
VFNMADD231PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 BC /r	fma	Multiply packed single-precision floating-point values from xmm2 and xmm3/mem, negate the multiplication result and add to xmm1 and put result in xmm1.
VFNMADD132PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 9C /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm3/mem, negate the multiplication result and add to ymm2 and put result in ymm1.
VFNMADD213PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 AC /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm2, negate the multiplication result and add to ymm3/mem and put result in ymm1.
VFNMADD231PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.0 BC /r	fma	Multiply packed single-precision floating-point values from ymm2 and ymm3/mem, negate the multiplication result and add to ymm1 and put result in ymm1.
VFNMADD132PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 9C /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm3/m128/m32bcst, negate the multiplication result and add to xmm2 and put result in xmm1.
VFNMADD213PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 AC /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm2, negate the multiplication result and add to xmm3/m128/m32bcst and put result in xmm1.
VFNMADD231PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 BC /r	avx512	Multiply packed single-precision floating-point values from xmm2 and xmm3/m128/m32bcst, negate the multiplication result and add to xmm1 and put result in xmm1.
VFNMADD132PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 9C /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm3/m256/m32bcst, negate the multiplication result and add to ymm2 and put result in ymm1.
VFNMADD213PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 AC /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm2, negate the multiplication result and add to ymm3/m256/m32bcst and put result in ymm1.
VFNMADD231PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 BC /r	avx512	Multiply packed single-precision floating-point values from ymm2 and ymm3/m256/m32bcst, negate the multiplication result and add to ymm1 and put result in ymm1.
VFNMADD132PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 9C /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm3/m512/m32bcst, negate the multiplication result and add to zmm2 and put result in zmm1.
VFNMADD213PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 AC /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm2, negate the multiplication result and add to zmm3/m512/m32bcst and put result in zmm1.
VFNMADD231PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 BC /r	avx512	Multiply packed single-precision floating-point values from zmm2 and zmm3/m512/m32bcst, negate the multiplication result and add to zmm1 and put result in zmm1.
VFNMADD132SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 9D /r	fma	Multiply scalar double-precision floating-point value from xmm1 and xmm3/mem, negate the multiplication result and add to xmm2 and put result in xmm1.
VFNMADD213SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 AD /r	fma	Multiply scalar double-precision floating-point value from xmm1 and xmm2, negate the multiplication result and add to xmm3/mem and put result in xmm1.
VFNMADD231SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 BD /r	fma	Multiply scalar double-precision floating-point value from xmm2 and xmm3/mem, negate the multiplication result and add to xmm1 and put result in xmm1.
VFNMADD132SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 9D /r	avx512	Multiply scalar double-precision floating-point value from xmm1 and xmm3/m64, negate the multiplication result and add to xmm2 and put result in xmm1.
VFNMADD213SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 AD /r	avx512	Multiply scalar double-precision floating-point value from xmm1 and xmm2, negate the multiplication result and add to xmm3/m64 and put result in xmm1.
VFNMADD231SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 BD /r	avx512	Multiply scalar double-precision floating-point value from xmm2 and xmm3/m64, negate the multiplication result and add to xmm1 and put result in xmm1.
VFNMADD132SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 9D /r	fma	Multiply scalar single-precision floating-point value from xmm1 and xmm3/m32, negate the multiplication result and add to xmm2 and put result in xmm1.
VFNMADD213SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 AD /r	fma	Multiply scalar single-precision floating-point value from xmm1 and xmm2, negate the multiplication result and add to xmm3/m32 and put result in xmm1.
VFNMADD231SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 BD /r	fma	Multiply scalar single-precision floating-point value from xmm2 and xmm3/m32, negate the multiplication result and add to xmm1 and put result in xmm1.
VFNMADD132SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 9D /r	avx512	Multiply scalar single-precision floating-point value from xmm1 and xmm3/m32, negate the multiplication result and add to xmm2 and put result in xmm1.
VFNMADD213SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 AD /r	avx512	Multiply scalar single-precision floating-point value from xmm1 and xmm2, negate the multiplication result and add to xmm3/m32 and put result in xmm1.
VFNMADD231SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 BD /r	avx512	Multiply scalar single-precision floating-point value from xmm2 and xmm3/m32, negate the multiplication result and add to xmm1 and put result in xmm1.
VFNMSUB132PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 9E /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm3/mem, negate the multiplication result and subtract xmm2 and put result in xmm1.
VFNMSUB213PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 AE /r	fma	Multiply packed double-precision floating-point values from xmm1 and xmm2, negate the multiplication result and subtract xmm3/mem and put result in xmm1.
VFNMSUB231PD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 BE /r	fma	Multiply packed double-precision floating-point values from xmm2 and xmm3/mem, negate the multiplication result and subtract xmm1 and put result in xmm1.
VFNMSUB132PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 9E /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm3/mem, negate the multiplication result and subtract ymm2 and put result in ymm1.
VFNMSUB213PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 AE /r	fma	Multiply packed double-precision floating-point values from ymm1 and ymm2, negate the multiplication result and subtract ymm3/mem and put result in ymm1.
VFNMSUB231PD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 BE /r	fma	Multiply packed double-precision floating-point values from ymm2 and ymm3/mem, negate the multiplication result and subtract ymm1 and put result in ymm1.
VFNMSUB132PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 9E /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm3/m128/m64bcst, negate the multiplication result and subtract xmm2 and put result in xmm1.
VFNMSUB213PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 AE /r	avx512	Multiply packed double-precision floating-point values from xmm1 and xmm2, negate the multiplication result and subtract xmm3/m128/m64bcst and put result in xmm1.
VFNMSUB231PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 BE /r	avx512	Multiply packed double-precision floating-point values from xmm2 and xmm3/m128/m64bcst, negate the multiplication result and subtract xmm1 and put result in xmm1.
VFNMSUB132PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 9E /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm3/m256/m64bcst, negate the multiplication result and subtract ymm2 and put result in ymm1.
VFNMSUB213PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 AE /r	avx512	Multiply packed double-precision floating-point values from ymm1 and ymm2, negate the multiplication result and subtract ymm3/m256/m64bcst and put result in ymm1.
VFNMSUB231PD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 BE /r	avx512	Multiply packed double-precision floating-point values from ymm2 and ymm3/m256/m64bcst, negate the multiplication result and subtract ymm1 and put result in ymm1.
VFNMSUB132PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 9E /r	avx512	Multiply packed double-precision floating-point values from zmm1 and zmm3/m512/m64bcst, negate the multiplication result and subtract zmm2 and put result in zmm1.
VFNMSUB213PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 AE /r	avx512	Multiply packed double-precision floating-point values from zmm1 and zmm2, negate the multiplication result and subtract zmm3/m512/m64bcst and put result in zmm1.
VFNMSUB231PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 BE /r	avx512	Multiply packed double-precision floating-point values from zmm2 and zmm3/m512/m64bcst, negate the multiplication result and subtract zmm1 and put result in zmm1.
VFNMSUB132PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 9E /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm3/mem, negate the multiplication result and subtract xmm2 and put result in xmm1.
VFNMSUB213PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 AE /r	fma	Multiply packed single-precision floating-point values from xmm1 and xmm2, negate the multiplication result and subtract xmm3/mem and put result in xmm1.
VFNMSUB231PS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 BE /r	fma	Multiply packed single-precision floating-point values from xmm2 and xmm3/mem, negate the multiplication result and subtract xmm1 and put result in xmm1.
VFNMSUB132PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 9E /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm3/mem, negate the multiplication result and subtract ymm2 and put result in ymm1.
VFNMSUB213PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 AE /r	fma	Multiply packed single-precision floating-point values from ymm1 and ymm2, negate the multiplication result and subtract ymm3/mem and put result in ymm1.
VFNMSUB231PS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.0 BE /r	fma	Multiply packed single-precision floating-point values from ymm2 and ymm3/mem, negate the multiplication result and subtract ymm1 and put result in ymm1.
VFNMSUB132PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 9E /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm3/m128/m32bcst, negate the multiplication result and subtract xmm2 and put result in xmm1.
VFNMSUB213PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 AE /r	avx512	Multiply packed single-precision floating-point values from xmm1 and xmm2, negate the multiplication result and subtract xmm3/m128/m32bcst and put result in xmm1.
VFNMSUB231PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 BE /r	avx512	Multiply packed single-precision floating-point values from xmm2 and xmm3/m128/m32bcst, negate the multiplication result subtract add to xmm1 and put result in xmm1.
VFNMSUB132PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 9E /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm3/m256/m32bcst, negate the multiplication result and subtract ymm2 and put result in ymm1.
VFNMSUB213PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 AE /r	avx512	Multiply packed single-precision floating-point values from ymm1 and ymm2, negate the multiplication result and subtract ymm3/m256/m32bcst and put result in ymm1.
VFNMSUB231PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 BE /r	avx512	Multiply packed single-precision floating-point values from ymm2 and ymm3/m256/m32bcst, negate the multiplication result subtract add to ymm1 and put result in ymm1.
VFNMSUB132PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 9E /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm3/m512/m32bcst, negate the multiplication result and subtract zmm2 and put result in zmm1.
VFNMSUB213PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 AE /r	avx512	Multiply packed single-precision floating-point values from zmm1 and zmm2, negate the multiplication result and subtract zmm3/m512/m32bcst and put result in zmm1.
VFNMSUB231PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 BE /r	avx512	Multiply packed single-precision floating-point values from zmm2 and zmm3/m512/m32bcst, negate the multiplication result subtract add to zmm1 and put result in zmm1.
VFNMSUB132SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 9F /r	fma	Multiply scalar double-precision floating-point value from xmm1 and xmm3/mem, negate the multiplication result and subtract xmm2 and put result in xmm1.
VFNMSUB213SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 AF /r	fma	Multiply scalar double-precision floating-point value from xmm1 and xmm2, negate the multiplication result and subtract xmm3/mem and put result in xmm1.
VFNMSUB231SD xmm1, xmm2, xmm3/m64	VEX.DDS.LIG.66.0F38.W1 BF /r	fma	Multiply scalar double-precision floating-point value from xmm2 and xmm3/mem, negate the multiplication result and subtract xmm1 and put result in xmm1.
VFNMSUB132SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 9F /r	avx512	Multiply scalar double-precision floating-point value from xmm1 and xmm3/m64, negate the multiplication result and subtract xmm2 and put result in xmm1.
VFNMSUB213SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 AF /r	avx512	Multiply scalar double-precision floating-point value from xmm1 and xmm2, negate the multiplication result and subtract xmm3/m64 and put result in xmm1.
VFNMSUB231SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.DDS.LIG.66.0F38.W1 BF /r	avx512	Multiply scalar double-precision floating-point value from xmm2 and xmm3/m64, negate the multiplication result and subtract xmm1 and put result in xmm1.
VFNMSUB132SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 9F /r	fma	Multiply scalar single-precision floating-point value from xmm1 and xmm3/m32, negate the multiplication result and subtract xmm2 and put result in xmm1.
VFNMSUB213SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 AF /r	fma	Multiply scalar single-precision floating-point value from xmm1 and xmm2, negate the multiplication result and subtract xmm3/m32 and put result in xmm1.
VFNMSUB231SS xmm1, xmm2, xmm3/m32	VEX.DDS.LIG.66.0F38.W0 BF /r	fma	Multiply scalar single-precision floating-point value from xmm2 and xmm3/m32, negate the multiplication result and subtract xmm1 and put result in xmm1.
VFNMSUB132SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 9F /r	avx512	Multiply scalar single-precision floating-point value from xmm1 and xmm3/m32, negate the multiplication result and subtract xmm2 and put result in xmm1.
VFNMSUB213SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 AF /r	avx512	Multiply scalar single-precision floating-point value from xmm1 and xmm2, negate the multiplication result and subtract xmm3/m32 and put result in xmm1.
VFNMSUB231SS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.DDS.LIG.66.0F38.W0 BF /r	avx512	Multiply scalar single-precision floating-point value from xmm2 and xmm3/m32, negate the multiplication result and subtract xmm1 and put result in xmm1.
VFPCLASSPD k2 {k1}, xmm2/m128/m64bcst, imm8	EVEX.128.66.0F3A.W1 66 /r ib	avx512	NaN, +0, -0, +Infinity, -Infinity, denormal, finite negative. The immediate field provides a mask bit for each of these category tests. The masked test results are OR-ed together to form a mask result.
VFPCLASSPD k2 {k1}, ymm2/m256/m64bcst, imm8	EVEX.256.66.0F3A.W1 66 /r ib	avx512	NaN, +0, -0, +Infinity, -Infinity, denormal, finite negative. The immediate field provides a mask bit for each of these category tests. The masked test results are OR-ed together to form a mask result.
VFPCLASSPD k2 {k1}, zmm2/m512/m64bcst, imm8	EVEX.512.66.0F3A.W1 66 /r ib	avx512	NaN, +0, -0, +Infinity, -Infinity, denormal, finite negative. The immediate field provides a mask bit for each of these category tests. The masked test results are OR-ed together to form a mask result.
VFPCLASSPS k2 {k1}, xmm2/m128/m32bcst, imm8	EVEX.128.66.0F3A.W0 66 /r ib	avx512	Tests the input for the following categories: NaN, +0, -0, +Infinity, -Infinity, denormal, finite negative. The immediate field provides a mask bit for each of these category tests. The masked test results are OR-ed together to form a mask result.
VFPCLASSPS k2 {k1}, ymm2/m256/m32bcst, imm8	EVEX.256.66.0F3A.W0 66 /r ib	avx512	Tests the input for the following categories: NaN, +0, -0, +Infinity, -Infinity, denormal, finite negative. The immediate field provides a mask bit for each of these category tests. The masked test results are OR-ed together to form a mask result.
VFPCLASSPS k2 {k1}, zmm2/m512/m32bcst, imm8	EVEX.512.66.0F3A.W0 66 /r ib	avx512	Tests the input for the following categories: NaN, +0, -0, +Infinity, -Infinity, denormal, finite negative. The immediate field provides a mask bit for each of these category tests. The masked test results are OR-ed together to form a mask result.
VFPCLASSSD k2 {k1}, xmm2/m64, imm8	EVEX.LIG.66.0F3A.W1 67 /r ib	avx512	Tests the input for the following categories: NaN, +0, -0, +Infinity, -Infinity, denormal, finite negative. The immediate field provides a mask bit for each of these category tests. The masked test results are OR-ed together to form a mask result.
VFPCLASSSS k2 {k1}, xmm2/m32, imm8	EVEX.LIG.66.0F3A.W0 67 /r	avx512	Tests the input for the following categories: NaN, +0, -0, +Infinity, -Infinity, denormal, finite negative. The immediate field provides a mask bit for each of these category tests. The masked test results are OR-ed together to form a mask result.
VGATHERDPD xmm1, vm32x, xmm2	VEX.DDS.128.66.0F38.W1 92 /r	avx2	Using dword indices specified in vm32x, gather double-pre-cision FP values from memory conditioned on mask speci-fied by xmm2. Conditionally gathered elements are merged into xmm1.
VGATHERQPD xmm1, vm64x, xmm2	VEX.DDS.128.66.0F38.W1 93 /r	avx2	Using qword indices specified in vm64x, gather double-pre-cision FP values from memory conditioned on mask speci-fied by xmm2. Conditionally gathered elements are merged into xmm1.
VGATHERDPD ymm1, vm32x, ymm2	VEX.DDS.256.66.0F38.W1 92 /r	avx2	Using dword indices specified in vm32x, gather double-pre-cision FP values from memory conditioned on mask speci-fied by ymm2. Conditionally gathered elements are merged into ymm1.
VGATHERQPD ymm1, vm64y, ymm2	VEX.DDS.256.66.0F38.W1 93 /r	avx2	Using qword indices specified in vm64y, gather double-pre-cision FP values from memory conditioned on mask speci-fied by ymm2. Conditionally gathered elements are merged into ymm1.
VGATHERDPS xmm1 {k1}, vm32x	EVEX.128.66.0F38.W0 92 /vsib	avx512	Using signed dword indices, gather single-precision floating-point values from memory using k1 as completion mask.
VGATHERDPS ymm1 {k1}, vm32y	EVEX.256.66.0F38.W0 92 /vsib	avx512	Using signed dword indices, gather single-precision floating-point values from memory using k1 as completion mask.
VGATHERDPS zmm1 {k1}, vm32z	EVEX.512.66.0F38.W0 92 /vsib	avx512	Using signed dword indices, gather single-precision floating-point values from memory using k1 as completion mask.
VGATHERDPD xmm1 {k1}, vm32x	EVEX.128.66.0F38.W1 92 /vsib	avx512	Using signed dword indices, gather float64 vector into float64 vector xmm1 using k1 as completion mask.
VGATHERDPD ymm1 {k1}, vm32x	EVEX.256.66.0F38.W1 92 /vsib	avx512	Using signed dword indices, gather float64 vector into float64 vector ymm1 using k1 as completion mask.
VGATHERDPD zmm1 {k1}, vm32y	EVEX.512.66.0F38.W1 92 /vsib	avx512	Using signed dword indices, gather float64 vector into float64 vector zmm1 using k1 as completion mask.
VGATHERDPS xmm1, vm32x, xmm2	VEX.DDS.128.66.0F38.W0 92 /r	avx2	Using dword indices specified in vm32x, gather single-preci-sion FP values from memory conditioned on mask specified by xmm2. Conditionally gathered elements are merged into xmm1.
VGATHERQPS xmm1, vm64x, xmm2	VEX.DDS.128.66.0F38.W0 93 /r	avx2	Using qword indices specified in vm64x, gather single-preci-sion FP values from memory conditioned on mask specified by xmm2. Conditionally gathered elements are merged into xmm1.
VGATHERDPS ymm1, vm32y, ymm2	VEX.DDS.256.66.0F38.W0 92 /r	avx2	Using dword indices specified in vm32y, gather single-preci-sion FP values from memory conditioned on mask specified by ymm2. Conditionally gathered elements are merged into ymm1.
VGATHERQPS xmm1, vm64y, xmm2	VEX.DDS.256.66.0F38.W0 93 /r	avx2	Using qword indices specified in vm64y, gather single-preci-sion FP values from memory conditioned on mask specified by xmm2. Conditionally gathered elements are merged into xmm1.
VGATHERPF0DPS vm32z {k1}	EVEX.512.66.0F38.W0 C6 /1 /vsib	avx512	Using signed dword indices, prefetch sparse byte memory locations containing single-precision data using opmask k1 and T0 hint.
VGATHERPF0QPS vm64z {k1}	EVEX.512.66.0F38.W0 C7 /1 /vsib	avx512	Using signed qword indices, prefetch sparse byte memory locations containing single-precision data using opmask k1 and T0 hint.
VGATHERPF0DPD vm32y {k1}	EVEX.512.66.0F38.W1 C6 /1 /vsib	avx512	Using signed dword indices, prefetch sparse byte memory locations containing double-precision data using opmask k1 and T0 hint.
VGATHERPF0QPD vm64z {k1}	EVEX.512.66.0F38.W1 C7 /1 /vsib	avx512	Using signed qword indices, prefetch sparse byte memory locations containing double-precision data using opmask k1 and T0 hint.
VGATHERPF1DPS vm32z {k1}	EVEX.512.66.0F38.W0 C6 /2 /vsib	avx512	Using signed dword indices, prefetch sparse byte memory locations containing single-precision data using opmask k1 and T1 hint.
VGATHERPF1QPS vm64z {k1}	EVEX.512.66.0F38.W0 C7 /2 /vsib	avx512	Using signed qword indices, prefetch sparse byte memory locations containing single-precision data using opmask k1 and T1 hint.
VGATHERPF1DPD vm32y {k1}	EVEX.512.66.0F38.W1 C6 /2 /vsib	avx512	Using signed dword indices, prefetch sparse byte memory locations containing double-precision data using opmask k1 and T1 hint.
VGATHERPF1QPD vm64z {k1}	EVEX.512.66.0F38.W1 C7 /2 /vsib	avx512	Using signed qword indices, prefetch sparse byte memory locations containing double-precision data using opmask k1 and T1 hint.
VGATHERQPS xmm1 {k1}, vm64x	EVEX.128.66.0F38.W0 93 /vsib	avx512	Using signed qword indices, gather single-precision floating-point values from memory using k1 as completion mask.
VGATHERQPS xmm1 {k1}, vm64y	EVEX.256.66.0F38.W0 93 /vsib	avx512	Using signed qword indices, gather single-precision floating-point values from memory using k1 as completion mask.
VGATHERQPS ymm1 {k1}, vm64z	EVEX.512.66.0F38.W0 93 /vsib	avx512	Using signed qword indices, gather single-precision floating-point values from memory using k1 as completion mask.
VGATHERQPD xmm1 {k1}, vm64x	EVEX.128.66.0F38.W1 93 /vsib	avx512	Using signed qword indices, gather float64 vector into float64 vector xmm1 using k1 as completion mask.
VGATHERQPD ymm1 {k1}, vm64y	EVEX.256.66.0F38.W1 93 /vsib	avx512	Using signed qword indices, gather float64 vector into float64 vector ymm1 using k1 as completion mask.
VGATHERQPD zmm1 {k1}, vm64z	EVEX.512.66.0F38.W1 93 /vsib	avx512	Using signed qword indices, gather float64 vector into float64 vector zmm1 using k1 as completion mask.
VGETEXPPD xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.66.0F38.W1 42 /r	avx512	Convert the exponent of packed double-precision floating-point values in the source operand to DP FP results representing unbiased integer exponents and stores the results in the destination register.
VGETEXPPD ymm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.66.0F38.W1 42 /r	avx512	Convert the exponent of packed double-precision floating-point values in the source operand to DP FP results representing unbiased integer exponents and stores the results in the destination register.
VGETEXPPD zmm1 {k1}{z}, zmm2/m512/m64bcst{sae}	EVEX.512.66.0F38.W1 42 /r	avx512	Convert the exponent of packed double-precision floating-point values in the source operand to DP FP results representing unbiased integer exponents and stores the results in the destination under writemask k1.
VGETEXPPS xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.66.0F38.W0 42 /r	avx512	Convert the exponent of packed single-precision floating-point values in the source operand to SP FP results representing unbiased integer exponents and stores the results in the destination register.
VGETEXPPS ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.66.0F38.W0 42 /r	avx512	Convert the exponent of packed single-precision floating-point values in the source operand to SP FP results representing unbiased integer exponents and stores the results in the destination register.
VGETEXPPS zmm1 {k1}{z}, zmm2/m512/m32bcst{sae}	EVEX.512.66.0F38.W0 42 /r	avx512	Convert the exponent of packed single-precision floating-point values in the source operand to SP FP results representing unbiased integer exponents and stores the results in the destination register.
VGETEXPSD xmm1 {k1}{z}, xmm2, xmm3/m64{sae}	EVEX.NDS.LIG.66.0F38.W1 43 /r	avx512	Convert the biased exponent (bits 62:52) of the low double-precision floating-point value in xmm3/m64 to a DP FP value representing unbiased integer exponent. Stores the result to the low 64-bit of xmm1 under the writemask k1 and merge with the other elements of xmm2.
VGETEXPSS xmm1 {k1}{z}, xmm2, xmm3/m32{sae}	EVEX.NDS.LIG.66.0F38.W0 43 /r	avx512	Convert the biased exponent (bits 30:23) of the low single-precision floating-point value in xmm3/m32 to a SP FP value representing unbiased integer exponent. Stores the result to xmm1 under the writemask k1 and merge with the other elements of xmm2.
VGETMANTPD xmm1 {k1}{z}, xmm2/m128/m64bcst, imm8	EVEX.128.66.0F3A.W1 26 /r ib	avx512	Get Normalized Mantissa from float64 vector xmm2/m128/m64bcst and store the result in xmm1, using imm8 for sign control and mantissa interval normalization, under writemask.
VGETMANTPD ymm1 {k1}{z}, ymm2/m256/m64bcst, imm8	EVEX.256.66.0F3A.W1 26 /r ib	avx512	Get Normalized Mantissa from float64 vector ymm2/m256/m64bcst and store the result in ymm1, using imm8 for sign control and mantissa interval normalization, under writemask.
VGETMANTPD zmm1 {k1}{z}, zmm2/m512/m64bcst{sae}, imm8	EVEX.512.66.0F3A.W1 26 /r ib	avx512	Get Normalized Mantissa from float64 vector zmm2/m512/m64bcst and store the result in zmm1, using imm8 for sign control and mantissa interval normalization, under writemask.
VGETMANTPS xmm1 {k1}{z}, xmm2/m128/m32bcst, imm8	EVEX.128.66.0F3A.W0 26 /r ib	avx512	Get normalized mantissa from float32 vector xmm2/m128/m32bcst and store the result in xmm1, using imm8 for sign control and mantissa interval normalization, under writemask.
VGETMANTPS ymm1 {k1}{z}, ymm2/m256/m32bcst, imm8	EVEX.256.66.0F3A.W0 26 /r ib	avx512	Get normalized mantissa from float32 vector ymm2/m256/m32bcst and store the result in ymm1, using imm8 for sign control and mantissa interval normalization, under writemask.
VGETMANTPS zmm1 {k1}{z}, zmm2/m512/m32bcst{sae}, imm8	EVEX.512.66.0F3A.W0 26 /r ib	avx512	Get normalized mantissa from float32 vector zmm2/m512/m32bcst and store the result in zmm1, using imm8 for sign control and mantissa interval normalization, under writemask.
VGETMANTSD xmm1 {k1}{z}, xmm2, xmm3/m64{sae}, imm8	EVEX.NDS.LIG.66.0F3A.W1 27 /r ib	avx512	Extract the normalized mantissa of the low float64 element in xmm3/m64 using imm8 for sign control and mantissa interval normalization. Store the mantissa to xmm1 under the writemask k1 and merge with the other elements of xmm2.
VGETMANTSS xmm1 {k1}{z}, xmm2, xmm3/m32{sae}, imm8	EVEX.NDS.LIG.66.0F3A.W0 27 /r ib	avx512	Extract the normalized mantissa from the low float32 element of xmm3/m32 using imm8 for sign control and mantissa interval normalization, store the mantissa to xmm1 under the writemask k1 and merge with the other elements of xmm2.
VINSERTF128 ymm1, ymm2, xmm3/m128, imm8	VEX.NDS.256.66.0F3A.W0 18 /r ib	avx	Insert 128 bits of packed floating-point values from xmm3/m128 and the remaining values from ymm2 into ymm1.
VINSERTF32X4 ymm1 {k1}{z}, ymm2, xmm3/m128, imm8	EVEX.NDS.256.66.0F3A.W0 18 /r ib	avx512	Insert 128 bits of packed single-precision floating-point values from xmm3/m128 and the remaining values from ymm2 into ymm1 under writemask k1.
VINSERTF32X4 zmm1 {k1}{z}, zmm2, xmm3/m128, imm8	EVEX.NDS.512.66.0F3A.W0 18 /r ib	avx512	Insert 128 bits of packed single-precision floating-point values from xmm3/m128 and the remaining values from zmm2 into zmm1 under writemask k1.
VINSERTF64X2 ymm1 {k1}{z}, ymm2, xmm3/m128, imm8	EVEX.NDS.256.66.0F3A.W1 18 /r ib	avx512	Insert 128 bits of packed double-precision floating-point values from xmm3/m128 and the remaining values from ymm2 into ymm1 under writemask k1.
VINSERTF64X2 zmm1 {k1}{z}, zmm2, xmm3/m128, imm8	EVEX.NDS.512.66.0F3A.W1 18 /r ib	avx512	Insert 128 bits of packed double-precision floating-point values from xmm3/m128 and the remaining values from zmm2 into zmm1 under writemask k1.
VINSERTF32X8 zmm1 {k1}{z}, zmm2, ymm3/m256, imm8	EVEX.NDS.512.66.0F3A.W0 1A /r ib	avx512	Insert 256 bits of packed single-precision floating-point values from ymm3/m256 and the remaining values from zmm2 into zmm1 under writemask k1.
VINSERTF64X4 zmm1 {k1}{z}, zmm2, ymm3/m256, imm8	EVEX.NDS.512.66.0F3A.W1 1A /r ib	avx512	Insert 256 bits of packed double-precision floating-point values from ymm3/m256 and the remaining values from zmm2 into zmm1 under writemask k1.
VINSERTI128 ymm1, ymm2, xmm3/m128, imm8	VEX.NDS.256.66.0F3A.W0 38 /r ib	avx2	Insert 128 bits of integer data from xmm3/m128 and the remaining values from ymm2 into ymm1.
VINSERTI32X4 ymm1 {k1}{z}, ymm2, xmm3/m128, imm8	EVEX.NDS.256.66.0F3A.W0 38 /r ib	avx512	Insert 128 bits of packed doubleword integer values from xmm3/m128 and the remaining values from ymm2 into ymm1 under writemask k1.
VINSERTI32X4 zmm1 {k1}{z}, zmm2, xmm3/m128, imm8	EVEX.NDS.512.66.0F3A.W0 38 /r ib	avx512	Insert 128 bits of packed doubleword integer values from xmm3/m128 and the remaining values from zmm2 into zmm1 under writemask k1.
VINSERTI64X2 ymm1 {k1}{z}, ymm2, xmm3/m128, imm8	EVEX.NDS.256.66.0F3A.W1 38 /r ib	avx512	Insert 128 bits of packed quadword integer values from xmm3/m128 and the remaining values from ymm2 into ymm1 under writemask k1.
VINSERTI64X2 zmm1 {k1}{z}, zmm2, xmm3/m128, imm8	EVEX.NDS.512.66.0F3A.W1 38 /r ib	avx512	Insert 128 bits of packed quadword integer values from xmm3/m128 and the remaining values from zmm2 into zmm1 under writemask k1.
VINSERTI32X8 zmm1 {k1}{z}, zmm2, ymm3/m256, imm8	EVEX.NDS.512.66.0F3A.W0 3A /r ib	avx512	Insert 256 bits of packed doubleword integer values from ymm3/m256 and the remaining values from zmm2 into zmm1 under writemask k1.
VINSERTI64X4 zmm1 {k1}{z}, zmm2, ymm3/m256, imm8	EVEX.NDS.512.66.0F3A.W1 3A /r ib	avx512	Insert 256 bits of packed quadword integer values from ymm3/m256 and the remaining values from zmm2 into zmm1 under writemask k1.
VMASKMOVPS xmm1, xmm2, m128	VEX.NDS.128.66.0F38.W0 2C /r	avx	Conditionally load packed single-precision values from m128 using mask in xmm2 and store in xmm1.
VMASKMOVPS ymm1, ymm2, m256	VEX.NDS.256.66.0F38.W0 2C /r	avx	Conditionally load packed single-precision values from m256 using mask in ymm2 and store in ymm1.
VMASKMOVPD xmm1, xmm2, m128	VEX.NDS.128.66.0F38.W0 2D /r	avx	Conditionally load packed double-precision values from m128 using mask in xmm2 and store in xmm1.
VMASKMOVPD ymm1, ymm2, m256	VEX.NDS.256.66.0F38.W0 2D /r	avx	Conditionally load packed double-precision values from m256 using mask in ymm2 and store in ymm1.
VMASKMOVPS m128, xmm1, xmm2	VEX.NDS.128.66.0F38.W0 2E /r	avx	Conditionally store packed single-precision values from xmm2 using mask in xmm1.
VMASKMOVPS m256, ymm1, ymm2	VEX.NDS.256.66.0F38.W0 2E /r	avx	Conditionally store packed single-precision values from ymm2 using mask in ymm1.
VMASKMOVPD m128, xmm1, xmm2	VEX.NDS.128.66.0F38.W0 2F /r	avx	Conditionally store packed double-precision values from xmm2 using mask in xmm1.
VMASKMOVPD m256, ymm1, ymm2	VEX.NDS.256.66.0F38.W0 2F /r	avx	Conditionally store packed double-precision values from ymm2 using mask in ymm1.
VPBLENDD xmm1, xmm2, xmm3/m128, imm8	VEX.NDS.128.66.0F3A.W0 02 /r ib	avx2	Select dwords from xmm2 and xmm3/m128 from mask specified in imm8 and store the values into xmm1.
VPBLENDD ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.66.0F3A.W0 02 /r ib	avx2	Select dwords from ymm2 and ymm3/m256 from mask specified in imm8 and store the values into ymm1.
VPBLENDMB xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.W0 66 /r	avx512	Blend byte integer vector xmm2 and byte vector xmm3/m128 and store the result in xmm1, under control mask.
VPBLENDMB ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.W0 66 /r	avx512	Blend byte integer vector ymm2 and byte vector ymm3/m256 and store the result in ymm1, under control mask.
VPBLENDMB zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.W0 66 /r	avx512	Blend byte integer vector zmm2 and byte vector zmm3/m512 and store the result in zmm1, under control mask.
VPBLENDMW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.W1 66 /r	avx512	Blend word integer vector xmm2 and word vector xmm3/m128 and store the result in xmm1, under control mask.
VPBLENDMW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.W1 66 /r	avx512	Blend word integer vector ymm2 and word vector ymm3/m256 and store the result in ymm1, under control mask.
VPBLENDMW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.W1 66 /r	avx512	Blend word integer vector zmm2 and word vector zmm3/m512 and store the result in zmm1, under control mask.
VPBLENDMD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 64 /r	avx512	Blend doubleword integer vector xmm2 and doubleword vector xmm3/m128/m32bcst and store the result in xmm1, under control mask.
VPBLENDMD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 64 /r	avx512	Blend doubleword integer vector ymm2 and doubleword vector ymm3/m256/m32bcst and store the result in ymm1, under control mask.
VPBLENDMD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 64 /r	avx512	Blend doubleword integer vector zmm2 and doubleword vector zmm3/m512/m32bcst and store the result in zmm1, under control mask.
VPBLENDMQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 64 /r	avx512	Blend quadword integer vector xmm2 and quadword vector xmm3/m128/m64bcst and store the result in xmm1, under control mask.
VPBLENDMQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 64 /r	avx512	Blend quadword integer vector ymm2 and quadword vector ymm3/m256/m64bcst and store the result in ymm1, under control mask.
VPBLENDMQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 64 /r	avx512	Blend quadword integer vector zmm2 and quadword vector zmm3/m512/m64bcst and store the result in zmm1, under control mask.
VPBROADCASTB xmm1, xmm2/m8	VEX.128.66.0F38.W0 78 /r	avx2	Broadcast a byte integer in the source operand to sixteen locations in xmm1.
VPBROADCASTB ymm1, xmm2/m8	VEX.256.66.0F38.W0 78 /r	avx2	Broadcast a byte integer in the source operand to thirty-two locations in ymm1.
VPBROADCASTB xmm1{k1}{z}, xmm2/m8	EVEX.128.66.0F38.W0 78 /r	avx512	Broadcast a byte integer in the source operand to locations in xmm1 subject to writemask k1.
VPBROADCASTB ymm1{k1}{z}, xmm2/m8	EVEX.256.66.0F38.W0 78 /r	avx512	Broadcast a byte integer in the source operand to locations in ymm1 subject to writemask k1.
VPBROADCASTB zmm1{k1}{z}, xmm2/m8	EVEX.512.66.0F38.W0 78 /r	avx512	Broadcast a byte integer in the source operand to 64 locations in zmm1 subject to writemask k1.
VPBROADCASTW xmm1, xmm2/m16	VEX.128.66.0F38.W0 79 /r	avx2	Broadcast a word integer in the source operand to eight locations in xmm1.
VPBROADCASTW ymm1, xmm2/m16	VEX.256.66.0F38.W0 79 /r	avx2	Broadcast a word integer in the source operand to sixteen locations in ymm1.
VPBROADCASTW xmm1{k1}{z}, xmm2/m16	EVEX.128.66.0F38.W0 79 /r	avx512	Broadcast a word integer in the source operand to locations in xmm1 subject to writemask k1.
VPBROADCASTW ymm1{k1}{z}, xmm2/m16	EVEX.256.66.0F38.W0 79 /r	avx512	Broadcast a word integer in the source operand to locations in ymm1 subject to writemask k1.
VPBROADCASTW zmm1{k1}{z}, xmm2/m16	EVEX.512.66.0F38.W0 79 /r	avx512	Broadcast a word integer in the source operand to 32 locations in zmm1 subject to writemask k1.
VPBROADCASTD xmm1, xmm2/m32	VEX.128.66.0F38.W0 58 /r	avx2	Broadcast a dword integer in the source operand to four locations in xmm1.
VPBROADCASTD ymm1, xmm2/m32	VEX.256.66.0F38.W0 58 /r	avx2	Broadcast a dword integer in the source operand to eight locations in ymm1.
VPBROADCASTD xmm1 {k1}{z}, xmm2/m32	EVEX.128.66.0F38.W0 58 /r	avx512	Broadcast a dword integer in the source operand to locations in xmm1 subject to writemask k1.
VPBROADCASTD ymm1 {k1}{z}, xmm2/m32	EVEX.256.66.0F38.W0 58 /r	avx512	Broadcast a dword integer in the source operand to locations in ymm1 subject to writemask k1.
VPBROADCASTD zmm1 {k1}{z}, xmm2/m32	EVEX.512.66.0F38.W0 58 /r	avx512	Broadcast a dword integer in the source operand to locations in zmm1 subject to writemask k1.
VPBROADCASTQ xmm1, xmm2/m64	VEX.128.66.0F38.W0 59 /r	avx2	Broadcast a qword element in source operand to two locations in xmm1.
VPBROADCASTQ ymm1, xmm2/m64	VEX.256.66.0F38.W0 59 /r	avx2	Broadcast a qword element in source operand to four locations in ymm1.
VPBROADCASTQ xmm1 {k1}{z}, xmm2/m64	EVEX.128.66.0F38.W1 59 /r	avx512	Broadcast a qword element in source operand to locations in xmm1 subject to writemask k1.
VPBROADCASTQ ymm1 {k1}{z}, xmm2/m64	EVEX.256.66.0F38.W1 59 /r	avx512	Broadcast a qword element in source operand to locations in ymm1 subject to writemask k1.
VPBROADCASTQ zmm1 {k1}{z}, xmm2/m64	EVEX.512.66.0F38.W1 59 /r	avx512	Broadcast a qword element in source operand to locations in zmm1 subject to writemask k1.
VBROADCASTI32x2 xmm1 {k1}{z}, xmm2/m64	EVEX.128.66.0F38.W0 59 /r	avx512	Broadcast two dword elements in source operand to locations in xmm1 subject to writemask k1.
VBROADCASTI32x2 ymm1 {k1}{z}, xmm2/m64	EVEX.256.66.0F38.W0 59 /r	avx512	Broadcast two dword elements in source operand to locations in ymm1 subject to writemask k1.
VBROADCASTI32x2 zmm1 {k1}{z}, xmm2/m64	EVEX.512.66.0F38.W0 59 /r	avx512	Broadcast two dword elements in source operand to locations in zmm1 subject to writemask k1.
VBROADCASTI128 ymm1, m128	VEX.256.66.0F38.W0 5A /r	avx2	Broadcast 128 bits of integer data in mem to low and high 128-bits in ymm1.
VBROADCASTI32X4 ymm1 {k1}{z}, m128	EVEX.256.66.0F38.W0 5A /r	avx512	Broadcast 128 bits of 4 doubleword integer data in mem to locations in ymm1 using writemask k1.
VBROADCASTI32X4 zmm1 {k1}{z}, m128	EVEX.512.66.0F38.W0 5A /r	avx512	Broadcast 128 bits of 4 doubleword integer data in mem to locations in zmm1 using writemask k1.
VBROADCASTI64X2 ymm1 {k1}{z}, m128	EVEX.256.66.0F38.W1 5A /r	avx512	Broadcast 128 bits of 2 quadword integer data in mem to locations in ymm1 using writemask k1.
VBROADCASTI64X2 zmm1 {k1}{z}, m128	EVEX.512.66.0F38.W1 5A /r	avx512	Broadcast 128 bits of 2 quadword integer data in mem to locations in zmm1 using writemask k1.
VBROADCASTI32X8 zmm1 {k1}{z}, m256	EVEX.512.66.0F38.W0 5B /r	avx512	Broadcast 256 bits of 8 doubleword integer data in mem to locations in zmm1 using writemask k1.
VBROADCASTI64X4 zmm1 {k1}{z}, m256	EVEX.512.66.0F38.W1 5B /r	avx512	Broadcast 256 bits of 4 quadword integer data in mem to locations in zmm1 using writemask k1.
VPBROADCASTB xmm1 {k1}{z}, reg	EVEX.128.66.0F38.W0 7A /r	avx512	Broadcast an 8-bit value from a GPR to all bytes in the 128-bit destination subject to writemask k1.
VPBROADCASTB ymm1 {k1}{z}, reg	EVEX.256.66.0F38.W0 7A /r	avx512	Broadcast an 8-bit value from a GPR to all bytes in the 256-bit destination subject to writemask k1.
VPBROADCASTB zmm1 {k1}{z}, reg	EVEX.512.66.0F38.W0 7A /r	avx512	Broadcast an 8-bit value from a GPR to all bytes in the 512-bit destination subject to writemask k1.
VPBROADCASTW xmm1 {k1}{z}, reg	EVEX.128.66.0F38.W0 7B /r	avx512	Broadcast a 16-bit value from a GPR to all words in the 128-bit destination subject to writemask k1.
VPBROADCASTW ymm1 {k1}{z}, reg	EVEX.256.66.0F38.W0 7B /r	avx512	Broadcast a 16-bit value from a GPR to all words in the 256-bit destination subject to writemask k1.
VPBROADCASTW zmm1 {k1}{z}, reg	EVEX.512.66.0F38.W0 7B /r	avx512	Broadcast a 16-bit value from a GPR to all words in the 512-bit destination subject to writemask k1.
VPBROADCASTD xmm1 {k1}{z}, r32	EVEX.128.66.0F38.W0 7C /r	avx512	Broadcast a 32-bit value from a GPR to all double-words in the 128-bit destination subject to writemask k1.
VPBROADCASTD ymm1 {k1}{z}, r32	EVEX.256.66.0F38.W0 7C /r	avx512	Broadcast a 32-bit value from a GPR to all double-words in the 256-bit destination subject to writemask k1.
VPBROADCASTD zmm1 {k1}{z}, r32	EVEX.512.66.0F38.W0 7C /r	avx512	Broadcast a 32-bit value from a GPR to all double-words in the 512-bit destination subject to writemask k1.
VPBROADCASTQ xmm1 {k1}{z}, r64	EVEX.128.66.0F38.W1 7C /r	avx512	Broadcast a 64-bit value from a GPR to all quad-words in the 128-bit destination subject to writemask k1.
VPBROADCASTQ ymm1 {k1}{z}, r64	EVEX.256.66.0F38.W1 7C /r	avx512	Broadcast a 64-bit value from a GPR to all quad-words in the 256-bit destination subject to writemask k1.
VPBROADCASTQ zmm1 {k1}{z}, r64	EVEX.512.66.0F38.W1 7C /r	avx512	Broadcast a 64-bit value from a GPR to all quad-words in the 512-bit destination subject to writemask k1.
VPBROADCASTMB2Q xmm1, k1	EVEX.128.F3.0F38.W1 2A /r	avx512	Broadcast low byte value in k1 to two locations in xmm1.
VPBROADCASTMB2Q ymm1, k1	EVEX.256.F3.0F38.W1 2A /r	avx512	Broadcast low byte value in k1 to four locations in ymm1.
VPBROADCASTMB2Q zmm1, k1	EVEX.512.F3.0F38.W1 2A /r	avx512	Broadcast low byte value in k1 to eight locations in zmm1.
VPBROADCASTMW2D xmm1, k1	EVEX.128.F3.0F38.W0 3A /r	avx512	Broadcast low word value in k1 to four locations in xmm1.
VPBROADCASTMW2D ymm1, k1	EVEX.256.F3.0F38.W0 3A /r	avx512	Broadcast low word value in k1 to eight locations in ymm1.
VPBROADCASTMW2D zmm1, k1	EVEX.512.F3.0F38.W0 3A /r	avx512	Broadcast low word value in k1 to sixteen locations in zmm1.
VPCMPB k1 {k2}, xmm2, xmm3/m128, imm8	EVEX.NDS.128.66.0F3A.W0 3F /r ib	avx512	Compare packed signed byte values in xmm3/m128 and xmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPB k1 {k2}, ymm2, ymm3/m256, imm8	EVEX.NDS.256.66.0F3A.W0 3F /r ib	avx512	Compare packed signed byte values in ymm3/m256 and ymm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPB k1 {k2}, zmm2, zmm3/m512, imm8	EVEX.NDS.512.66.0F3A.W0 3F /r ib	avx512	Compare packed signed byte values in zmm3/m512 and zmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPUB k1 {k2}, xmm2, xmm3/m128, imm8	EVEX.NDS.128.66.0F3A.W0 3E /r ib	avx512	Compare packed unsigned byte values in xmm3/m128 and xmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPUB k1 {k2}, ymm2, ymm3/m256, imm8	EVEX.NDS.256.66.0F3A.W0 3E /r ib	avx512	Compare packed unsigned byte values in ymm3/m256 and ymm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPUB k1 {k2}, zmm2, zmm3/m512, imm8	EVEX.NDS.512.66.0F3A.W0 3E /r ib	avx512	Compare packed unsigned byte values in zmm3/m512 and zmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPD k1 {k2}, xmm2, xmm3/m128/m32bcst, imm8	EVEX.NDS.128.66.0F3A.W0 1F /r ib	avx512	Compare packed signed doubleword integer values in xmm3/m128/m32bcst and xmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPD k1 {k2}, ymm2, ymm3/m256/m32bcst, imm8	EVEX.NDS.256.66.0F3A.W0 1F /r ib	avx512	Compare packed signed doubleword integer values in ymm3/m256/m32bcst and ymm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPD k1 {k2}, zmm2, zmm3/m512/m32bcst, imm8	EVEX.NDS.512.66.0F3A.W0 1F /r ib	avx512	Compare packed signed doubleword integer values in zmm2 and zmm3/m512/m32bcst using bits 2:0 of imm8 as a comparison predicate. The comparison results are written to the destination k1 under writemask k2.
VPCMPUD k1 {k2}, xmm2, xmm3/m128/m32bcst, imm8	EVEX.NDS.128.66.0F3A.W0 1E /r ib	avx512	Compare packed unsigned doubleword integer values in xmm3/m128/m32bcst and xmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPUD k1 {k2}, ymm2, ymm3/m256/m32bcst, imm8	EVEX.NDS.256.66.0F3A.W0 1E /r ib	avx512	Compare packed unsigned doubleword integer values in ymm3/m256/m32bcst and ymm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPUD k1 {k2}, zmm2, zmm3/m512/m32bcst, imm8	EVEX.NDS.512.66.0F3A.W0 1E /r ib	avx512	Compare packed unsigned doubleword integer values in zmm2 and zmm3/m512/m32bcst using bits 2:0 of imm8 as a comparison predicate. The comparison results are written to the destination k1 under writemask k2.
VPCMPQ k1 {k2}, xmm2, xmm3/m128/m64bcst, imm8	EVEX.NDS.128.66.0F3A.W1 1F /r ib	avx512	Compare packed signed quadword integer values in xmm3/m128/m64bcst and xmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPQ k1 {k2}, ymm2, ymm3/m256/m64bcst, imm8	EVEX.NDS.256.66.0F3A.W1 1F /r ib	avx512	Compare packed signed quadword integer values in ymm3/m256/m64bcst and ymm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPQ k1 {k2}, zmm2, zmm3/m512/m64bcst, imm8	EVEX.NDS.512.66.0F3A.W1 1F /r ib	avx512	Compare packed signed quadword integer values in zmm3/m512/m64bcst and zmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPUQ k1 {k2}, xmm2, xmm3/m128/m64bcst, imm8	EVEX.NDS.128.66.0F3A.W1 1E /r ib	avx512	Compare packed unsigned quadword integer values in xmm3/m128/m64bcst and xmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPUQ k1 {k2}, ymm2, ymm3/m256/m64bcst, imm8	EVEX.NDS.256.66.0F3A.W1 1E /r ib	avx512	Compare packed unsigned quadword integer values in ymm3/m256/m64bcst and ymm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPUQ k1 {k2}, zmm2, zmm3/m512/m64bcst, imm8	EVEX.NDS.512.66.0F3A.W1 1E /r ib	avx512	Compare packed unsigned quadword integer values in zmm3/m512/m64bcst and zmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPW k1 {k2}, xmm2, xmm3/m128, imm8	EVEX.NDS.128.66.0F3A.W1 3F /r ib	avx512	Compare packed signed word integers in xmm3/m128 and xmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPW k1 {k2}, ymm2, ymm3/m256, imm8	EVEX.NDS.256.66.0F3A.W1 3F /r ib	avx512	Compare packed signed word integers in ymm3/m256 and ymm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPW k1 {k2}, zmm2, zmm3/m512, imm8	EVEX.NDS.512.66.0F3A.W1 3F /r ib	avx512	Compare packed signed word integers in zmm3/m512 and zmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPUW k1 {k2}, xmm2, xmm3/m128, imm8	EVEX.NDS.128.66.0F3A.W1 3E /r ib	avx512	Compare packed unsigned word integers in xmm3/m128 and xmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPUW k1 {k2}, ymm2, ymm3/m256, imm8	EVEX.NDS.256.66.0F3A.W1 3E /r ib	avx512	Compare packed unsigned word integers in ymm3/m256 and ymm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCMPUW k1 {k2}, zmm2, zmm3/m512, imm8	EVEX.NDS.512.66.0F3A.W1 3E /r ib	avx512	Compare packed unsigned word integers in zmm3/m512 and zmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.
VPCOMPRESSD xmm1/m128 {k1}{z}, xmm2	EVEX.128.66.0F38.W0 8B /r	avx512	Compress packed doubleword integer values from xmm2 to xmm1/m128 using controlmask k1.
VPCOMPRESSD ymm1/m256 {k1}{z}, ymm2	EVEX.256.66.0F38.W0 8B /r	avx512	Compress packed doubleword integer values from ymm2 to ymm1/m256 using controlmask k1.
VPCOMPRESSD zmm1/m512 {k1}{z}, zmm2	EVEX.512.66.0F38.W0 8B /r	avx512	Compress packed doubleword integer values from zmm2 to zmm1/m512 using controlmask k1.
VPCOMPRESSQ xmm1/m128 {k1}{z}, xmm2	EVEX.128.66.0F38.W1 8B /r	avx512	Compress packed quadword integer values from xmm2 to xmm1/m128 using controlmask k1.
VPCOMPRESSQ ymm1/m256 {k1}{z}, ymm2	EVEX.256.66.0F38.W1 8B /r	avx512	Compress packed quadword integer values from ymm2 to ymm1/m256 using controlmask k1.
VPCOMPRESSQ zmm1/m512 {k1}{z}, zmm2	EVEX.512.66.0F38.W1 8B /r	avx512	Compress packed quadword integer values from zmm2 to zmm1/m512 using controlmask k1.
VPCONFLICTD xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.66.0F38.W0 C4 /r	avx512	Detect duplicate double-word values in xmm2/m128/m32bcst using writemask k1.
VPCONFLICTD ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.66.0F38.W0 C4 /r	avx512	Detect duplicate double-word values in ymm2/m256/m32bcst using writemask k1.
VPCONFLICTD zmm1 {k1}{z}, zmm2/m512/m32bcst	EVEX.512.66.0F38.W0 C4 /r	avx512	Detect duplicate double-word values in zmm2/m512/m32bcst using writemask k1.
VPCONFLICTQ xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.66.0F38.W1 C4 /r	avx512	Detect duplicate quad-word values in xmm2/m128/m64bcst using writemask k1.
VPCONFLICTQ ymm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.66.0F38.W1 C4 /r	avx512	Detect duplicate quad-word values in ymm2/m256/m64bcst using writemask k1.
VPCONFLICTQ zmm1 {k1}{z}, zmm2/m512/m64bcst	EVEX.512.66.0F38.W1 C4 /r	avx512	Detect duplicate quad-word values in zmm2/m512/m64bcst using writemask k1.
VPERM2F128 ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.66.0F3A.W0 06 /r ib	avx	Permute 128-bit floating-point fields in ymm2 and ymm3/mem using controls from imm8 and store result in ymm1.
VPERM2I128 ymm1, ymm2, ymm3/m256, imm8	VEX.NDS.256.66.0F3A.W0 46 /r ib	avx2	Permute 128-bit integer data in ymm2 and ymm3/mem using controls from imm8 and store result in ymm1.
VPERMD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 36 /r	avx2	Permute doublewords in ymm3/m256 using indices in ymm2 and store the result in ymm1.
VPERMD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 36 /r	avx512	Permute doublewords in ymm3/m256/m32bcst using indexes in ymm2 and store the result in ymm1 using writemask k1.
VPERMD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 36 /r	avx512	Permute doublewords in zmm3/m512/m32bcst using indices in zmm2 and store the result in zmm1 using writemask k1.
VPERMW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.W1 8D /r	avx512	Permute word integers in xmm3/m128 using indexes in xmm2 and store the result in xmm1 using writemask k1.
VPERMW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.W1 8D /r	avx512	Permute word integers in ymm3/m256 using indexes in ymm2 and store the result in ymm1 using writemask k1.
VPERMW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.W1 8D /r	avx512	Permute word integers in zmm3/m512 using indexes in zmm2 and store the result in zmm1 using writemask k1.
VPERMI2W xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.DDS.128.66.0F38.W1 75 /r	avx512	Permute word integers from two tables in xmm3/m128 and xmm2 using indexes in xmm1 and store the result in xmm1 using writemask k1.
VPERMI2W ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.DDS.256.66.0F38.W1 75 /r	avx512	Permute word integers from two tables in ymm3/m256 and ymm2 using indexes in ymm1 and store the result in ymm1 using writemask k1.
VPERMI2W zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.DDS.512.66.0F38.W1 75 /r	avx512	Permute word integers from two tables in zmm3/m512 and zmm2 using indexes in zmm1 and store the result in zmm1 using writemask k1.
VPERMI2D xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.DDS.128.66.0F38.W0 76 /r	avx512	Permute double-words from two tables in xmm3/m128/m32bcst and xmm2 using indexes in xmm1 and store the result in xmm1 using writemask k1.
VPERMI2D ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.DDS.256.66.0F38.W0 76 /r	avx512	Permute double-words from two tables in ymm3/m256/m32bcst and ymm2 using indexes in ymm1 and store the result in ymm1 using writemask k1.
VPERMI2D zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.DDS.512.66.0F38.W0 76 /r	avx512	Permute double-words from two tables in zmm3/m512/m32bcst and zmm2 using indices in zmm1 and store the result in zmm1 using writemask k1.
VPERMI2Q xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.DDS.128.66.0F38.W1 76 /r	avx512	Permute quad-words from two tables in xmm3/m128/m64bcst and xmm2 using indexes in xmm1 and store the result in xmm1 using writemask k1.
VPERMI2Q ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.DDS.256.66.0F38.W1 76 /r	avx512	Permute quad-words from two tables in ymm3/m256/m64bcst and ymm2 using indexes in ymm1 and store the result in ymm1 using writemask k1.
VPERMI2Q zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.DDS.512.66.0F38.W1 76 /r	avx512	Permute quad-words from two tables in zmm3/m512/m64bcst and zmm2 using indices in zmm1 and store the result in zmm1 using writemask k1.
VPERMI2PS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.DDS.128.66.0F38.W0 77 /r	avx512	Permute single-precision FP values from two tables in xmm3/m128/m32bcst and xmm2 using indexes in xmm1 and store the result in xmm1 using writemask k1.
VPERMI2PS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.DDS.256.66.0F38.W0 77 /r	avx512	Permute single-precision FP values from two tables in ymm3/m256/m32bcst and ymm2 using indexes in ymm1 and store the result in ymm1 using writemask k1.
VPERMI2PS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.DDS.512.66.0F38.W0 77 /r	avx512	Permute single-precision FP values from two tables in zmm3/m512/m32bcst and zmm2 using indices in zmm1 and store the result in zmm1 using writemask k1.
VPERMI2PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.DDS.128.66.0F38.W1 77 /r	avx512	Permute double-precision FP values from two tables in xmm3/m128/m64bcst and xmm2 using indexes in xmm1 and store the result in xmm1 using writemask k1.
VPERMI2PD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.DDS.128.66.0F38.W1 77 /r	avx512	Permute double-precision FP values from two tables in ymm3/m256/m64bcst and ymm2 using indexes in ymm1 and store the result in ymm1 using writemask k1.
VPERMI2PD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.DDS.512.66.0F38.W1 77 /r	avx512	Permute double-precision FP values from two tables in zmm3/m512/m64bcst and zmm2 using indices in zmm1 and store the result in zmm1 using writemask k1.
VPERMILPD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 0D /r	avx	Permute double-precision floating-point values in xmm2 using controls from xmm3/m128 and store result in xmm1.
VPERMILPD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 0D /r	avx	Permute double-precision floating-point values in ymm2 using controls from ymm3/m256 and store result in ymm1.
VPERMILPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 0D /r	avx512	Permute double-precision floating-point values in xmm2 using control from xmm3/m128/m64bcst and store the result in xmm1 using writemask k1.
VPERMILPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 0D /r	avx512	Permute double-precision floating-point values in ymm2 using control from ymm3/m256/m64bcst and store the result in ymm1 using writemask k1.
VPERMILPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 0D /r	avx512	Permute double-precision floating-point values in zmm2 using control from zmm3/m512/m64bcst and store the result in zmm1 using writemask k1.
VPERMILPD xmm1, xmm2/m128, imm8	VEX.128.66.0F3A.W0 05 /r ib	avx	Permute double-precision floating-point values in xmm2/m128 using controls from imm8.
VPERMILPD ymm1, ymm2/m256, imm8	VEX.256.66.0F3A.W0 05 /r ib	avx	Permute double-precision floating-point values in ymm2/m256 using controls from imm8.
VPERMILPD xmm1 {k1}{z}, xmm2/m128/m64bcst, imm8	EVEX.128.66.0F3A.W1 05 /r ib	avx512	Permute double-precision floating-point values in xmm2/m128/m64bcst using controls from imm8 and store the result in xmm1 using writemask k1.
VPERMILPD ymm1 {k1}{z}, ymm2/m256/m64bcst, imm8	EVEX.256.66.0F3A.W1 05 /r ib	avx512	Permute double-precision floating-point values in ymm2/m256/m64bcst using controls from imm8 and store the result in ymm1 using writemask k1.
VPERMILPD zmm1 {k1}{z}, zmm2/m512/m64bcst, imm8	EVEX.512.66.0F3A.W1 05 /r ib	avx512	Permute double-precision floating-point values in zmm2/m512/m64bcst using controls from imm8 and store the result in zmm1 using writemask k1.
VPERMILPS xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 0C /r	avx	Permute single-precision floating-point values in xmm2 using controls from xmm3/m128 and store result in xmm1.
VPERMILPS xmm1, xmm2/m128, imm8	VEX.128.66.0F3A.W0 04 /r ib	avx	Permute single-precision floating-point values in xmm2/m128 using controls from imm8 and store result in xmm1.
VPERMILPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 0C /r	avx	Permute single-precision floating-point values in ymm2 using controls from ymm3/m256 and store result in ymm1.
VPERMILPS ymm1, ymm2/m256, imm8	VEX.256.66.0F3A.W0 04 /r ib	avx	Permute single-precision floating-point values in ymm2/m256 using controls from imm8 and store result in ymm1.
VPERMILPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 0C /r	avx512	Permute single-precision floating-point values xmm2 using control from xmm3/m128/m32bcst and store the result in xmm1 using writemask k1.
VPERMILPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 0C /r	avx512	Permute single-precision floating-point values ymm2 using control from ymm3/m256/m32bcst and store the result in ymm1 using writemask k1.
VPERMILPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 0C /r	avx512	Permute single-precision floating-point values zmm2 using control from zmm3/m512/m32bcst and store the result in zmm1 using writemask k1.
VPERMILPS xmm1 {k1}{z}, xmm2/m128/m32bcst, imm8	EVEX.128.66.0F3A.W0 04 /r ib	avx512	Permute single-precision floating-point values xmm2/m128/m32bcst using controls from imm8 and store the result in xmm1 using writemask k1.
VPERMILPS ymm1 {k1}{z}, ymm2/m256/m32bcst, imm8	EVEX.256.66.0F3A.W0 04 /r ib	avx512	Permute single-precision floating-point values ymm2/m256/m32bcst using controls from imm8 and store the result in ymm1 using writemask k1.
VPERMILPS zmm1 {k1}{z}, zmm2/m512/m32bcst, imm8	EVEX.512.66.0F3A.W0 04 /r ib	avx512	Permute single-precision floating-point values zmm2/m512/m32bcst using controls from imm8 and store the result in zmm1 using writemask k1.
VPERMPD ymm1, ymm2/m256, imm8	VEX.256.66.0F3A.W1 01 /r ib	avx2	Permute double-precision floating-point elements in ymm2/m256 using indices in imm8 and store the result in ymm1.
VPERMPD ymm1 {k1}{z}, ymm2/m256/m64bcst, imm8	EVEX.256.66.0F3A.W1 01 /r ib	avx512	Permute double-precision floating-point elements in ymm2/m256/m64bcst using indexes in imm8 and store the result in ymm1 subject to writemask k1.
VPERMPD zmm1 {k1}{z}, zmm2/m512/m64bcst, imm8	EVEX.512.66.0F3A.W1 01 /r ib	avx512	Permute double-precision floating-point elements in zmm2/m512/m64bcst using indices in imm8 and store the result in zmm1 subject to writemask k1.
VPERMPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 16 /r	avx512	Permute double-precision floating-point elements in ymm3/m256/m64bcst using indexes in ymm2 and store the result in ymm1 subject to writemask k1.
VPERMPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 16 /r	avx512	Permute double-precision floating-point elements in zmm3/m512/m64bcst using indices in zmm2 and store the result in zmm1 subject to writemask k1.
VPERMPS ymm1, ymm2, ymm3/m256	VEX.256.66.0F38.W0 16 /r	avx2	Permute single-precision floating-point elements in ymm3/m256 using indices in ymm2 and store the result in ymm1.
VPERMPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 16 /r	avx512	Permute single-precision floating-point elements in ymm3/m256/m32bcst using indexes in ymm2 and store the result in ymm1 subject to write mask k1.
VPERMPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 16 /r	avx512	Permute single-precision floating-point values in zmm3/m512/m32bcst using indices in zmm2 and store the result in zmm1 subject to write mask k1.
VPERMQ ymm1, ymm2/m256, imm8	VEX.256.66.0F3A.W1 00 /r ib	avx2	Permute qwords in ymm2/m256 using indices in imm8 and store the result in ymm1.
VPERMQ ymm1 {k1}{z}, ymm2/m256/m64bcst, imm8	EVEX.256.66.0F3A.W1 00 /r ib	avx512	Permute qwords in ymm2/m256/m64bcst using indexes in imm8 and store the result in ymm1.
VPERMQ zmm1 {k1}{z}, zmm2/m512/m64bcst, imm8	EVEX.512.66.0F3A.W1 00 /r ib	avx512	Permute qwords in zmm2/m512/m64bcst using indices in imm8 and store the result in zmm1.
VPERMQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 36 /r	avx512	Permute qwords in ymm3/m256/m64bcst using indexes in ymm2 and store the result in ymm1.
VPERMQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 36 /r	avx512	Permute qwords in zmm3/m512/m64bcst using indices in zmm2 and store the result in zmm1.
VPEXPANDD xmm1 {k1}{z}, xmm2/m128	EVEX.128.66.0F38.W0 89 /r	avx512	Expand packed double-word integer values from xmm2/m128 to xmm1 using writemask k1.
VPEXPANDD ymm1 {k1}{z}, ymm2/m256	EVEX.256.66.0F38.W0 89 /r	avx512	Expand packed double-word integer values from ymm2/m256 to ymm1 using writemask k1.
VPEXPANDD zmm1 {k1}{z}, zmm2/m512	EVEX.512.66.0F38.W0 89 /r	avx512	Expand packed double-word integer values from zmm2/m512 to zmm1 using writemask k1.
VPEXPANDQ xmm1 {k1}{z}, xmm2/m128	EVEX.128.66.0F38.W1 89 /r	avx512	Expand packed quad-word integer values from xmm2/m128 to xmm1 using writemask k1.
VPEXPANDQ ymm1 {k1}{z}, ymm2/m256	EVEX.256.66.0F38.W1 89 /r	avx512	Expand packed quad-word integer values from ymm2/m256 to ymm1 using writemask k1.
VPEXPANDQ zmm1 {k1}{z}, zmm2/m512	EVEX.512.66.0F38.W1 89 /r	avx512	Expand packed quad-word integer values from zmm2/m512 to zmm1 using writemask k1.
VPGATHERDD xmm1 {k1}, vm32x	EVEX.128.66.0F38.W0 90 /vsib	avx512	Using signed dword indices, gather dword values from memory using writemask k1 for merging-masking.
VPGATHERDD ymm1 {k1}, vm32y	EVEX.256.66.0F38.W0 90 /vsib	avx512	Using signed dword indices, gather dword values from memory using writemask k1 for merging-masking.
VPGATHERDD zmm1 {k1}, vm32z	EVEX.512.66.0F38.W0 90 /vsib	avx512	Using signed dword indices, gather dword values from memory using writemask k1 for merging-masking.
VPGATHERDQ xmm1 {k1}, vm32x	EVEX.128.66.0F38.W1 90 /vsib	avx512	Using signed dword indices, gather quadword values from memory using writemask k1 for merging-masking.
VPGATHERDQ ymm1 {k1}, vm32x	EVEX.256.66.0F38.W1 90 /vsib	avx512	Using signed dword indices, gather quadword values from memory using writemask k1 for merging-masking.
VPGATHERDQ zmm1 {k1}, vm32y	EVEX.512.66.0F38.W1 90 /vsib	avx512	Using signed dword indices, gather quadword values from memory using writemask k1 for merging-masking.
VPGATHERDD xmm1, vm32x, xmm2	VEX.DDS.128.66.0F38.W0 90 /r	avx2	Using dword indices specified in vm32x, gather dword val-ues from memory conditioned on mask specified by xmm2. Conditionally gathered elements are merged into xmm1.
VPGATHERQD xmm1, vm64x, xmm2	VEX.DDS.128.66.0F38.W0 91 /r	avx2	Using qword indices specified in vm64x, gather dword val-ues from memory conditioned on mask specified by xmm2. Conditionally gathered elements are merged into xmm1.
VPGATHERDD ymm1, vm32y, ymm2	VEX.DDS.256.66.0F38.W0 90 /r	avx2	Using dword indices specified in vm32y, gather dword from memory conditioned on mask specified by ymm2. Conditionally gathered elements are merged into ymm1.
VPGATHERQD xmm1, vm64y, xmm2	VEX.DDS.256.66.0F38.W0 91 /r	avx2	Using qword indices specified in vm64y, gather dword val-ues from memory conditioned on mask specified by xmm2. Conditionally gathered elements are merged into xmm1.
VPGATHERDQ xmm1, vm32x, xmm2	VEX.DDS.128.66.0F38.W1 90 /r	avx2	Using dword indices specified in vm32x, gather qword val-ues from memory conditioned on mask specified by xmm2. Conditionally gathered elements are merged into xmm1.
VPGATHERQQ xmm1, vm64x, xmm2	VEX.DDS.128.66.0F38.W1 91 /r	avx2	Using qword indices specified in vm64x, gather qword val-ues from memory conditioned on mask specified by xmm2. Conditionally gathered elements are merged into xmm1.
VPGATHERDQ ymm1, vm32x, ymm2	VEX.DDS.256.66.0F38.W1 90 /r	avx2	Using dword indices specified in vm32x, gather qword val-ues from memory conditioned on mask specified by ymm2. Conditionally gathered elements are merged into ymm1.
VPGATHERQQ ymm1, vm64y, ymm2	VEX.DDS.256.66.0F38.W1 91 /r	avx2	Using qword indices specified in vm64y, gather qword val-ues from memory conditioned on mask specified by ymm2. Conditionally gathered elements are merged into ymm1.
VPGATHERQD xmm1 {k1}, vm64x	EVEX.128.66.0F38.W0 91 /vsib	avx512	Using signed qword indices, gather dword values from memory using writemask k1 for merging-masking.
VPGATHERQD xmm1 {k1}, vm64y	EVEX.256.66.0F38.W0 91 /vsib	avx512	Using signed qword indices, gather dword values from memory using writemask k1 for merging-masking.
VPGATHERQD ymm1 {k1}, vm64z	EVEX.512.66.0F38.W0 91 /vsib	avx512	Using signed qword indices, gather dword values from memory using writemask k1 for merging-masking.
VPGATHERQQ xmm1 {k1}, vm64x	EVEX.128.66.0F38.W1 91 /vsib	avx512	Using signed qword indices, gather quadword values from memory using writemask k1 for merging-masking.
VPGATHERQQ ymm1 {k1}, vm64y	EVEX.256.66.0F38.W1 91 /vsib	avx512	Using signed qword indices, gather quadword values from memory using writemask k1 for merging-masking.
VPGATHERQQ zmm1 {k1}, vm64z	EVEX.512.66.0F38.W1 91 /vsib	avx512	Using signed qword indices, gather quadword values from memory using writemask k1 for merging-masking.
VPLZCNTD xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.66.0F38.W0 44 /r	avx512	Count the number of leading zero bits in each dword element of xmm2/m128/m32bcst using writemask k1.
VPLZCNTD ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.66.0F38.W0 44 /r	avx512	Count the number of leading zero bits in each dword element of ymm2/m256/m32bcst using writemask k1.
VPLZCNTD zmm1 {k1}{z}, zmm2/m512/m32bcst	EVEX.512.66.0F38.W0 44 /r	avx512	Count the number of leading zero bits in each dword element of zmm2/m512/m32bcst using writemask k1.
VPLZCNTQ xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.66.0F38.W1 44 /r	avx512	Count the number of leading zero bits in each qword element of xmm2/m128/m64bcst using writemask k1.
VPLZCNTQ ymm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.66.0F38.W1 44 /r	avx512	Count the number of leading zero bits in each qword element of ymm2/m256/m64bcst using writemask k1.
VPLZCNTQ zmm1 {k1}{z}, zmm2/m512/m64bcst	EVEX.512.66.0F38.W1 44 /r	avx512	Count the number of leading zero bits in each qword element of zmm2/m512/m64bcst using writemask k1.
VPMASKMOVD xmm1, xmm2, m128	VEX.NDS.128.66.0F38.W0 8C /r	avx2	Conditionally load dword values from m128 using mask in xmm2 and store in xmm1.
VPMASKMOVD ymm1, ymm2, m256	VEX.NDS.256.66.0F38.W0 8C /r	avx2	Conditionally load dword values from m256 using mask in ymm2 and store in ymm1.
VPMASKMOVQ xmm1, xmm2, m128	VEX.NDS.128.66.0F38.W1 8C /r	avx2	Conditionally load qword values from m128 using mask in xmm2 and store in xmm1.
VPMASKMOVQ ymm1, ymm2, m256	VEX.NDS.256.66.0F38.W1 8C /r	avx2	Conditionally load qword values from m256 using mask in ymm2 and store in ymm1.
VPMASKMOVD m128, xmm1, xmm2	VEX.NDS.128.66.0F38.W0 8E /r	avx2	Conditionally store dword values from xmm2 using mask in xmm1.
VPMASKMOVD m256, ymm1, ymm2	VEX.NDS.256.66.0F38.W0 8E /r	avx2	Conditionally store dword values from ymm2 using mask in ymm1.
VPMASKMOVQ m128, xmm1, xmm2	VEX.NDS.128.66.0F38.W1 8E /r	avx2	Conditionally store qword values from xmm2 using mask in xmm1.
VPMASKMOVQ m256, ymm1, ymm2	VEX.NDS.256.66.0F38.W1 8E /r	avx2	Conditionally store qword values from ymm2 using mask in ymm1.
VPMOVB2M k1, xmm1	EVEX.128.F3.0F38.W0 29 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding byte in XMM1.
VPMOVB2M k1, ymm1	EVEX.256.F3.0F38.W0 29 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding byte in YMM1.
VPMOVB2M k1, zmm1	EVEX.512.F3.0F38.W0 29 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding byte in ZMM1.
VPMOVW2M k1, xmm1	EVEX.128.F3.0F38.W1 29 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding word in XMM1.
VPMOVW2M k1, ymm1	EVEX.256.F3.0F38.W1 29 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding word in YMM1.
VPMOVW2M k1, zmm1	EVEX.512.F3.0F38.W1 29 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding word in ZMM1.
VPMOVD2M k1, xmm1	EVEX.128.F3.0F38.W0 39 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding doubleword in XMM1.
VPMOVD2M k1, ymm1	EVEX.256.F3.0F38.W0 39 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding doubleword in YMM1.
VPMOVD2M k1, zmm1	EVEX.512.F3.0F38.W0 39 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding doubleword in ZMM1.
VPMOVQ2M k1, xmm1	EVEX.128.F3.0F38.W1 39 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding quadword in XMM1.
VPMOVQ2M k1, ymm1	EVEX.256.F3.0F38.W1 39 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding quadword in YMM1.
VPMOVQ2M k1, zmm1	EVEX.512.F3.0F38.W1 39 /r	avx512	Sets each bit in k1 to 1 or 0 based on the value of the most significant bit of the corresponding quadword in ZMM1.
VPMOVDB xmm1/m32 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 31 /r	avx512	Converts 4 packed double-word integers from xmm2 into 4 packed byte integers in xmm1/m32 with truncation under writemask k1.
VPMOVSDB xmm1/m32 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 21 /r	avx512	Converts 4 packed signed double-word integers from xmm2 into 4 packed signed byte integers in xmm1/m32 using signed saturation under writemask k1.
VPMOVUSDB xmm1/m32 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 11 /r	avx512	Converts 4 packed unsigned double-word integers from xmm2 into 4 packed unsigned byte integers in xmm1/m32 using unsigned saturation under writemask k1.
VPMOVDB xmm1/m64 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 31 /r	avx512	Converts 8 packed double-word integers from ymm2 into 8 packed byte integers in xmm1/m64 with truncation under writemask k1.
VPMOVSDB xmm1/m64 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 21 /r	avx512	Converts 8 packed signed double-word integers from ymm2 into 8 packed signed byte integers in xmm1/m64 using signed saturation under writemask k1.
VPMOVUSDB xmm1/m64 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 11 /r	avx512	Converts 8 packed unsigned double-word integers from ymm2 into 8 packed unsigned byte integers in xmm1/m64 using unsigned saturation under writemask k1.
VPMOVDB xmm1/m128 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 31 /r	avx512	Converts 16 packed double-word integers from zmm2 into 16 packed byte integers in xmm1/m128 with truncation under writemask k1.
VPMOVSDB xmm1/m128 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 21 /r	avx512	Converts 16 packed signed double-word integers from zmm2 into 16 packed signed byte integers in xmm1/m128 using signed saturation under writemask k1.
VPMOVUSDB xmm1/m128 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 11 /r	avx512	Converts 16 packed unsigned double-word integers from zmm2 into 16 packed unsigned byte integers in xmm1/m128 using unsigned saturation under writemask k1.
VPMOVDW xmm1/m64 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 33 /r	avx512	Converts 4 packed double-word integers from xmm2 into 4 packed word integers in xmm1/m64 with truncation under writemask k1.
VPMOVSDW xmm1/m64 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 23 /r	avx512	Converts 4 packed signed double-word integers from xmm2 into 4 packed signed word integers in ymm1/m64 using signed saturation under writemask k1.
VPMOVUSDW xmm1/m64 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 13 /r	avx512	Converts 4 packed unsigned double-word integers from xmm2 into 4 packed unsigned word integers in xmm1/m64 using unsigned saturation under writemask k1.
VPMOVDW xmm1/m128 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 33 /r	avx512	Converts 8 packed double-word integers from ymm2 into 8 packed word integers in xmm1/m128 with truncation under writemask k1.
VPMOVSDW xmm1/m128 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 23 /r	avx512	Converts 8 packed signed double-word integers from ymm2 into 8 packed signed word integers in xmm1/m128 using signed saturation under writemask k1.
VPMOVUSDW xmm1/m128 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 13 /r	avx512	Converts 8 packed unsigned double-word integers from ymm2 into 8 packed unsigned word integers in xmm1/m128 using unsigned saturation under writemask k1.
VPMOVDW ymm1/m256 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 33 /r	avx512	Converts 16 packed double-word integers from zmm2 into 16 packed word integers in ymm1/m256 with truncation under writemask k1.
VPMOVSDW ymm1/m256 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 23 /r	avx512	Converts 16 packed signed double-word integers from zmm2 into 16 packed signed word integers in ymm1/m256 using signed saturation under writemask k1.
VPMOVUSDW ymm1/m256 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 13 /r	avx512	Converts 16 packed unsigned double-word integers from zmm2 into 16 packed unsigned word integers in ymm1/m256 using unsigned saturation under writemask k1.
VPMOVM2B xmm1, k1	EVEX.128.F3.0F38.W0 28 /r	avx512	Sets each byte in XMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVM2B ymm1, k1	EVEX.256.F3.0F38.W0 28 /r	avx512	Sets each byte in YMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVM2B zmm1, k1	EVEX.512.F3.0F38.W0 28 /r	avx512	Sets each byte in ZMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVM2W xmm1, k1	EVEX.128.F3.0F38.W1 28 /r	avx512	Sets each word in XMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVM2W ymm1, k1	EVEX.256.F3.0F38.W1 28 /r	avx512	Sets each word in YMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVM2W zmm1, k1	EVEX.512.F3.0F38.W1 28 /r	avx512	Sets each word in ZMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVM2D xmm1, k1	EVEX.128.F3.0F38.W0 38 /r	avx512	Sets each doubleword in XMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVM2D ymm1, k1	EVEX.256.F3.0F38.W0 38 /r	avx512	Sets each doubleword in YMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVM2D zmm1, k1	EVEX.512.F3.0F38.W0 38 /r	avx512	Sets each doubleword in ZMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVM2Q xmm1, k1	EVEX.128.F3.0F38.W1 38 /r	avx512	Sets each quadword in XMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVM2Q ymm1, k1	EVEX.256.F3.0F38.W1 38 /r	avx512	Sets each quadword in YMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVM2Q zmm1, k1	EVEX.512.F3.0F38.W1 38 /r	avx512	Sets each quadword in ZMM1 to all 1’s or all 0’s based on the value of the corresponding bit in k1.
VPMOVQB xmm1/m16 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 32 /r	avx512	Converts 2 packed quad-word integers from xmm2 into 2 packed byte integers in xmm1/m16 with truncation under writemask k1.
VPMOVSQB xmm1/m16 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 22 /r	avx512	Converts 2 packed signed quad-word integers from xmm2 into 2 packed signed byte integers in xmm1/m16 using signed saturation under writemask k1.
VPMOVUSQB xmm1/m16 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 12 /r	avx512	Converts 2 packed unsigned quad-word integers from xmm2 into 2 packed unsigned byte integers in xmm1/m16 using unsigned saturation under writemask k1.
VPMOVQB xmm1/m32 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 32 /r	avx512	Converts 4 packed quad-word integers from ymm2 into 4 packed byte integers in xmm1/m32 with truncation under writemask k1.
VPMOVSQB xmm1/m32 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 22 /r	avx512	Converts 4 packed signed quad-word integers from ymm2 into 4 packed signed byte integers in xmm1/m32 using signed saturation under writemask k1.
VPMOVUSQB xmm1/m32 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 12 /r	avx512	Converts 4 packed unsigned quad-word integers from ymm2 into 4 packed unsigned byte integers in xmm1/m32 using unsigned saturation under writemask k1.
VPMOVQB xmm1/m64 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 32 /r	avx512	Converts 8 packed quad-word integers from zmm2 into 8 packed byte integers in xmm1/m64 with truncation under writemask k1.
VPMOVSQB xmm1/m64 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 22 /r	avx512	Converts 8 packed signed quad-word integers from zmm2 into 8 packed signed byte integers in xmm1/m64 using signed saturation under writemask k1.
VPMOVUSQB xmm1/m64 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 12 /r	avx512	Converts 8 packed unsigned quad-word integers from zmm2 into 8 packed unsigned byte integers in xmm1/m64 using unsigned saturation under writemask k1.
VPMOVQD xmm1/m128 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 35 /r	avx512	Converts 2 packed quad-word integers from xmm2 into 2 packed double-word integers in xmm1/m128 with truncation subject to writemask k1.
VPMOVSQD xmm1/m64 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 25 /r	avx512	Converts 2 packed signed quad-word integers from xmm2 into 2 packed signed double-word integers in xmm1/m64 using signed saturation subject to writemask k1.
VPMOVUSQD xmm1/m64 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 15 /r	avx512	Converts 2 packed unsigned quad-word integers from xmm2 into 2 packed unsigned double-word integers in xmm1/m64 using unsigned saturation subject to writemask k1.
VPMOVQD xmm1/m128 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 35 /r	avx512	Converts 4 packed quad-word integers from ymm2 into 4 packed double-word integers in xmm1/m128 with truncation subject to writemask k1.
VPMOVSQD xmm1/m128 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 25 /r	avx512	Converts 4 packed signed quad-word integers from ymm2 into 4 packed signed double-word integers in xmm1/m128 using signed saturation subject to writemask k1.
VPMOVUSQD xmm1/m128 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 15 /r	avx512	Converts 4 packed unsigned quad-word integers from ymm2 into 4 packed unsigned double-word integers in xmm1/m128 using unsigned saturation subject to writemask k1.
VPMOVQD ymm1/m256 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 35 /r	avx512	Converts 8 packed quad-word integers from zmm2 into 8 packed double-word integers in ymm1/m256 with truncation subject to writemask k1.
VPMOVSQD ymm1/m256 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 25 /r	avx512	Converts 8 packed signed quad-word integers from zmm2 into 8 packed signed double-word integers in ymm1/m256 using signed saturation subject to writemask k1.
VPMOVUSQD ymm1/m256 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 15 /r	avx512	Converts 8 packed unsigned quad-word integers from zmm2 into 8 packed unsigned double-word integers in ymm1/m256 using unsigned saturation subject to writemask k1.
VPMOVQW xmm1/m32 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 34 /r	avx512	Converts 2 packed quad-word integers from xmm2 into 2 packed word integers in xmm1/m32 with truncation under writemask k1.
VPMOVSQW xmm1/m32 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 24 /r	avx512	Converts 8 packed signed quad-word integers from zmm2 into 8 packed signed word integers in xmm1/m32 using signed saturation under writemask k1.
VPMOVUSQW xmm1/m32 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 14 /r	avx512	Converts 2 packed unsigned quad-word integers from xmm2 into 2 packed unsigned word integers in xmm1/m32 using unsigned saturation under writemask k1.
VPMOVQW xmm1/m64 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 34 /r	avx512	Converts 4 packed quad-word integers from ymm2 into 4 packed word integers in xmm1/m64 with truncation under writemask k1.
VPMOVSQW xmm1/m64 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 24 /r	avx512	Converts 4 packed signed quad-word integers from ymm2 into 4 packed signed word integers in xmm1/m64 using signed saturation under writemask k1.
VPMOVUSQW xmm1/m64 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 14 /r	avx512	Converts 4 packed unsigned quad-word integers from ymm2 into 4 packed unsigned word integers in xmm1/m64 using unsigned saturation under writemask k1.
VPMOVQW xmm1/m128 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 34 /r	avx512	Converts 8 packed quad-word integers from zmm2 into 8 packed word integers in xmm1/m128 with truncation under writemask k1.
VPMOVSQW xmm1/m128 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 24 /r	avx512	Converts 8 packed signed quad-word integers from zmm2 into 8 packed signed word integers in xmm1/m128 using signed saturation under writemask k1.
VPMOVUSQW xmm1/m128 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 14 /r	avx512	Converts 8 packed unsigned quad-word integers from zmm2 into 8 packed unsigned word integers in xmm1/m128 using unsigned saturation under writemask k1.
VPMOVWB xmm1/m64 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 30 /r	avx512	Converts 8 packed word integers from xmm2 into 8 packed bytes in xmm1/m64 with truncation under writemask k1.
VPMOVSWB xmm1/m64 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 20 /r	avx512	Converts 8 packed signed word integers from xmm2 into 8 packed signed bytes in xmm1/m64 using signed saturation under writemask k1.
VPMOVUSWB xmm1/m64 {k1}{z}, xmm2	EVEX.128.F3.0F38.W0 10 /r	avx512	Converts 8 packed unsigned word integers from xmm2 into 8 packed unsigned bytes in 8mm1/m64 using unsigned saturation under writemask k1.
VPMOVWB xmm1/m128 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 30 /r	avx512	Converts 16 packed word integers from ymm2 into 16 packed bytes in xmm1/m128 with truncation under writemask k1.
VPMOVSWB xmm1/m128 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 20 /r	avx512	Converts 16 packed signed word integers from ymm2 into 16 packed signed bytes in xmm1/m128 using signed saturation under writemask k1.
VPMOVUSWB xmm1/m128 {k1}{z}, ymm2	EVEX.256.F3.0F38.W0 10 /r	avx512	Converts 16 packed unsigned word integers from ymm2 into 16 packed unsigned bytes in xmm1/m128 using unsigned saturation under writemask k1.
VPMOVWB ymm1/m256 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 30 /r	avx512	Converts 32 packed word integers from zmm2 into 32 packed bytes in ymm1/m256 with truncation under writemask k1.
VPMOVSWB ymm1/m256 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 20 /r	avx512	Converts 32 packed signed word integers from zmm2 into 32 packed signed bytes in ymm1/m256 using signed saturation under writemask k1.
VPMOVUSWB ymm1/m256 {k1}{z}, zmm2	EVEX.512.F3.0F38.W0 10 /r	avx512	Converts 32 packed unsigned word integers from zmm2 into 32 packed unsigned bytes in ymm1/m256 using unsigned saturation under writemask k1.
VPSCATTERDD vm32x {k1}, xmm1	EVEX.128.66.0F38.W0 A0 /vsib	avx512	Using signed dword indices, scatter dword values to memory using writemask k1.
VPSCATTERDD vm32y {k1}, ymm1	EVEX.256.66.0F38.W0 A0 /vsib	avx512	Using signed dword indices, scatter dword values to memory using writemask k1.
VPSCATTERDD vm32z {k1}, zmm1	EVEX.512.66.0F38.W0 A0 /vsib	avx512	Using signed dword indices, scatter dword values to memory using writemask k1.
VPSCATTERDQ vm32x {k1}, xmm1	EVEX.128.66.0F38.W1 A0 /vsib	avx512	Using signed dword indices, scatter qword values to memory using writemask k1.
VPSCATTERDQ vm32x {k1}, ymm1	EVEX.256.66.0F38.W1 A0 /vsib	avx512	Using signed dword indices, scatter qword values to memory using writemask k1.
VPSCATTERDQ vm32y {k1}, zmm1	EVEX.512.66.0F38.W1 A0 /vsib	avx512	Using signed dword indices, scatter qword values to memory using writemask k1.
VPSCATTERQD vm64x {k1}, xmm1	EVEX.128.66.0F38.W0 A1 /vsib	avx512	Using signed qword indices, scatter dword values to memory using writemask k1.
VPSCATTERQD vm64y {k1}, xmm1	EVEX.256.66.0F38.W0 A1 /vsib	avx512	Using signed qword indices, scatter dword values to memory using writemask k1.
VPSCATTERQD vm64z {k1}, ymm1	EVEX.512.66.0F38.W0 A1 /vsib	avx512	Using signed qword indices, scatter dword values to memory using writemask k1.
VPSCATTERQQ vm64x {k1}, xmm1	EVEX.128.66.0F38.W1 A1 /vsib	avx512	Using signed qword indices, scatter qword values to memory using writemask k1.
VPSCATTERQQ vm64y {k1}, ymm1	EVEX.256.66.0F38.W1 A1 /vsib	avx512	Using signed qword indices, scatter qword values to memory using writemask k1.
VPSCATTERQQ vm64z {k1}, zmm1	EVEX.512.66.0F38.W1 A1 /vsib	avx512	Using signed qword indices, scatter qword values to memory using writemask k1.
VPSLLVD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 47 /r	avx2	Shift doublewords in xmm2 left by amount specified in the corresponding element of xmm3/m128 while shifting in 0s.
VPSLLVQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 47 /r	avx2	Shift quadwords in xmm2 left by amount specified in the corresponding element of xmm3/m128 while shifting in 0s.
VPSLLVD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 47 /r	avx2	Shift doublewords in ymm2 left by amount specified in the corresponding element of ymm3/m256 while shifting in 0s.
VPSLLVQ ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 47 /r	avx2	Shift quadwords in ymm2 left by amount specified in the corresponding element of ymm3/m256 while shifting in 0s.
VPSLLVW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.W1 12 /r	avx512	Shift words in xmm2 left by amount specified in the corresponding element of xmm3/m128 while shifting in 0s using writemask k1.
VPSLLVW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.W1 12 /r	avx512	Shift words in ymm2 left by amount specified in the corresponding element of ymm3/m256 while shifting in 0s using writemask k1.
VPSLLVW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.W1 12 /r	avx512	Shift words in zmm2 left by amount specified in the corresponding element of zmm3/m512 while shifting in 0s using writemask k1.
VPSLLVD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 47 /r	avx512	Shift doublewords in xmm2 left by amount specified in the corresponding element of xmm3/m128/m32bcst while shifting in 0s using writemask k1.
VPSLLVD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 47 /r	avx512	Shift doublewords in ymm2 left by amount specified in the corresponding element of ymm3/m256/m32bcst while shifting in 0s using writemask k1.
VPSLLVD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 47 /r	avx512	Shift doublewords in zmm2 left by amount specified in the corresponding element of zmm3/m512/m32bcst while shifting in 0s using writemask k1.
VPSLLVQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 47 /r	avx512	Shift quadwords in xmm2 left by amount specified in the corresponding element of xmm3/m128/m64bcst while shifting in 0s using writemask k1.
VPSLLVQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 47 /r	avx512	Shift quadwords in ymm2 left by amount specified in the corresponding element of ymm3/m256/m64bcst while shifting in 0s using writemask k1.
VPSLLVQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 47 /r	avx512	Shift quadwords in zmm2 left by amount specified in the corresponding element of zmm3/m512/m64bcst while shifting in 0s using writemask k1.
VPSRAVD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 46 /r	avx2	Shift doublewords in xmm2 right by amount specified in the corresponding element of xmm3/m128 while shifting in sign bits.
VPSRAVD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 46 /r	avx2	Shift doublewords in ymm2 right by amount specified in the corresponding element of ymm3/m256 while shifting in sign bits.
VPSRAVW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.W1 11 /r	avx512	Shift words in xmm2 right by amount specified in the corresponding element of xmm3/m128 while shifting in sign bits using writemask k1.
VPSRAVW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.W1 11 /r	avx512	Shift words in ymm2 right by amount specified in the corresponding element of ymm3/m256 while shifting in sign bits using writemask k1.
VPSRAVW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.W1 11 /r	avx512	Shift words in zmm2 right by amount specified in the corresponding element of zmm3/m512 while shifting in sign bits using writemask k1.
VPSRAVD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 46 /r	avx512	Shift doublewords in xmm2 right by amount specified in the corresponding element of xmm3/m128/m32bcst while shifting in sign bits using writemask k1.
VPSRAVD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 46 /r	avx512	Shift doublewords in ymm2 right by amount specified in the corresponding element of ymm3/m256/m32bcst while shifting in sign bits using writemask k1.
VPSRAVD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 46 /r	avx512	Shift doublewords in zmm2 right by amount specified in the corresponding element of zmm3/m512/m32bcst while shifting in sign bits using writemask k1.
VPSRAVQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 46 /r	avx512	Shift quadwords in xmm2 right by amount specified in the corresponding element of xmm3/m128/m64bcst while shifting in sign bits using writemask k1.
VPSRAVQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 46 /r	avx512	Shift quadwords in ymm2 right by amount specified in the corresponding element of ymm3/m256/m64bcst while shifting in sign bits using writemask k1.
VPSRAVQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 46 /r	avx512	Shift quadwords in zmm2 right by amount specified in the corresponding element of zmm3/m512/m64bcst while shifting in sign bits using writemask k1.
VPSRLVD xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W0 45 /r	avx2	Shift doublewords in xmm2 right by amount specified in the corresponding element of xmm3/m128 while shifting in 0s.
VPSRLVQ xmm1, xmm2, xmm3/m128	VEX.NDS.128.66.0F38.W1 45 /r	avx2	Shift quadwords in xmm2 right by amount specified in the corresponding element of xmm3/m128 while shifting in 0s.
VPSRLVD ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W0 45 /r	avx2	Shift doublewords in ymm2 right by amount specified in the corresponding element of ymm3/m256 while shifting in 0s.
VPSRLVQ ymm1, ymm2, ymm3/m256	VEX.NDS.256.66.0F38.W1 45 /r	avx2	Shift quadwords in ymm2 right by amount specified in the corresponding element of ymm3/m256 while shifting in 0s.
VPSRLVW xmm1 {k1}{z}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.W1 10 /r	avx512	Shift words in xmm2 right by amount specified in the corresponding element of xmm3/m128 while shifting in 0s using writemask k1.
VPSRLVW ymm1 {k1}{z}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.W1 10 /r	avx512	Shift words in ymm2 right by amount specified in the corresponding element of ymm3/m256 while shifting in 0s using writemask k1.
VPSRLVW zmm1 {k1}{z}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.W1 10 /r	avx512	Shift words in zmm2 right by amount specified in the corresponding element of zmm3/m512 while shifting in 0s using writemask k1.
VPSRLVD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 45 /r	avx512	Shift doublewords in xmm2 right by amount specified in the corresponding element of xmm3/m128/m32bcst while shifting in 0s using writemask k1.
VPSRLVD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 45 /r	avx512	Shift doublewords in ymm2 right by amount specified in the corresponding element of ymm3/m256/m32bcst while shifting in 0s using writemask k1.
VPSRLVD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 45 /r	avx512	Shift doublewords in zmm2 right by amount specified in the corresponding element of zmm3/m512/m32bcst while shifting in 0s using writemask k1.
VPSRLVQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 45 /r	avx512	Shift quadwords in xmm2 right by amount specified in the corresponding element of xmm3/m128/m64bcst while shifting in 0s using writemask k1.
VPSRLVQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 45 /r	avx512	Shift quadwords in ymm2 right by amount specified in the corresponding element of ymm3/m256/m64bcst while shifting in 0s using writemask k1.
VPSRLVQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 45 /r	avx512	Shift quadwords in zmm2 right by amount specified in the corresponding element of zmm3/m512/m64bcst while shifting in 0s using writemask k1.
VPTERNLOGD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst, imm8	EVEX.DDS.128.66.0F3A.W0 25 /r ib	avx512	Bitwise ternary logic taking xmm1, xmm2 and xmm3/m128/m32bcst as source operands and writing the result to xmm1 under writemask k1 with dword granularity. The immediate value determines the specific binary function being implemented.
VPTERNLOGD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst, imm8	EVEX.DDS.256.66.0F3A.W0 25 /r ib	avx512	Bitwise ternary logic taking ymm1, ymm2 and ymm3/m256/m32bcst as source operands and writing the result to ymm1 under writemask k1 with dword granularity. The immediate value determines the specific binary function being implemented.
VPTERNLOGD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst, imm8	EVEX.DDS.512.66.0F3A.W0 25 /r ib	avx512	Bitwise ternary logic taking zmm1, zmm2 and zmm3/m512/m32bcst as source operands and writing the result to zmm1 under writemask k1 with dword granularity. The immediate value determines the specific binary function being implemented.
VPTERNLOGQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst, imm8	EVEX.DDS.128.66.0F3A.W1 25 /r ib	avx512	Bitwise ternary logic taking xmm1, xmm2 and xmm3/m128/m64bcst as source operands and writing the result to xmm1 under writemask k1 with qword granularity. The immediate value determines the specific binary function being implemented.
VPTERNLOGQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst, imm8	EVEX.DDS.256.66.0F3A.W1 25 /r ib	avx512	Bitwise ternary logic taking ymm1, ymm2 and ymm3/m256/m64bcst as source operands and writing the result to ymm1 under writemask k1 with qword granularity. The immediate value determines the specific binary function being implemented.
VPTERNLOGQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst, imm8	EVEX.DDS.512.66.0F3A.W1 25 /r ib	avx512	Bitwise ternary logic taking zmm1, zmm2 and zmm3/m512/m64bcst as source operands and writing the result to zmm1 under writemask k1 with qword granularity. The immediate value determines the specific binary function being implemented.
VPTESTMB k2 {k1}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.W0 26 /r	avx512	Bitwise AND of packed byte integers in xmm2 and xmm3/m128 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTMB k2 {k1}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.W0 26 /r	avx512	Bitwise AND of packed byte integers in ymm2 and ymm3/m256 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTMB k2 {k1}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.W0 26 /r	avx512	Bitwise AND of packed byte integers in zmm2 and zmm3/m512 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTMW k2 {k1}, xmm2, xmm3/m128	EVEX.NDS.128.66.0F38.W1 26 /r	avx512	Bitwise AND of packed word integers in xmm2 and xmm3/m128 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTMW k2 {k1}, ymm2, ymm3/m256	EVEX.NDS.256.66.0F38.W1 26 /r	avx512	Bitwise AND of packed word integers in ymm2 and ymm3/m256 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTMW k2 {k1}, zmm2, zmm3/m512	EVEX.NDS.512.66.0F38.W1 26 /r	avx512	Bitwise AND of packed word integers in zmm2 and zmm3/m512 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTMD k2 {k1}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 27 /r	avx512	Bitwise AND of packed doubleword integers in xmm2 and xmm3/m128/m32bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTMD k2 {k1}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 27 /r	avx512	Bitwise AND of packed doubleword integers in ymm2 and ymm3/m256/m32bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTMD k2 {k1}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.66.0F38.W0 27 /r	avx512	Bitwise AND of packed doubleword integers in zmm2 and zmm3/m512/m32bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTMQ k2 {k1}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 27 /r	avx512	Bitwise AND of packed quadword integers in xmm2 and xmm3/m128/m64bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTMQ k2 {k1}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 27 /r	avx512	Bitwise AND of packed quadword integers in ymm2 and ymm3/m256/m64bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTMQ k2 {k1}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.66.0F38.W1 27 /r	avx512	Bitwise AND of packed quadword integers in zmm2 and zmm3/m512/m64bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMB k2 {k1}, xmm2, xmm3/m128	EVEX.NDS.128.F3.0F38.W0 26 /r	avx512	Bitwise NAND of packed byte integers in xmm2 and xmm3/m128 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMB k2 {k1}, ymm2, ymm3/m256	EVEX.NDS.256.F3.0F38.W0 26 /r	avx512	Bitwise NAND of packed byte integers in ymm2 and ymm3/m256 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMB k2 {k1}, zmm2, zmm3/m512	EVEX.NDS.512.F3.0F38.W0 26 /r	avx512	Bitwise NAND of packed byte integers in zmm2 and zmm3/m512 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMW k2 {k1}, xmm2, xmm3/m128	EVEX.NDS.128.F3.0F38.W1 26 /r	avx512	Bitwise NAND of packed word integers in xmm2 and xmm3/m128 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMW k2 {k1}, ymm2, ymm3/m256	EVEX.NDS.256.F3.0F38.W1 26 /r	avx512	Bitwise NAND of packed word integers in ymm2 and ymm3/m256 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMW k2 {k1}, zmm2, zmm3/m512	EVEX.NDS.512.F3.0F38.W1 26 /r	avx512	Bitwise NAND of packed word integers in zmm2 and zmm3/m512 and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMD k2 {k1}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.F3.0F38.W0 27 /r	avx512	Bitwise NAND of packed doubleword integers in xmm2 and xmm3/m128/m32bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMD k2 {k1}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.F3.0F38.W0 27 /r	avx512	Bitwise NAND of packed doubleword integers in ymm2 and ymm3/m256/m32bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMD k2 {k1}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.F3.0F38.W0 27 /r	avx512	Bitwise NAND of packed doubleword integers in zmm2 and zmm3/m512/m32bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMQ k2 {k1}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.F3.0F38.W1 27 /r	avx512	Bitwise NAND of packed quadword integers in xmm2 and xmm3/m128/m64bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMQ k2 {k1}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.F3.0F38.W1 27 /r	avx512	Bitwise NAND of packed quadword integers in ymm2 and ymm3/m256/m64bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VPTESTNMQ k2 {k1}, zmm2, zmm3/m512/m64bcst	EVEX.NDS.512.F3.0F38.W1 27 /r	avx512	Bitwise NAND of packed quadword integers in zmm2 and zmm3/m512/m64bcst and set mask k2 to reflect the zero/non-zero status of each element of the result, under writemask k1.
VRANGEPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst, imm8	EVEX.NDS.128.66.0F3A.W1 50 /r ib	avx512	Calculate two RANGE operation output value from 2 pairs of double-precision floating-point values in xmm2 and xmm3/m128/m32bcst, store the results to xmm1 under the writemask k1. Imm8 specifies the comparison and sign of the range operation.
VRANGEPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst, imm8	EVEX.NDS.256.66.0F3A.W1 50 /r ib	avx512	Calculate four RANGE operation output value from 4pairs of double-precision floating-point values in ymm2 and ymm3/m256/m32bcst, store the results to ymm1 under the writemask k1. Imm8 specifies the comparison and sign of the range operation.
VRANGEPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{sae}, imm8	EVEX.NDS.512.66.0F3A.W1 50 /r ib	avx512	Calculate eight RANGE operation output value from 8 pairs of double-precision floating-point values in zmm2 and zmm3/m512/m32bcst, store the results to zmm1 under the writemask k1. Imm8 specifies the comparison and sign of the range operation.
VRANGEPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst, imm8	EVEX.NDS.128.66.0F3A.W0 50 /r ib	avx512	Calculate four RANGE operation output value from 4 pairs of single-precision floating-point values in xmm2 and xmm3/m128/m32bcst, store the results to xmm1 under the writemask k1. Imm8 specifies the comparison and sign of the range operation.
VRANGEPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst, imm8	EVEX.NDS.256.66.0F3A.W0 50 /r ib	avx512	Calculate eight RANGE operation output value from 8 pairs of single-precision floating-point values in ymm2 and ymm3/m256/m32bcst, store the results to ymm1 under the writemask k1. Imm8 specifies the comparison and sign of the range operation.
VRANGEPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{sae}, imm8	EVEX.NDS.512.66.0F3A.W0 50 /r ib	avx512	Calculate 16 RANGE operation output value from 16 pairs of single-precision floating-point values in zmm2 and zmm3/m512/m32bcst, store the results to zmm1 under the writemask k1. Imm8 specifies the comparison and sign of the range operation.
VRANGESD xmm1 {k1}{z}, xmm2, xmm3/m64{sae}, imm8	EVEX.NDS.LIG.66.0F3A.W1 51 /r	avx512	Calculate a RANGE operation output value from 2 double-precision floating-point values in xmm2 and xmm3/m64, store the output to xmm1 under writemask. Imm8 specifies the comparison and sign of the range operation.
VRANGESS xmm1 {k1}{z}, xmm2, xmm3/m32{sae}, imm8	EVEX.NDS.LIG.66.0F3A.W0 51 /r	avx512	Calculate a RANGE operation output value from 2 single-precision floating-point values in xmm2 and xmm3/m32, store the output to xmm1 under writemask. Imm8 specifies the comparison and sign of the range operation.
VRCP14PD xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.66.0F38.W1 4C /r	avx512	Computes the approximate reciprocals of the packed double-precision floating-point values in xmm2/m128/m64bcst and stores the results in xmm1. Under writemask.
VRCP14PD ymm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.66.0F38.W1 4C /r	avx512	Computes the approximate reciprocals of the packed double-precision floating-point values in ymm2/m256/m64bcst and stores the results in ymm1. Under writemask.
VRCP14PD zmm1 {k1}{z}, zmm2/m512/m64bcst	EVEX.512.66.0F38.W1 4C /r	avx512	Computes the approximate reciprocals of the packed double-precision floating-point values in zmm2/m512/m64bcst and stores the results in zmm1. Under writemask.
VRCP14PS xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.66.0F38.W0 4C /r	avx512	Computes the approximate reciprocals of the packed single-precision floating-point values in xmm2/m128/m32bcst and stores the results in xmm1. Under writemask.
VRCP14PS ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.66.0F38.W0 4C /r	avx512	Computes the approximate reciprocals of the packed single-precision floating-point values in ymm2/m256/m32bcst and stores the results in ymm1. Under writemask.
VRCP14PS zmm1 {k1}{z}, zmm2/m512/m32bcst	EVEX.512.66.0F38.W0 4C /r	avx512	Computes the approximate reciprocals of the packed single-precision floating-point values in zmm2/m512/m32bcst and stores the results in zmm1. Under writemask.
VRCP14SD xmm1 {k1}{z}, xmm2, xmm3/m64	EVEX.NDS.LIG.66.0F38.W1 4D /r	avx512	Computes the approximate reciprocal of the scalar double-precision floating-point value in xmm3/m64 and stores the result in xmm1 using writemask k1. Also, upper double-precision floating-point value (bits[127:64]) from xmm2 is copied to xmm1[127:64].
VRCP14SS xmm1 {k1}{z}, xmm2, xmm3/m32	EVEX.NDS.LIG.66.0F38.W0 4D /r	avx512	Computes the approximate reciprocal of the scalar single-precision floating-point value in xmm3/m32 and stores the results in xmm1 using writemask k1. Also, upper double-precision floating-point value (bits[127:32]) from xmm2 is copied to xmm1[127:32].
VRCP28PD zmm1 {k1}{z}, zmm2/m512/m64bcst {sae}	EVEX.512.66.0F38.W1 CA /r	avx512	Computes the approximate reciprocals ( < 2^-28 relative error) of the packed double-precision floating-point values in zmm2/m512/m64bcst and stores the results in zmm1. Under writemask.
VRCP28PS zmm1 {k1}{z}, zmm2/m512/m32bcst {sae}	EVEX.512.66.0F38.W0 CA /r	avx512	Computes the approximate reciprocals ( < 2^-28 relative error) of the packed single-precision floating-point values in zmm2/m512/m32bcst and stores the results in zmm1. Under writemask.
VRCP28SD xmm1 {k1}{z}, xmm2, xmm3/m64 {sae}	EVEX.NDS.LIG.66.0F38.W1 CB /r	avx512	Computes the approximate reciprocal ( < 2^-28 relative error) of the scalar double-precision floating-point value in xmm3/m64 and stores the results in xmm1. Under writemask. Also, upper double-precision floating-point value (bits[127:64]) from xmm2 is copied to xmm1[127:64].
VRCP28SS xmm1 {k1}{z}, xmm2, xmm3/m32 {sae}	EVEX.NDS.LIG.66.0F38.W0 CB /r	avx512	Computes the approximate reciprocal ( < 2^-28 relative error) of the scalar single-precision floating-point value in xmm3/m32 and stores the results in xmm1. Under writemask. Also, upper 3 single-precision floating-point values (bits[127:32]) from xmm2 is copied to xmm1[127:32].
VREDUCEPD xmm1 {k1}{z}, xmm2/m128/m64bcst, imm8	EVEX.128.66.0F3A.W1 56 /r ib	avx512	Perform reduction transformation on packed double-precision floating point values in xmm2/m128/m32bcst by subtracting a number of fraction bits specified by the imm8 field. Stores the result in xmm1 register under writemask k1.
VREDUCEPD ymm1 {k1}{z}, ymm2/m256/m64bcst, imm8	EVEX.256.66.0F3A.W1 56 /r ib	avx512	Perform reduction transformation on packed double-precision floating point values in ymm2/m256/m32bcst by subtracting a number of fraction bits specified by the imm8 field. Stores the result in ymm1 register under writemask k1.
VREDUCEPD zmm1 {k1}{z}, zmm2/m512/m64bcst{sae}, imm8	EVEX.512.66.0F3A.W1 56 /r ib	avx512	Perform reduction transformation on double-precision floating point values in zmm2/m512/m32bcst by subtracting a number of fraction bits specified by the imm8 field. Stores the result in zmm1 register under writemask k1.
VREDUCEPS xmm1 {k1}{z}, xmm2/m128/m32bcst, imm8	EVEX.128.66.0F3A.W0 56 /r ib	avx512	Perform reduction transformation on packed single-precision floating point values in xmm2/m128/m32bcst by subtracting a number of fraction bits specified by the imm8 field. Stores the result in xmm1 register under writemask k1.
VREDUCEPS ymm1 {k1}{z}, ymm2/m256/m32bcst, imm8	EVEX.256.66.0F3A.W0 56 /r ib	avx512	Perform reduction transformation on packed single-precision floating point values in ymm2/m256/m32bcst by subtracting a number of fraction bits specified by the imm8 field. Stores the result in ymm1 register under writemask k1.
VREDUCEPS zmm1 {k1}{z}, zmm2/m512/m32bcst{sae}, imm8	EVEX.512.66.0F3A.W0 56 /r ib	avx512	Perform reduction transformation on packed single-precision floating point values in zmm2/m512/m32bcst by subtracting a number of fraction bits specified by the imm8 field. Stores the result in zmm1 register under writemask k1.
VREDUCESD xmm1 {k1}{z}, xmm2, xmm3/m64{sae}, imm8	EVEX.NDS.LIG.66.0F3A.W1 57 /r	avx512	Perform a reduction transformation on a scalar double-precision floating point value in xmm3/m64 by subtracting a number of fraction bits specified by the imm8 field. Also, upper double precision floating-point value (bits[127:64]) from xmm2 are copied to xmm1[127:64]. Stores the result in xmm1 register.
VREDUCESS xmm1 {k1}{z}, xmm2, xmm3/m32{sae}, imm8	EVEX.NDS.LIG.66.0F3A.W0 57 /r /ib	avx512	Perform a reduction transformation on a scalar single-precision floating point value in xmm3/m32 by subtracting a number of fraction bits specified by the imm8 field. Also, upper single precision floating-point values (bits[127:32]) from xmm2 are copied to xmm1[127:32]. Stores the result in xmm1 register.
VRNDSCALEPD xmm1 {k1}{z}, xmm2/m128/m64bcst, imm8	EVEX.128.66.0F3A.W1 09 /r ib	avx512	Rounds packed double-precision floating point values in xmm2/m128/m64bcst to a number of fraction bits specified by the imm8 field. Stores the result in xmm1 register. Under writemask.
VRNDSCALEPD ymm1 {k1}{z}, ymm2/m256/m64bcst, imm8	EVEX.256.66.0F3A.W1 09 /r ib	avx512	Rounds packed double-precision floating point values in ymm2/m256/m64bcst to a number of fraction bits specified by the imm8 field. Stores the result in ymm1 register. Under writemask.
VRNDSCALEPD zmm1 {k1}{z}, zmm2/m512/m64bcst{sae}, imm8	EVEX.512.66.0F3A.W1 09 /r ib	avx512	Rounds packed double-precision floating-point values in zmm2/m512/m64bcst to a number of fraction bits specified by the imm8 field. Stores the result in zmm1 register using writemask k1.
VRNDSCALEPS xmm1 {k1}{z}, xmm2/m128/m32bcst, imm8	EVEX.128.66.0F3A.W0 08 /r ib	avx512	Rounds packed single-precision floating point values in xmm2/m128/m32bcst to a number of fraction bits specified by the imm8 field. Stores the result in xmm1 register. Under writemask.
VRNDSCALEPS ymm1 {k1}{z}, ymm2/m256/m32bcst, imm8	EVEX.256.66.0F3A.W0 08 /r ib	avx512	Rounds packed single-precision floating point values in ymm2/m256/m32bcst to a number of fraction bits specified by the imm8 field. Stores the result in ymm1 register. Under writemask.
VRNDSCALEPS zmm1 {k1}{z}, zmm2/m512/m32bcst{sae}, imm8	EVEX.512.66.0F3A.W0 08 /r ib	avx512	Rounds packed single-precision floating-point values in zmm2/m512/m32bcst to a number of fraction bits specified by the imm8 field. Stores the result in zmm1 register using writemask.
VRNDSCALESD xmm1 {k1}{z}, xmm2, xmm3/m64{sae}, imm8	EVEX.NDS.LIG.66.0F3A.W1 0B /r ib	avx512	Rounds scalar double-precision floating-point value in xmm3/m64 to a number of fraction bits specified by the imm8 field. Stores the result in xmm1 register.
VRNDSCALESS xmm1 {k1}{z}, xmm2, xmm3/m32{sae}, imm8	EVEX.NDS.LIG.66.0F3A.W0 0A /r ib	avx512	Rounds scalar single-precision floating-point value in xmm3/m32 to a number of fraction bits specified by the imm8 field. Stores the result in xmm1 register under writemask.
VRSQRT14PD xmm1 {k1}{z}, xmm2/m128/m64bcst	EVEX.128.66.0F38.W1 4E /r	avx512	Computes the approximate reciprocal square roots of the packed double-precision floating-point values in xmm2/m128/m64bcst and stores the results in xmm1. Under writemask.
VRSQRT14PD ymm1 {k1}{z}, ymm2/m256/m64bcst	EVEX.256.66.0F38.W1 4E /r	avx512	Computes the approximate reciprocal square roots of the packed double-precision floating-point values in ymm2/m256/m64bcst and stores the results in ymm1. Under writemask.
VRSQRT14PD zmm1 {k1}{z}, zmm2/m512/m64bcst	EVEX.512.66.0F38.W1 4E /r	avx512	Computes the approximate reciprocal square roots of the packed double-precision floating-point values in zmm2/m512/m64bcst and stores the results in zmm1 under writemask.
VRSQRT14PS xmm1 {k1}{z}, xmm2/m128/m32bcst	EVEX.128.66.0F38.W0 4E /r	avx512	Computes the approximate reciprocal square roots of the packed single-precision floating-point values in xmm2/m128/m32bcst and stores the results in xmm1. Under writemask.
VRSQRT14PS ymm1 {k1}{z}, ymm2/m256/m32bcst	EVEX.256.66.0F38.W0 4E /r	avx512	Computes the approximate reciprocal square roots of the packed single-precision floating-point values in ymm2/m256/m32bcst and stores the results in ymm1. Under writemask.
VRSQRT14PS zmm1 {k1}{z}, zmm2/m512/m32bcst	EVEX.512.66.0F38.W0 4E /r	avx512	Computes the approximate reciprocal square roots of the packed single-precision floating-point values in zmm2/m512/m32bcst and stores the results in zmm1. Under writemask.
VRSQRT14SD xmm1 {k1}{z}, xmm2, xmm3/m64	EVEX.NDS.LIG.66.0F38.W1 4F /r	avx512	Computes the approximate reciprocal square root of the scalar double-precision floating-point value in xmm3/m64 and stores the result in the low quadword element of xmm1 using writemask k1. Bits[127:64] of xmm2 is copied to xmm1[127:64].
VRSQRT14SS xmm1 {k1}{z}, xmm2, xmm3/m32	EVEX.NDS.LIG.66.0F38.W0 4F /r	avx512	Computes the approximate reciprocal square root of the scalar single-precision floating-point value in xmm3/m32 and stores the result in the low doubleword element of xmm1 using writemask k1. Bits[127:32] of xmm2 is copied to xmm1[127:32].
VRSQRT28PD zmm1 {k1}{z}, zmm2/m512/m64bcst {sae}	EVEX.512.66.0F38.W1 CC /r	avx512	Computes approximations to the Reciprocal square root (<2^-28 relative error) of the packed double-precision floating-point values from zmm2/m512/m64bcst and stores result in zmm1with writemask k1.
VRSQRT28PS zmm1 {k1}{z}, zmm2/m512/m32bcst {sae}	EVEX.512.66.0F38.W0 CC /r	avx512	Computes approximations to the Reciprocal square root (<2^-28 relative error) of the packed single-precision floating-point values from zmm2/m512/m32bcst and stores result in zmm1with writemask k1.
VRSQRT28SD xmm1 {k1}{z}, xmm2, xmm3/m64 {sae}	EVEX.NDS.LIG.66.0F38.W1 CD /r	avx512	Computes approximate reciprocal square root (<2^-28 relative error) of the scalar double-precision floating-point value from xmm3/m64 and stores result in xmm1with writemask k1. Also, upper double-precision floating-point value (bits[127:64]) from xmm2 is copied to xmm1[127:64].
VRSQRT28SS xmm1 {k1}{z}, xmm2, xmm3/m32 {sae}	EVEX.NDS.LIG.66.0F38.W0 CD /r	avx512	Computes approximate reciprocal square root (<2^-28 relative error) of the scalar single-precision floating-point value from xmm3/m32 and stores result in xmm1with writemask k1. Also, upper 3 single-precision floating-point value (bits[127:32]) from xmm2 is copied to xmm1[127:32].
VSCALEFPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.NDS.128.66.0F38.W1 2C /r	avx512	Scale the packed double-precision floating-point values in xmm2 using values from xmm3/m128/m64bcst. Under writemask k1.
VSCALEFPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.NDS.256.66.0F38.W1 2C /r	avx512	Scale the packed double-precision floating-point values in ymm2 using values from ymm3/m256/m64bcst. Under writemask k1.
VSCALEFPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}	EVEX.NDS.512.66.0F38.W1 2C /r	avx512	Scale the packed double-precision floating-point values in zmm2 using values from zmm3/m512/m64bcst. Under writemask k1.
VSCALEFPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.66.0F38.W0 2C /r	avx512	Scale the packed single-precision floating-point values in xmm2 using values from xmm3/m128/m32bcst. Under writemask k1.
VSCALEFPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.66.0F38.W0 2C /r	avx512	Scale the packed single-precision values in ymm2 using floating point values from ymm3/m256/m32bcst. Under writemask k1.
VSCALEFPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst{er}	EVEX.NDS.512.66.0F38.W0 2C /r	avx512	Scale the packed single-precision floating-point values in zmm2 using floating-point values from zmm3/m512/m32bcst. Under writemask k1.
VSCALEFSD xmm1 {k1}{z}, xmm2, xmm3/m64{er}	EVEX.NDS.LIG.66.0F38.W1 2D /r	avx512	Scale the scalar double-precision floating-point values in xmm2 using the value from xmm3/m64. Under writemask k1.
VSCALEFSS xmm1 {k1}{z}, xmm2, xmm3/m32{er}	EVEX.NDS.LIG.66.0F38.W0 2D /r	avx512	Scale the scalar single-precision floating-point value in xmm2 using floating-point value from xmm3/m32. Under writemask k1.
VSCATTERDPS vm32x {k1}, xmm1	EVEX.128.66.0F38.W0 A2 /vsib	avx512	Using signed dword indices, scatter single-precision floating-point values to memory using writemask k1.
VSCATTERDPS vm32y {k1}, ymm1	EVEX.256.66.0F38.W0 A2 /vsib	avx512	Using signed dword indices, scatter single-precision floating-point values to memory using writemask k1.
VSCATTERDPS vm32z {k1}, zmm1	EVEX.512.66.0F38.W0 A2 /vsib	avx512	Using signed dword indices, scatter single-precision floating-point values to memory using writemask k1.
VSCATTERDPD vm32x {k1}, xmm1	EVEX.128.66.0F38.W1 A2 /vsib	avx512	Using signed dword indices, scatter double-precision floating-point values to memory using writemask k1.
VSCATTERDPD vm32x {k1}, ymm1	EVEX.256.66.0F38.W1 A2 /vsib	avx512	Using signed dword indices, scatter double-precision floating-point values to memory using writemask k1.
VSCATTERDPD vm32y {k1}, zmm1	EVEX.512.66.0F38.W1 A2 /vsib	avx512	Using signed dword indices, scatter double-precision floating-point values to memory using writemask k1.
VSCATTERQPS vm64x {k1}, xmm1	EVEX.128.66.0F38.W0 A3 /vsib	avx512	Using signed qword indices, scatter single-precision floating-point values to memory using writemask k1.
VSCATTERQPS vm64y {k1}, xmm1	EVEX.256.66.0F38.W0 A3 /vsib	avx512	Using signed qword indices, scatter single-precision floating-point values to memory using writemask k1.
VSCATTERQPS vm64z {k1}, ymm1	EVEX.512.66.0F38.W0 A3 /vsib	avx512	Using signed qword indices, scatter single-precision floating-point values to memory using writemask k1.
VSCATTERQPD vm64x {k1}, xmm1	EVEX.128.66.0F38.W1 A3 /vsib	avx512	Using signed qword indices, scatter double-precision floating-point values to memory using writemask k1.
VSCATTERQPD vm64y {k1}, ymm1	EVEX.256.66.0F38.W1 A3 /vsib	avx512	Using signed qword indices, scatter double-precision floating-point values to memory using writemask k1.
VSCATTERQPD vm64z {k1}, zmm1	EVEX.512.66.0F38.W1 A3 /vsib	avx512	Using signed qword indices, scatter double-precision floating-point values to memory using writemask k1.
VSCATTERPF0DPS vm32z {k1}	EVEX.512.66.0F38.W0 C6 /5 /vsib	avx512	Using signed dword indices, prefetch sparse byte memory locations containing single-precision data using writemask k1 and T0 hint with intent to write.
VSCATTERPF0QPS vm64z {k1}	EVEX.512.66.0F38.W0 C7 /5 /vsib	avx512	Using signed qword indices, prefetch sparse byte memory locations containing single-precision data using writemask k1 and T0 hint with intent to write.
VSCATTERPF0DPD vm32y {k1}	EVEX.512.66.0F38.W1 C6 /5 /vsib	avx512	Using signed dword indices, prefetch sparse byte memory locations containing double-precision data using writemask k1 and T0 hint with intent to write.
VSCATTERPF0QPD vm64z {k1}	EVEX.512.66.0F38.W1 C7 /5 /vsib	avx512	Using signed qword indices, prefetch sparse byte memory locations containing double-precision data using writemask k1 and T0 hint with intent to write.
VSCATTERPF1DPS vm32z {k1}	EVEX.512.66.0F38.W0 C6 /6 /vsib	avx512	Using signed dword indices, prefetch sparse byte memory locations containing single-precision data using writemask k1 and T1 hint with intent to write.
VSCATTERPF1QPS vm64z {k1}	EVEX.512.66.0F38.W0 C7 /6 /vsib	avx512	Using signed qword indices, prefetch sparse byte memory locations containing single-precision data using writemask k1 and T1 hint with intent to write.
VSCATTERPF1DPD vm32y {k1}	EVEX.512.66.0F38.W1 C6 /6 /vsib	avx512	Using signed dword indices, prefetch sparse byte memory locations containing double-precision data using writemask k1 and T1 hint with intent to write.
VSCATTERPF1QPD vm64z {k1}	EVEX.512.66.0F38.W1 C7 /6 /vsib	avx512	Using signed qword indices, prefetch sparse byte memory locations containing double-precision data using writemask k1 and T1 hint with intent to write.
VSHUFF32X4 ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst, imm8	EVEX.NDS.256.66.0F3A.W0 23 /r ib	avx512	Shuffle 128-bit packed single-precision floating-point values selected by imm8 from ymm2 and ymm3/m256/m32bcst and place results in ymm1 subject to writemask k1.
VSHUFF32x4 zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst, imm8	EVEX.NDS.512.66.0F3A.W0 23 /r ib	avx512	Shuffle 128-bit packed single-precision floating-point values selected by imm8 from zmm2 and zmm3/m512/m32bcst and place results in zmm1 subject to writemask k1.
VSHUFF64X2 ymm1{k1}{z}, ymm2, ymm3/m256/m64bcst, imm8	EVEX.NDS.256.66.0F3A.W1 23 /r ib	avx512	Shuffle 128-bit packed double-precision floating-point values selected by imm8 from ymm2 and ymm3/m256/m64bcst and place results in ymm1 subject to writemask k1.
VSHUFF64x2 zmm1{k1}{z}, zmm2, zmm3/m512/m64bcst, imm8	EVEX.NDS.512.66.0F3A.W1 23 /r ib	avx512	Shuffle 128-bit packed double-precision floating-point values selected by imm8 from zmm2 and zmm3/m512/m64bcst and place results in zmm1 subject to writemask k1.
VSHUFI32X4 ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst, imm8	EVEX.NDS.256.66.0F3A.W0 43 /r ib	avx512	Shuffle 128-bit packed double-word values selected by imm8 from ymm2 and ymm3/m256/m32bcst and place results in ymm1 subject to writemask k1.
VSHUFI32x4 zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst, imm8	EVEX.NDS.512.66.0F3A.W0 43 /r ib	avx512	Shuffle 128-bit packed double-word values selected by imm8 from zmm2 and zmm3/m512/m32bcst and place results in zmm1 subject to writemask k1.
VSHUFI64X2 ymm1{k1}{z}, ymm2, ymm3/m256/m64bcst, imm8	EVEX.NDS.256.66.0F3A.W1 43 /r ib	avx512	Shuffle 128-bit packed quad-word values selected by imm8 from ymm2 and ymm3/m256/m64bcst and place results in ymm1 subject to writemask k1.
VSHUFI64x2 zmm1{k1}{z}, zmm2, zmm3/m512/m64bcst, imm8	EVEX.NDS.512.66.0F3A.W1 43 /r ib	avx512	Shuffle 128-bit packed quad-word values selected by imm8 from zmm2 and zmm3/m512/m64bcst and place results in zmm1 subject to writemask k1.
VTESTPS xmm1, xmm2/m128	VEX.128.66.0F38.W0 0E /r	avx	Set ZF and CF depending on sign bit AND and ANDN of packed single-precision floating-point sources.
VTESTPS ymm1, ymm2/m256	VEX.256.66.0F38.W0 0E /r	avx	Set ZF and CF depending on sign bit AND and ANDN of packed single-precision floating-point sources.
VTESTPD xmm1, xmm2/m128	VEX.128.66.0F38.W0 0F /r	avx	Set ZF and CF depending on sign bit AND and ANDN of packed double-precision floating-point sources.
VTESTPD ymm1, ymm2/m256	VEX.256.66.0F38.W0 0F /r	avx	Set ZF and CF depending on sign bit AND and ANDN of packed double-precision floating-point sources.
VZEROALL	VEX.256.0F.WIG 77	avx	Zero all YMM registers.
VZEROUPPER	VEX.128.0F.WIG 77	avx	Zero upper 128 bits of all YMM registers.
WAIT	9B		Check pending unmasked floating-point exceptions.
WAIT	9B		Check pending unmasked floating-point exceptions.
WBINVD	0F 09		Write back and flush Internal caches; initiate writing-back and flushing of external caches.
WRFSBASE r32	F3 0F AE /2	fsgsbase	Load the FS base address with the 32-bit value in the source register.
WRFSBASE r64	F3 REX.W 0F AE /2	fsgsbase	Load the FS base address with the 64-bit value in the source register.
WRGSBASE r32	F3 0F AE /3	fsgsbase	Load the GS base address with the 32-bit value in the source register.
WRGSBASE r64	F3 REX.W 0F AE /3	fsgsbase	Load the GS base address with the 64-bit value in the source register.
WRMSR	0F 30		Write the value in EDX:EAX to MSR specified by ECX.
WRPKRU	0F 01 EF	ospke	Writes EAX into PKRU.
XABORT imm8	C6 F8 ib	rtm	Causes an RTM abort if in RTM execution
XACQUIRE	F2	hle	A hint used with an “XACQUIRE-enabled“ instruction to start lock elision on the instruction memory operand address.
XRELEASE	F3	hle	A hint used with an “XRELEASE-enabled“ instruction to end lock elision on the instruction memory operand address.
XADD r/m8, r8	0F C0 /r		Exchange r8 and r/m8; load sum into r/m8.
XADD r/m8, r8	REX + 0F C0 /r		Exchange r8 and r/m8; load sum into r/m8.
XADD r/m16, r16	0F C1 /r		Exchange r16 and r/m16; load sum into r/m16.
XADD r/m32, r32	0F C1 /r		Exchange r32 and r/m32; load sum into r/m32.
XADD r/m64, r64	REX.W + 0F C1 /r		Exchange r64 and r/m64; load sum into r/m64.
XBEGIN rel16	C7 F8	rtm	Specifies the start of an RTM region. Provides a 16-bit relative offset to compute the address of the fallback instruction address at which execution resumes following an RTM abort.
XBEGIN rel32	C7 F8	rtm	Specifies the start of an RTM region. Provides a 32-bit relative offset to compute the address of the fallback instruction address at which execution resumes following an RTM abort.
XCHG AX, r16	90+rw		Exchange r16 with AX.
XCHG r16, AX	90+rw		Exchange AX with r16.
XCHG EAX, r32	90+rd		Exchange r32 with EAX.
XCHG RAX, r64	REX.W + 90+rd		Exchange r64 with RAX.
XCHG r32, EAX	90+rd		Exchange EAX with r32.
XCHG r64, RAX	REX.W + 90+rd		Exchange RAX with r64.
XCHG r/m8, r8	86 /r		Exchange r8 (byte register) with byte from r/m8.
XCHG r/m8, r8	REX + 86 /r		Exchange r8 (byte register) with byte from r/m8.
XCHG r8, r/m8	86 /r		Exchange byte from r/m8 with r8 (byte register).
XCHG r8, r/m8	REX + 86 /r		Exchange byte from r/m8 with r8 (byte register).
XCHG r/m16, r16	87 /r		Exchange r16 with word from r/m16.
XCHG r16, r/m16	87 /r		Exchange word from r/m16 with r16.
XCHG r/m32, r32	87 /r		Exchange r32 with doubleword from r/m32.
XCHG r/m64, r64	REX.W + 87 /r		Exchange r64 with quadword from r/m64.
XCHG r32, r/m32	87 /r		Exchange doubleword from r/m32 with r32.
XCHG r64, r/m64	REX.W + 87 /r		Exchange quadword from r/m64 with r64.
XEND	0F 01 D5	rtm	Specifies the end of an RTM code region.
XGETBV	0F 01 D0		Reads an XCR specified by ECX into EDX:EAX.
XLAT m8	D7		Set AL to memory byte DS:[(E)BX + unsigned AL].
XLATB	D7		Set AL to memory byte DS:[(E)BX + unsigned AL].
XLATB	REX.W + D7		Set AL to memory byte [RBX + unsigned AL].
XOR AL, imm8	34 ib		AL XOR imm8.
XOR AX, imm16	35 iw		AX XOR imm16.
XOR EAX, imm32	35 id		EAX XOR imm32.
XOR RAX, imm32	REX.W + 35 id		RAX XOR imm32 (sign-extended).
XOR r/m8, imm8	80 /6 ib		r/m8 XOR imm8.
XOR r/m8, imm8	REX + 80 /6 ib		r/m8 XOR imm8.
XOR r/m16, imm16	81 /6 iw		r/m16 XOR imm16.
XOR r/m32, imm32	81 /6 id		r/m32 XOR imm32.
XOR r/m64, imm32	REX.W + 81 /6 id		r/m64 XOR imm32 (sign-extended).
XOR r/m16, imm8	83 /6 ib		r/m16 XOR imm8 (sign-extended).
XOR r/m32, imm8	83 /6 ib		r/m32 XOR imm8 (sign-extended).
XOR r/m64, imm8	REX.W + 83 /6 ib		r/m64 XOR imm8 (sign-extended).
XOR r/m8, r8	30 /r		r/m8 XOR r8.
XOR r/m8, r8	REX + 30 /r		r/m8 XOR r8.
XOR r/m16, r16	31 /r		r/m16 XOR r16.
XOR r/m32, r32	31 /r		r/m32 XOR r32.
XOR r/m64, r64	REX.W + 31 /r		r/m64 XOR r64.
XOR r8, r/m8	32 /r		r8 XOR r/m8.
XOR r8, r/m8	REX + 32 /r		r8 XOR r/m8.
XOR r16, r/m16	33 /r		r16 XOR r/m16.
XOR r32, r/m32	33 /r		r32 XOR r/m32.
XOR r64, r/m64	REX.W + 33 /r		r64 XOR r/m64.
XORPD xmm1, xmm2/m128	66 0F 57/r	sse2	Return the bitwise logical XOR of packed double-precision floating-point values in xmm1 and xmm2/mem.
VXORPD xmm1,xmm2, xmm3/m128	VEX.128.66.0F.WIG 57 /r	avx	Return the bitwise logical XOR of packed double-precision floating-point values in xmm2 and xmm3/mem.
VXORPD ymm1, ymm2, ymm3/m256	VEX.256.66.0F.WIG 57 /r	avx	Return the bitwise logical XOR of packed double-precision floating-point values in ymm2 and ymm3/mem.
VXORPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst	EVEX.128.66.0F.W1 57 /r	avx512	Return the bitwise logical XOR of packed double-precision floating-point values in xmm2 and xmm3/m128/m64bcst subject to writemask k1.
VXORPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst	EVEX.256.66.0F.W1 57 /r	avx512	Return the bitwise logical XOR of packed double-precision floating-point values in ymm2 and ymm3/m256/m64bcst subject to writemask k1.
VXORPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst	EVEX.512.66.0F.W1 57 /r	avx512	Return the bitwise logical XOR of packed double-precision floating-point values in zmm2 and zmm3/m512/m64bcst subject to writemask k1.
XORPS xmm1, xmm2/m128	0F 57 /r	sse	Return the bitwise logical XOR of packed single-precision floating-point values in xmm1 and xmm2/mem.
VXORPS xmm1,xmm2, xmm3/m128	VEX.NDS.128.0F.WIG 57 /r	avx	Return the bitwise logical XOR of packed single-precision floating-point values in xmm2 and xmm3/mem.
VXORPS ymm1, ymm2, ymm3/m256	VEX.NDS.256.0F.WIG 57 /r	avx	Return the bitwise logical XOR of packed single-precision floating-point values in ymm2 and ymm3/mem.
VXORPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst	EVEX.NDS.128.0F.W0 57 /r	avx512	Return the bitwise logical XOR of packed single-precision floating-point values in xmm2 and xmm3/m128/m32bcst subject to writemask k1.
VXORPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst	EVEX.NDS.256.0F.W0 57 /r	avx512	Return the bitwise logical XOR of packed single-precision floating-point values in ymm2 and ymm3/m256/m32bcst subject to writemask k1.
VXORPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst	EVEX.NDS.512.0F.W0 57 /r	avx512	Return the bitwise logical XOR of packed single-precision floating-point values in zmm2 and zmm3/m512/m32bcst subject to writemask k1.
XRSTOR mem	0F AE /5		Restore state components specified by EDX:EAX from mem.
XRSTOR64 mem	REX.W+ 0F AE /5		Restore state components specified by EDX:EAX from mem.
XRSTORS mem	0F C7 /3		Restore state components specified by EDX:EAX from mem.
XRSTORS64 mem	REX.W+ 0F C7 /3		Restore state components specified by EDX:EAX from mem.
XSAVE mem	0F AE /4		Save state components specified by EDX:EAX to mem.
XSAVE64 mem	REX.W+ 0F AE /4		Save state components specified by EDX:EAX to mem.
XSAVEC mem	0F C7 /4		Save state components specified by EDX:EAX to mem with compaction.
XSAVEC64 mem	REX.W+ 0F C7 /4		Save state components specified by EDX:EAX to mem with compaction.
XSAVEOPT mem	0F AE /6	xsaveopt	Save state components specified by EDX:EAX to mem, optimizing if possible.
XSAVEOPT64 mem	REX.W + 0F AE /6	xsaveopt	Save state components specified by EDX:EAX to mem, optimizing if possible.
XSAVES mem	0F C7 /5		Save state components specified by EDX:EAX to mem with compaction, optimizing if possible.
XSAVES64 mem	REX.W+ 0F C7 /5		Save state components specified by EDX:EAX to mem with compaction, optimizing if possible.
XSETBV	0F 01 D1		Write the value in EDX:EAX to the XCR specified by ECX.
XTEST	0F 01 D6	hle	Test if executing in a transactional region