From OSx86
Jump to: navigation, search

Extracted from Wikipedia:

SSE (Streaming SIMD|SIMD Extensions, originally called ISSE, Internet Streaming SIMD Extensions) is a SIMD (Single Instruction, Multiple Data) instruction set designed by Intel and introduced in 1999 in their Pentium III series processors as a reply to AMD's 3DNow! (which had debuted a year earlier).

SSE contains 70 new instructions.

It was originally known as KNI for Katmai New Instructions (Katmai was the code name for the first Pentium III core revision). During the Katmai project Intel was looking to distinguish it from their earlier product line, particularly their flagship Pentium II. AMD eventually added support for SSE instructions, starting with its Athlon XP processor.

Intel was generally disappointed with their first IA-32 SIMD effort, MMX. MMX had two main problems: it re-used existing floating point registers making the Central processing unit|CPU unable to work on both floating point and SIMD data at the same time, and it only worked on integers.

SSE adds eight new 128-bit registers known as XMM0 through XMM7. Each register packs together four 32-bit single-precision floating point numbers. Integer SIMD operations may still be performed with the eight 64-bit MMX registers.

Because these 128-bit registers are additional program states that the operating system must preserve across task switches, they are disabled by default until the operating system explicitly enables them. This means that the OS must know how to use the FXSAVE and FXRSTR instructions, which is the extended pair of instructions which can save all x87 and SSE register states all at once. This support was quickly added to all major IA-32 operating systems.

Because SSE adds floating point support, it sees much more use than MMX. The addition of SSE2's integer support makes SSE even more flexible. While MMX is redundant, operations can be operated in parallel with SSE operations offering further performance increases in some situations.

The first CPU to support SSE, the Pentium III, shared execution resources between SSE and the floating point unit|FPU. While a compiled application can interleave FPU and SSE instructions side-by-side, the Pentium III will not issue a FPU and a SSE instruction in the same clock-cycle. This limitation reduces the effectiveness of Instruction pipeline|pipelining, but the separate XMM registers do allow SIMD and scalar floating point operations to be mixed without the performance hit from explicit MMX/floating point mode switching.

[edit] Later Versions

  • SSE2, introduced with the Pentium 4, is a major enhancement to SSE (which some programmers renamed "SSE1"). SSE2 adds new math instructions for double-precision (64-bit) floating point and 8/16/32-bit integer data types, all operating on the same 128-bit XMM vector register-file previously introduced with SSE. SSE2 enables the programmer to perform SIMD math of virtually any type (from 8-bit integer to 64-bit float) entirely with the XMM vector-register file, without the need to touch the (legacy) MMX/FPU registers. Many programmers consider SSE2 to be "everything SSE should have been", as SSE2 offers an orthogonal set of instructions for dealing with common datatypes.
  • SSE3 called Prescott New Instructions, is an incremental upgrade to SSE2, adding a handful of DSP-oriented mathemathics instructions and some process (thread) management instructions.
  • SSSE3 is an incremental upgrade to SSE3, adding 16 new opcodes which include permuting the bytes in a word, multiplying 16-bit fixed-point numbers with correct rounding, and within-word accumulate instructions. SSSE3 is often mistaken for SSE4 as this term was used during the development of the Core microarchitecture.
  • SSE4 is another major enhancement, adding a dot product instruction, lots of additional integer instructions, a popcnt instruction, and more. No processors on the market officially support SSE 4, however SSE 4 will be supported on 'Intel Core microarchitecture | Penryn' a version of the Core 2 microarchitecture. An overview can be found at .

This page was last modified on 17 January 2007, at 15:39.
This page has been accessed 9,755 times.
Powered by MediaWiki © 2021 OSx86 Project  |   InsanelyMac  |   Forum  |   OSx86 Wiki   |   Privacy policy   |   About OSx86   |   Disclaimers