Skip to main content

Let Intel on Black Technology of AI on PC-Application of AVX512 instruction set in consumer processor

CPU squeezing toothpaste has always been a topic that the market cannot circumvent, and various segments of -Intel squeezing toothpaste are endless. But in fact, Intel has silently applied many supercomputing technologies to consumer-grade processors, such as AVX521 and its extended instruction set VNNI, continuously improving the AI ​​performance in the processor, so that the overall performance of the processor has been further improved and optimization.
Many users have the perception that the current PC processor has entered a bottleneck in terms of performance and applications. For example, using a 16-core CPU to access the Internet is not much faster than a 4-core CPU. The PC processor has "excess performance". But in fact, such recognition is very obvious. Needless to say, users in the professional field are pursuing the ultimate performance of computers. With the rapid development of Internet technology, ordinary users have more and more demands for images and videos, and their requirements are also getting higher and higher. For example, converting voice into text and video The automatic optimization of images and so on, which have exponentially improved the processor performance requirements compared to text processing. If the application requirements and technology of supercomputing can be brought to the consumer market, users can clearly feel the improvement of PC performance, because supercomputing is a highly parallel, large-scale expansion of the computer, the number of cores, parallel performance are very High demands.

In fact, Intel has long-term foresighted insight into this trend and demand, and has been committed to this, and more importantly, Intel has always done so, not only in the next 5-10 years, but now has a large number of The supercomputing technology has been brought to the consumer market and brought to ordinary consumers. Why can we say that?
Intel is an absolute big player in the supercomputing market, especially the supercomputing CPU market. At present, the fastest 500 supercomputers use Intel CPUs accounting for nearly 95%. It can be said that the core technology of CPUs in supercomputing, Intel is definitely the leader who does not allow it. In addition, Intel’s Xeon Scalable processors for the latest supercomputing processors have many new features, not only in the number of cores, but also the faster CPU point-to-point interconnect bus UPI, node interconnect solution Omni-Path, the Intel Parallel Studio suite on the software can provide tools from the software development environment, performance tuning, high-performance math library to compiler, etc., providing developers and users with the most high-performance application software in an all-round way. With the comprehensive optimization of vectorization, there will be an unprecedented improvement in performance.
In the CPU core, the seemingly unfavorable Xeon processor supports the latest advanced vector extension AVX-512 instruction set. This is the latest wide vector data processing implementation of the X86 CPU. Intel provides a single 512-bit data and controls The instruction execution unit makes the width of the combined vector data that can be processed by the CPU reach 512 bits at a time, and expands to 32 512-bit MZM registers to ensure the temporary storage requirements of data processing. It also supports FMA fusion multiply-add operations, which is compared to the current The 256-bit vector processing capacity of AVX2 of mainstream products and competing products is doubled, and more importantly, through a large number of supplementary expansions, the speed of certain specific operations is greatly accelerated, making it more than doubled.
This powerful data processing capability requires extremely high application requirements to reflect its power. At present, AVX512 configuration files have been widely used in supercomputers and scientific computing fields to improve their computing efficiency. NAMD, Gromacs, lammps, Intel Media SDK, and Ospray custom rendering In various fields, the device can be accelerated by AVX512 to achieve faster or richer special effects computing, graphics, and various multimedia applications. This is currently only for professional users, and it is a supercomputing application. Intel has started to work. The current Core i9 X series processors all support the AVX-512 instruction set and maintain the same 2 512-bit FMA units as high-end servers.
At the same time, in the notebook, the 10th generation Intel Core IceLake series just launched last year also supports the AVX512 instruction set. In the future, the Intel Core product line will all support the AVX-512 instruction set and related latest extensions. It can be said that you do not need to wait In 5-10 years, you can now embrace supercomputing technology in your arms. This is also the foresight of Intel as a technology leader.
And the application of speech to text just mentioned has also been implemented by Intel in large numbers. Intel has advocated that AI inference has been widely used in speech recognition, image recognition and text recognition applications. VNNI based on the AVX512 instruction set is Intel’s latest AI inference acceleration instruction set, by turning the three instructions originally required for the int8 fusion multiply-add operation into one instruction execution, has greatly increased the rate of inference applications related to AI convolution calculation for int8 data types.
Use the VPDPWSSD instruction of VNNI to complete the int8 multiply and int32 accumulation operations that can only be completed by the last three instructions.
And through the tenth generation Intel Core X series, IceLake supports AVX512 VNNI, Intel has also brought the latest AI inference technology to the consumer market. Through the latest image recognition, classification, speech and text recognition applications and Intel OpenVino AI inference optimization framework, it will It will greatly improve the user's experience in text and image recognition applications, and complete some image processing tasks faster.
This is not all. Intel started with the sixth-generation Core processor and supported the TSX instruction set on some mainstream processor models. This is a transaction memory load expansion instruction. It is designed to handle high concurrent services in database transactions. When the data table is modified synchronously, it involves dealing with the problem of locking when the data is modified. Multi-thread concurrent modification of the data table often requires the program to lock. The program judges and arbitrates each modification of the data, but the lock itself is also formed by the program code. The operation of executing the lock will greatly reduce the concurrency and increase the CPU execution pressure. TSX is a coarse-grained lock that wraps the critical section that contains transactional operations; the hardware automatically detects the data conflicts in the operation to ensure the transaction The correctness of sexual operations and the exploration of parallelism between operations can dig out more opportunities for parallelism. Now, many simulator users have paid more and more attention to the support of the TSX instruction set because it can greatly improve the simulation of high-performance requirements. The efficiency of the device, such as the PS3 emulator, and this is also the watchdog secret of server-side database transactions. After using TSX, compared with other products and competitors that are not supported, the simple transaction throughput rate can be increased by up to 10 times, and such Professional instruction set is supported on X series processors and higher-end desktop processors. (See Intel ARK for details)
Innovative technology companies such as Intel have been ahead of the industry. Intel has brought a lot of enterprise-level and super-computing-level technical support to consumer users through a powerful combination of software and hardware. AI technology and performance are applied on the PC side, constantly digging up new experience for consumers, and laying a solid foundation for actively responding to the needs of the latest applications. I think, at present, only Intel can do it. With this black technology blessing, do you still think Intel's CPU is squeezing toothpaste?

Comments

Popular posts from this blog

What are the areas of Intel Xeon E3, E5, E7 processors

Xeon E3 is a single-socket processor series for workstations and entry-level servers, Xeon E5 is a processor series for high-end workstations and servers, and Xeon E7 is a processor series for mission-critical and data centers. The following is a detailed introduction: 1. Xeon and Core are similar to Intel's processor series, and Xeon processors are mainly used for professional-grade and server products, similar to Core i3, i5, i7 three series. 2. The Xeon E3 processor is mainly for low-end servers and microservers, generally used for website hosting and cloud services, designed to handle a large number of light network transactions or cloud transactions, such as used in search queries and social networks The effect of the web page. 3. The Xeon E5 is mainly used for entry-level dual-socket servers, high-performance dual-socket and quad-socket servers, and is currently the most widely used mainstream server processor. 4. Xeon E7 is Intel's highest performance server proce...

Apple's self-developed computer processor, bid farewell to Intel CPU

American Apple announced on June 22 that it will install a self-developed CPU (central processing unit) on the personal computer "Mac". Since adopting Intel CPU in 2006, Apple will replace it after about 15 years. In addition to reducing the power consumption of the computer using the new CPU, it is also easy to collaborate with the iPhone. Apple held its annual developer conference "WWDC" online on June 22 and announced the news at the conference. Apple CEO Tim Cook (Reuters), who attended the online annual developer conference "WWDC", first put self-developed CPUs on some new Mac products that will be launched before the end of 2020. After that, it took about 2 years to change the CPU of all models to its own products. Apple has joined hands with ARM Holdings, which provides semiconductor design support, and the production is expected to be entrusted to Taiwan Semiconductor Manufacturing (TSMC, TSMC). Apple Chief Executive Officer (CEO) Tim Cook s...

Intel showcases smart edge and energy-efficient performance research results at 2020 VLSI seminar

At this week's 2020 VLSI Technology and Circuits Symposium, Intel will introduce a series of research results and technical perspectives on the transformation of computing caused by the growing data distributed on cores, edges, and endpoints. CTO Mike Mayberry will deliver a keynote speech entitled "Future Computing: How Data Transformation Reshapes VLSI", emphasizing the importance of transitioning from hardware/program-centric computing to data/information-centric computing.  "There is a huge amount of data flow on the distributed edge, network and cloud infrastructure, which requires high energy efficiency and powerful processing near the location where the data is generated, but this processing is often constrained by bandwidth, memory and power resources The Intel Research Institute highlighted several new methods for improving computing efficiency at the VLSI seminar. These methods show the broad prospects of various application fields, including robotics, aug...