Overview of AI Chip
2024-02-27 13:52:54 723
01 What is AI chips and its application
AI chips, in a broad sense, refer to chips that can run artificial intelligence (AI) algorithms, and in a narrow sense, refer to chips specifically designed for accelerating AI algorithms. The current stage of AI algorithms generally focuses on deep learning algorithms.
The three key foundational elements of the AI industry are algorithms, computing power, and data. With the continuous development of AI, the demand for computing power is rapidly increasing. The existing general-purpose chips have a large amount of general-purpose logic inside, but AI algorithms cannot use these logic, so they cannot achieve the optimal cost-effectiveness. Therefore, it is necessary to develop AI chips that can accelerate AI computing.
AI chips can be widely used in various fields of artificial intelligence, such as speech recognition, image processing, natural language processing, recommendation systems, autonomous driving, etc
- Speech recognition: AI chips can be used for speech recognition applications to convert speech into text. It supports functions such as voice input, processing, and transcription, and is often used in scenarios such as intelligent customer service and intelligent assistants.
- Image processing: AI chips can be used for image processing applications to recognize, classify, and extract features from images. It is often used in security monitoring, intelligent driving, medical imaging and other scenarios.
- Natural language processing: AI chips can be used in natural language processing applications to perform text analysis, semantic understanding, and other operations on human language. It is often used in scenarios such as intelligent question answering and intelligent recommendation.
- Recommendation system: AI chips can be used for recommendation system applications to provide personalized recommendations based on the user's historical behavior and preferences. It is often used in scenarios such as e-commerce and video websites.
- Autonomous driving: AI chips can be used for autonomous driving applications, achieving autonomous driving and vehicle control through vehicle sensors and algorithms. It is often used in scenarios such as autonomous vehicles and drones.
02 Architecture types of AI chips
With the development of the AI industry, four types of AI chip architectures have emerged in the industry: GPU, FPGA, ASIC, and neuromorphic. The first three are based on von Neumann's traditional computing architecture, with a focus on accelerating hardware computing power; Brain like structures subvert the von Neumann architecture and are independently designed using brain like neural structures to enhance computing power.
- GPU (General Purpose Image Processing Unit): Adopting the SIMD single instruction multiple data stream approach, it has a large number of computing units and an ultra long graphics and image processing pipeline. Due to the fact that most of the internal transistors can form various specialized circuits and multiple pipelines, GPUs have much higher computing speed than CPUs and more powerful floating-point computing capabilities, unleashing the potential of AI. Therefore, they are widely used in the field of deep learning algorithms. NVIDIA and AMD, as GPU giants, account for 70% of the GPU and AI market share.
- FPGA (Field Programmable Logic Array): Repeatedly burning hardware designs into its programmable memory, allowing FPGA chips to perform different hardware designs and functions. FPGA implements the computing architecture of AI using hardware circuits, and continuously inputs data streams into the system to complete calculations. Unlike GPUs, FPGAs can have both hardware pipeline parallel and data parallel processing capabilities, making them suitable for processing data streams in a hardware pipeline manner, making them highly suitable for the AI inference stage.
- ASIC (Application Specific Integrated Circuit): A chip designed with a customized architecture to meet the specific algorithm requirements of AI, which helps improve chip performance and power consumption ratio. The disadvantage is that customized circuit design leads to relatively long development cycles and cannot be expanded. The advantage is that it has huge advantages in power consumption, reliability, chip size, performance, and other aspects.
- Brain like chip: designed directly based on neural morphology architecture, used to simulate human brain functions for computing perception, behavior, and thinking patterns. But the research and development difficulty is enormous. Brain like chips do not use the classic von Neumann architecture, but are designed based on neural morphology architecture, represented by IBM Truenorth. IBM researchers used storage units as synapses, computing units as synapses, and transmission units as axons to build the prototype of neural chips. In China, the Brain Computing Center of Tsinghua University successfully developed the first large-scale neuromorphic brain computing chip in November 2015.
Table 1: Characteristics and specific parameter comparison of AI chip architecture types
Characteristic |
CPU |
GPU |
FPGA |
ASIC |
Basic architecture |
60% logic unit 40% computing unit |
60%-70%computing unit 30%logic unit |
Gate circuit resources |
Solidified gate circuit resources
|
Architecture diagram |
|
|
|
|
Customization level |
Universal type |
Universal type |
Semi customization |
customization |
Delay
|
high
|
higher |
Low (all 1/10 of GPU) |
Low (all 1/10 of GPU) |
Advantage
|
Strong ability in complex logical operations, skilled in logical control |
Proficient in parallel computing, strong floating-point data computing ability, and consistent software and hardware systems |
Can perform data parallelism and pipeline parallelism, programmable, and highly flexible |
AI has high computational efficiency, low power consumption, and small size |
Disadvantages |
Few cores, not good at handling parallel tasks |
Large area and high power consumption. Due to the general requirements, it is difficult to focus on a specific model for in-depth optimization |
Long development cycle and high difficulty in developing complex algorithms |
Poor flexibility, algorithm supports wired, and needs to be redeveloped after algorithm iteration |
AI training effectiveness |
Poor effect |
The only mass-produced hardware available for training |
Low efficiency |
Perhaps the best chip for training, but currently there are no mass-produced products available |
Application scenarios |
Mainly used for inferring scenarios
|
Dominant in both cloud and edge environments, with the highest share of cloud based training |
Mainly used for inferring scenarios |
Mainly used for inferring scenarios |
Specific chip comparison |
E5-2699 V3 |
Testa K80 |
Virtex7-690T |
Google TPU |
Calculate the number of units (number) |
18 (256bit) |
7804 (32bit) |
3600 (32bit) |
65536 (8bit) |
Peak computing power (TOPS) |
1.33 (Single precision floating-point) |
8.74 (Single precision floating-point) |
1.8 (Single precision floating-point) |
92(8bit) |
Power consumption (W) |
145 |
300 |
30 |
40 |
Energy consumption ratio (GFLOPS/W) |
9 |
29 |
60 |
2300 |
03 Scene classification of AI chips
According to the location classification of AI chips in the application scenario network, they can be divided into cloud, edge, and terminal.
Application |
Chip requirements |
Typical computing power |
Typical power consumption |
Typical application areas |
Terminal
|
Low power consumption, high energy efficiency, mainly focused on inference tasks, cost sensitive, and numerous hardware product forms |
<8TOPS |
<5W |
Various consumer electronics, Internet of Things
|
Cloud |
High performance, computationally intensive, with inference and training tasks, high unit price, and limited hardware product forms |
>30TOPS |
>50W |
Cloud computing data centers, enterprise private clouds, etc |
Edge |
The requirements for power consumption, performance, and size are often intermediate between terminals and the cloud, with inference tasks as the main focus and mostly used for plug-in devices. The hardware product form is relatively limited |
5TOPS~30TOPS |
4W~15W |
Intelligent manufacturing, intelligent fixtures, intelligent retail, intelligent transportation, intelligent finance, intelligent healthcare, intelligent driving and many other application fields |
- Cloud based AI chips: powerful performance, support for large amounts of computing, flexible support for AI applications, can quickly connect to various intelligent devices, and maintain maximum stability. Featuring high performance, high scalability, and low latency.
- Edge AI chip: responsible for edge side data calculation and storage, can summarize data to the computing layer, and the cloud computing layer completes analysis, mining, and data sharing work, distributing results or models to the edge and terminals. Featuring low latency, high reliability, and privacy protection, it is suitable for application scenarios that require real-time response.
- Terminal AI chip: small size, low power consumption, need to be embedded in the device, usually only need to support one or two AI capabilities, and enable the device to use the corresponding AI capabilities without the Internet. Featuring low power consumption, small size, and high performance.
According to the classification of AI chips in practical goals, they can be divided into two categories: training AI chips and inference AI chips.
- Training AI chips: used to build models such as neural networks. Emphasis is placed on absolute computing power, with very high performance requirements, as well as comprehensive indicators such as unit energy consumption, computing power, latency, and cost.
- Inference AI chip: Using models for inference prediction. Pay attention to comprehensive indicators, consider unit energy consumption, computing power, latency, cost, etc., and have low performance requirements. Being able to complete computing tasks is sufficient. Because the reasoning results are directly provided to end customers, more attention is paid to optimizing user experience.
04 The representative companies of AI inference and training chips are as follows
Training AI chip representative enterprises
NVIDIA, AMD, Iluvar Corex, Biren Technology, Cambridge, Inflame, MetaX, Moore Threads
Inference AI chips represent enterprises
NVIDIA, AMD (Xilinx), Google, Intel, Cambridge, Wave Computing, Inflame, Bitmain