CN115357525A - Snooping filter, processing unit, computing device and related methods - Google Patents

Snooping filter, processing unit, computing device and related methods Download PDF

Info

Publication number
CN115357525A
CN115357525A CN202210948674.9A CN202210948674A CN115357525A CN 115357525 A CN115357525 A CN 115357525A CN 202210948674 A CN202210948674 A CN 202210948674A CN 115357525 A CN115357525 A CN 115357525A
Authority
CN
China
Prior art keywords
cache line
cache
filling position
snoop
entry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210948674.9A
Other languages
Chinese (zh)
Inventor
贾孟晗
朱峰
朱涛涛
马冀炜
徐文健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou C Sky Microsystems Co Ltd
Original Assignee
Pingtouge Shanghai Semiconductor Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pingtouge Shanghai Semiconductor Co Ltd filed Critical Pingtouge Shanghai Semiconductor Co Ltd
Priority to CN202210948674.9A priority Critical patent/CN115357525A/en
Publication of CN115357525A publication Critical patent/CN115357525A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/084Multiuser, multiprocessor or multiprocessing cache systems with a shared cache
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • G06F12/0897Caches characterised by their organisation or structure with two or more cache hierarchy levels
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/62Details of cache specific to multiprocessor cache arrangements
    • G06F2212/621Coherency control relating to peripheral accessing, e.g. from DMA or I/O device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本申请实施例提供了一种探听过滤器、处理单元、计算设备和相关方法,本方案适用于包括CISC指令集、RISC精简指令集(特别是RISC‑V指令集)或VLIM指令集架构的各种芯片,如物联网芯片、音/视频芯片等。该探听过滤器包括:接收单元,用于接收第一缓存行被填充到私有缓存中后私有缓存发送的坐标信息;探听单元,用于在根据坐标信息确定目标填充位置对应的映射表项中存储有第二缓存行的标识信息时,将第二缓存行的标识信息从目标填充位置对应的映射表项转移到牺牲表项,并在目标填充位置对应的映射表项中存储第一缓存行的标识信息。本方案能够减少探听过滤对于处理器核内与处理器核外的通信带宽的占用。

Figure 202210948674

The embodiment of the present application provides a snooping filter, a processing unit, a computing device, and a related method. A variety of chips, such as IoT chips, audio/video chips, etc. The snooping filter includes: a receiving unit, configured to receive the coordinate information sent by the private cache after the first cache line is filled into the private cache; a snooping unit, configured to store in the mapping entry corresponding to the target filling position determined according to the coordinate information When the identification information of the second cache line is available, the identification information of the second cache line is transferred from the mapping entry corresponding to the target filling position to the victim entry, and the first cache line is stored in the mapping entry corresponding to the target filling position. Identification information. This solution can reduce the occupancy of snooping and filtering on the communication bandwidth inside and outside the processor core.

Figure 202210948674

Description

探听过滤器、处理单元、计算设备和相关方法Snooping filter, processing unit, computing device and related methods

技术领域technical field

本申请实施例涉及芯片技术领域,尤其涉及一种探听过滤器、处理单元、计算设备和相关方法。The embodiments of the present application relate to the field of chip technology, and in particular, to a snoop filter, a processing unit, a computing device, and a related method.

背景技术Background technique

在包括多个处理器核的处理单元中,缓存包括私有缓存和共享缓存,私有缓存属于特定的处理器核,共享缓存属于多个处理器核。处理器核从缓存中查询数据时,首先从其私有缓存中查询所需数据,如果查询不到则依次向下一级缓存查询。如果所需数据存储在其他处理器核的私有缓存中,则需要在其他处理器核的私有缓存中查询所需数据,这种处理方式称为探听(snoop)。探听其他处理器核的私有缓存,会降低其他处理器核的运行效率。In a processing unit including multiple processor cores, the cache includes a private cache and a shared cache, the private cache belongs to a specific processor core, and the shared cache belongs to multiple processor cores. When the processor core queries data from the cache, it first queries the required data from its private cache, and if it cannot find it, it then queries the lower-level cache in turn. If the required data is stored in the private caches of other processor cores, the required data needs to be queried in the private caches of other processor cores. This processing method is called snooping. Snooping on the private caches of other processor cores will reduce the operating efficiency of other processor cores.

目前,设置用于存储私有缓存中所存放数据的相关信息的探听过滤器,通过查询探听过滤器可以确定未存放所需数据的私有缓存,从而避免探听未存放所需数据的私有缓存。At present, a snoop filter for storing relevant information of data stored in the private cache is set. By querying the snoop filter, the private cache that does not store the required data can be determined, so as to avoid snooping the private cache that does not store the required data.

然而,为了保证探听过滤器能够准确反映私有缓存所存放的数据,每当私有缓存主动删除一条缓存行时,都需要向探听过滤器发送删除请求,将探听过滤器中被删除缓存行的相关信息删除,即使被删除的缓存行与内存中的数据相同,也需要向探听过滤器发送删除请求,这将占据处理器核内与处理器核外的通信带宽,影响处理单元的性能。However, in order to ensure that the snooping filter can accurately reflect the data stored in the private cache, whenever the private cache actively deletes a cache line, it needs to send a deletion request to the snooping filter, and the relevant information of the deleted cache line in the snooping filter For deletion, even if the deleted cache line is the same as the data in the memory, a deletion request needs to be sent to the snoop filter, which will occupy the communication bandwidth between the processor core and the processor core, and affect the performance of the processing unit.

发明内容Contents of the invention

有鉴于此,本申请实施例提供一种探听过滤器、处理单元、计算设备和相关方法,以至少解决或缓解上述问题。In view of this, embodiments of the present application provide a snoop filter, a processing unit, a computing device, and a related method, so as to at least solve or alleviate the above-mentioned problems.

根据本申请实施例的第一方面,提供了一种探听过滤器,包括:接收单元,用于接收第一缓存行被填充到私有缓存中后所述私有缓存发送的坐标信息,其中,所述坐标信息用于指示所述第一缓存行在所述私有缓存中的目标填充位置;探听单元,用于在根据所述坐标信息确定所述目标填充位置对应的映射表项中存储有第二缓存行的标识信息时,将所述第二缓存行的标识信息从所述目标填充位置对应的映射表项转移到牺牲表项,并在所述目标填充位置对应的映射表项中存储所述第一缓存行的标识信息,其中,不同的缓存行对应不同的标识信息。According to the first aspect of the embodiments of the present application, there is provided a snooping filter, including: a receiving unit, configured to receive coordinate information sent by the private cache after the first cache line is filled in the private cache, wherein the The coordinate information is used to indicate the target filling position of the first cache line in the private cache; the snooping unit is configured to store the second cache in the mapping entry corresponding to the target filling position determined according to the coordinate information When the identification information of the line is selected, the identification information of the second cache line is transferred from the mapping entry corresponding to the target filling position to the victim entry, and the second cache line is stored in the mapping entry corresponding to the target filling position. Identification information of a cache line, where different cache lines correspond to different identification information.

根据本申请实施例的第二方面,提供了一种处理单元,包括:根据上述第一方面所述的探听过滤器;至少两个处理器核,用于查询所述探听过滤器,确定所需数据所在的私有缓存。According to the second aspect of the embodiment of the present application, there is provided a processing unit, including: the snooping filter according to the first aspect above; at least two processor cores, used to query the snooping filter and determine the required The private cache where the data resides.

根据本申请实施例的第三方面,提供了一种计算设备,包括:上述第二方面所述的处理单元;存储器,与所述处理单元耦接,存储所述处理单元所需的数据。According to a third aspect of the embodiments of the present application, there is provided a computing device, including: the processing unit described in the second aspect above; and a memory, coupled to the processing unit, and storing data required by the processing unit.

根据本申请实施例的第四方面,提供了一种探听过滤方法,包括:接收第一缓存行被填充到私有缓存中后所述私有缓存发送的坐标信息,其中,所述坐标信息用于指示所述第一缓存行在所述私有缓存中的目标填充位置;在根据所述坐标信息确定所述目标填充位置对应的映射表项中存储有第二缓存行的标识信息时,将所述第二缓存行的标识信息从所述目标填充位置对应的映射表项转移到牺牲表项,并在所述目标填充位置对应的映射表项中存储所述第一缓存行的标识信息,其中,不同的缓存行对应不同的标识信息。According to a fourth aspect of the embodiments of the present application, there is provided a method for snooping and filtering, including: receiving coordinate information sent by the private cache after the first cache line is filled in the private cache, wherein the coordinate information is used to indicate The target filling position of the first cache line in the private cache; when it is determined according to the coordinate information that the mapping entry corresponding to the target filling position stores the identification information of the second cache line, the first The identification information of the second cache line is transferred from the mapping entry corresponding to the target filling position to the victim entry, and the identification information of the first cache line is stored in the mapping entry corresponding to the target filling position, where different The cache lines correspond to different identification information.

根据本申请实施例提供的探听过滤方案,私有缓存中的填充位置与探听过滤器中的映射表项一一对应,在第一缓存行被填充到目标填充位置后,如果目标填充位置对应的映射表项中存储有第二缓存行的标识信息,探听单元将第二缓存行的标识信息转移到牺牲表项,并将第一缓存行的标识信息存储到目标填充位置对应的映射表项。由于填充位置与映射表项一一对应,在第二缓存行不需要进行数据回写时,私有缓存仅需将目标填充位置的坐标信息发送给探听过滤器,探听过滤器将第一缓存行的标识信息存储到目标填充位置对应的映射表项,而将第二缓存行的标识信息转移到牺牲表项中,因此对于无需数据回写的缓存行,私有缓存无需向探听过滤器发送删除请求,即可保证根据探听过滤器确定哪些私有缓存不可能存放所需的数据,从而减少对其他处理器核的私有缓存进行探听的次数,进而可以减小发送删除请求对处理器核内与处理器核外的通信带宽的占据,从而保证处理单元的性能。According to the snooping and filtering scheme provided by the embodiment of the present application, the filling position in the private cache corresponds to the mapping entry in the snooping filter. After the first cache line is filled to the target filling position, if the mapping corresponding to the target filling position The entry stores the identification information of the second cache line, and the snooping unit transfers the identification information of the second cache line to the victim entry, and stores the identification information of the first cache line in the mapping entry corresponding to the target filling position. Since the filling position corresponds to the mapping table item one by one, when the second cache line does not need to write back data, the private cache only needs to send the coordinate information of the target filling position to the snoop filter, and the snoop filter will send the coordinate information of the first cache line The identification information is stored in the mapping entry corresponding to the target filling position, and the identification information of the second cache line is transferred to the victim entry. Therefore, for the cache line that does not need data write-back, the private cache does not need to send a delete request to the snoop filter. It can be guaranteed to determine which private caches cannot store the required data according to the snooping filter, thereby reducing the number of snoops on the private caches of other processor cores, thereby reducing the impact of sending delete requests on the processor core and the processor core. Occupation of external communication bandwidth, thereby ensuring the performance of the processing unit.

附图说明Description of drawings

为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请实施例中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments described in the embodiments of the present application, and those skilled in the art can also obtain other drawings based on these drawings.

图1是本申请一个实施例所应用的计算设备的示意图;FIG. 1 is a schematic diagram of a computing device used in an embodiment of the present application;

图2是本申请一个实施例的处理单元的示意图;Fig. 2 is a schematic diagram of a processing unit of an embodiment of the present application;

图3是本申请一个实施例的探听过滤器的示意图;FIG. 3 is a schematic diagram of a snoop filter according to an embodiment of the present application;

图4是本申请一个实施例的私有缓存和探听过滤器的示意图;FIG. 4 is a schematic diagram of a private cache and a snooping filter according to an embodiment of the present application;

图5是本申请一个实施例的探听过滤方法的流程图。Fig. 5 is a flowchart of a snooping filtering method according to an embodiment of the present application.

具体实施方式Detailed ways

以下基于实施例对本申请进行描述,但是本申请并不仅仅限于这些实施例。在下文对本申请的细节描述中,详尽描述了一些特定的细节部分。对本领域技术人员来说没有这些细节部分的描述也可以完全理解本申请。为了避免混淆本申请的实质,公知的方法、过程、流程没有详细叙述。另外附图不一定是按比例绘制的。The present application is described below based on examples, but the present application is not limited only to these examples. In the following detailed description of the application, some specific details are set forth in detail. The present application can be fully understood by those skilled in the art without the description of these detailed parts. In order to avoid obscuring the essence of the present application, well-known methods, procedures, and procedures are not described in detail. Additionally, the drawings are not necessarily drawn to scale.

首先,对本申请实施例进行描述的过程中出现的部分名词或术语适用于如下解释。First of all, some nouns or terms appearing in the process of describing the embodiments of the present application are applicable to the following explanations.

探听:在包括多个处理器核的处理单元中,如果一个处理器核所需的数据存放在其他处理器核的私有缓存中,则该处理器核需要去其他处理器核的私有缓存中查询所需数据,该过程称为探听(snoop)。Snooping: In a processing unit including multiple processor cores, if the data required by one processor core is stored in the private cache of other processor cores, the processor core needs to query in the private cache of other processor cores The required data, the process is called snooping (snoop).

探听过滤器:由于探听其他处理器核的私有缓存会降低其他处理器核的运行效率,为了减少探听的次数,在下一级缓存的附近存放上一级缓存的查找表,该查找表中存储有相应私有缓存中所存放数据的相关信息,通过查找表可以确定相应私有缓存中是否存储有所需数据,该查找表称为探听过滤器(snoopfilter)。Snooping filter: Since snooping on the private caches of other processor cores will reduce the operating efficiency of other processor cores, in order to reduce the number of snooping times, a lookup table of the upper-level cache is stored near the next-level cache, and the lookup table is stored. Regarding information about the data stored in the corresponding private cache, it can be determined whether the required data is stored in the corresponding private cache through a lookup table, and the lookup table is called a snoop filter.

缓存行:处理器核将数据从主存读入缓冲存储器时,需要将所需数据及其附近的一部分数据都读取进来,每次读取的一组数据叫做缓存行(CacheLine),每一级缓存中可以存放很多缓存行,每一级缓存均以缓存行为单位进行数据存储。Cache line: When the processor core reads data from the main memory into the buffer memory, it needs to read in the required data and a part of the nearby data. The set of data read each time is called a cache line (CacheLine). Many cache lines can be stored in the level cache, and each level cache stores data in units of cache lines.

计算设备computing device

图1示出一个计算设备10的示意性框图。计算设备10可以基于各种型号的处理单元构建,并由WINDOWS操作系统、UNIX操作系统、Linux操作系统等任一操作系统驱动。此外,计算设备10可以在PC机、台式机、笔记本、服务器和移动通信装置等硬件和/或软件中实施。FIG. 1 shows a schematic block diagram of a computing device 10 . The computing device 10 can be constructed based on various types of processing units and driven by any operating system such as WINDOWS operating system, UNIX operating system, and Linux operating system. In addition, computing device 10 may be implemented in hardware and/or software such as PCs, desktops, notebooks, servers, and mobile communication devices.

如图1所示,计算设备10可以包括一个或多个处理单元12,以及存储器14。计算设备10中的存储器14可以作为主存储器(简称为主存或内存),用于存储由数据信号表示的指令信息和/或数据信息,例如,存储器14可以存放处理单元12提供的数据(例如为预算结果),也可以用于实现处理单元12与外部存储设备16(或成为辅助存储器或外部存储器)之间的数据交换。As shown in FIG. 1 , computing device 10 may include one or more processing units 12 , and memory 14 . The memory 14 in the computing device 10 can be used as a main memory (referred to as main memory or internal memory) for storing instruction information and/or data information represented by data signals, for example, the memory 14 can store data provided by the processing unit 12 (such as For budgetary results), it can also be used to implement data exchange between the processing unit 12 and the external storage device 16 (or become an auxiliary storage or an external storage).

在一些情形下,处理单元12需要通过总线11访问存储器14,以获取存储器14中的数据或对存储器14中的数据进行修改。由于存储器14的访问速度较慢,为了缓解处理单元12与存储器14之间的速度差距,计算设备10还包括与总线11通信连接的高速缓冲存储器18,高速缓冲存储器18用于对存储器14中的一些可能会被反复调用的程序数据或者报文数据进行缓存。高速缓冲存储器18可以由诸如静态随机存储器(Static Random AccessMemory,SRAM)等类型的存储装置实现。高速缓冲存储器18可以为多级结构,例如具有一级缓存(L1Cache)、二级缓存(L2Cache)和三级缓存(L3Cache)的三级缓存结构,高速缓冲存储器18也可以是三级以上的缓存结构或其他类型缓存结构。在一些实施例中,高速缓冲存储器18的一部分(例如一级缓存,或一级缓存和二级缓存两者)可以集成在处理单元12内部或与处理单元12集成在同一片上系统中。In some cases, the processing unit 12 needs to access the memory 14 through the bus 11 to acquire data in the memory 14 or modify the data in the memory 14 . Since the access speed of the memory 14 is relatively slow, in order to alleviate the speed gap between the processing unit 12 and the memory 14, the computing device 10 further includes a cache memory 18 communicatively connected with the bus 11, and the cache memory 18 is used to store the data stored in the memory 14. Some program data or message data that may be called repeatedly are cached. The cache memory 18 may be implemented by a storage device such as a static random access memory (Static Random Access Memory, SRAM). The high-speed cache memory 18 can be a multi-level structure, such as a three-level cache structure with a first-level cache (L1Cache), a second-level cache (L2Cache) and a third-level cache (L3Cache), and the high-speed cache memory 18 can also be a cache with more than three levels structure or other type of cache structure. In some embodiments, a portion of cache memory 18 (eg, L1 cache, or both L1 and L2 cache) may be integrated within processing unit 12 or integrated with processing unit 12 in the same system-on-chip.

基于此,处理单元12可以包括指令执行单元121、以及内存管理单元122等部分。指令执行单元121在执行一些需要修改内存的指令时发起写访问请求,该写访问请求指定了需要写入内存中的写入数据和相应的物理地址。内存管理单元122用于将这些指令指定的虚拟地址转译为该虚拟地址映射的物理地址,写访问请求指定的物理地址与相应指令指定的物理地址可以一致。Based on this, the processing unit 12 may include an instruction execution unit 121 , a memory management unit 122 and other parts. The instruction execution unit 121 initiates a write access request when executing some instructions that need to modify the memory, and the write access request specifies the write data to be written into the memory and the corresponding physical address. The memory management unit 122 is configured to translate the virtual address specified by these instructions into the physical address mapped by the virtual address, and the physical address specified by the write access request may be consistent with the physical address specified by the corresponding instruction.

存储器14和高速缓冲存储器18之间的信息交互可以按照数据块来组织。在一些实施例中,高速缓冲存储器18和存储器14可以按照相同的空间尺寸被划分成数据块,数据块可以作为高速缓冲存储器18和存储器14之间的数据交换的最小单位(包括预设长度的一个或多个数据)。为了表述简洁清晰,下面将高速缓冲存储器18中的各数据块简称为缓存块(或者可以称为cacheline或高速缓存线),且不同的缓存块具有不同的缓存块地址。将存储器14中的各个数据块简称为内存块,且不同的内存块具有不同的内存块地址。缓存块地址和/或内存块地址可以包括用于定位数据块的物理地址标签。The exchange of information between memory 14 and cache memory 18 may be organized in blocks of data. In some embodiments, the cache memory 18 and the memory 14 can be divided into data blocks according to the same space size, and the data block can be used as the minimum unit of data exchange between the cache memory 18 and the memory 14 (including a preset length one or more data). For simplicity and clarity, each data block in the cache memory 18 is referred to as a cache block (or cacheline or cache line) for short below, and different cache blocks have different cache block addresses. Each data block in the memory 14 is referred to as a memory block for short, and different memory blocks have different memory block addresses. Cache block addresses and/or memory block addresses may include physical address tags for locating data blocks.

由于受到空间和资源的限制,高速缓冲存储器18无法对存储器14中的全部内容进行缓存,即高速缓冲存储器18的存储容量通常小于存储器14,高速缓冲存储器18提供的各个缓存块地址无法对应存储器14提供的全部内存块地址。处理单元12在需要访问内存时,首先经总线11访问高速缓冲存储器18,以判断所要访问的内容是否已被存储于高速缓冲存储器18中。如果所要访问的内容已被存储于高速缓冲存储器18中,则高速缓冲存储器18命中,此时处理单元12直接从高速缓冲存储器18中调用所要访问的内容。如果所要访问的内容未被存储于高速缓冲存储器18中,则高速缓冲存储器18和处理单元12需要经总线11访问存储器14,以在存储器14中查找相应的信息。由于高速缓冲存储器18的存取速率非常快,因此在高速缓冲存储器18命中时,处理单元12的效率可以显著提高,进而也使得整个计算设备10的性能和效率得以提升。Due to the limitation of space and resources, the cache memory 18 cannot cache all the contents in the memory 14, that is, the storage capacity of the cache memory 18 is usually smaller than that of the memory 14, and the addresses of each cache block provided by the cache memory 18 cannot correspond to the memory 14. All memory block addresses provided. When the processing unit 12 needs to access the memory, it first accesses the cache memory 18 via the bus 11 to determine whether the content to be accessed has been stored in the cache memory 18 . If the content to be accessed has been stored in the cache memory 18 , the cache memory 18 hits, and the processing unit 12 directly recalls the content to be accessed from the cache memory 18 . If the content to be accessed is not stored in the cache memory 18 , the cache memory 18 and the processing unit 12 need to access the memory 14 via the bus 11 to find corresponding information in the memory 14 . Since the access rate of the cache memory 18 is very fast, when the cache memory 18 hits, the efficiency of the processing unit 12 can be significantly improved, thereby improving the performance and efficiency of the entire computing device 10 .

此外,计算设备10还可以包括存储设备16、显示设备、音频设备、鼠标/键盘等输入/输出设备。存储设备16可以是通过相应接口与总线11耦合的硬盘、光盘以及闪存等用于信息存取的设备。显示设备可以经相应的显卡与总线11耦合,用于根据总线11提供的显示信号进行显示。In addition, the computing device 10 may also include a storage device 16, a display device, an audio device, a mouse/keyboard, and other input/output devices. The storage device 16 may be a hard disk, an optical disk, and a flash memory, which are coupled to the bus 11 through a corresponding interface, and are used for information access. The display device can be coupled to the bus 11 via a corresponding graphics card for displaying according to the display signal provided by the bus 11 .

计算设备10还可以包括通信设备17,进而计算设备10可以通过各种方式与网络或其他设备进行通信。通信设备17可以包括一种或多种通信模块,通信设备17可以包括适用于特定的无线通信协议的无线通信模块。例如,通信设备17可以包括WLAN模块,用于实现符合电气和电子工程协会(IEEE)制定的802.11标准的WiFi通信。通信设备17可以包括WWAN模块,用于实现符合蜂窝或其他无线广域协议的无线广域通信。通信设备17还可以包括蓝牙模块等采用其他协议的通信模块,或其他自定义类型的通信模块。通信设备17还可以是用于串行传输数据的端口。The computing device 10 may also include a communication device 17, so that the computing device 10 may communicate with a network or other devices in various ways. The communication device 17 may include one or more communication modules, and the communication device 17 may include a wireless communication module suitable for a specific wireless communication protocol. For example, the communication device 17 may include a WLAN module for implementing WiFi communication conforming to the 802.11 standard formulated by the Institute of Electrical and Electronics Engineers (IEEE). The communication device 17 may include a WWAN module for implementing wireless wide area communication conforming to cellular or other wireless wide area protocols. The communication device 17 may also include communication modules using other protocols, such as a Bluetooth module, or other self-defined communication modules. The communication device 17 may also be a port for serial transmission of data.

需要说明的是,不同的计算设备10根据主板、操作系统和指令集架构的不同,计算设备10的结构可能有所变化。例如,目前很多计算设备设置有连接在总线11和各个输入/输出设备之间的输入/输出控制中心,且该输入/输出控制中心可以集成于处理单元12之内或独立于处理单元12。It should be noted that, different computing devices 10 may have different structures of computing devices 10 according to different motherboards, operating systems and instruction set architectures. For example, many current computing devices are provided with an I/O control center connected between the bus 11 and various I/O devices, and the I/O control center can be integrated in the processing unit 12 or independent of the processing unit 12 .

处理单元processing unit

图2是本申请一个实施例的处理单元12的示意性框图。如图2所示,每个处理单元12可以包括用于处理指令的多个处理器核120,指令的处理和执行是可以被用户(例如通过应用程序)和/或系统平台控制的。每个处理器核120可以用于处理特定的指令集,指令集可以支持复杂指令集计算(Complex Instruction Set Computing,CISC)、精简指令集计算(Reduced Instruction Set Computing,RISC)或基于超长指令字(Very LongInstruction Word,VLIW)的计算,特别需要说明的是,处理器核120适用于处理RISC-V指令集。不同的处理器核120可以各自处理不同或相同的指令集。处理器核120还可以包括其他处理模块,比如数字信号处理器(Digital Signal Processor,DSP)等。作为一种示例,图2中示出了处理器核1至处理器核m,m是正整数。FIG. 2 is a schematic block diagram of the processing unit 12 according to one embodiment of the present application. As shown in FIG. 2 , each processing unit 12 may include a plurality of processor cores 120 for processing instructions, and the processing and execution of instructions may be controlled by a user (for example, through an application program) and/or a system platform. Each processor core 120 can be used to process a specific instruction set, and the instruction set can support Complex Instruction Set Computing (Complex Instruction Set Computing, CISC), Reduced Instruction Set Computing (Reduced Instruction Set Computing, RISC) or based on very long instruction word (Very Long Instruction Word, VLIW) calculation, it should be noted that the processor core 120 is suitable for processing the RISC-V instruction set. Different processor cores 120 may each process different or the same instruction set. The processor core 120 may also include other processing modules, such as a digital signal processor (Digital Signal Processor, DSP) and the like. As an example, FIG. 2 shows processor core 1 to processor core m, where m is a positive integer.

图1示出的高速缓冲存储器18可以被全部或部分集成于处理单元12中。根据架构的不同,高速缓冲存储器18可以是位于各个处理器核120之内和/或之外的单个或多级的内部高速缓冲存储器(如图2示出的3级高速缓冲存储器L1至L3,图2中统一标识为18),也可以包括面向指令和指令高速缓存和面向数据的数据高速缓存。处理单元12中的各处理器核120可以共享至少一部分的高速缓存存储器,例如,处理器核1至m可以共用第三级高速缓冲存储器L3。处理单元12中每个处理器核120具有各自的私有缓存,例如,每个处理器核120均具有第一级高速缓冲存储器L1和第二级高速缓冲存储器L2。处理单元12还可以包括外部高速缓存(未示出),其他高速缓存结构也可以作为处理单元12的外部高速缓存。The cache memory 18 shown in FIG. 1 may be fully or partially integrated in the processing unit 12 . Depending on the architecture, the cache memory 18 may be a single or multi-level internal cache memory located inside and/or outside each processor core 120 (such as three levels of cache memories L1 to L3 shown in FIG. 2 , In FIG. 2, it is collectively identified as 18), and may also include instruction-oriented and instruction caches and data-oriented data caches. The processor cores 120 in the processing unit 12 may share at least a part of the cache memory, for example, the processor cores 1 to m may share the third-level cache memory L3. Each processor core 120 in the processing unit 12 has its own private cache, for example, each processor core 120 has a first-level cache memory L1 and a second-level cache memory L2. The processing unit 12 may also include an external cache (not shown), and other cache structures may also serve as the external cache of the processing unit 12 .

如图2所示,处理单元12可以包括寄存器堆(Register File)126,寄存器堆126可以包括用于存储不同类型的数据和/或指令的多个寄存器,这些寄存器可以是不同类型,比如寄存器堆126可以包括整数寄存器、浮点寄存器、状态寄存器、指令寄存器和指针寄存器等。寄存器堆126中的寄存器可以选用通用寄存器来实现,也可以根据处理单元12的实际需求采用特定的设计。As shown in FIG. 2, the processing unit 12 may include a register file (Register File) 126, and the register file 126 may include a plurality of registers for storing different types of data and/or instructions, and these registers may be of different types, such as a register file 126 may include integer registers, floating point registers, status registers, instruction registers and pointer registers, among others. The registers in the register file 126 can be implemented by selecting general-purpose registers, or can adopt a specific design according to the actual requirements of the processing unit 12 .

处理单元12可以包括内存管理单元(Memory Management Unit,MMU)122,用于实现虚拟地址到物理地址的转译。内存管理单元122中缓存有页表中的一部分表项,内存管理单元122也可以从内存中获取未被缓存的表项。每个处理器核120中可以设置一个或多个内存管理单元122,不同处理器核120中的内存管理单元122可以与位于其他处理单元或处理器核中的内存管理单元122实现同步,使得每个处理单元或处理器核可以共享统一的虚拟存储系统。The processing unit 12 may include a memory management unit (Memory Management Unit, MMU) 122, configured to translate virtual addresses into physical addresses. Some entries in the page table are cached in the memory management unit 122 , and the memory management unit 122 may also acquire non-cached entries from the memory. One or more memory management units 122 can be set in each processor core 120, and the memory management units 122 in different processor cores 120 can be synchronized with the memory management units 122 in other processing units or processor cores, so that each Each processing unit or processor core can share a unified virtual storage system.

处理单元12用于执行指令序列(即程序)。处理单元12执行每个指令的过程包括:从存放指令的存储器中取出指令,对取出的指令进行译码,执行译码后的指令,及保持指令执行结果等步骤,如此循环,直到执行完指令序列中的全部指令或遇到停机指令。The processing unit 12 is used to execute instruction sequences (ie, programs). The process of the processing unit 12 executing each instruction includes: fetching the instruction from the memory storing the instruction, decoding the fetched instruction, executing the decoded instruction, and maintaining the execution result of the instruction, etc., and so on until the instruction is executed All commands in the sequence or a shutdown command is encountered.

为了实现上述过程,处理单元12可以包括取指令单元124、指令译码单元125、指令发射单元(未示出)、指令执行单元121和退休单元123等。In order to realize the above process, the processing unit 12 may include an instruction fetching unit 124, an instruction decoding unit 125, an instruction issuing unit (not shown), an instruction executing unit 121, a retirement unit 123, and the like.

取指令单元124作为处理单元12的启动引擎,用于将指令从存储器14中搬运到指令寄存器(可以是图2示出的寄存器堆126中的一个用于存放指令的寄存器)中,并接收下一个取指地址或根据取指算法计算获得下一个取指地址,取指算法可以是根据指令长度递增地址或递减地址。The instruction fetching unit 124 is used as the startup engine of the processing unit 12, and is used to transfer instructions from the memory 14 to the instruction register (which may be a register for storing instructions in the register file 126 shown in FIG. 2 ), and receives the following An instruction fetch address or the next instruction fetch address can be calculated according to the instruction fetch algorithm. The instruction fetch algorithm can increase or decrease the address according to the instruction length.

取出指令后,处理单元12进入指令译码阶段,指令译码单元125按照预定的指令格式,对取回的指令进行解码,以获得取回的指令所需的操作数获取信息,从而为指令执行单元121的操作做准备。操作数获取信息可以包括指向立即数、寄存器或其他能够提供源操作数的软件/硬件。After the instruction is fetched, the processing unit 12 enters the instruction decoding stage, and the instruction decoding unit 125 decodes the fetched instruction according to a predetermined instruction format, so as to obtain operand acquisition information required by the fetched instruction, so as to provide instructions for execution. Unit 121 is ready for operation. Operand fetch information can include pointers to immediates, registers, or other software/hardware that can provide source operands.

指令发射单元通常存在于高性能的处理单元12中,位于指令译码单元125与指令执行单元121之间,用于指令的调度和控制,以将各个指令高效地分配至不同的指令执行单元121,使得多个指令的并行操作成为可能。指令经取指、译码并被调度到相应的指令执行单元121之后,相应的指令执行单元121开始执行该指令,即执行该指令所指示的操作,现相应的功能。The instruction issuing unit usually exists in the high-performance processing unit 12, between the instruction decoding unit 125 and the instruction execution unit 121, and is used for instruction scheduling and control, so as to efficiently distribute each instruction to different instruction execution units 121 , making parallel operation of multiple instructions possible. After the instruction is fetched, decoded, and dispatched to the corresponding instruction execution unit 121, the corresponding instruction execution unit 121 starts to execute the instruction, that is, executes the operation indicated by the instruction and performs the corresponding function.

退休单元123(或称为指令引退单元或指令写回单元)主要用于将指令执行单元121产生的执行结果写回到相应的存储位置(例如为处理单元12内部的寄存器),以使后续指令能够从该存储位置快速获取相应的执行结果。Retirement unit 123 (or called instruction retirement unit or instruction write-back unit) is mainly used to write back the execution result generated by the instruction execution unit 121 to the corresponding storage location (for example, a register inside the processing unit 12), so that subsequent instructions The corresponding execution result can be quickly obtained from the storage location.

对于不同类别的指令,可以在处理单元12中相应地设置不同的指令执行单元121。指令执行单元121可以是运算单元(例如包含算术逻辑单元、整形处理单元、矢量运算单元等,用于根据操作数进行运算并输出运算结果)、内存执行单元(例如用于根据指令访问内存以读取内存中的数据或向内存写入指定的数据等)以及协处理器等。在处理单元12中,各个指令执行单元121可以并行运行并输出相应的执行结果。For different types of instructions, different instruction execution units 121 may be set in the processing unit 12 accordingly. The instruction execution unit 121 may be an operation unit (for example, including an arithmetic logic unit, a shaping processing unit, a vector operation unit, etc., for performing operations according to operands and outputting operation results), a memory execution unit (for example, for accessing memory according to instructions to read fetch data in memory or write specified data to memory, etc.) and coprocessors, etc. In the processing unit 12, each instruction execution unit 121 may run in parallel and output a corresponding execution result.

指令执行单元121在执行某类指令(例如访存指令)时,需要访问存储器14,以获取存储器14中存储的信息或提供需要写入存储器14中的数据。需要说明的是,用于执行访存指令的指令执行单元121也可以简称为内存执行单元,该内存执行单元可以为加载存储单元(Load Store Unit,LSU)和/或其他用于内存访问的单元。When the instruction execution unit 121 executes a certain type of instruction (such as a memory access instruction), it needs to access the memory 14 to obtain information stored in the memory 14 or provide data that needs to be written into the memory 14 . It should be noted that the instruction execution unit 121 for executing memory access instructions may also be referred to as a memory execution unit for short, and the memory execution unit may be a load store unit (Load Store Unit, LSU) and/or other units for memory access .

访存指令被取指令单元124获取之后,指令译码单元125可以对访存指令进行译码处理,使得访存指令的源操作数可被获取。译码处理后的访存指令被提供至相应的指令执行单元121,该指令执行单元121可以对访存指令的源操作数进行相应的运算(例如由算术逻辑单元对存储于寄存器中的源操作数进行运算)以获得访存指令对应的地址信息,并根据该地址信息发起相应的请求,例如地址转译请求、写访问请求等。After the memory access instruction is acquired by the instruction fetching unit 124, the instruction decoding unit 125 may decode the memory access instruction, so that the source operand of the memory access instruction can be acquired. The decoded memory access instruction is provided to the corresponding instruction execution unit 121, and the instruction execution unit 121 can perform corresponding operations on the source operand of the memory access instruction (for example, the source operand stored in the register is operated by the arithmetic logic unit number) to obtain the address information corresponding to the memory access instruction, and initiate corresponding requests based on the address information, such as address translation requests, write access requests, and the like.

访存指令的源操作数通常包括地址操作数,指令执行单元121对该地址操作数进行运算以获得访存指令对应的虚拟地址或物理地址。当内存管理单元122被禁用时,指令执行单元121可以直接通过逻辑运算获得访存指令的物理地址。当内存管理单元121被启用时,相应的指令执行单元121根据访存指令对应的虚拟地址发起地址转译请求,该地址转译请求包括与访存指令的地址操作数对应的虚拟地址;内存管理单元122响应地址转译请求,并根据与该虚拟地址匹配的表项将地址转译请求中的虚拟地址转换为物理地址,使得指令执行单元121可以根据转译后的物理地址访问高速缓冲存储器18和/或存储器14。The source operand of the memory access instruction generally includes an address operand, and the instruction execution unit 121 performs operations on the address operand to obtain a virtual address or a physical address corresponding to the memory access instruction. When the memory management unit 122 is disabled, the instruction execution unit 121 can directly obtain the physical address of the memory access instruction through logical operations. When the memory management unit 121 is enabled, the corresponding instruction execution unit 121 initiates an address translation request according to the virtual address corresponding to the memory access instruction, and the address translation request includes the virtual address corresponding to the address operand of the memory access instruction; the memory management unit 122 Respond to the address translation request, and convert the virtual address in the address translation request into a physical address according to the entry matching the virtual address, so that the instruction execution unit 121 can access the cache memory 18 and/or memory 14 according to the translated physical address .

根据功能的不同,访存指令可包括加载指令和存储指令。加载指令的执行过程通常不需要对存储器14或高速缓冲存储器18中的信息进行修改,指令执行单元121只需要根据加载指令的地址操作数读取存储于存储器14、高速缓冲存储器18或外部的存储设备中的数据。不同于加载指令,存储指令的源操作数不仅包括地址操作数,还包括数据信息,存储指令的执行过程通常需要对存储器14和/或高速缓冲存储器18中的信息进行修改。存储指令的数据信息可以指向写入数据,该写入数据的来源可以是运算指令、加载指令等指令的执行结果,也可以是处理单元12中的寄存器或其他存储单元提供的数据,还可以是立即数。Depending on the function, memory access instructions may include load instructions and store instructions. The execution process of the load instruction generally does not need to modify the information in the memory 14 or the cache memory 18, and the instruction execution unit 121 only needs to read the address operand stored in the memory 14, the cache memory 18 or the external memory according to the load instruction. data in the device. Different from the load instruction, the source operand of the store instruction includes not only the address operand, but also data information. The execution of the store instruction usually needs to modify the information in the memory 14 and/or the cache memory 18 . The data information of the storage instruction can point to the write data, and the source of the write data can be the execution result of instructions such as operation instructions and load instructions, or it can be the data provided by the registers in the processing unit 12 or other storage units, or it can be immediate value.

如图2所示,处理单元12包括探听过滤器15,探听过滤器15设置于共享缓存L3附近,探听过滤器15的数量可以是一个或多个,比如,不同的处理器核120可以对应不同的探听过滤器15,也可以多个处理器核120对应同一个探听过滤器15。探听过滤器15中存储有相对应的处理器核120的私有缓存(L1和L2)中所存放数据的相关信息,其他处理器核120通过查询探听过滤器15,可以确定探听过滤器15所对应处理器核120的私有缓存中是否存放有所需数据,比如,处理器核2通过查询处理器核1对应的探听过滤器15,可以确定处理器核1对应的私有缓存中是否存放有处理器核2所需的数据。As shown in Figure 2, the processing unit 12 includes a snoop filter 15, which is arranged near the shared cache L3, and the number of snoop filters 15 can be one or more, for example, different processor cores 120 can correspond to different The same snoop filter 15 can also be corresponded to by multiple processor cores 120 . The snooping filter 15 stores the relevant information of the data stored in the private cache (L1 and L2) of the corresponding processor core 120, and other processor cores 120 can determine the corresponding snooping filter 15 by querying the snooping filter 15. Whether the required data is stored in the private cache of the processor core 120, for example, the processor core 2 can determine whether there is a processor in the private cache corresponding to the processor core 1 by querying the snooping filter 15 corresponding to the processor core 1. Data required for core 2.

探听过滤器15存储相对应处理器核120的私有缓存所存放数据的相关信息,当私有缓存所存放数据发生改变时,探听过滤器15需要更新所存储的相关信息,使得所存储的相关信息与相对应私有缓存所存放数据相对应,以使其他处理器核120通过查询探听过滤器15可以确定相应私有缓存中是否存储有所需数据。The snooping filter 15 stores the relevant information of the data stored in the private cache corresponding to the processor core 120. When the data stored in the private cache changes, the snooping filter 15 needs to update the stored related information, so that the stored related information is consistent with Corresponding to the data stored in the private cache, so that other processor cores 120 can determine whether the required data is stored in the corresponding private cache by querying the snooping filter 15 .

本申请实施例主要着眼于探听过滤器15更新所存储相关信息的过程,在后文中会对探听过滤器15更新所存储相关信息的过程进行详细描述。The embodiment of the present application mainly focuses on the process of updating the stored relevant information by the snooping filter 15, and the process of updating the stored relevant information by the snooping filter 15 will be described in detail later.

探听过滤器snoop filter

多核处理单元通常包括多级缓存,顶层缓存容量较小但访问速度较快,底层缓存容量较大但访问速度较慢。顶层缓存属于某个处理器核私有,底层缓存由多个处理器核共同占用,比如,一级缓存(L1cache)属于某一个处理器核私有,仅用于缓存该处理器核的数据,二级缓存(L2 cache)由多个处理器核共同占用,用于缓存共用该二级缓存的多个处理器核的数据。当处理器核需要查询特定数据时,首先从其私有缓存中查询所需数据,如果查询不到所需数据,则依次向下一级缓存查询。如果所需数据存储在其他处理器核的私有缓存中,则需要对其他处理器核的私有缓存中查询所需数据。A multi-core processing unit usually includes a multi-level cache, the top cache has a small capacity but a fast access speed, and the bottom cache has a large capacity but a slow access speed. The top-level cache is private to a certain processor core, and the bottom-level cache is shared by multiple processor cores. For example, the first-level cache (L1cache) is private to a certain processor core and is only used to cache the data of the processor core. The cache (L2 cache) is jointly occupied by multiple processor cores, and is used to cache data of multiple processor cores that share the L2 cache. When the processor core needs to query specific data, it first queries the required data from its private cache, and if the required data cannot be queried, it then queries the next-level cache in turn. If the required data is stored in the private caches of other processor cores, it is necessary to query the required data in the private caches of other processor cores.

目前,探听过滤器存储私有缓存所存放数据的相关信息,为了保证探听过滤器所存储的相关信息能够准确反映私有缓存所存放的数据,每当私有缓存主动删除一条缓存行时,都需要向探听过滤器发送删除请求,将监被删除缓存行的相关信息从探听过滤器中删除。当被删除的缓存行与内存中的数据相同,此时无需进行数据回写,但仍需要向探听过滤器发送删除请求,以将被删除缓存行的相关信息从探听过滤器中删除,保证探听过滤器中所存储相关信息与私有缓存中的缓存行一一对应。由于处理器核需要频繁向探听过滤器发送删除请求,这将占据处理器核内与处理器核外的通信带宽,影响处理单元的性能。Currently, the snoop filter stores information about the data stored in the private cache. In order to ensure that the relevant information stored in the snoop filter can accurately reflect the data stored in the private cache, whenever a private cache proactively deletes a cache line, it needs to send the snoop The filter sends a delete request to delete the relevant information about the deleted cache line from the snoop filter. When the deleted cache line is the same as the data in the memory, there is no need to write back the data at this time, but it is still necessary to send a delete request to the snoop filter to delete the relevant information of the deleted cache line from the snoop filter to ensure snoop The relevant information stored in the filter is in one-to-one correspondence with the cache lines in the private cache. Since the processor core needs to frequently send delete requests to the snoop filter, this will occupy the communication bandwidth between the processor core and outside the processor core, and affect the performance of the processing unit.

本申请实施例就是为了应对探听过滤器删除表项对核内与核外之间通信带宽占用较大的问题,它主要是通过探听过滤器15实现的。下面详细讨论探听过滤器15的内部结构和本申请实施例的实现过程。The embodiment of the present application is to deal with the problem that deletion of entries by the snoop filter occupies a large communication bandwidth between the core and the outside of the core, and it is mainly realized through the snoop filter 15 . The internal structure of the snoop filter 15 and the implementation process of the embodiment of the present application will be discussed in detail below.

图3是本申请一个实施例的探听过滤器的内部结构示意图。如图3所示,探听过滤器15包括接收单元151和探听单元152。在第一缓存行被填充到私有缓存中之后,私有缓存会向接收单元151发送坐标信息,该坐标信息可以指示第一缓存行在私有缓存中的目标填充位置。接收单元151接收到坐标信息后,探听单元152可以根据坐标信息判断目标填充位置对应的映射表项中是否存储有第二缓存行的标识信息,如果目标填充位置对应的映射表项中存储有第二缓存行的标识信息,则将第二缓存行的标识信息从目标填充位置对应的映射表项转移到牺牲表项,并在目标填充位置对应的映射表项中存储第一缓存行的标识信息,其中,不同的缓存行对应不同的标识信息。Fig. 3 is a schematic diagram of an internal structure of a snoop filter according to an embodiment of the present application. As shown in FIG. 3 , the snoop filter 15 includes a receiving unit 151 and a snoop unit 152 . After the first cache line is filled into the private cache, the private cache will send coordinate information to the receiving unit 151 , where the coordinate information can indicate the target filling position of the first cache line in the private cache. After the receiving unit 151 receives the coordinate information, the snooping unit 152 can judge whether the identification information of the second cache line is stored in the mapping entry corresponding to the target filling position according to the coordinate information. For the identification information of the second cache line, the identification information of the second cache line is transferred from the mapping entry corresponding to the target filling position to the victim entry, and the identification information of the first cache line is stored in the mapping entry corresponding to the target filling position. , where different cache lines correspond to different identification information.

探听过滤器15用于存储私有缓存中所填充缓冲行的标识信息,私有缓存可以是一个也可以是多个。不同的缓存行对应不同的标识信息,根据标识信息可以区别不同的缓存行,比如标识信息可以是缓存行在内存中的地址。The snooping filter 15 is used to store the identification information of the filled buffer lines in the private cache, and there may be one or more private caches. Different cache lines correspond to different identification information, and different cache lines can be distinguished according to the identification information, for example, the identification information may be an address of the cache line in memory.

私有缓存包括多个用于填充缓存行的填充位置,探听过滤器15包括多个映射表项,探听过滤器15中的映射表项与私有缓存中的填充位置一一对应,映射表项用于存储相对应的填充位置中所填充缓存行的标识信息。比如,私有缓存包括N个填充位置,N为大于或等于2的正整数,则探听过滤器15包括N个映射表项,不同的填充位置与不同的映射表项相对应。探听过滤器15除了包括有多个映射表项外,还包括牺牲表项,牺牲表项也用于存储缓存行的标识信息,但牺牲表项不属于某一个填充位置,而是由多个填充位置共享。The private cache includes a plurality of filling positions for filling cache lines, and the snooping filter 15 includes a plurality of mapping entries, and the mapping entries in the snooping filter 15 correspond to the filling positions in the private cache one by one, and the mapping entries are used for The identification information of the filled cache line in the corresponding filled position is stored. For example, the private cache includes N filling positions, where N is a positive integer greater than or equal to 2, then the snoop filter 15 includes N mapping entries, and different filling positions correspond to different mapping entries. In addition to including multiple mapping entries, the snooping filter 15 also includes a victim table entry, which is also used to store the identification information of the cache line, but the victim table entry does not belong to a certain filling position, but consists of multiple filling positions. location sharing.

私有缓存中的目标填充位置之前填充有第二缓存行,当第一缓存行被填充到目标填充位置时,私有缓存将目标填充位置的坐标信息发送给接收单元151,探听单元152根据接收单元151接收到的坐标信息,将目标填充位置对应的映射表项中存储的第二缓存行的标识信息转移到牺牲表项,并将第一缓存行的标识信息存储到目标填充位置对应的映射表项。如果第二缓存行不需要进行数据回写,第一缓存行被填充到目标填充位置后,第二缓存行被从私有缓存中挤出,第二缓存行的标识信息被存储在牺牲表项,同样保证了私有缓存中所存放的缓存行的标识信息均存储在探听过滤器15中。The target filling position in the private cache is filled with a second cache line before. When the first cache line is filled to the target filling position, the private cache sends the coordinate information of the target filling position to the receiving unit 151, and the snooping unit 152 sends the coordinate information of the target filling position to the receiving unit 151. For the received coordinate information, transfer the identification information of the second cache line stored in the mapping entry corresponding to the target filling position to the victim entry, and store the identification information of the first cache line in the mapping entry corresponding to the target filling position . If the second cache line does not need to perform data write-back, after the first cache line is filled to the target filling position, the second cache line is squeezed out from the private cache, and the identification information of the second cache line is stored in the victim entry, It is also ensured that the identification information of the cache lines stored in the private cache is all stored in the snooping filter 15 .

在本申请实施例中,私有缓存中的填充位置与探听过滤器15中的映射表项一一对应,在第一缓存行被填充到目标填充位置后,如果目标填充位置对应的映射表项中存储有第二缓存行的标识信息,探听单元152将第二缓存行的标识信息转移到牺牲表项,并将第一缓存行的标识信息存储到目标填充位置对应的映射表项。由于填充位置与映射表项一一对应,在第二缓存行不需要进行数据回写时,私有缓存仅需将目标填充位置的坐标信息发送给探听过滤器15,探听过滤器15将第一缓存行的标识信息存储到目标填充位置对应的映射表项,而将第二缓存行的标识信息转移到牺牲表项中,因此对于无需数据回写的缓存行,私有缓存无需向探听过滤器15发送删除请求,即可保证根据探听过滤器确定哪些私有缓存不可能存放所需的数据,从而减少对其他处理器核的私有缓存进行探听的次数,进而可以减小发送删除请求对处理器核内与处理器核外的通信带宽的占据,从而保证处理单元的性能。In the embodiment of the present application, the filling position in the private cache corresponds to the mapping entry in the snoop filter 15. After the first cache line is filled to the target filling position, if the mapping entry corresponding to the target filling position The identification information of the second cache line is stored, and the snooping unit 152 transfers the identification information of the second cache line to the victim entry, and stores the identification information of the first cache line into the mapping entry corresponding to the target filling position. Since the filling position corresponds to the mapping entry one by one, when the second cache line does not need to perform data write-back, the private cache only needs to send the coordinate information of the target filling position to the snooping filter 15, and the snooping filter 15 will first cache The identification information of the line is stored in the mapping entry corresponding to the target filling position, and the identification information of the second cache line is transferred to the victim entry, so for the cache line that does not need data write-back, the private cache does not need to send to the snoop filter 15 The delete request can ensure that which private caches cannot store the required data according to the snoop filter, thereby reducing the number of snoops on the private caches of other processor cores, thereby reducing the impact of sending delete requests on the processor core and The occupancy of the communication bandwidth outside the processor core to ensure the performance of the processing unit.

在一种可能的实现方式中,探听单元152根据坐标信息确定目标填充位置对应的映射表项为空时,将第一缓存行的标识信息存储到目标填充位置对应的映射表项中。In a possible implementation manner, when the snooping unit 152 determines that the mapping entry corresponding to the target filling position is empty according to the coordinate information, it stores the identification information of the first cache line in the mapping entry corresponding to the target filling position.

由于私有缓存中的填充位置与探听过滤器15中的映射表项是一一对应的,如果目标填充位置对应的映射表项为空,说明在填充第一缓存行之前目标填充位置未被填充其他缓存行,或者在填充第一缓存行之前目标填充位置中填充的缓存行已被排除私有缓存。根据坐标信息确定目标填充位置对应的映射表为空时,探听单元152将第一缓存行的标识信息存储到目标填充位置对应的映射表项中,通过第一缓存行的标识信息指示私有缓存中存放中第一缓存行。Since the filling position in the private cache is in one-to-one correspondence with the mapping entry in the snoop filter 15, if the mapping entry corresponding to the target filling position is empty, it means that the target filling position has not been filled before the first cache line is filled. The cache line, or the cache line filled in the target fill location before the first cache line was filled has been excluded from the private cache. When it is determined according to the coordinate information that the mapping table corresponding to the target filling position is empty, the snooping unit 152 stores the identification information of the first cache line in the mapping entry corresponding to the target filling position, and indicates the private cache through the identification information of the first cache line. Store the first cache line.

在本申请实施例中,在目标填充位置对应的映射表项为空时,探听单元152无需执行将标识信息转移到牺牲表项的步骤,直接将被填充到目标填充位置的第一缓存行的标识信息存储到目标填充位置对应的映射表项,进而其他处理器核查询探听过滤器15时,根据目标填充位置对应的映射表项中所存储的标识信息,可以确定探听过滤器15对应的私有缓存中存储有第一缓存行,保证处理器核能够快速的查询到所需数据。In this embodiment of the present application, when the mapping entry corresponding to the target filling position is empty, snooping unit 152 does not need to perform the step of transferring the identification information to the victim entry, and directly transfers the first cache line filled to the target filling position The identification information is stored in the mapping entry corresponding to the target filling position, and then when other processor cores query the snooping filter 15, the private key corresponding to the snooping filter 15 can be determined according to the identification information stored in the mapping entry corresponding to the target filling position. The first cache line is stored in the cache to ensure that the processor core can quickly query the required data.

在一种可能的实现方式中,探听过滤器15包括至少一个探听表项集合,每个探听表项集合包括一个牺牲表项和至少一个映射表项,每个探听表项集合与私有缓存中的一个填充位置集合相对应,每个填充位置集合包括至少一个填充位置,每个探听表项集合中映射表项的数量与相对应的填充位置集合中填充位置的数量相等,且在相对应的探听表项集合和填充位置集合中,不同的映射表项对应不同的填充位置。在根据坐标信息确定目标填充位置对应的映射表项中存储有第二缓存行的标识信息后,探听单元152确定目标填充位置对应的映射表项所属的探听表项集合,然后将第二缓存行的标识信息从目标填充位置对应的映射表项转移到该探听表项集合包括的牺牲表项中。In a possible implementation manner, the snooping filter 15 includes at least one snooping entry set, each snooping entry set includes a victim entry and at least one mapping entry, and each snooping entry set is related to the Corresponding to a set of filling positions, each set of filling positions includes at least one filling position, the number of mapping entries in each set of snooping entries is equal to the number of filling positions in the corresponding set of filling positions, and in the corresponding snooping In the entry set and the filling position set, different mapping entries correspond to different filling positions. After determining that the identification information of the second cache line is stored in the mapping entry corresponding to the target filling position according to the coordinate information, the snooping unit 152 determines the set of snooping entries to which the mapping entry corresponding to the target filling position belongs, and then stores the second cache line The identification information of is transferred from the mapping entry corresponding to the target filling position to the victim entry included in the snooping entry set.

例如,私有缓存包括y个填充位置集合,每个填充位置集合包括x个填充位置,其中x和y均为大于或等于1的正整数。探听过滤器15包括y个探听表项集合,每个探听表项集合包括x个映射表项和1个牺牲表项。填充位置集合与探听表项集合一一对应,比如,填充位置集合0与探听表项集合0相对应,填充位置集合1与探听表项集合1相对应,填充位置集合y-1与探听表项集合y-1相对应。在相对应的填充位置集合和探听表项集合中,填充位置与映射表项一一对应,比如,填充位置0与映射表项0相对应,填充位置1与映射表项1相对应,填充位置x-1与映射表项x-1相对应。For example, the private cache includes y filling position sets, and each filling position set includes x filling positions, where both x and y are positive integers greater than or equal to 1. The snoop filter 15 includes y snoop entry sets, and each snoop entry set includes x mapping entries and 1 victim entry. The set of filling positions corresponds to the set of snooping entries. For example, the set of filling positions 0 corresponds to the set of snooping entries 0, the set of filling positions 1 corresponds to the set of snooping entries 1, and the set of filling positions y-1 corresponds to the set of snooping entries The set y-1 corresponds. In the corresponding filling position set and snooping entry set, the filling position corresponds to the mapping table item one by one, for example, the filling position 0 corresponds to the mapping table item 0, the filling position 1 corresponds to the mapping table item 1, and the filling position x-1 corresponds to mapping table entry x-1.

将第一缓存行填充到填充位置集合0中的填充位置x-1后,探听单元152会将探听表项集合0中映射表项x-1中存储的第二缓存行的标识信息,转移到探听表项集合0中的牺牲表项,并将第一缓存行的标识信息存储到探听表项集合0中映射表项x-1。After filling the first cache line into the filling position x-1 in the filling position set 0, the snooping unit 152 will transfer the identification information of the second cache line stored in the mapping entry x-1 in the snooping entry set 0 to snoop the victim entry in entry set 0, and store the identification information of the first cache line in the mapping entry x-1 in snoop entry set 0.

针对同一个缓存行,每次将该缓存行存放到私有缓存中时,需要将该缓存行填充到相同的填充位置集合中,但可以将该缓存行填充到该填充位置集合中不同的填充位置。比如,每次将缓存行L存放到私有缓存时,都将该缓存行L填充到填充位置集合1,上一次将该缓存行L填充到填充位置集合1包括的填充位置1,下一次可以将该缓存行L填充到填充位置集合1包括的填充位置0。For the same cache line, each time the cache line is stored in the private cache, the cache line needs to be filled into the same fill position set, but the cache line can be filled into different fill positions in the fill position set . For example, each time the cache line L is stored in the private cache, the cache line L is filled into the filling position set 1. The last time the cache line L was filled into the filling position 1 included in the filling position set 1, the next time the cache line L can be filled The cache line L is filled to fill position 0 included in fill position set 1 .

在本申请实施例中,将私有缓存包括的填充位置划分为多个填充位置集合,并将探听过滤器15包括的表项分别为多个探听表项集合,使得填充位置集合与探听表项集合一一对应,而且相对应填充位置集合与探听表项集合中的填充位置与映射表项一一对应,便于根据坐标信息确定与填充位置相对应的映射表项,提高探听过滤器15更新标识信息的效率。另外,每个探听表项集合还包括一个牺牲表项,同一个探听表项集合包括的各映射表项共享一个牺牲表项,用于存储转移的标识信息,而且同一个缓存行每次都被填充到同一个填充位置集合中,从而在其他处理器核在查询探听过滤器15时,仅需查询用于存储所需数据的填充位置集合对应的探听表项集合,便可以确定私有缓存中是否存放有其他处理器核所需的数据,无需遍历探听过滤器15所存储的全部标识信息,从而可以提高处理器核从其他处理器核的私有缓存中获取所需数据的效率。In the embodiment of the present application, the filling positions included in the private cache are divided into a plurality of filling position sets, and the entries included in the snooping filter 15 are respectively divided into a plurality of snooping entry sets, so that the filling position set and the snooping entry set One-to-one correspondence, and the corresponding filling position set and the filling position in the snooping entry set are in one-to-one correspondence with the mapping table item, which is convenient for determining the mapping table item corresponding to the filling position according to the coordinate information, and improving the snooping filter 15 to update the identification information s efficiency. In addition, each snooping entry set also includes a victim entry, and the mapping entries included in the same snooping entry set share a victim entry, which is used to store the transferred identification information, and the same cache line is saved every time. Fill in the same filling location set, so that when other processor cores are querying the snooping filter 15, they only need to query the snooping entry set corresponding to the filling location set used to store the required data, and then it can be determined whether the private cache is The data required by other processor cores is stored without traversing all the identification information stored in the snooping filter 15, thereby improving the efficiency of the processor cores in obtaining required data from the private caches of other processor cores.

在一种可能的实现方式中,每个填充位置集合对应的待完成的缓存行填充请求的数量小于或等于1,其中,缓冲行填充请求用于请求向相对应的填充位置集合包括的填充位置填充缓存行。In a possible implementation manner, the number of cache line filling requests to be completed corresponding to each filling position set is less than or equal to 1, wherein the buffer line filling request is used to request to fill positions included in the corresponding filling position set Fill the cache line.

每个填充位置集合可以包括多个填充位置。针对每个填充位置集合,该填充位置集合对应的缓存行填充请求用于请求向该填充位置集合包括的填充位置中填充缓存行,该填充位置集合接收到缓存行填充请求后,响应于缓存行填充请求向相对应的填充位置填充缓存行。在一个缓存行被填充到一个填充位置之后,该填充位置之前存放的缓存行将被挤出私有缓存,如果该被挤出私有缓存的缓存行需要进行数据回写,在该缓存行被回写的过程中,该缓存行的标识信息将被存储在相应探听表项集合中的牺牲表项。Each set of fill locations may include multiple fill locations. For each filling location set, the cache line filling request corresponding to the filling location set is used to request to fill the cache lines in the filling locations included in the filling location set, and the filling location set responds to the cache line filling request after receiving the cache line filling request. A fill request fills a cache line to the corresponding fill location. After a cache line is filled into a filling position, the cache line stored before the filling position will be squeezed out of the private cache. If the cache line that is squeezed out of the private cache needs to be written back, after the cache line is written back During the process, the identification information of the cache line will be stored in the victim entry in the corresponding snoop entry set.

比如,在前述实施例中填充位置集合与探听表项集合对应关系及填充位置与映射表项对应关系的基础上,第二缓存行存放于填充位置集合1中的填充位置1,当私有缓存接收到缓存行填充请求,将第一缓存行填充到填充位置集合1中的填充位置1时,第二缓存行将被挤出私有缓存,第二缓存行的标识信息被从探听表项集合1中的映射表项1转移到探听表项集合1中的牺牲表项。如果第二缓存行与内存中的数据不同,则第二缓存行需要进行数据回写,在第二缓存行被回写的过程中,第二缓存行的标识信息被存储在探听表项集合1中的牺牲表项,保证探听过滤器15中所存储标识信息与私有缓存中缓存行的对应性。For example, on the basis of the corresponding relationship between the filling location set and the snooping entry set and the corresponding relationship between the filling location and the mapping entry in the foregoing embodiments, the second cache line is stored in the filling location 1 in the filling location set 1. When the private cache receives When a cache line fill request is received, when the first cache line is filled to fill position 1 in fill position set 1, the second cache line will be squeezed out of the private cache, and the identification information of the second cache line will be retrieved from snooping entry set 1. Mapping entry 1 is transferred to the victim entry in snooping entry set 1. If the second cache line is different from the data in the memory, the second cache line needs to be written back. During the process of the second cache line being written back, the identification information of the second cache line is stored in snooping entry set 1 The sacrificial entry in the snoop filter 15 ensures the correspondence between the identification information stored in the snoop filter 15 and the cache line in the private cache.

在本申请实施例中,由于填充位置集合与探听表项集合一一对应,而每个探听表项集合包括一个牺牲表项,如果一个缓存行需要进行数据回写,该缓存行被回写的过程中,该缓存行的标识信息需要被存储在相应的牺牲表项,保证私有缓存中存放的缓存行在探听过滤器15中有相对应的标识信息,为了避免一个缓存行回写完成之前,其他缓存行的标识信息被转移到牺牲表项将该缓存行的标识信息覆盖,进而导致私有缓存中存放的缓存行在探听过滤器15中没有相对应标识信息的情况发生,针对每个填充位置集合,仅允许存在小于或等于一个待完成的缓存行填充请求,从而保证私有缓存中存放的每个缓存行在探听过滤器15中均存储有相对应的标识信息,进而保证处理器核通过查询探听过滤器15查找所需数据的正确性。In the embodiment of the present application, since the set of filling positions corresponds to the set of snooping entries one-to-one, and each set of snooping entries includes a victim entry, if a cache line needs to be written back, the cache line is written back During the process, the identification information of the cache line needs to be stored in the corresponding victim table entry to ensure that the cache line stored in the private cache has corresponding identification information in the snooping filter 15. In order to prevent a cache line from being written back before completion, The identification information of other cache lines is transferred to the victim table entry to overwrite the identification information of the cache line, thus causing the cache line stored in the private cache to have no corresponding identification information in the snoop filter 15, for each filling position set, only allowing less than or equal to one cache line filling request to be completed, so as to ensure that each cache line stored in the private cache has corresponding identification information stored in the snooping filter 15, thereby ensuring that the processor core passes the query The snoop filter 15 looks for the correctness of the required data.

在一种可能的实现方式中,探听单元152在将第二缓存行的标识信息转移到牺牲表项时,如果牺牲表项中存储有第三缓存行的标识信息,则通过第二缓存行的标识信息覆盖牺牲表项中第三缓存行的标识信息。In a possible implementation manner, when the snooping unit 152 transfers the identification information of the second cache line to the victim entry, if the identification information of the third cache line is stored in the victim entry, the The identification information overwrites the identification information of the third cache line in the victim entry.

例如,在将第二缓存行的标识信息从探听表项集合1中的映射表项1转移到探听表项集合1中的牺牲表项时,如果探听表项集合1中的牺牲表项为空,则直接将第二缓存行的标识信息存储到探听表项集合1中的牺牲表项,如果探听表项集合1中的牺牲表项存储有第三缓存行的标识信息,则通过第二缓存行的标识信息覆盖探听表项集合1中牺牲表项中第三缓存行的标识信息,以将第二缓存行的标识信息存储到探听表项集合1中的牺牲表项。For example, when transferring the identification information of the second cache line from the mapping entry 1 in the snooping entry set 1 to the victim entry in the snooping entry set 1, if the victim entry in the snooping entry set 1 is empty , then directly store the identification information of the second cache line in the victim entry in the snooping entry set 1, if the victim entry in the snooping entry set 1 stores the identification information of the third cache line, then pass the second cache The identification information of the row overwrites the identification information of the third cache line in the victim entry in the snoop entry set 1, so as to store the identification information of the second cache line in the victim entry in the snoop entry set 1.

在本申请实施例中,由于每个探听表项集合包括一个牺牲表项,当被替换的缓存行与内存中的数据不相同时,允许首先发起数据填充请求,再发起数据回写请求,无需先向探听过滤器15发起表项删除操作,再向探听过滤器15发起表项添加操作,从而提高了数据填充到缓存的速度。由于每个探听表项集合包括一个牺牲表项,在将一个缓存行的标识信息从映射表项转移到同一探听表项集合中的牺牲表项时,即使牺牲表项之前存储有其他缓存行的标识信息,该标识信息所对应的缓存行在私有缓存中已不存在,所以可以被覆盖。In the embodiment of this application, since each set of snooping entries includes a victim entry, when the replaced cache line is different from the data in the memory, it is allowed to initiate a data filling request first, and then initiate a data write-back request without An entry deletion operation is first initiated to the snoop filter 15, and then an entry addition operation is initiated to the snoop filter 15, thereby increasing the speed of filling data into the cache. Since each set of snooping entries includes a victim entry, when the identification information of a cache line is transferred from the mapping entry to the victim entry in the same snooping entry set, even if there are other cache lines stored before the victim Identification information. The cache line corresponding to the identification information no longer exists in the private cache, so it can be overwritten.

在一种可能的实现方式中,每个处理器核对应一个探听过滤器15,一个处理器核内可以包括多个私有缓存,每个私有缓存包括至少一个填充位置集合。探听过滤器15包括多个探听表项集合,与各探听表项集合相对应的至少两个填充位置集合位于至少两个私有缓存中,不同的探听表项集合与不同的填充位置集合相对应,各私有缓存位于同一处理器核内。In a possible implementation manner, each processor core corresponds to one snoop filter 15, and one processor core may include multiple private caches, and each private cache includes at least one filling location set. The snooping filter 15 includes a plurality of snooping entry sets, and at least two filling position sets corresponding to each snooping entry set are located in at least two private caches, and different snooping entry sets correspond to different filling position sets, Each private cache is located in the same processor core.

每个私有缓存包括至少一个填充位置集合,不同的私有缓存所包括的填充位置集合的数量可以相同或不同。各私有缓存中填充位置集合的总数等于探听过滤器15中探听表项集合的数量,每个探听表项集合对应一个填充位置集合,且不同的探听表项集合对应不同的填充位置集合。Each private cache includes at least one filling location set, and the number of filling location sets included in different private caches may be the same or different. The total number of filling position sets in each private cache is equal to the number of snooping entry sets in the snooping filter 15, each snooping entry set corresponds to a filling position set, and different snooping entry sets correspond to different filling position sets.

例如,处理器核内包括一级缓存和第二缓存,一级缓存和二级缓存均包括y个填充位置集合。探听过滤器包括2y个探听表项集合,其中y个探听表项集合与一级缓存中的y个填充位置集合相对应,另外y个探听表项集合与二级缓存中的y个填充位置集合相对应,而且不同的探听表项集合对应不同的填充位置集合。For example, the processor core includes a first-level cache and a second-level cache, and both the first-level cache and the second-level cache include y filling position sets. The snooping filter includes 2y sets of snooping entries, wherein y sets of snooping entries correspond to y sets of filling positions in the first-level cache, and another set of y snooping entries corresponds to y sets of filling positions in the second-level cache Correspondingly, and different sets of snooping entries correspond to different sets of filling positions.

在本申请实施例中,探听过滤器15所包括的各探听表项集合可以对应相同私有缓存中的多个填充位置集合,也可以对应不同私有缓存中的多个填充位置集合,从而探听过滤器15适用于包括一个私有缓冲的处理器核,也适用于包括多个私有缓冲的处理器核,保证了探听过滤器15具有较强的适用性。In the embodiment of the present application, each set of snooping entries included in the snooping filter 15 may correspond to multiple filling position sets in the same private cache, or may correspond to multiple filling position sets in different private caches, so that the snooping filter 15 is applicable to a processor core including one private buffer, and is also applicable to a processor core including multiple private buffers, which ensures that the snoop filter 15 has strong applicability.

在一种可能的实现方式中,同一个填充位置集合包括的各填充位置对应相同的索引信息,索引信息可以标识私有缓存中填充位置所能够填充的缓存行的地址区段。In a possible implementation manner, each filling position included in the same filling position set corresponds to the same index information, and the index information may identify the address segment of the cache line that can be filled by the filling position in the private cache.

在私有缓存内,按照从0依次加1分别赋予不同填充位置集合以索引(index)信息,即各填充位置集合的索引信息分别为0,1,2……,并将可以缓存的地址区间分割成至少一个地址区段,使得地址区段的数量与填充位置集合的数量相等。同样,按照从0依次加1分别赋予不同地址区段以索引信息,使得每个地址区段与具有相同索引信息的填充位置集合相对应。任意缓存行所处的地址区段,只能缓存在对应索引信息的填充位置。In the private cache, index information is assigned to different filling position sets according to the order of adding 1 from 0, that is, the index information of each filling position set is 0, 1, 2..., and the address intervals that can be cached are divided into at least one address segment such that the number of address segments is equal to the number of fill location sets. Similarly, index information is assigned to different address segments according to sequentially increasing from 0 to 1, so that each address segment corresponds to a set of filling positions having the same index information. The address segment where any cache line is located can only be cached in the filling position of the corresponding index information.

与私有缓存中的填充位置相对应,可以为探听过滤器15中的映射表项分配索引信息和路信息,使得相对应的填充位置和映射表项具有相同的索引信息和路信息,从而可以将具有相同索引信息的多个映射表项划分到同一个探听表项集合中。另外,为牺牲表项分配索引信息,使得不同的牺牲表项对应不同的索引信息,使得每个探听表项集合包括一个牺牲表项,而且同一探听表项集合中的映射表项和牺牲表项具有相同的索引信息。Corresponding to the filling position in the private cache, index information and way information can be assigned to the mapping entry in the snoop filter 15, so that the corresponding filling position and mapping entry have the same index information and way information, so that the Multiple mapping entries with the same index information are divided into the same set of snooping entries. In addition, index information is assigned to the victim entry, so that different victim entries correspond to different index information, so that each set of snooping entries includes a victim entry, and the mapping entries and victim entries in the same set of snooping entries have the same index information.

图4是本申请一个实施例提供的私有缓存和探听过滤器的示意图。如图4所示,私有缓存包括y个填充位置集合,每个填充位置集合包括x个填充位置。不同的填充位置集合对应不同的索引信息,各填充位置集合对应的索引信息分别为index0、index1…index(y-1)。同一填充位置集合包括的各填充位置具有相同的索引信息,而且同一填充位置集合包括的各填充位置对应不同的路信息,各填充位置对应的路信息分别为way0、way1…way(x-1)。Fig. 4 is a schematic diagram of a private cache and a snooping filter provided by an embodiment of the present application. As shown in FIG. 4 , the private cache includes y filling location sets, and each filling location set includes x filling locations. Different filling position sets correspond to different index information, and the index information corresponding to each filling position set is respectively index0, index1...index(y-1). Each filling position included in the same filling position set has the same index information, and each filling position included in the same filling position set corresponds to different way information, and the way information corresponding to each filling position is way0, way1...way(x-1) .

与私有缓存包括的各填充位置相对应,探听过滤器15包括y个探听表项集合,每个探听表项集合包括x个映射表项和1个牺牲表项。不同的探听表项集合对应不同的索引信息,各探听表项集合对应的索引信息分别为index0、index1…index(y-1)。同一探听表项集合包括的各映射表项对应相同的索引信息,而同一探听表项集合包括的各映射表项对应不同的路信息,各映射表项对应的路信息分别为way0、way1…way(x-1)。Corresponding to each filling position included in the private cache, the snoop filter 15 includes y snoop entry sets, and each snoop entry set includes x mapping entries and 1 victim entry. Different snooping entry sets correspond to different index information, and the index information corresponding to each snooping entry set is index0, index1...index(y-1) respectively. Each mapping entry included in the same snooping entry set corresponds to the same index information, and each mapping entry included in the same snooping entry set corresponds to different way information, and the way information corresponding to each mapping entry is way0, way1...way (x-1).

私有缓存中的填充位置与探听过滤器15中的映射表项一一对应。私有缓存中的填充位置填充有缓存行和缓存行的标识信息,探听过滤器15中的映射表项存储有相对应填充位置所填充缓存行的标识信息。The filling position in the private cache is in one-to-one correspondence with the mapping entry in the snooping filter 15 . The filling position in the private cache is filled with the cache line and the identification information of the cache line, and the mapping entry in the snoop filter 15 stores the identification information of the filled cache line corresponding to the filling position.

在本申请实施例中,以私有缓存中填充位置的地址区段作为索引信息,根据索引信息将私有缓存中的填充位置划分为多个填充位置集合,使同一填充位置集合包括的各填充位置具有相同的索引信息,而且同一填充位置集合包括的各填充位置对应不同的路信息,为探听过滤器15包括的各映射表项分别不同的索引信息和路信息,使具有相同索引信息和路信息的填充位置与映射表项相对应,从而实现填充位置与映射表项的一一对应。In the embodiment of the present application, the address segment of the filling position in the private cache is used as the index information, and the filling position in the private cache is divided into multiple filling position sets according to the index information, so that each filling position included in the same filling position set has The same index information, and each filling position included in the same filling position set corresponds to different road information, each mapping entry included in snooping filter 15 has different index information and road information, so that the same index information and road information The filling position corresponds to the mapping table item, so as to realize the one-to-one correspondence between the filling position and the mapping table item.

在一种可能的实现方式中,标识信息可以是缓存行在内存中的地址信息。In a possible implementation manner, the identification information may be address information of the cache line in the memory.

在本申请实施例中,由于不同的缓存行在内存中具有不同的地址,因此可以将缓存行在内存中的地址信息作为缓存行的标识信息,使得可以根据标识信息区别不同的缓存行,方便处理器核根据缓存行的标识信息从其他处理器核的私有缓存中查找所需数据,保证处理器核查找所需数据的效率,进而保证处理单元的性能。In the embodiment of the present application, since different cache lines have different addresses in the memory, the address information of the cache line in the memory can be used as the identification information of the cache line, so that different cache lines can be distinguished according to the identification information, which is convenient The processor core searches the required data from the private caches of other processor cores according to the identification information of the cache line to ensure the efficiency of the processor core to search for the required data, thereby ensuring the performance of the processing unit.

探听过滤方法snoop filter method

图5是本申请一个实施例的探听过滤方法的流程图,该探听过滤方法可由上述实施例中的探听过滤器15执行。如图5所示,该探听过滤方法包括如下步骤:Fig. 5 is a flow chart of a snoop filtering method according to an embodiment of the present application, and the snoop filtering method may be executed by the snoop filter 15 in the above embodiment. As shown in Figure 5, the snooping and filtering method includes the following steps:

步骤501、接收第一缓存行被填充到私有缓存中后私有缓存发送的坐标信息,其中,坐标信息用于指示第一缓存行在私有缓存中的目标填充位置;Step 501: Receive the coordinate information sent by the private cache after the first cache line is filled into the private cache, where the coordinate information is used to indicate the target filling position of the first cache line in the private cache;

步骤502、在根据坐标信息确定目标填充位置对应的映射表项中存储有第二缓存行的标识信息时,将第二缓存行的标识信息从目标填充位置对应的映射表项转移到牺牲表项,并在目标填充位置对应的映射表项中存储第一缓存行的标识信息,其中,不同的缓存行对应不同的标识信息。Step 502: When it is determined according to the coordinate information that the identification information of the second cache line is stored in the mapping entry corresponding to the target filling position, transfer the identification information of the second cache line from the mapping entry corresponding to the target filling position to the victim entry , and store the identification information of the first cache line in the mapping entry corresponding to the target fill position, where different cache lines correspond to different identification information.

在本申请实施例中,私有缓存中的填充位置与映射表项一一对应,在第一缓存行被填充到目标填充位置后,如果目标填充位置对应的映射表项中存储有第二缓存行的标识信息,则将第二缓存行的标识信息转移到牺牲表项,并将第一缓存行的标识信息存储到目标填充位置对应的映射表项。由于填充位置与映射表项一一对应,在第二缓存行不需要进行数据回写时,将第一缓存行的标识信息存储到目标填充位置对应的映射表项,而将第二缓存行的标识信息转移到牺牲表项中,因此对于无需数据回写的缓存行,私有缓存无需向探听过滤器发送删除请求,即可保证根据探听过滤器确定哪些私有缓存不可能存放所需的数据,从而减少对其他处理器核的私有缓存进行探听的次数,进而可以减小发送删除请求对处理器核内与处理器核外的通信带宽的占据,从而保证处理单元的性能。In the embodiment of the present application, the filling position in the private cache corresponds to the mapping table item one by one. After the first cache line is filled to the target filling position, if the mapping table item corresponding to the target filling position stores the second cache line The identification information of the second cache line is transferred to the victim entry, and the identification information of the first cache line is stored in the mapping entry corresponding to the target filling position. Since the filling position corresponds to the mapping table item one by one, when the second cache line does not need to perform data write-back, the identification information of the first cache line is stored in the mapping table item corresponding to the target filling position, and the second cache line’s The identification information is transferred to the victim table entry. Therefore, for the cache line that does not need data write-back, the private cache does not need to send a delete request to the snoop filter, and it can ensure that the private cache cannot store the required data according to the snoop filter. The number of times of snooping on the private caches of other processor cores is reduced, thereby reducing the occupancy of the communication bandwidth between the processor core and outside the processor core by sending delete requests, thereby ensuring the performance of the processing unit.

在一种可能的实现方式中,在根据坐标信息确定目标填充位置对应的映射表项为空时,将第一缓存行的标识信息存储到目标填充位置对应的映射表项中。In a possible implementation manner, when it is determined according to the coordinate information that the mapping entry corresponding to the target filling position is empty, the identification information of the first cache line is stored in the mapping entry corresponding to the target filling position.

在一种可能的实现方式中,将第二缓存行的标识信息从目标填充位置对应的映射表项转移到牺牲表项,包括:确定目标填充位置对应的映射表项所属的探听表项集合,并将第二缓存行的标识信息从目标填充位置对应的映射表项转移到该探听表项集合包括的牺牲表项,其中,探听过滤器包括至少一个探听表项集合,每个探听表项集合包括牺牲表项和至少一个映射表项,每个探听表项集合与私有缓存中的一个填充位置集合相对应,每个填充位置集合包括至少一个填充位置,每个探听表项集合中映射表项的数量与相对应的填充位置集合中填充位置的数量相等,相对应的探听表项集合和填充位置集合中不同的映射表项对应不同的填充位置。In a possible implementation manner, transferring the identification information of the second cache line from the mapping entry corresponding to the target filling position to the victim entry includes: determining the set of snooping entries to which the mapping entry corresponding to the target filling position belongs, And transfer the identification information of the second cache line from the mapping entry corresponding to the target filling position to the victim entry included in the snooping entry set, wherein the snooping filter includes at least one snooping entry set, and each snooping entry set Including the victim table entry and at least one mapping table entry, each snooping table entry set corresponds to a filling position set in the private cache, each filling position set includes at least one filling position, and each snooping table entry set has a mapping table entry The number of is equal to the number of filling positions in the corresponding filling position set, and different mapping entries in the corresponding snooping entry set and filling position set correspond to different filling positions.

在一种可能的实现方式中,将第二缓存行的标识信息从目标填充位置对应的映射表项转移到牺牲表项,包括:若牺牲表项中存储有第三缓存行的标识信息,通过第二缓存行的标识信息覆盖牺牲表项中第三缓存行的标识信息,其中,每个填充位置集合对应的待完成的缓存行填充请求的数量小于或等于1,缓存行填充请求用于请求向相对应的填充位置集合包括的填充位置填充缓存行。In a possible implementation manner, transferring the identification information of the second cache line from the mapping entry corresponding to the target filling position to the victim entry includes: if the victim entry stores the identification information of the third cache line, by The identification information of the second cache line overwrites the identification information of the third cache line in the victim entry, where the number of cache line filling requests to be completed corresponding to each filling position set is less than or equal to 1, and the cache line filling request is used for the request The cache line is filled to the fill locations included in the corresponding fill location set.

需要说明的是,由于探听过滤方法的细节在上述实施例的探听过滤器部分,已经结合结构示意图进行了详细说明,具体过程可参见前述探听过滤器实施例中的描述,在此不再进行赘述。It should be noted that, since the details of the snoop filter method have been described in detail in the snoop filter part of the above embodiment in conjunction with the structural schematic diagram, the specific process can refer to the description in the aforementioned snoop filter embodiment, and will not be repeated here. .

计算机存储介质computer storage media

本申请还提供了一种计算机可读存储介质,存储用于使一机器执行如本文所述的探听过滤方法的指令。具体地,可以提供配有存储介质的系统或者装置,在该存储介质上存储着实现上述实施例中任一实施例的功能的软件程序代码,且使该系统或者装置的计算机(或CPU或MPU)读出并执行存储在存储介质中的程序代码。The present application also provides a computer-readable storage medium storing instructions for causing a machine to execute the snooping filtering method as described herein. Specifically, a system or device equipped with a storage medium may be provided, on which a software program code for realizing the functions of any of the above embodiments is stored, and the computer (or CPU or MPU of the system or device) ) to read and execute the program code stored in the storage medium.

在这种情况下,从存储介质读取的程序代码本身可实现上述实施例中任何一项实施例的功能,因此程序代码和存储程序代码的存储介质构成了本申请的一部分。In this case, the program code itself read from the storage medium can realize the function of any one of the above-mentioned embodiments, so the program code and the storage medium storing the program code constitute a part of the present application.

用于提供程序代码的存储介质实施例包括软盘、硬盘、磁光盘、光盘(如CD-ROM、CD-R、CD-RW、DVD-ROM、DVD-RAM、DVD-RW、DVD+RW)、磁带、非易失性存储卡和ROM。可选择地,可以由通信网络从服务器计算机上下载程序代码。Examples of storage media for providing program code include floppy disks, hard disks, magneto-optical disks, optical disks (such as CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), Tape, non-volatile memory card, and ROM. Alternatively, the program code can be downloaded from a server computer via a communication network.

计算机程序产品computer program product

本申请实施例还提供了一种计算机程序产品,包括计算机指令,该计算机指令指示计算设备执行上述多个方法实施例中的任一对应的操作。An embodiment of the present application further provides a computer program product, including computer instructions, where the computer instruction instructs a computing device to perform any corresponding operation in the foregoing multiple method embodiments.

本申请实施例的商业价值The commercial value of the embodiment of this application

本申请实施例在解决探听过滤占据处理器核内与处理器核外的通信带宽较大的问题时,通过设置牺牲表项,私有缓存填充缓存行时通知探听过滤器相对应的坐标信息,探听过滤器将所填充缓存行的标识信息存储到相对应的映射表项,如果映射表项中有其他标识信息,但该标识信息对应的缓存行不需要回写时,仅需要告知探听过滤器坐标信息,探听过滤器会将原缓存行的标识信息存放到牺牲表项,从而私有缓存不需要向探听过滤器发送删除请求,减少了对于处理器核内与处理器核外的通信带宽的占用,进而保证处理单元的性能。由于增加了牺牲表项,并且限制了同一填充位置集合仅允许一个正在执行的填充请求,当被替换的缓存行与内存中的数据不相同时,允许首先发起数据填充请求,然后发起数据回写请求,保证数据填充缓存的速度。In the embodiment of the present application, when solving the problem that the snoop filter occupies a large communication bandwidth between the processor core and the outside of the processor core, by setting a sacrificial entry, the private cache notifies the coordinate information corresponding to the snoop filter when the cache line is filled, and snoop The filter stores the identification information of the filled cache line in the corresponding mapping table item. If there is other identification information in the mapping table item, but the cache line corresponding to the identification information does not need to be written back, it only needs to inform the listener of the filter coordinates information, the snooping filter will store the identification information of the original cache line in the victim entry, so that the private cache does not need to send a delete request to the snooping filter, reducing the communication bandwidth usage between the processor core and outside the processor core. Thus, the performance of the processing unit is guaranteed. Due to the addition of sacrifice entries and the restriction that only one filling request is allowed in the same filling location set, when the replaced cache line is different from the data in memory, it is allowed to initiate a data filling request first, and then initiate a data write-back request, which guarantees the speed at which data fills the cache.

应该理解,本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同或相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于方法实施例而言,由于其基本相似于装置和系统实施例中描述的方法,所以描述的比较简单,相关之处参见其他实施例的部分说明即可。It should be understood that each embodiment in this specification is described in a progressive manner, the same or similar parts of each embodiment can be referred to each other, and each embodiment focuses on the difference from other embodiments . In particular, for the method embodiments, since they are basically similar to the methods described in the device and system embodiments, the description is relatively simple, and for relevant parts, please refer to some descriptions of other embodiments.

应该理解,上述对本说明书特定实施例进行了描述。其它实施例在权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。It should be understood that the foregoing describes specific embodiments of the present specification. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in an order different from that in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. Multitasking and parallel processing are also possible or may be advantageous in certain embodiments.

应该理解,本文用单数形式描述或者在附图中仅显示一个的元件并不代表将该元件的数量限于一个。此外,本文中被描述或示出为分开的模块或元件可被组合为单个模块或元件,且本文中被描述或示出为单个的模块或元件可被拆分为多个模块或元件。It should be understood that describing an element herein in the singular or showing only one in a drawing does not mean limiting the number of that element to one. Furthermore, modules or elements described or illustrated herein as separate may be combined into a single module or element, and modules or elements described or illustrated herein as a single may be split into a plurality of modules or elements.

还应理解,本文采用的术语和表述方式只是用于描述,本说明书的一个或多个实施例并不应局限于这些术语和表述。使用这些术语和表述并不意味着排除任何示意和描述(或其中部分)的等效特征,应认识到可能存在的各种修改也应包含在权利要求范围内。其他修改、变化和替换也可能存在。相应的,权利要求应视为覆盖所有这些等效物。It should also be understood that the terms and expressions used herein are for description only, and one or more embodiments of this specification should not be limited to these terms and expressions. The use of these terms and expressions does not mean to exclude any equivalent features shown and described (or parts thereof), and it should be recognized that various modifications may also be included within the scope of the claims. Other modifications, changes and substitutions may also exist. Accordingly, the claims should be read to cover all such equivalents.

Claims (14)

1. A snoop filter comprising:
the device comprises a receiving unit and a processing unit, wherein the receiving unit is used for receiving coordinate information sent by a private cache after a first cache line is filled into the private cache, and the coordinate information is used for indicating a target filling position of the first cache line in the private cache;
and the snooping unit is configured to, when it is determined that the mapping table entry corresponding to the target filling position stores identification information of a second cache line according to the coordinate information, transfer the identification information of the second cache line from the mapping table entry corresponding to the target filling position to a victim table entry, and store the identification information of the first cache line in the mapping table entry corresponding to the target filling position, where different cache lines correspond to different identification information.
2. The snoop filter as claimed in claim 1, wherein,
and the snooping unit is configured to store the identification information of the first cache line in the mapping table entry corresponding to the target filling position when it is determined that the mapping table entry corresponding to the target filling position is empty according to the coordinate information.
3. The snoop filter of claim 1, wherein said snoop filter comprises at least one set of snoop entries, each set of snoop entries comprising a victim entry and at least one mapping entry, each set of snoop entries corresponding to a set of fill locations in said private cache, each set of fill locations comprising at least one fill location, a number of mapping entries in each set of snoop entries being equal to a number of fill locations in a corresponding set of fill locations, different mapping entries in a corresponding set of snoop entries and a set of fill locations corresponding to different fill locations;
and the snoop unit is configured to determine a snoop table entry set to which a mapping table entry corresponding to the target filling position belongs when it is determined according to the coordinate information that identification information of a second cache line is stored in the mapping table entry corresponding to the target filling position, and transfer the identification information of the second cache line from a mapping table entry corresponding to the target filling position to a victim table entry included in the snoop table entry set.
4. The snoop filter as claimed in claim 3, wherein the number of cache line fill requests to be completed for each set of fill locations is less than or equal to 1, a cache line fill request requesting a cache line to be filled to a fill location included in the corresponding set of fill locations.
5. The snoop filter of claim 4 wherein,
the snooping unit is configured to, when transferring the identifier information of the second cache line to a victim entry, if the identifier information of a third cache line is stored in the victim entry, overwrite the identifier information of the third cache line in the victim entry with the identifier information of the second cache line.
6. The snoop filter of claim 3 wherein the snoop filter comprises at least two sets of snoop entries, at least two sets of fill locations corresponding to the at least two sets of snoop entries being located in at least two private caches, different sets of snoop entries corresponding to different sets of fill locations, the at least two private caches being located within the same processor core.
7. The snoop filter of claim 3 wherein each fill location in a same set of fill locations corresponds to the same index information identifying the address segment of the cache line that the fill location in the private cache can fill.
8. A snoop filter as claimed in any one of claims 1 to 7 wherein the identification information comprises address information in memory for the corresponding cache line.
9. A processing unit, comprising:
the snoop filter as claimed in any one of claims 1-8;
and the at least two processor cores are used for inquiring the snoop filter and determining the private cache where the required data is located.
10. A computing device, comprising:
the processing unit of claim 9;
and the memory is coupled with the processing unit and used for storing the data required by the processing unit.
11. A snoop filtering method comprising:
receiving coordinate information sent by a private cache after a first cache line is filled into the private cache, wherein the coordinate information is used for indicating a target filling position of the first cache line in the private cache;
and when determining that the mapping table entry corresponding to the target filling position stores the identification information of a second cache line according to the coordinate information, transferring the identification information of the second cache line from the mapping table entry corresponding to the target filling position to a sacrifice table entry, and storing the identification information of the first cache line in the mapping table entry corresponding to the target filling position, wherein different cache lines correspond to different identification information.
12. The snoop filtering method as claimed in claim 11, wherein the method further comprises:
and when the mapping table entry corresponding to the target filling position is determined to be empty according to the coordinate information, storing the identification information of the first cache line into the mapping table entry corresponding to the target filling position.
13. The snoop filtering method of claim 11, wherein the transferring identification information of the second cache line from a mapping table entry corresponding to the target fill location to a victim entry comprises:
determining a snoop entry set to which a mapping entry corresponding to the target filling position belongs, and transferring identification information of the second cache line from a mapping table entry corresponding to the target filling position to a victim entry included in the snoop entry set, wherein the snoop filter includes at least one snoop entry set, each snoop entry set includes a victim entry and at least one mapping entry, each snoop entry set corresponds to one filling position set in the private cache, each filling position set includes at least one filling position, the number of mapping entries in each snoop entry set is equal to the number of filling positions in the corresponding filling position set, and different mapping entries in the corresponding snoop entry set and the filling position set correspond to different filling positions.
14. The snoop filtering method of claim 13, wherein the transferring identification information of the second cache line from a mapping table entry corresponding to the target fill location to a victim entry comprises:
and if the sacrifice table entry stores identification information of a third cache line, covering the identification information of the third cache line in the sacrifice table entry by the identification information of the second cache line, wherein the number of cache line filling requests to be completed, which correspond to each filling position set, is less than or equal to 1, and the cache line filling requests are used for requesting to fill the cache line to the filling positions included in the corresponding filling position set.
CN202210948674.9A 2022-08-09 2022-08-09 Snooping filter, processing unit, computing device and related methods Pending CN115357525A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210948674.9A CN115357525A (en) 2022-08-09 2022-08-09 Snooping filter, processing unit, computing device and related methods

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210948674.9A CN115357525A (en) 2022-08-09 2022-08-09 Snooping filter, processing unit, computing device and related methods

Publications (1)

Publication Number Publication Date
CN115357525A true CN115357525A (en) 2022-11-18

Family

ID=84033685

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210948674.9A Pending CN115357525A (en) 2022-08-09 2022-08-09 Snooping filter, processing unit, computing device and related methods

Country Status (1)

Country Link
CN (1) CN115357525A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119127731A (en) * 2024-11-15 2024-12-13 中科亿海微电子科技(苏州)有限公司 A monitoring filter
CN119621652A (en) * 2025-02-11 2025-03-14 北京开源芯片研究院 Processor core selection method, device, electronic device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7325102B1 (en) * 2003-11-17 2008-01-29 Sun Microsystems, Inc. Mechanism and method for cache snoop filtering
US20170255557A1 (en) * 2016-03-07 2017-09-07 Qualcomm Incorporated Self-healing coarse-grained snoop filter
US20170286299A1 (en) * 2016-04-01 2017-10-05 Intel Corporation Sharing aware snoop filter apparatus and method
US20210149819A1 (en) * 2019-01-24 2021-05-20 Advanced Micro Devices, Inc. Data compression and encryption based on translation lookaside buffer evictions
US20210294743A1 (en) * 2020-03-17 2021-09-23 Arm Limited Apparatus and method for maintaining cache coherence data for memory blocks of different size granularities using a snoop filter storage comprising an n-way set associative storage structure

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7325102B1 (en) * 2003-11-17 2008-01-29 Sun Microsystems, Inc. Mechanism and method for cache snoop filtering
US20170255557A1 (en) * 2016-03-07 2017-09-07 Qualcomm Incorporated Self-healing coarse-grained snoop filter
US20170286299A1 (en) * 2016-04-01 2017-10-05 Intel Corporation Sharing aware snoop filter apparatus and method
US20210149819A1 (en) * 2019-01-24 2021-05-20 Advanced Micro Devices, Inc. Data compression and encryption based on translation lookaside buffer evictions
US20210294743A1 (en) * 2020-03-17 2021-09-23 Arm Limited Apparatus and method for maintaining cache coherence data for memory blocks of different size granularities using a snoop filter storage comprising an n-way set associative storage structure

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119127731A (en) * 2024-11-15 2024-12-13 中科亿海微电子科技(苏州)有限公司 A monitoring filter
CN119127731B (en) * 2024-11-15 2025-03-04 中科亿海微电子科技(苏州)有限公司 A monitoring filter
CN119621652A (en) * 2025-02-11 2025-03-14 北京开源芯片研究院 Processor core selection method, device, electronic device and storage medium

Similar Documents

Publication Publication Date Title
US9921972B2 (en) Method and apparatus for implementing a heterogeneous memory subsystem
US8250254B2 (en) Offloading input/output (I/O) virtualization operations to a processor
CN112416817B (en) Prefetching method, information processing device, device and storage medium
US8423715B2 (en) Memory management among levels of cache in a memory hierarchy
US20080235477A1 (en) Coherent data mover
CN108139981B (en) Access method for page table cache TLB table entry and processing chip
CN112631961B (en) A memory management unit, an address translation method and a processor
CN112631962B (en) Storage management device, storage management method, processor and computer system
US8185692B2 (en) Unified cache structure that facilitates accessing translation table entries
CN114328295A (en) Storage management apparatus, processor, related apparatus and related method
JP2012532381A (en) Extended page size with agglomerated small pages
JP6859361B2 (en) Performing memory bandwidth compression using multiple Last Level Cache (LLC) lines in a central processing unit (CPU) -based system
CN112540939A (en) Storage management device, storage management method, processor and computer system
CN112559389B (en) Storage control device, processing device, computer system and storage control method
US12175250B2 (en) Computing device and method for fusing and executing vector instructions
US11061820B2 (en) Optimizing access to page table entries in processor-based devices
CN114816666B (en) Configuration method of virtual machine manager, TLB (translation lookaside buffer) management method and embedded real-time operating system
CN116795736A (en) Data pre-reading method, device, electronic equipment and storage medium
CN113722247A (en) Physical memory protection unit, physical memory authority control method and processor
CN116383101A (en) Memory access method, memory management unit, chip, device and storage medium
US20240345774A1 (en) Information processing system
CN115357525A (en) Snooping filter, processing unit, computing device and related methods
CN120104043A (en) Data processing method, device and computing equipment
CN117873921A (en) A queue page table management method and device based on remote direct memory access
JP3973129B2 (en) Cache memory device and central processing unit using the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240301

Address after: 310052 Room 201, floor 2, building 5, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: C-SKY MICROSYSTEMS Co.,Ltd.

Country or region after: China

Address before: 201208 floor 5, No. 2, Lane 55, Chuanhe Road, No. 366, Shangke Road, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai

Applicant before: Pingtouge (Shanghai) semiconductor technology Co.,Ltd.

Country or region before: China