ISSN在线(2319 - 8753)打印(2347 - 6710)
相关文章Pubmed,谷歌学者 |
该方法是基于最小费用最大流制定全球决定树拓扑结构,保持负载平衡和考虑脉冲发电机和脉冲之间的时脉门闩。迁移实验结果表明,该方法可以提高电力消耗了12%和13%,7%和70%的歪斜度改进平均相比,最近的纸在工业电路和ispd - 2010基准,分别。减少时钟树的大小被称为一个有效的方法来减少功耗在现代电路设计。然而,大多数现有的节能从而最小化算法的优化能力的基础上,人字拖,这可能导致有限的电能节约。实现权力和时间权衡,探讨pulsed-latch利用时钟树进一步节省电能。这是第一篇论文提出一个迁移的方法有效地构造一个时钟树pulsed-latches和人字拖。
关键字 |
时钟树迁移,动态功率降低,脉冲发生器,脉冲门闩。 |
介绍 |
在电流电路的设计中,最常见的d型触发器,存储元素包含两个插销(主人和奴隶)引发的时钟信号。这种类型的设计使得静态时序分析(STA)申请时间验证。作为触发器的晶体管数量两倍比一个门闩,门闩比人字拖就地区而言,过渡时间和功耗。然而,很难执行STA latch-based电路,因为数据透明度。pulsed-latch-based设计风格是采用动态的力量。脉冲门闩门闩是由一个简单的时钟信号产生一个脉冲发生器。当脉冲时钟波形触发一个门闩,门闩同步时钟和时间行为类似于一个边沿触发触发器。因此,可以应用于pulsed-latch STA时钟树。在高效能的电路功耗已经成为一个至关重要的问题,因为的晶体管数量大大增加。提出了许多技术来减少总功率的芯片,如多个供应电压时钟门控,从而最小化。 Because of heavy pipeline designs and high-frequency signal switching, a clock tree is known to be a major contributor to power dissipation. The clock tree accounts for a significant portion of total power consumption and consumes 20%–40% of total power in synchronous circuit. Therefore, the chip power can be greatly reduced by decreasing the clock-tree power. The power dissipation of a clock tree can be reduced by decreasing total (clock) wire capacitance. Therefore, pulsed-latch designs have both advantages of latches and flip-flops: they offer easier timing verification and less power consumption As in an 20% reduction in total dynamic power consumption can be achieved in practice.Although pulsed latches can effectively reduce power consumption, most current design flows are built for flip-flop designs. To adopt pulsed latches for a current design flow, designers might change the circuit description in high-level synthesis. However, this modification incurs excessive costs and causes high complexity in physical-synthesis stages. Therefore, in this paper, we present an efficient pulsed-latch migration approach in physical design to minimize the cost of utilizing pulsed latches under the current design flow. In pulsed-latch designs, a pulse generator is indispensable |
预赛 |
大多数设计流基于flip-flop-based电路然而,如果设计师尝试后脉冲门闩和执行相应的过程,如位置和路由,它不可避免地引入了过多的额外费用。因此,在本文中,我们提出一个有效的但有效的迁移方法减少的努力改变flip-flop-based pulsed-latch-based时钟树。给定一个flip-flop-based时钟比较pulsed-latch迁移流。(一)迁移流提出了提出了迁移流混合结构。我们运用迁移的方法来获得一个新的时钟树的混合脉冲门闩和人字拖。显示了一个比较以前的纸和提议的设计。与一个单一类型脉冲发生器pulsed-latch迁移。然而,脉冲的迁移方法应该包含一个混合结构门闩和人字拖。此外,脉冲发电机的驱动负载可能不同,和更换发电机小发电机将进一步降低功耗 |
动态功率主要来源于电路操作。时钟树的功耗通常占主导地位的总功耗电路。因此,本文的重点是动态功率,可以计算。这种类型的设计使得静态时序分析(STA)申请时间验证。。为了实现这些目标,本文提出了一种新的时钟树的混合结构脉冲门闩和拖鞋的multitype脉冲发生器插入。有两个主要组件的电路功耗:静态功耗和动态能力。静态功耗,由于晶体管的漏电流,消耗功率电路即使不是操作。作为触发器的晶体管数量两倍比一个门闩,门闩比人字拖就地区而言,过渡时间和功耗。然而,很难执行STA latch-based电路,因为数据透明度。pulsed-latchbased设计风格是采用动态功率降低.Pulsed门闩门闩是由一个简单的时钟信号产生一个脉冲发生器。当脉冲时钟波形触发一个门闩,门闩与时钟同步和定时行为类似于一个边缘触发触发器。 Hence, STA can be applied to the pulsed-latch clock tree. Therefore, pulsedlatch designs have both advantages of latches and flipflops: they offer easier timing verification and less power consumption . As in , an 20% reduction in total dynamic power consumption can be achieved in practice. Although pulsed latches can effectively reduce power consumption, most current design flows are built for flip-flop designs. pulsed-latch circuits, the largest-size pulse generators are inserted to drive pulsed latches.To adopt pulsed latches for a current design flow, designers might change the circuit description in high-level synthesis. However, this modification incurs excessive costs and causes high complexity in physical-synthesis stages. Therefore, in this paper, we present an efficient pulsed-latch migration approach in physical design to minimize the cost of utilizing pulsed latches under the current design flow. In pulsed-latch designs, a pulse generator is indispensable to generate a clock pulse, but consumes more power than a pulsed latch and a buffer. Although pulsed latches can reduce power dissipation, the total power of the clock tree may increase because of additional pulse generators. Thus, there is a tradeoff between the pulse-generator insertion and pulsed-latch substitution. As the clock pulse is sensitive to output load, it is essential to control the load of a pulse generator for potential pulse degradation . Additionally, if designers do not limit the number of pulse latches driven by a pulse generator, the number of fanoutsIn the pulse generator may be too large, which could lead to routing congestion. Therefore, two major factors must be considered to control the output load: 1) the pulsegenerator driving load cannot exceed the maximum tolerable load defined in the library and 2) the number of pulsed latches driven by a pulse generator should be smaller than the maximum fan out number. As pulse generators consume large amounts of power in pulsedlatch circuits, it is critical to reduce the pulse-generator power. Proposed multi type pulse generator insertion to reduce the unnecessary power dissipation. Considering clock gating of Then, to further reduce power consumption, this method replaces the largest-size pulse generators with smaller-size ones if there are no constraint violations. As a result, the power consumption can be further reduced. However, existing methods of clock-tree minimization are primarily based on flip-flops and focus on wire length minimization alone, which may limit achievable power savings. In current circuit designs, the most common storage element is a D-type flip-flop that consists of two latches (master and slave) triggered by a clock signal. This type of design makes it easier to apply static timing analysis (STA) for timing verification. As transistor counts of a flip-flop are two times than that of a single latch, latches are superior to flip-flops in terms of area, transition time, and power dissipation. |
高功耗SoC不仅会增加系统的成本,但也会影响产品的寿命和可靠性。优化功耗,许多介绍了低功耗设计技术,如时钟门控取代他们high-Vtcounter non-timingcritical细胞部分,电源控制,创建multi-supply-voltage设计动态电压/频率缩放和最小化时钟网络。减少运行时,该方法采用泰森多边形法图把设计分成几个多边形在脉冲门闩提供位置信息。我们解决问题minimumcost全球最大流量的方法确定时钟树的拓扑结构,而保持适当的负载平衡,同时考虑导线长度之间的脉冲发电机和脉冲门闩。 |
算法 |
在这些技术中,最大限度地减少时钟网络在降低功耗SoC的非常重要,因为它占了50%的动态功率的芯片和动态功率是主导力量来源,占总能耗的75%的SoC .Resent研究提出各种方法来减少时钟网络,包括缓冲大小位置优化寄存器,并应用multi-bit拖鞋(MBFFs),或者multi-bit寄存器,寄存器的银行。电源优化问题与MBFFs充填后阶段。它现在的应用程序的一个新问题制定multi-bit拖鞋,触发器,同时减少总能耗和互连线长度,这样放置密度和时间松弛约束都满意。问题公式化的基础上,我们提出一种新颖的放置电源优化流程触发器一起分组和MBFF放置算法来解决解决问题。的过程。我们进行了实验五ispd - 2010年工业电路和八个电路时钟网络竞赛[23]。我们没有考虑障碍在这篇文章中,我们只使用水槽位置而忽略了障碍。 |
理想情况下,脉冲门闩成为零宽度的脉冲边沿触发装置。然而,在实践中脉冲宽度必须大到足以让captureflip失败是记忆中常用的元素顺序的设计有限状态机控制器和级联电路等电路。它制定啪嗒啪嗒地响分组问题m-clique发现和maximum-independent-set子问题。最后,我们介绍了进步的窗口优化技术来减少位置偏差和提高运行效率的算法。实验结果表明,我们的方法是非常有效的减少不仅触发器功耗而且时钟树和信号网络线长度时应用multi-bit拖鞋设计在充填后阶段。脉冲门闩门闩是由一个简单的时钟脉冲。他们保留门闩的设计优势,提供触发时间验证和优化,因为他们像拖鞋由于短的透明度。提出了几种类型的脉冲门闩,主要用于高性能微处理器设计。例如,脉冲门闩用于时间关键路径而拖鞋用于路径不定时的关键。应用脉冲门闩asic最近报道;替换一些拖鞋可以产生的脉冲门闩总动态功耗降低20%。实验结果表明,迁移后的时钟树可以减少电力消耗了12%和15%,实现斜改进平均7%和70%相比,最近的论文工业电路和ispd - 2010标准。第一个实验flip-flopbased电路相比,pulsed-latch-based电路,提出pulsed-latch移民计划(混合沉类型和multitype脉冲发生器)工业电路。 Table IV lists the comparisons of power dissipation improvement for different types of sinks. The columns Sinks, Bufs, PGs, and Wires list the power consumption of sinks, buffers, pulse generators, and wires, respectively. The total power, column Total, is the summation over the power dissipation of sinks, wires, and drivers. Columns #FF and #PL list the number of flip-flops and pulsed latches for mixture sink types, respectively.To prevent pulse distortion, the total load of a pulse generator cannot exceed the defined tolerable load and the maximum fanout constraint during the migration |
问题公式化 |
使用单脉冲宽度脉冲latch-based电路,这是传统的方法,不能利用借贷由于时间短的透明度。这可以缓解采用序贯优化技术以重或时钟歪斜等调度。然而,使用以经常导致门闩的数量大量增加从而限制其实际应用;它也可以影响验证方法。传统的时钟脉冲相位差调度分配任意数量的斜向每个锁平衡组合块之间的延迟。它已经表明,最大的区别在时钟到达时间,可以真正实现不到10%的时钟周期在0.18 -μm技术,或10% - 16% 0.18 -μm和0.13 -μm技术。这也是真正的在一个只有一个非常小的时钟网格斜可以实现。介绍了降低功耗的方法修改时钟树的树拓扑和减少时钟歪斜。除了最小化的动态功率时钟树,它还需要控制的脉冲发电机中引入部分即自树拓扑变化脉冲发生器插入后,时钟歪斜可能增加。因此,重要的是要考虑斜在时钟树的重建问题。 For the transition time of each cell, we applied STA to derive the timing information. When calculating the timing information, we also check each cell to verify whether the input net transition time or total net capacitance exceeds the maximum value defined in the library. In summary, the problem can be formulated as follows.we consider the generators with the largest driving capabilities the main difference between the algorithm proposed in and to obtain better results, we cluster all pulsed latches rather than local reclustering..Algorithm 1 shows the pseudocode for the proposed clustering procedure. Initially, regard each sink as an individual group, and calculate the Manhattan distance between each pair of sinks in the pulsed-latch set PL. Sort the distance set in the ascending order, and preferentially select the shorter distance such that the minimum distance between two sinks is a closer pair. If two sinks are located in different groups, they can be merged into the group with the shortest distance if all the constraints are satisfied is the grouped total load capacitance of giand g j . To reduce power consumption, we merge pulsed latches by considering the tradeoff between the number of pulse generators and wirelength (line 8). This is a tradeoff method to achieve better power reduction than an approach focused on minimizing the number of pulse generators alone. |
实验结果 |
算法实现的C / c++语言和Linux机器上执行的实验与16 gb内存2 ghz Intel Xeon处理器。拟议的STA方法是由使用源代码从林et al。计算每个单元的过渡时间,STA用于推导synopsys对此的计时信息。细胞库。显示pulsed-latch-based电路之间的比较和迁移的时钟树混合下沉,pulsed-latch-based时钟树。 |
当计算计时信息,它也需要评估确定每个单元格输入净过渡时间或总电容超过在图书馆定义的最大值。夷为平地的时钟树综合和夷为平地缓冲插入执行构建初始缓冲时钟树。线负载的乘法计算总导线长度之间的伪脉冲发生器和脉冲门闩和线负载恒定在[20]。此外,我们估计连接线长度通过计算曼哈顿距离。我们用算法来构造初始缓冲时钟树与零倾斜。在每个聚类迭代,伪脉冲发生器放置在中心位置的一组确定的约束是否可以满足。考虑Cmax,一组的总负载电容之和总水槽负载在两个选择组和每个水槽之间的连接线负载和伪脉冲发生器。 |
提供一个合理的设置,我们设置了最大可容忍的载荷和最大扇出multitype脉冲发生器的约束。为了防止脉冲畸变,脉冲发生器的总负载不能超过定义的可容忍的负荷和最大扇出约束在迁移过程中。优化功耗,许多介绍了低功耗设计技术,如时钟门控取代他们high-VT non-timing-critical细胞计数器部分,电源控制,创建multi-supply-voltage设计动态电压/频率缩放和最小化时钟网络。方法设置的最大容许负载脉冲发生器作为小型缓冲区的最大输出负载。然而,脉冲发生器的最大容许负载应该远小于,避免脉冲退化。我们进行了实验五ispd - 2010年工业电路和八个电路时钟网络比赛。我们没有考虑障碍在这篇文章中,我们只使用水槽位置而忽略了障碍。工业的统计数据电路和ispd - 2010基准电路在列中列出的测试用例列表名称,和列芯片尺寸和#人字拖单芯片大小和数量的人字拖,分别。聚类可以被认为是最重要的非监督学习问题;所以,当其他这类的问题,它涉及寻找结构未标记的数据的集合。 A loose definition of clustering could be “the process of organizing objects into group’s members are similar in some way.A cluster is therefore a collection of objects which are“similar” between them and This method of reducing the power consumption of the clock tree by modifying the tree topology and minimizing the clock skew. In addition to minimizing the dynamic power of a clock tree, it is also necessary to control the amount of pulse generators as introduced in. Since the tree topology changes after pulse-generator insertion, the clock skew might increase. As a result, it is important to consider the skew issue during the clocktreereconstruction. For the transition time of each cell, we applied STA to derive the timing information. When calculating the timing information, we also check each cell to verify whether the input net transition time or total net capacitance exceeds the maximum value defined in the library. This section demonstrates the performance of multi type pulse-generator insertion.The first experiment compared flip-flop-based circuits,pulsed-latch-based circuits, and the proposed pulsed-latch migration scheme (with mixed sink types and multitype pulse generator) on industrial circuits. The lists are comparisons of power dissipation improvement for different types of sinks.The cell library of multitype pulse generators we used are based on. It shows the cell library of multi type pulse generators. Row Cap and Load list the cell capacitance and maximum tolerable load of pulse generators, respectively.In small cases, the algorithm without a Voronoi diagram can still adequately manage the problem. |
然而,运行时增加显著下沉的数量增加,因为高网络流程建模的复杂性。实验结果表明,当泰森多边形法图,最严重的倾斜提高了93%和29%,功耗也减少了与算法相比没有泰森多边形法图工业电路和ispd - 2010基准,分别。这些改进确认提出了泰森多边形法图建设不仅降低了运行时,但也保持质量的解决方案。 |
结论 |
在本文中,我们提出了一个有效的时钟偏移方法可以迁移flip-flop-based时钟树成pulsed-latch-based动态功率降低。为了防止脉冲退化,容许负载的脉冲发生器和脉冲门闩由脉冲发生器驱动脉冲发生器插入期间被认为是。为了进一步降低脉冲发电机的功率损耗,我们启用multitype脉冲发电机和确定合适的尺寸驱动脉冲的脉冲发电机门闩。考虑的额外权力之间的权衡multitype脉冲发生器插入和pulsed-latch替换的电能节约,并不是所有人字拖都更换。这允许迁移的时钟树构造混合结构的门闩和人字拖。同时确定拓扑配置和最小化时和负载平衡,我们应用最小费用最大流配方解决pulsed-latch-clustering问题。迁移实验结果表明,该方法可以提高能耗和倾斜与最近的研究工业电路和ispd - 2010基准,分别。一个有效的时钟偏移,可以迁移flip-flop-based时钟树成pulsed-latch-based动态功率降低。为了防止脉冲退化,容许负载的脉冲发生器和脉冲门闩由脉冲发生器驱动脉冲发生器插入期间被认为是。为了进一步降低脉冲发电机的功率损耗,使多脉冲型发电机和确定合适的尺寸驱动脉冲的脉冲发电机门闩。 Considering the trade off between the additional power of multi type pulse-generator insertion and the power savings of pulsed-latch substitution, not all flip-flops were replaced. This allows the migrated clock tree to be constructed with a mixed structure of latches and flip-flops. To determine the topology configuration and simultaneously minimize the wire length maintain load balance, we applied a minimum-cost maximum-flow formulation to solve the pulsed-latch-clustering problem. Experimental results indicated that the proposed migration approach can improve both power consumption and skew compared with the most recent research on the industrial circuits and ISPD-2010 benchmarks, respectively. To achieve 40.25% power reduction than existing system.In the future work, to apply a DVFS(Dynamic voltage and frequency scaling) algorithm. It is one of the most effective method for low power consumption. |
引用 |