13

LJgibbs · 2023年07月06日 · 北京市

[译文]在综合中约束时序路径 1

原文地址:https://vlsitutorials.com/constraining-timing-paths-in-synthesis-part-1/
后附英文原文

本文是 how to define Synthesis timing constraint 系列文章的第一篇。

本文的目标是约束一个 Demo 设计中所有输入、输出以及内部时序路径。

假设我们是一个简单设计的 designer(即使这样,我也愿将其称之为一个 IP)。这个 IP 只有一个时钟域(Clock domain),在输入端口和内部 flip-flop 之间有组合逻辑,还有组合逻辑存在于两个 flip-flop 之间,以及 flip-flop 和输出端口之间。接下来本文将介绍我们如何为这个 IP 建立约束。

image.png
图 1: 待约束模块的结构图

一般来说,我们的 IP 会和其他 IP 一起集成到一个 SoC 片上系统中,如图 2 所示。所以我们的 IP 的输入端口信号会来自其他 IP,这里我们称这个驱动我们输入的 IP 为 IP-2。同样的,我们的输出信号会驱动另一个 IP,我们称之为 IP-3 。

image.png
图 2:与其他 IP 集成后的设计示意图

注意:在定义输入输出约束后,综合工具会默认假设输入信号在时钟上升沿时刻到达(此处指来自 IP-2 的信号),并且我们输出的信号同样在时钟上升沿时刻被后一级 IP 的 FF 采样(这里指 IP-3)。

在我们约束输入路径、内部路径以及外部路径之前,我们首先有必要理解时序路径(path)的定义。 综合工具在进行时序分析时,会将整个设计中的电路,分拆为多条时序路径进行分析,一条时序路径总是存在一个起点(startpoint)和终点(endpoint),以下介绍了路径可能的起点和终点。

路径起点

  • 输入端口 (时钟输入端口除外)
  • FF (flip-flop) 的时钟管脚 (Clock pin)

路径终点

  • 输出端口(时钟输出端口除外)
  • FF 的任意输入管脚(时钟管脚除外)

基于上述可能的路径起点和终点,我们可以得到以下四条示例的路径,如图 3 所示

image.png
图 3:设计中的时序路径示意

Path 1:输入端口 - FF 输入管脚

Path 2:FF 时钟管脚 - FF 输入管脚

Path 3:FF 时钟管脚 - 输出端口

Path 4:输入端口 - 输出端口

在了解了上述基础知识之后,让我们回到我们的设计约束中。

First we will focus on constraining register-to-register path //约束 REG2REG 路径

首先,我们将约束触发器-触发器之间(register-to-register, 一般简称 reg2reg)的时序路径。

image.png
图 4:约束 reg-to-reg 时序路径

在约束 reg2reg 的建立时间(setup time)路径时,我们只需要把 reg 的时钟周期提供给综合工具。在约束了某个时钟之后,设计中所有该时钟域的时序路径都会被约束,并将该时钟的周期信息作用到这些路径的建立时间分析中。假设我们设计中的时钟的周期为 5ns,那么我们将定义一个 5ns 的时钟约束,并在约束中指定该时钟的端口。

create_clock -period 5 [get_ports CLK]

注意:默认的周期单位为 1ns

注意:综合工具默认情况下认为时钟在时刻 0 置起,占空比为 50%

那么接下来问题来了,关于路径的建立时间检查。
问题 1: 假设我们的设计中,如图 4 所示的 FF-1 和 FF-2 来自相同的工艺库,clock pin 到 Q pin 的延迟 clock-Q delay 为 0.5ns,建立时间需求为 1ns。那么在时钟周期 5ns 的情况,并且不违反时序要求的情况下,两个 FF 之间的组合逻辑所能引入的最大延迟时多少?
答案是 ---------------------------------------------------------------------------------------3.5ns,具体分析见文末。

在前一条约束生效后,综合工具中的这个时钟的行为是理想的,即它的上升下降时延为 0,时钟偏斜(skew)为 0(因为在综合阶段,还没有进行时钟树综合)。我们可以通过约束时钟的上升下降跳变时间(transition)和时钟的偏斜时间,给综合工具提供尽可能准确的时钟行为,从而在综合阶段得到更准确的时序分析。约束的具体数值,一般来自估计。

Modeling clock skew //约束时钟抖动

时钟的不确定性(clock uncertainty)约束预估了时钟在时钟网络中不同点之间的最大延迟差异,也被称为时钟偏斜(clock skew),但广义上还包括了其他的时钟非理想行为,比如时钟抖动(clock jitter)以及一些预留的余量(margin)。

set_clock_uncerainity -setup Tu [get_clocks <clock>]

具体的约束示例 –

create_clock -period 5 [get_ports CLK] set_clock_uncertainty -setup 0.75 [get_clocks CLK]

Modeling transition time //约束时钟跳变时间

我们需要在 FF 的时钟管脚约束时钟所需的信号上升和下降时间。

set_clock_transition -max Tt [get_clocks <clock>]

具体的约束示例 –

create_clock -period 5 [get_ports CLK] set_clock_transition -max 0.1 [get_clocks CLK]

Next we will constrain the Input path //约束输入路径1

image.png
图 5:约束 input2reg 时序路径

想要约束设计中的所有输入路径的建立时序,除了约束时钟之外,我们还必须约束输入数据端口的数据相对于时钟边沿的最晚到达时间(即输入延迟 input delay)。在我们的示例中,发出数据的触发器位于 IP-2 中,因此 IP-1 的输入延迟等于数据发送 FF-3 的 clock-to-Q 延迟加上组合逻辑 logic-5 引入的延迟。

假设 IP-2 的设计者告诉我们,IP-1 输入端口 Input1 上的数据最晚到达时间,相对于 FF-3 的数据发送时钟边沿是 1.5ns ,那么我们可以将这条信息建模为一条这样的时序约束-

set_input_delay -max 1.5 -clock CLK [get_ports Input1]
那么接下来问题又来了,关于路径的建立时间检查。
问题 2: 假设输入端口 Input1 的输入延迟是 1.5ns,FF-1 的建立时间要求是 1ns,时钟周期是 5ns,时钟的不确定性是 0.75ns,跳变时间是 0.1ns,那么输入端口和 FF-1 之间的组合逻辑 logic-1 在不违反建立时序时,所能引入的最大延迟是多少?
答案是 ---------------------------------------------------------------------------------------1.65ns,具体分析见文末。

Then we will constrain the Output1 path // 约束输出路径 1

image.png
图 6:约束 reg2output 时序路径

和输入端口约束一样,想要约束设计中的所有输出路径的建立时序,除了约束时钟之外,还需要约束输出数据端口的数据,相对于接收时钟边沿的最晚到达时间(即输出延迟 output delay)。在我们的示例中,接收数据的触发器位于 IP-3 中,因此 IP-1 的输出延迟等于数据接收 FF-4 的建立时间加上组合逻辑 logic-7 引入的延迟。

假设后级模块 IP-3 的设计者告诉我们,IP-1 输出端口 output1上要求的数据最晚到达时间,相对于 FF-4 的数据接收时钟边沿是 2ns ,那么我们可以将这条信息建模为一条这样的时序约束-

set_output_delay -max 2 -clock CLK [get_ports Output1]

Finally let’s constrain the port-to-port combinational path // 约束 PORT2PORT 路径

image.png
图 7:约束端口到端口的时序路径

我们可以同时使用输入延迟和输出延迟约束,来约束纯组合逻辑的路径。

示例 -

set_input_delay -max 2 -clock CLK [get_ports Input2] set_output_delay -max 2.5 -clock CLK [get_ports Output2]

Constraining a purely combinational design // 约束一个纯组合逻辑设计

image.png
图 8:约束一个纯组合逻辑设计

假设我们的设计中仅有组合逻辑(自然也没有时钟信号),我们仍然可以使用 set_input_delayset_output_delay来进行输入和输出延迟约束,这些约束反映的是 IP-2 和 IP-3 中的延迟情况。但是这里存在一个问题,输入和输出延迟约束中的延迟量都是相对于端口所属的时钟上升下降沿设置的,在不存在实际时钟时,如何约束延迟量呢?我们采用虚拟时钟(Virtual Clock)来解决这个问题。虚拟时钟不是存在于某个端口的物理时钟,而只是 “虚拟” 出来,用于输入输出延迟设置延迟量所参考的一个虚拟时钟。

create_clock -name VCLK -period 5

再相对于虚拟时钟约束输入和输出延迟。

set_input_delay -max 2 -clock VCLK [get_ports Input2] set_output_delay -max 2.5 -clock VCLK [get_ports Output2]

说到这里你或许想问,为什么我们会讨论这个!毕竟实际世界中并不存在一个只有组合逻辑的 IP。

没错,但是注意示例中的 FF-5 和 FF-6 之间的路径,在 IP-1 中它就是上述只有纯组合逻辑的情况。

image.png
图 9:体现虚拟时钟必要性的路径

Time Budgeting // 时序预算

在上述的所有示例中,我们都假设可以从其他 IP 设计者那里获得用于约束输入和输出延迟量的实际延迟信息。但在现实中,这是无法做到的,因为这样做对于一个拥有大量端口的设计来说实在过于繁琐,并且使各个 IP 之间存在依赖性。

实际上,我们可以利用上级模块输出延迟和下级模块输入延迟都是基于一个完整周期的特点,来简化这一工作。比如 IP-2 的输出逻辑(combo logic-5)和我们 IP 的输入逻辑(combo logic-1),再比如我们 IP 的输出逻辑(combo logic-3)和 IP-3 的输入逻辑(combo logic-7)。这些输出逻辑的延迟加上输入逻辑的延迟需要等于或者小于一个时钟周期。

所以,时序预算的概念就出现了,用于在我们 IP 和相邻 IP 之间合理地分配这一个时钟周期。一开始,我们会 50-50 地在两个模块之间分配时序预算,也就是两个 IP 各自获得二分之一的周期作为输入/输出延迟来收敛时序。但这么做有一个问题,那就是没有为任何错误预留空间(也没有为两个模块之间的连线留下时序预算)。所以更好的方案是分别为两个模块留下小于半个时钟周期的预算,用于满足任何的额外时序延迟。

Solution // 问题解答

Answer – 1: 为了满足路径的建立时间,– - (clock-to-Q delay of FF-1 + delay due to combo logic-2) ≤ (time period of clock – setup time of FF-2) - (delay due to combo logic-2) ≤ (time period of clock – setup time of FF-2 – clock-to-Q delay of FF1) - (delay due to combo logic-2) ≤ 3.5 因此最大的组合逻辑延迟为 3.5ns.

Answer – 2: 为了满足路径的建立时间,–

  • (Input delay of port Input1 + delay due to combo logic-1) ≤ (time period of clock – setup time of FF-2 – clock uncertainty – clock transition time)
  • (delay due to combo logic-1) ≤ (time period of clock – setup time of FF-2 – clock uncertainty – clock transition time – Input delay of port Input1)
  • (delay due to combo logic-1) ≤ 1.65ns 因此最大的组合逻辑延迟为 1.65ns.
原文
This is article-1 of how to define Synthesis timing constraint

The objective is to define setup timing constraints for all inputs, internal and output paths.

Suppose we have a very simple and generic design (an IP) and we are the IP designer. It has a single clock domain; it has a combinational logic between input port and flip-flop, an internal combination logic between two flip-flops and combination logic between flip-flop and output port. Let’s see how we can constraint this design.

image.png
Figure 1: The design we want to constrain

Now typically our IP will be integrated with other IPs in a SoC (as shown in Figure 2). So the input to our IP will be coming from another IP, in this case from IP-2 and the output from our IP will be going to another IP, in this case to IP-3.

image.png
Figure 2: Our design integrated with other designs (or IPs)

Note : After defining the input and output constraints, by default the synthesis tool assumes the input data arrives from a pos-edge clocked device prior to our design (i.e. from IP-2) and the output data goes to a pos-edge clocked flip-flop in the subsequent design (i.e. to IP-3)

Before we learn about how to apply timing constraints to input, internal or output paths, we need to first understand the definition of a path. When a synthesis tool perform timing analysis, it break the design into timing paths. A timing path has a startpoint and an endpoint, discussed below are the possible startpoints and endpoints of a timing path.

Startpoint

  • Input port (other than a clock port)
  • Clock pin of flip-flop

Endpoint

  • Output port (other than a clock port)
  • Any input pin of a flip-flop (other than a clock pin)

Applying these definitions, let’s look an example (Figure 3)

image.png
Figure 3: Timing paths in a design

  • Path 1: Starts from an input port and ends in an input pin of flip-flop.
  • Path 2: Starts from a clock pin of flip-flop and ends in an input pin of flip-flop.
  • Path 3: Starts from a clock pin of flip-flop and ends in an output port.
  • Path 4: Starts from an input port and ends in an output port.

Now let’s go back to our design.

First we will focus on constraining register-to-register path –

image.png
Figure 4: Constraining reg-to-reg timing path

For constraining register-to-register paths for setup time, we only need to provide the clock period to the synthesis tool. Defining the clock in a single-clock design constrains all timing paths between registers for a single cycle setup time analysis. Assume the clock in our design is having a time period of 5ns, so we will define a clock with 5ns time period and specify clock port in the design.

create_clock -period 5 [get_ports CLK]

Note: Unit of time is 1ns in this example.

Note: Synthesis tool assumes the clock rises at zero ns with 50% duty cycle, by default.

Question – 1: Suppose in our design, both flip-flops (FF-1 and FF-2) are from the same technology library, having a clock-to-Q delay of 0.5ns and setup time of 1ns. What is the maximum possible delay that can be introduced by the combination logic between the two flip-flops (combo logic-2), if we don’t want to violate the setup time? Given the time period of clock is 5ns. [Answer is 3.5ns, for the solution refer the end of this article]

In synthesis tool, clocks have ideal behavior, meaning zero rise/fall time and zero skew (as in pre-layout synthesis no clock tree synthesis is there). Estimated skew and transition time can, and should be modeled for a more accurate representation of clock behavior and therefore a more realistic timing analysis.

Modeling clock skew

Uncertainty (clock uncertainty) models the maximum delay difference between the clock network branches, known as clock skew. But it can also include non-ideal behaviors of clock like clock jitter and margin –

set_clock_uncerainity -setup *Tu* [get_clocks *<clock>*]

Example –

create_clock -period 5 [get_ports CLK] set_clock_uncertainty -setup 0.75 [get_clocks CLK]

Modeling transition time

We need to model the rise and fall times of the clock waveform at the flip-flop clock pins –

set_clock_transition -max *Tt* [get_clocks *<clock>*]

Example –

create_clock -period 5 [get_ports CLK] set_clock_transition -max 0.1 [get_clocks CLK]

Next we will constrain the Input1 path –

image.png
Figure 5: Constraining input-to-reg timing path

To constrain all the input paths in our design for setup time, in addition to the clock, we must provide the latest arrival time of the data at the input ports relative to the launching flip-flop’s clock edge (the time delay is called the input delay). In our example, the launching flip-flop is from IP-2, thus the input delay is the sum of clock-to-Q delay of FF-3 and the delay due to combo logic-5.

Suppose the designer of IP-2 told us that the latest arrival time at port Input1, after the FF-3 launching clock edge is 1.5ns, then we can model the same for our design using –

set_input_delay -max 1.5 -clock CLK [get_ports Input1]

Question – 2: For an input delay of 1.5ns at port Input1, what is the maximum possible delay that can be introduced by the combination logic between the port Input1 and FF-1 (combo logic-1), if we don’t want to violate the setup time? Given the setup time of FF-1 is 1ns, time period of clock is 5ns, clock uncertainty is 0.75ns and clock transition time is 0.1ns. [Answer is 1.65ns, for the solution refer the end of this article]

Then we will constrain the Output1 path –

image.png
Figure 6: Constraining reg-to-output timing path

To constrain all the output paths in our design for setup time, in addition to the clock, we must provide the latest arrival time of the data at the output ports with respect to the capturing flip-flop’s clock edge (the time delay is called the output delay). In our example, the capturing flip-flop is in IP-3, thus the output delay is the sum of setup time of FF-4 and the delay due to combo logic-7.

Suppose the designer of IP-3 told us that the required latest arrival time at port Output1, before FF-4 capturing clock edge is 2ns, then we can model the same for our design using –

set_output_delay -max 2 -clock CLK [get_ports Output1]

Finally let’s constrain the port-to-port combinational path –

image.png
Figure 7: Constraining port-to-port timing path

We can constrain the pure combinational path using the input delay and output delay.

Example –

set_input_delay -max 2 -clock CLK [get_ports Input2] set_output_delay -max 2.5 -clock CLK [get_ports Output2]

Constraining a purely combinational design

image.png
Figure 8: Constraining a purely combinational design

Suppose we have an IP having combinational circuits only (thus no clocks in our design). Well we can still use set_input_delay and set_output_delay to constrain our inputs and outputs, because these constrains are simply modeling the delay in IP-2 and IP-3; but these are with respect to a clock. So how do we define a clock, if there is no clock in our design? The answer is virtual clock. The virtual clock is a clock that is not connected to any port or pin within the current design but instead serves as a reference for input or output delays.

create_clock -name VCLK -period 5

Now constrain our design with respect to this virtual clock.

set_input_delay -max 2 -clock VCLK [get_ports Input2] set_output_delay -max 2.5 -clock VCLK [get_ports Output2]

But you may think, why we are discussing this! Because in practical we will never have a purely combinational logic IP. That is true, but what if this is the case (notice the clock of the FF-5 and FF-6; we don’t have CLK-2 in our design ) –

image.png
Figure 9: A design illustrating the necessity of virtual clock

Time Budgeting

In all the examples above, we have assumed the input delay and output delay is provided to us by the other IPs’ designers, but in reality this is not the case because it is a very cumbersome task with IPs having large number of ports and also by doing so, the IPs become interdependent.

Instead we take advantage of the fact that IP-2’s output logic (combo logic-5) and our design’s input logic (combo logic-1) or our design’s output logic (combo logic-3) and IP-3’s input logic (combo logic-7) is always constrained by a full clock period; in other words the sum of delays for both the cases should be equal to or less than the time period.

So we create a time budget, meaning we allocate an appropriate portion of time period between our IP and other IPs. At first we may assume, we can use the 50-50 rules, i.e. every IP gets half a period to close the time, but this doesn’t leave any room for error. So a better approach is to allocate a number lesser than 50% for each IP, so that there is some portion of time period left to accompany any unanticipated delay.

Solutions

Answer – 1: For satisfying setup time – (clock-to-Q delay of FF-1 + delay due to combo logic-2) ≤ (time period of clock – setup time of FF-2) Thus, (delay due to combo logic-2) ≤ (time period of clock – setup time of FF-2 – clock-to-Q delay of FF1) Implies, (delay due to combo logic-2) ≤ 3.5 Thus maximum possible delay that can be introduced by the combo logic-2 is 3.5ns.

Answer – 2: For satisfying setup time – (Input delay of port Input1 + delay due to combo logic-1) ≤ (time period of clock – setup time of FF-2 – clock uncertainty – clock transition time) Thus, (delay due to combo logic-1) ≤ (time period of clock – setup time of FF-2 – clock uncertainty – clock transition time – Input delay of port Input1) Implies, (delay due to combo logic-1) ≤ 1.65ns Thus maximum possible delay that can be introduced by the combo logic-1 is 1.65ns

原文:知乎
作者:LogicJitterGibbs

相关文章推荐

更多FPGA干货请关注FPGA的逻辑技术专栏。欢迎添加极术小姐姐微信(id:aijishu20)加入技术交流群,请备注研究方向。
推荐阅读
关注数
10604
内容数
561
FPGA Logic 二三事
目录
极术微信服务号
关注极术微信号
实时接收点赞提醒和评论通知
安谋科技学堂公众号
关注安谋科技学堂
实时获取安谋科技及 Arm 教学资源
安谋科技招聘公众号
关注安谋科技招聘
实时获取安谋科技中国职位信息