罗风 · 9月22日

DFT | SSN: Streaming Scan Network

一次酒局,听业内某位DFT 大拿讲DFT 技术革新:先是感慨近几年工艺跟设计踏着一个又一个技术热点或噱头『突飞猛进』,以致整个数字实现链路上的相关技术大都发生了巨变,尤其是在AI 加持下,但DFT 作为数字实现链路上非常重要的一环却似乎一直都『止步不前』;说到起伏处突然话锋一转目露精光,问在座的各位有没有了解过西门家的SSN, 不待众人反应,DFT 大拿自顾自解说起来:DFT 终于迎来了又一次巨大的进步,scan 走到了头,SSN 相当于重新给scan 设计开了一道门,SSN 特别适合:

  • 超级大芯片
  • tile based 的芯片,也就是channeless 的 flow
  • 有大量重复IP 的芯片

大拿言简意赅地总结了SSN 的优点:

  • 将block design 和soc top 完全隔离开来,做block设计时完全不用考虑soc top怎么做, 真正的hierarchical 设计;
  • scan 数据以packet 的方式往里面送, 测试时间和测试data volume 都小很多;
  • 可以on-chip compare, 特别适合芯片中有大量copy 的harden block 的设计;
  • 测试时间会短很多,测试效率高很多,scan shift clock 可以做得很高,可以做到100~200MHz,这就可以大幅缩短测试时间;
  • 由于没有了edt channel 在top level 的限制, edt 个数可以做的很多,压缩比可以做的小一些,对后端routing 非常友好。

酒局之后老驴就一直想找时间码一篇文,把这个好东西分享出来,让更多人看到,老驴对DFT 一知半解,自己没有真正做过DFT 设计,全是道听途说。下面的内容是从网上找的一些西门家对所有人可见的whitepaper, 老驴原文照搬,有兴趣的看官可进一步咨询西门家的店小二。

什么是SSN

SSN, Streaming Scan Network, SSN distributes packetized scan test data to multiple cores across a design across a synchronous SSN bus. Each core typically contains one Streaming Scan Host (SSH) node. The SSH drives local scan resources to load and unload scan chains/channels with data delivered on the SSN bus. The SSH node can interface with one or more EDT controller(s), uncompressed scan chains, or a combination of the two.

aijishu_dft1.png

Each SSH has an IEEE 1687 IJTAG interface, and a parallel data bus that transports the payload scan data and connects one SSH node to the next. The IJTAG network is used to configure all nodes in the SSN network prior to the application of a test pattern set. Each node is loaded with information related to the protocol such as the active bus width, its location in the series of nodes driven, the number of shift cycles per scan pattern, scan\_enable transition timing information, etc.

The entire scan test pattern set is applied as packetized data that is streamed on the parallel SSN bus. The SSHs are programmed just once per pattern set and only the scan payload is streamed following the setup. There is no need to send any opcode or address information with each packet.

aijishu_dft2.png

Each SSH controls the local scan operations for the core, including transitions between load/unload and capture stages, as well as performing individual shift operations. All scan signals and EDT controls are generated by the SSN local to the core. Timing closure is self-contained to the core regardless of test mode. SSN clock skew between cores is tolerated without impacting the bus shift speed.

SSN provides a scalable method for testing any number of identical core instances. Stimuli, expected responses, and compare/nocompare mask data are scanned in within each packet. Each core performs its own on-chip comparison. The accumulated per-shift status bits and a per-core instance pass/fail sticky bit are observed on the tester. The packet accumulates the pass/fail status from a given channel/shift cycle across all identical core instances (or a subset of them).

The SSN implementation flow is based on the Tessent Shell flow for hierarchical designs, and leverages Tessent Connect automation. Dedicated debug and verification capabilities—including DRCs, testbenches, SSN continuity patterns, SSN loopback patterns, and a streaming over IJTAG survivability mode—ensure a smooth and effective DFT implementation flow.

SSN is fully integrated with all other Tessent DFT technologies and products such as Tessent MemoryBIST and Tessent LogicBIST. SSN supports all ATPG pattern types and fault models. SSN has full support for diagnosis and yield analysis.

SSN 为什么是重大革新

With the growth in design sizes, design flows have become more hierarchical, creating design cores that are functionally complete all the way through physical design. The finished blocks are then instantiated into the top level of a chip, or to chiplets then to chip top. Companies that tried to continue with full, flat ATPG or partitioned-based approaches with manual steps suffered severe time-to-market delays. Breaking designs into smaller pieces makes physical implementation more manageable for designers as well as for automation tools. 

Hierarchical DFT also lets you take advantage of cores that have many identical instantiations, as seen in many AI designs. All the design effort goes into one instance, which can then be instantiated as many times as required. DFT also benefits from a similar divide-and-conquer approach that is consistent with the rest of the design flow and addresses the same problems with large designs. 

Tessent Hierarchical DFT was introduced so that not only can physical design blocks be functionally complete, but DFT complete as well. This methodology requires a few key technologies such as core wrapping for core isolation, graybox model generation to reduce machine memory consumption, and pattern retargeting to re-use core level generated patterns. 

aijishu_dft3.png

The move to hierarchical DFT has demonstrated dramatic improvements in all aspects of DFT. For large SoC designs, hierarchical DFT has become the standard practice. 

Tessent Streaming Scan Network (SSN) takes plug-and-play DFT to the next level. It provides a packetized data delivery bus on top of TestKompress to make core TestKompress optimization completely independent from the core embedding and SSN bus width. It also has completely scalable handling of identical cores utilizing on-chip compare.

西门官方对SSN 特性和优势总结

  • Automated insertion and integration of SSN circuitry.
  • Pattern-generation and re-targeting for SSN.
  • Ideal for tiled designs with abutment.
  • Dedicated debug and verification, including DRCs, testbenches, continuity patterns, loopback patterns, and survivability mode (streaming over IJTAG).
  • Reverse mapping for diagnosis: full support for Tessent Diagnosis production diagnosis.
  • Compatible with Tessent Connectautomation and all other Tessent DFT products and technologies such as Tessent MemoryBIST and Tessent LogicBIST.
  • Decouple core-level DFT configuration and chip-level DFT resources.
  • Dramatic reduction of DFT planningand implementation effort while minimizing test cost.
  • Independent shift/capture of concurrently tested cores.
  • Automated bandwidth managementminimizes shift length/pattern count imbalance. 
  • Time multiplexing maximizes shift frequency without impacting timing closure.
3 阅读 872
推荐阅读
0 条评论
关注数
2448
内容数
621
主要交流IC以及SoC设计流程相关的技术和知识
目录
极术微信服务号
关注极术微信号
实时接收点赞提醒和评论通知
Arm中国学堂公众号
关注Arm中国学堂
实时获取免费 Arm 教学资源信息
Arm中国招聘公众号
关注Arm中国招聘
实时获取 Arm 中国职位信息