棋子 · 2020年02月17日

Overlapping the execution of VDIV.F32 and SDIV/UDIV

The Cortex-M4F has separate hardware for integer and floating-point arithmetic. Both integer and floating-point divide instructions take up to 12 clock cycles to complete. I've verified that integer instructions immediately following a VDIV are able to execute simultaneously while the VDIV is finishing. However, the reverse does not seem to be true - i.e., floating-point instructions immediately following an integer divide (SDIV or UDIV) must wait for the divide to complete before the floating-point instructions proceed. Does anyone know why the you can't overlap the execution of an integer divide with the execution of floating-point instructions?

1 个回答 得票排序 · 时间排序
极术小姐姐 · 2020年02月17日

I am pretty sure it is a pipeline thing. I remember a Doulos webinar explaining the CM4 pipeline, but I am not sure if there is a public version.

你的回答
关注数
1
收藏数
0
浏览数
2943
极术小姐姐
极术微信服务号
关注极术微信号
实时接收点赞提醒和评论通知
安谋科技学堂公众号
关注安谋科技学堂
实时获取安谋科技及 Arm 教学资源
安谋科技招聘公众号
关注安谋科技招聘
实时获取安谋科技中国职位信息