Yet I also feel the things C910 does well are overshadowed by executing poorly on the basics. The core’s out-of-order engine is poorly balanced, with inadequate capacity in critical structures like the schedulers and register files in relation to its ROB capacity. CPU performance is often limited by memory access performance, and C910’s cache subsystem is exceptionally weak. The cluster’s shared L2 is both slow and small, and the C910 cores have no mid-level cache to insulate L1 misses from that L2. DRAM bandwidth is also lackluster.
I'm not a CPU designer but shouldn't this be points that one could discover using higher-level simulators? Ie before even needing to do FPGA or gate-level sims?
If so, are they doing a SpaceX thing where they iterate fast with known less-than-optimal solutions just to gain experience building the things?
magicalhippo ·12 days ago
I'm not a CPU designer but shouldn't this be points that one could discover using higher-level simulators? Ie before even needing to do FPGA or gate-level sims?
If so, are they doing a SpaceX thing where they iterate fast with known less-than-optimal solutions just to gain experience building the things?
Show replies
gnfargbl ·13 days ago
JoachimS ·12 days ago
Show replies
fcanesin ·12 days ago
Show replies
torginus ·12 days ago
I remember this post,by an ARM engineer, who was highly critical of the RISC-V ISA:
https://news.ycombinator.com/item?id=24958423
Show replies