76 comments
magicalhippo · 12 days ago
Yet I also feel the things C910 does well are overshadowed by executing poorly on the basics. The core’s out-of-order engine is poorly balanced, with inadequate capacity in critical structures like the schedulers and register files in relation to its ROB capacity. CPU performance is often limited by memory access performance, and C910’s cache subsystem is exceptionally weak. The cluster’s shared L2 is both slow and small, and the C910 cores have no mid-level cache to insulate L1 misses from that L2. DRAM bandwidth is also lackluster.

I'm not a CPU designer but shouldn't this be points that one could discover using higher-level simulators? Ie before even needing to do FPGA or gate-level sims?

If so, are they doing a SpaceX thing where they iterate fast with known less-than-optimal solutions just to gain experience building the things?

Show replies

gnfargbl · 13 days ago
JoachimS · 12 days ago
Another amazing analysis by Chester Lam. I'm astounded of the cadence and persistence, and at the same time the depth and comprehensiveness.

Show replies

fcanesin · 12 days ago

Show replies

torginus · 12 days ago
Has it been demonstrated that RISC-V is architecturally suitable for making chips that equal the performance of high-end x86 and ARM designs?

I remember this post,by an ARM engineer, who was highly critical of the RISC-V ISA:

https://news.ycombinator.com/item?id=24958423

Show replies