Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Others have pointed out adding bits to identify instruction types eats into your instruction length, so let's go stupid big time: what if you had the instructions as described here, without any instruction length being part of the instruction, but have that stored separately? (3 bits would be plenty per word) You might put it 1. as a contiguous bit string somewhere, or you might 2. put it at the bottom of the cache line that holds the instructions (the cache line being 512 bits I assume).

Okay, for 1. you'd have to do two fetches to get stuff into the I-cache (but not if it's part of the same cache line, option 2.) and of course you're going to reduce instruction density because you're using up cache, but there's nothing you can do about that, but at least it would allow n-bit instructions to be genuinely n-bits long which is a big advantage.

That this hasn't been done before to my knowledge is proof that it's a rotten idea, but can the experts here please explain why – thanks



> you'd have to do two fetches

I think this is the big downside. You're effectively taking information which will always be needed at the same time, and storing it in two different places.

There is never a need for one piece of information without the other, so why not store it together.


> There is never a need for one piece of information without the other, so why not store it together.

Why not? as I said, so you can have full-length instructions!

And you can store it together in the same fetchable unit - the cache line (my option 2)


Interesting idea. Effectively moving the extra decode stage in front of the Icache, making the Icache a bit like a CISC trace/microOp cache. On a 512b line you would add 32 bits to mark the instruction boundaries. At which point you start to wonder if there is anything else worth adding that simplifies the later decode chain. And if the roughly 5% adder to Icache size (figuring less than 1/16th since a lot of shared overhead) is worth it.


Why not treat it like Unicode does and just have two marker instructions before and after the compressed ones?

Start compressed instructions <size>

Compressed instructions

End compressed instructions <size>


Which Unicode encoding are you talking about? It sounds a bit like you're talking about UTF-16 conjugate pairs, but that's not how those work. It's not how UTF-8 or UTF-32 work. So, which encoding is this?


If I understand you correctly, the guy I'm responding to is proposing allowing the mixing of different sized instructions. Your suggestion effectively says "I'm starting a run of compressed instructions/I'm finishing a run of compressed instructions" which is a different proposition. Just my take though.


> let's go stupid big time

Wouldn't that be to Huffman encode the instructions? Fixed table, but still, would save a lot of bits on the common instructions surely...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: