From the RVI thread on 48 bit instructions, 64 bit ones would probably look similar:
> There are several 48-bit instruction possibilities.
> 1. PC-relative long jump
> 2. GP-relative addressing to support large small data area, effectively giving GP-relative access to entire data address space of most programs
> 3. Load upper 32-bits of 64-bit constants or addresses
> 4. Or lower 32-bits of 64-bit constants or addresses
> 5. And with 32-bit mask
> 6. More effective ins/ext of 64-bit bit fields
Another thing thats offten discussed is moving the vtype and setvl into each vector instructions, I'm not sure if that requries 48 or 64 bit instructions.
I was really asking about 64-bit instructions specifically, but going with what you've put, if you don't mind...
> 1. PC-relative long jump
My understanding is that these are rare
> 2. GP-relative addressing to support large small data area, effectively giving GP-relative access to entire data address space of most programs
What is 'GP' here? but "...access to entire data address space of most programs" In this case you are just going to be bouncing all over the address space, substantially missing any level of cache much of the time, surely?. Maybe you get a little extra code density but you aren't going to get any extra speed to speak of.
> 3. Load upper 32-bits of 64-bit constants or addresses
> 4. Or lower 32-bits of 64-bit constants or addresses
> 5. And with 32-bit mask
Well yeah, but how common is this? I understand the alpha architecture team looked at this and found it uncommon which is why they were okay with less-than-32-bit constants. If it really speeded things up you might build a specific cache to store constants (a kind of larger, stupider, register set). It would seem a simpler solution.
I'm not sure what you mean with 6, and I'm not familiar with vtype/setvl
On vtype/setvl: in the RISC-V V extension (aka RVV / Vector (≈SIMD)), due to the 32-bit instruction length, there's a separate instruction that does some configuration (operated-on element size, register group size, masked-off element behavior, target element count), which arith/etc operations afterwards will work by. So e.g. if you wanted to add vectors of int32_t-s, you'd need something like "vsetvli x0,x0,e32,m1,ta,ma; vadd.vv dst,src1,src2"
Often one vsetvl stays valid for multiple/most/all instructions, but sometimes there's a need to toggle it for a single instruction and then toggle it back. With 48-bit or 64-bit instructions, such temporary changes could be encoded in the operation instruction itself.
Additionally, masked instructions always mask by v0, which could be expanded to allow any register (and perhaps built-in negation) by more instruction bits too.
Depends on how many bits you had to start with. On Power ISA they aren't common either, but when they happen you need up to seven instructions (lis, ori, rldicl, oris, ori, then for branches mtctr/b(c)ctr) to specify the new address or larger value. Most other RISCs are similar when full 64-bit values must be specified. This is a significant savings.
> There are several 48-bit instruction possibilities.
> 1. PC-relative long jump
> 2. GP-relative addressing to support large small data area, effectively giving GP-relative access to entire data address space of most programs
> 3. Load upper 32-bits of 64-bit constants or addresses
> 4. Or lower 32-bits of 64-bit constants or addresses
> 5. And with 32-bit mask
> 6. More effective ins/ext of 64-bit bit fields
Another thing thats offten discussed is moving the vtype and setvl into each vector instructions, I'm not sure if that requries 48 or 64 bit instructions.