The aspect that will probably be most problematic for programmer is the loss of memory mapping paradigm.
Whereas adjacent bytes in memory are in the same or adjacent words (ditto for half words and doubles),
once stored in vector registers this will no longer hold when SLEN <= 1/4 VLEN.
Indeed, in memory consecutive bytes advance through halfs, words and doubles,
but in vector registers with SLEN<= 1/2 VLEN, they jump to consecutive SLEN chunks.
Due to this SLEN relative to VLEN dependency,
it is at least as hard for one to get ones mind around (to grok) than the various big-endian formats.
It may prove challenging to porting code that assumes the memory mapping model in overlapping registers of differing power of two widths .
I have no immediate solution.