I wonder if this is similar to how that experiment[0] where they attempted to domesticate foxes led to traits like floppy ears.
Which is to say, even when attempting to objectively select for "well aligned" behavior, human tendency to favor non-material signals of "friendliness" still leaks in.
Which is to say, even when attempting to objectively select for "well aligned" behavior, human tendency to favor non-material signals of "friendliness" still leaks in.
[0] https://en.m.wikipedia.org/wiki/Domesticated_silver_fox