Since the feature was first implemented, vertical video on desktop has become a thing. When resizing a PiP window for a horizontal video and then using the same height for a vertical video, the experience is worse than if both used the same width instead.
It can be an about:config setting or a about:settings setting if you conclude the default behaviour is still ideal, but also see that some users would prefer going by width.
A third option of using some dynamic size derived from the height and width is also possible. For example total pixels. Please just revisit the decision.