I guess it varies with use case ... if a typical repeated call in a loop is going to have the same n (eg, dice game with 6 sized dice, random position in a square array), remembering the last n and last mask will avoid the overhead after the first call, making the mask generation overhead trivial, so it is mostly a convenience to the caller to infer the mask rather than demand that it be passed.Good point. Though if the output were shifted in parallel with testing the range, that would save having a lookup table. That said, having a "top bit" table would be useful in general, even if it's a lot of memory to use for this. Ie, getting the mask could be just a lookup or two.
I'm running BigCrush again with reversed bits, that will help test if the low bits are good. I'll be disappointed if they aren't.
Also I fixed a couple typos in the code above.
But if a typical repeated call in a loop is going to have different value, might allow the caller to hand the mask or #0, and only compute the mask when the caller doesn't hand it.