Assuming that the response variable is binary and so the model is logistic, the sample size/size of groups is at the very lower limit at which HL should be considered reliable. Harrell, F. E., Jr. (2016). Regression modeling strategies. Springer International Publishing at 247.
The {logisticDx::dx} package is a diagnostic measure. That package has a separate gof function. I'd expect that the results of applying the two function to a model would differ similarly to what was found in your case.
