|
Posting this verbatim from my reddit comment, but I ran a few quick fits and the (arbitrarily scaled) Planck distribution looks better than standard fits of some other distributions. Of course I ran absolutely no statistics on these or tried any sort of real analysis that should have been included in the paper in the first place, so take it with multiple grains of salt: I fit a few distributions to the data and actually the Planck distribution at 9000K seems to be noticeably better. I did no further statistics and haven't even looked at the fitted parameters for this. Note: I am an earth scientist, not a physicist.
https://imgur.com/Qg4ixF2 Edit: More distributions: https://imgur.com/IrF6lsV Edit: Filtering out sodium and potassium to try to account for some of the low-wavelength counts doesn't seem to help fitting the distributions either: https://imgur.com/dHDUS9Z You can get the data here: https://physics.nist.gov/PhysRefData/ASD/lines_form.html And here's the (garbage) code to reproduce my plots: %pylab inline
import pandas as pd
import scipy
import scipy.stats
mpl.style.use('seaborn-muted')
mpl.rcParams['figure.figsize'] = (18, 12)
mpl.rcParams['font.size'] = 16
df = pd.read_csv('nist.csv')
def filter_string(x):
x = str(x).replace('=', '').replace('"', '').replace('*', '').replace('+', '').replace('(', '').replace(')', '')
if len(x) == 0:
x = 'NaN'
return x
asl_comp = df['ritz_wl_vac(nm)'].apply(filter_string).astype(float)
asl_ver = df['obs_wl_vac(nm)'].apply(filter_string).astype(float)
sp_num = df['sp_num'].astype(float)
df_neutral = df[sp_num == 1]
asl_comp = df_neutral['ritz_wl_vac(nm)'].apply(filter_string).astype(float)
asl_ver = df_neutral['obs_wl_vac(nm)'].apply(filter_string).astype(float)
sp_num = df_neutral['sp_num'].astype(float)
asl_ver = asl_ver[~asl_ver.isna()]
def planck(wav, T=9000.0):
h = 6.626e-34
c = 3.0e+8
k = 1.38e-23
a = 2.0*h*c**2
b = h*c/(wav*k*T)
intensity = (a / ( (wav**5)) * (1 / (np.exp(b) - 1.0) ))
return intensity
dist_names = ['beta', 'lognorm', 'pearson3', 'gumbel_r']
x = np.arange(0, 1500, 5)
intensity = planck(x / 1e9)
size = len(asl_ver)
yvals, xvals = np.histogram(asl_ver, bins=300, normed=True)
for dist_name in dist_names:
dist = getattr(scipy.stats, dist_name)
param = dist.fit(asl_ver)
pdf_fitted = dist.pdf(x, *param[:-2], loc=param[-2], scale=param[-1])
plt.plot(x, pdf_fitted, label=dist_name, linewidth=4,)
plt.hist(asl_ver, bins=300, density=True, color='grey');
plt.plot(x, intensity / 1.1e17, linewidth=4, color='red', linestyle='--', label='scaled planck (9000k)')
plt.legend()
plt.xlabel(r'$\lambda (nm)$')
plt.ylabel('Density')
|
The Planck fit gets the peak height right, but mainly because it has no chance of fitting the left hand side at all (since it increases much more slowly than the other distributions), so it doesn't even try.
This is why I said it would be better to have actual statistical tests -- we shouldn't have to have this kind of qualitative discussion when it's already a solved problem.