|
|
|
|
|
by alextgordon
5404 days ago
|
|
Just did it for 28000 C files. Here's the results: a 0.772163
b 1.2679
c 1.78209
d 1.1195
e 0.881398
f 1.47252
g 0.924242
h 0.358954
i 1.06756
j 0.835313
k 1.41458
l 0.981729
m 1.08955
n 0.9156
o 0.73849
p 1.74468
q 4.2497
r 1.21577
s 1.05023
t 1.03627
u 1.2967
v 1.77662
w 0.396003
x 13.7292
y 0.47566
z 3.78748
The numbers are (relative frequency in C) / (relative frequency in English). So "b" is slightly more common in C than English, but "w" is a lot more common in English than C.The raw counts for symbol characters: _ 22890057
, 10895692
) 10749798
( 10745839
* 9211904
; 8187969
- 6628768
= 5878296
> 4428291
/ 3468260
. 3011078
{ 2212412
} 2211783
" 2120264
& 1647188
: 1032587
+ 962554
# 909859
[ 889538
] 888722
< 839910
| 643903
% 583092
! 561462
\ 540456
' 454201
@ 131199
? 112488
~ 84629
^ 19064
$ 17922
` 7272
[space] 74199965
|
|