Hacker News new | ask | show | jobs
by aorist 1241 days ago
It doesn't count `failure` — just the number of rows. But neither does the pandas version: `pd_df.groupby(['date'])['failure'].count()` and `pd_df.groupby(['date']).count()` are the same except the former returns a single `pd.Series` with the count and the latter produces a `pd.DataFrame` where each column has the same count (not super useful).

e.g.

    > iris.groupby('species').count()
                sepal_length  sepal_width  petal_length  petal_width
    species
    setosa                50           50            50           50
    versicolor            50           50            50           50
    virginica             50           50            50           50
vs.

    > iris.groupby('species')['sepal_length'].count()
    species
    setosa        50
    versicolor    50
    virginica     50
1 comments

I believe `count` will only give you the number of non-null rows, so the numbers from the first command could differ by column if there were null values. You can also use the `size` command to get the total number of rows, and that will return a `pd.Series` with or without a column specifier.