Hacker News new | ask | show | jobs
by tialaramex 51 days ago
Although it feels intuitively as though a std::scan could make sense, it doesn't, at least not with the sort of API I've seen suggested

Consider a hypothetical Goose type, we can express any Goose usefully as output and, conveniently, some potential inputs could be read as a Goose successfully though most arbitrary strings cannot be understood as a Goose.

Providing std::print for Goose is simple, we've got a variable (or maybe a constant) of type Goose, we just emit the correct sequence of symbols. It's annoying to actually write all the boilerplate in C++ 23 but that's mechanical it's not actually tricky to do just very boring (and so hence maybe C++ 26 makes that easier via reflection)

But how could std::scan for Goose work? We need a Goose variable to potentially store the Goose if we read one, but how can we make a default Goose? No, each Goose is unique and there is no substitute, this can't work.

The std::scan idea seem attractive for simple almost untyped input, strings, integers, that sort of thing, but the whole point of "Parse, don't validate" is that you probably want to parse email addresses and ISBNs and ISO dates, you don't want a string, another string and a third string.

Rust's FromStr trait is more appropriate. Given a type implements FromStr we can parse any string to (maybe) get an instance of that type, but we don't need an "empty" instance first because we're doing the construction when we call the function.

1 comments

Rust's FromStr only deals with parsing a single object. However, ideally std::scan() would be an exact counterpart of std::print() and would be able to parse multiple objects. I totally agree that the C way of passing references to already existing variables is not great. Ideally you return a tuple of objects, but then it becomes very annoying to specify the types. Maybe something like this?

    auto [value, text, goose] = std::scan<int, std::string, Goose>(input, "{} {} {}");
A halfway solution would be to have the hypothetical std::scan() take references to std::optional<>s or std::expected<>s:

    std::optional<int> value;
    std::optional<std::string> text;
    std::optional<Goose> goose;
    /* auto result = */ std::scan(input, "{} {} {}", value, text, goose);
The latter would be type safe, close to how scanf() works, but less satisfying from a functional programming standpoint.

Orthogonal to that, adding support for scanning a Goose would be just like how you add a formatter for it, and would be quite similar to a Rust trait. One could imagine having to define something like this:

    template<>
    struct std::scanner<Goose> {
        constexpr auto parse(std::format_parse_context& ctx) {…}
        auto scan(std::format_context& ctx) const -> std::optional<Goose> {…}
    };