Hacker News new | ask | show | jobs
by pcmonk 3399 days ago
Generally, "flatten" means the first of those two orders, but I see that the problem as given doesn't disambiguate those. If an interviewer wants to only accept one of those, he should clarify.

Incidentally, the first of those is preorder and postorder traversal of the list-of-lists (inorder only exists for binary trees), while the second is none of those. I would describe that difference as depth-first (first) vs breadth-first (second).

1 comments

I think "flatten" is kind of underdefined.

In many situations, the order of the leaves doesn't matter at all. In others, it matters a great deal. The question should be clear if it does matter.

Depth-first for breadth-first was not specified. Both preorder and postorder are subclasses of depth-first. Neither preorder nor postorder was specified.

Still, it depends on what you consider a node and a leaf and how you build your tree how much the order matters.

Is a deeper nesting level the child node of a value? Are all values leaves and their parent nodes the nesting level? Are nesting levels collapsed into single nodes or are different nested lists at the same level sibling nodes?

     [1,[2,3], [4, [5,6]]]

      []
     / |
    1  [[]].
      / | | \
    2   3 4  [[[]]]
              |  |
              5  6


In preorder you're going to get: [], 1, [[]], 2, 3, 4, [[[]]], 5, 6 In postorder: 1, 2, 3, 4, 5, 6, [[[]]], [[]], []

Let me explode that initial root this time and make some values nodes if they are followed by an increased nesting level.:

    [1,[2,3], [4, [5,6]]]

       1
      /  \
    []    []
    |\     |
    2 3    4
           |
          [[]]
          |  |
          5  6
Preorder? 1, [], 2, 3, [], 4, [[]], 5, 6 Postorder? Aah... 2, 3, [], 5, 6, [[]], 4, [], 1

So even after you splice out empty nodes, your values are suddenly out of sort order. It's still depth-first.

Or maybe we just don't build the nesting levels into our tree because we don't care. But we still build the tree according to it. We're flattening, after all, and the spec doesn't say we need to retain a nesting level as some attribute of the objects in the flattened list. But what becomes a node vs. a leaf is left as an exercise.

       1
     / | \
    2  3  4
         / \
        5   6
Preorder? 1, 2, 3, 4, 5, 6 Postorder? 2, 3, 5, 6, 4, 1

So yeah, it matters. The obvious recursive approach is preorder, especially if you want to maintain sorting. That's what I called "flatten" in my code example earlier. "flatten2" is breadth first as mentioned. Consider this depth-first Perl5 subroutine:

    sub flatten3 {
        my $n = shift;
        my $o;
        my @f;

        for ( @$n ) {
            if ( ref $_ eq 'ARRAY' ) {
                unshift @f, flatten3( $_ );
            } else {
                push @f, $_;
            }
        }
        return @f;
    }
In each level of recursion it's place the children at the beginning of the array (unshift) and the parent to the end of the array (push) even though it's actually considering them each whenever it reaches them.

Given my expanded example input of:

    my $nested = [ 1, [ 2, 3 ], [ 4, [ 5, 6, [ 7, [ 8, [ 9, 10 ] ], 11, 12, [ 13, [ 14, [ 15, 16 ], 17 ], 18 ], 19 ], 20 ] ] ];
This produces output like this, which is definitely the absolute deepest nesting first in the output.:

    15 16 14 17 13 18 9 10 8 7 11 12 19 5 6 20 4 2 3 1

Then there's a true postorder traversal, with an array holding the parent back until after its immediate children.:

    sub flatten4 {
        my $n = shift;
        my @p;
        my @f;

        for my $node ( @$n ) {
            if ( ref $node ) {
                push @f, flatten4( $node );
            } else {
                push @p, $node;
            }
        }

        push @f, @p;
        return @f;
    }

which provides output as such since it's considering the parent node after all its children every time.:

    2 3 9 10 8 15 16 14 17 13 18 7 11 12 19 5 6 20 4 1

Whereas preorder depth-first keeps the numerical order for this input and the breadth-first traversal gives:

   1 2 3 4 5 6 20 7 11 12 19 8 13 18 9 10 14 17 15 16
as previously shown.

So no, preorder and postorder do not necessarily mean the same output just because the input started as a nested list.

A nested list is not simply a list. It is a special representation of a tree. How you consider that tree to be represented by the nested list is important, as is how you traverse it. In fact, in Perl [] is an array reference, which is why this code is testing for references and recursing on them. So it's already traversing an actual tree structure.

Don't overthink it. But if we're gonna overthink it, one thing we can notice is that a rose tree is a free monad, and like any free structure is initial in its category - meaning there's a unique monad homomorphism from a rose tree to any other monad. Lists form a monad, so we can ask "what is that homomorphism?" It turns out to be the traditional definition of flatten.

That is to say, only flatten solves these equations for lists and rose trees:

    (flatten . f =<< flatten m) == flatten (f =<< m)
    pure x = flatten (pure x)
I believe speaking of monads, monoids, functors, and homomorphism when discussing an interview question for fairly fresh programmers is definitely overthinking it.

I'm not sure what you think is so simple about your solution compared to:

    sub flatten {
        map { ref ? flatten( @$_ ) : $_ } @_;
    }
  
Goodness, and Perl gets a bad reputation for the amount of punctuation in the code.

Of course if you want flatten in Haskell you have it for Tree and Forest.

    import Data.Tree
    tree = Node "A" [Node "B" [], Node "C" [Node "D" [], Node "E" []], Node "F" []]
    main = do
        print $ flatten tree

If a Perl programmer wanted to pull in a CPAN module, there are many from which to choose. Of course, it was just done in a one-line subroutine...:

http://search.cpan.org/~obradovic/List-Flatten-0.01/lib/List... (which does not handle arbitrary depths)

http://search.cpan.org/~rthompson/List-Flatten-Recursive-0.1... (which seems overly complicated)

http://search.cpan.org/~rsavage/Set-Array-0.30/lib/Set/Array... (which tries to flatten hashes as well as lists and seems, well, overly complicated... and pulls in a bunch more methods and functions)

http://search.cpan.org/~satoh/List-Enumerator-0.10/ (which seems about right, including stopping the flattening at an arbitrary depth and comes with other useful array tools)

Of course it's possible to apply the concept of flattening to hashes/dictionaries, too, so there goes the concept of keeping the original sort order.

http://search.cpan.org/~bbc/Hash-Flatten-1.19/lib/Hash/Flatt...

http://search.cpan.org/~chocolate/Hash-Fold-0.1.2/lib/Hash/F...

> I believe speaking of monads, monoids, functors, and homomorphism when discussing an interview question for fairly fresh programmers is definitely overthinking it.

I agree. I said as much. I just thought it was interesting.

> I'm not sure what you think is so simple about your solution compared to [...]

Which solution? I didn't present an implementation in this thread. I did elsewhere (https://news.ycombinator.com/item?id=13726564), but I don't think that's what you're talking about? I was discussing specification, and the code fragment in my comment was a property, not a definition.

> Of course if you want flatten in Haskell you have it for Tree and Forest.

Yes, though Tree is a slightly less natural choice for this than a rose tree (aka `Free []`). Of course, you still have it (in the form of toList).

> If a Perl programmer wanted to pull in a CPAN module, there are many from which to choose. Of course, it was just done in a one-line subroutine.

... yes?

You seem to be desperately trying to defend perl against an attack you imagine me to have made. I have nothing against perl (at least, nothing beyond a strong desire for static types on large projects, but that applies equally to a great many languages).

Oh, I'm not imagining you making an attack on Perl. It happens commonly enough, though. I was commenting that your property had a lot of punctuation and bemoaned that Perl with the same amount often gets criticized for that very thing. Meanwhile many of the same people are fine with Python's invisible punctuation, Lisp's parentheses, or Haskell's syntax which likewise includes a lot of punctuation.

I also was pointing out that Haskell has Tree shipping with it, and Perl's CPAN, which is usually a stellar place to look, has what appear to be some false starts. I'd sort of expect one of the many List:: modules like List::Utils, List::MoreUtils, etc. to have the functionality, but as far as I saw when looking, no. It's easy enough to do in the base language, though. I assumed Haskell from your language and the syntax of your notation.

Your Bash and sed solution appears simple on the surface, but it is using a trick of the data format and bringing together two languages. It's even using a syntax that will confuse some people on first look in that you're putting square brackets within square brackets starting with the right bracket then the left. Many people are going to look at that and at first think it's two empty character classes then do a double take. It's clever, but any simplicity in it is rather baked into some, let's say interesting assumptions. I like it as a snarky response to the problem, but it's not something I'd hire a programmer for proposing as a serious solution.

> Oh, I'm not imagining you making an attack on Perl.

'k :-P

> It happens commonly enough, though. I was commenting that your property had a lot of punctuation and bemoaned that Perl with the same amount often gets criticized for that very thing. Meanwhile many of the same people are fine with Python's invisible punctuation, Lisp's parentheses, or Haskell's syntax which likewise includes a lot of punctuation.

It certainly happens commonly enough, often unfairly. That said, I don't think Haskellers usually complain about Perl's punctuation.

> I also was pointing out [...] of your notation

Ah, I seem to have misread you entirely there.

> Your Bash and sed solution appears simple on the surface, but it is using a trick of the data format and bringing together two languages.

Well, it's true that we're lucky that a relatively simple transformation of the input format gives us the desired output. But I'm not sure I'd call it a "trick".

> It's even using a syntax that will confuse some people on first look in that you're putting square brackets within square brackets starting with the right bracket then the left. Many people are going to look at that and at first think it's two empty character classes then do a double take.

I mean, you've kinda gotta assume people speak the language...

> I like it as a snarky response to the problem, but it's not something I'd hire a programmer for proposing as a serious solution.

Well, it was a snarky response, but I think that actually depends on the context. If it was "You have these particular 20 files, that need to be transformed this way just this once", it's a great "serious" solution and I'd totally hire someone who would propose it (or something equivalent in another language) in for that purpose. It's incredibly brittle to some particular changes in the input format (especially if you might find square brackets internal to items) and probably not something that should be built atop.