Hacker News new | ask | show | jobs
by esafak 264 days ago
Greetings, Diomidis.

I would suggest a familiar notation like "[a, b] -> c" in a dedicated dag block:

  dag text_stats {
    tee -> [ split_words, count_chars ]

    # word-based frequencies
    split_words -> tee_words
    tee_words -> ngram2 -> save_digram
    tee_words -> ngram3 -> save_trigram
    tee_words -> ranked_frequency -> save_words

    # character-based frequencies
    count_chars -> add_percentage
    chars_to_lines -> ranked_frequency -> add_percentage -> save_chars
  }

  run text_stats < input.txt
https://www2.dmst.aueb.gr/dds/sw/dgsh/#text-properties

or

  dag commit_graph {
    git_log -> filter_recent -> sort -n -> [ uniq_committers, sort_by_email ]

    uniq_committers -> [ last_commit, first_commit, committer_positions ]
    [ last_commit, first_commit ] -> cat -> tr '\n' ' ' -> days_between

    [ committer_positions, sort_by_email ] -> join_by_email -> sort -k2n -> [ make_bitmap_header, plot_per_day ]

    [ uniq_committers, days_between ] -> emit_dims -> plot_per_day

    make_bitmap_header -> cat
    plot_per_day -> morphconv -> [ to_png_large, to_png_small ]
  }

  run commit_graph
https://www2.dmst.aueb.gr/dds/sw/dgsh/#committer-plot

The translations above are computer-assisted and may contain mistakes, but you get the idea.

2 comments

The closeness of this syntax to graphviz dot is very interesting.

having dgsh output a graphvis file in dry-run mode would be a neat feature.

Thank you for the suggestion. This would mean that you'd also then create some mapping from each name (like git_log) to its implementation, right?
Yes, using shell functions:

  git_log() {
    git log --pretty=tformat:'%at %ae'
  }
Separating function definitions allows you to run, test, and re-use them.
And, more importantly, assign a name to a process, so that it can appear multiple times in the graph.
You might want to try looking at the Neo4j query language Cypher for some possible inspiration for the syntax.