Author here. Currently working on a debugger. (Threw the old crappy one out.) Backtraces are working. Some of the remaining work is going to require long, uninterrupted concentration that is hard to come by due to taking care of a six-month-old baby.
I have over 50 unreleased patches. There are some bugfixes, including a compiler one, involving dynamically scoped variables used as optional parameters:
(defvar v)
(defun f (: (v v)))
(call (compile 'f)) ;; blows up in virtual machine with "frame level mismatch"
Patch for that:
diff --git a/share/txr/stdlib/compiler.tl b/share/txr/stdlib/compiler.tl
index e76849db..ccdbee83 100644
--- a/share/txr/stdlib/compiler.tl
+++ b/share/txr/stdlib/compiler.tl
@@ -868,7 +868,7 @@
,*(whenlet ((spec-sub [find have-sym specials : cdr]))
(set specials [remq have-sym specials cdr])
^((bindv ,have-bind.loc ,me.(get-dreg (car spec-sub))))))))))
- (benv (if specials (new env up nenv co me) nenv))
+ (benv (if need-dframe (new env up nenv co me) nenv))
(btreg me.(alloc-treg))
(bfrag me.(comp-progn btreg benv body))
(boreg (if env.(out-of-scope bfrag.oreg) btreg bfrag.oreg))
There is now support in the printer for limiting the depth and length.
I added a derived hook into the OOP system; a struct being notified that it is being inherited.
"TXR Lisp programs are shorter and clearer than those written in some mainstream languages "du jour" like Python, Ruby, Clojure, Javascript or Racket. If you find that this isn't the case, the TXR project wants to hear from you; give a shout to the mailing list. If a program is significantly clearer and shorter in another language, that is considered a bug in TXR."
i agree that the general-purpose programming language space is fairly crowded ... the lisp dialect/user ratio especially so.
DSLs, otoh, are in short supply. while awk or plain sed are great for shell programming, this is the only (open source) DSL i'm aware of targeting certain types of NLP-esque "munging". this space is mostly full of statistical approaches, which, while conceptually pure, don't allow the kind of flexibility that would be useful in many applications.
i wonder if, eventually, the DSL portion of TXR could be sheared off (possibly via metacircular evaluation of the TXR lisp?) into something that's portable across lisps or at least to semi-standardized scheme implementations?
Since Python lambdas also are expressions, they cannot contain statements, due to Python adhering to the Algol-like statement/expression syntactic paradigm. If a lambda expression could contain statements, that would mean that almost any other kind of Python expression could also contain statements by containing a lambda expression.
But, here above, I'm writing more from the angle of supporting multi-line lambdas that contain statements. Strictly speaking, you are only bemoaning the lack of multiple expression support, not multi-line lambdas.
Python could adopt something similar to the C comma operator. It almost has that in the form of list constructors, except that these return a list, instead of the rightmost value:
[foo(), bar(), xyzzy()] # foo, bar and xyzzy are called
Idea: a dummy function called progn could be used for this:
There you go. Lambdas are (effectively) not limited to a single expression. If progn is too long, call it pg (Paul Graham) or pn (Peter Norvig).
Always have your Lisp hat on, even if you find yourself in Python land.
Maybe this is a common trick? I don't use Python; I hardly know anything about it. I wrote one Python program before which garbage-collects unreferenced files from a Linux "initramfs" image, making it smaller (thus reducing a kernel image size). This was in a Yocto environment, which is written in Python 3, so that choice of language made sense.
BTW does Python require left-to-right evaluation order for arguments? I would sure hope so; it would probably be "un-Pythonic" to plant such a bomb into the language as unspecified eval order.
BTW looks like a more idiomatic definition for progn is:
No. It's obvious and trivial, but you'd be on a verge of being called names if you tried to use it in a Python codebase. Lambdas in Python are limited to a single expression by convention - which in Python-land is scarily rigid and specific - rather than just by the language spec.
Before `... if ... else ...` was added to the language as an expression (I think around 2.5), people had to make do with some workarounds. The fact that `True` and `False` get automatically casted to ints and back allowed for writing something like `[val_if_false, val_if_true][condition]`. Or you could use `and`/`or` combination as per usual. The official stance at the time was to never do this and use an `if` statement instead, but people still sometimes resorted to it. Then, the `if` expression was introduced specifically to combat the use of such workarounds. Now you'd be lynched if you tried to use one of them.
"There should be one - and preferably only one - obvious way to do it" - from the Zen of Python[1].
In general, despite a lot of effort to eliminate them, there are still some creative ways to use the language. It will always be the case, obviously, as you demonstrate. However, that creativity is 100% rejected by the community, to the point that even mentioning inadequacy of some construct for some use case is frowned upon - because it could lead to people inventing creative workarounds. If you try to complain about something in the language, the general attitude is "write a PEP or GTFO". More often than not it results in the latter.
The saddest part of it all is that this apparently is one of the major factors that made Python as popular as it is. There are valid reasons and a lot of advantages to this strategy. Go is similar as far as I can tell. Among the dynamic languages with rich syntax, Python codebases tend to be stylistically very close to each other, and not because there is a lack of ways this rich syntax could be (ab)used, but because doing so is unpythonic.
Haaah, now I said it... I hope not many Python programmers read this thread; I can already see torches and pitchforks on the horizon...
Source: I've been writing Python for the last 12 years for pay.
I do agree about homoiconicity. It's great for writing macros, and terrible for everything else. Of course, go too far in the other direction and you get perl, so... yeah.
I don't want to reopen this 50+ years old can of worms, but I have 2 questions:
1. Is it about homoiconicity in general or specifically s-expressions [EDIT: I see GP writes about s-exps specifically, missed it at first]? Prolog, Erlang, TCL, and Rebol (just some examples) are homoiconic, but not s-exps based. What do you think about them?
2. Do you often read code without syntax highlighting and proper indentation? Assuming the code is properly indented and colorized, what makes it so hard to read in your eyes? Take a look for example at snippets in: https://docs.racket-lang.org/quick/index.html#(part._.Local_... - what do you feel is wrong with them? Is it only the placement of parens, or is there something else?
>I do agree about homoiconicity. It's great for writing macros, and terrible for everything else.
I don't know; i've written code in C, C++, C#, Java, Python, Ruby, Pascal, Delphi, Assembler x86, TCL, Javascript and Common Lisp. Lisp codebases are the cleanest and clearest i've seen by far, although ReasonML/SML/OCaml might be as clean too.
Tcl does not have GIL and is possibly one of the most dynamic languages that has seen nontrivial use. Guile does not have GIL. Running multiple interpreterd in different threads is a common notion in Tcl.
That's true about Racket, but it has a quite original way of dealing with it with their futures[1].
To be honest, I'm not sure about the details, but it should let other threads run truly in parallel as long as it's "safe" to do on the VM implementation level. So, if the code inside the future doesn't perform any "future unsafe" operations, it can execute within a separate OS thread without worrying about the main thread.
Examples of "future unsafe" actions were given as memory allocation, and JIT compilation. Further, it's mentioned that some simple (for the language users, at least) operations may be too complex internally to be "future safe". An example of this is using a generic number comparison operators - `<`, `>`, etc. Apparently, these have to handle the full numeric tower of Racket and in the process perform some future unsafe operations.
In the Mandelbrot function given as an example in the guide, simply replacing the generic comparisons with the ones specialized for work on floats specifically (and assuming that contract is not broken, which would immediately stop the future) allows the future to execute fully in parallel.
What is important to note here is that `set!` and friends, and so mutation of shared memory, is considered "future safe", ie. it's permitted to use them! (although then it's you who deals with the usual problems that brings).
I think it's worth mentioning here, because it's a novel strategy that seems to be between the two usual solutions (1. we've got GIL, live with it; 2. spawn more processes and get them to work - well, now you have many GILs...) and is showing some promising results. Plus they have a neat visualization tool!
Currently, it's limited and works best for purely numerical computations (which is also where you'd need it 99% of the time), but in some cases, it appears to work: the programmers of the language (not the implementation of the language) are given a tool to work outside the GIL in a structured manner plus a tool for closely inspecting low-level operations that happen in their code which would suspend or stop the future.
I'm not aware of any other dynamic or not language which has both the GIL and a nice, language-level tool for freeing it and running in parallel. Because what is considered "future unsafe" depends on the details of the implementation, I'm full of hopes for Racket-on-Chez, although I think I read somewhere that work on futures is not a priority at this time.
Also, to confirm the sibling comment, SBCL is happy to spawn truly parallel threads. There are other Scheme implementations (I think Chicken at least, but not sure right now) who allow the same.
This seemed interesting, but when I went through the "Accepted Stack Overflow" links on the main page, I thought "how would I do this in an R tidyverse stack?" and set the goal that my responses should be shorter, clearer, or ideally both, and that I would favour clearer answers to code golf, except that when posting to HN I collapse the code into a single line while in R there would be linebreaks at each semicolon or after each pipe operator (%>%). Here are three examples below:
"Customized sort based on multiple columns of CSV". In R, something like this: `library(tidyverse); read_delim("file.tsv", delim = "@") %>% arrange(.[[2]]) %>% group_by(.[[2]]) %>% arrange(match(.[[3]], c("arch.", "var." "ver.", "anci.", "fam.")), .[[3]]) %>% group_by(.[[2]], .[[3]]) %>% mutate(n = n()) %>% arrange(desc(n)) %>% ungroup() %>% select(1:4)`
"Extract text from HTML table". In R, something like this would suffice: `library(rvest); library(tidyverse); read_html(URL_GOES_HERE) %>% html_nodes("div.scoreTableArea") %>% html_table() %>% write_delim("out.csv", delim = "\t")`
"Get n-th Field of Each Create Referring to Another File". In R: `library(tidyverse); file1 = read_delim("file1.txt", delim = " ", col_names = FALSE); chunks = readChar("file2.txt", 999999) %>% str_split(";") %>% unlist() %>% map(function(x) { matches = str_match(str_trim(x), '^create table "(.)"([^(])\\(((.|\n)*)\\)$'); title = matches[, 2]; fields = matches[, 4] %>% str_split(",") %>% unlist() %>% str_trim(); return(tibble(table_name = rep(title, length(fields)), n = 1:length(fields), field = fields)) }) %>% bind_rows(); file1 %>% left_join(chunks, by = c("X1" = "table_name", "X2" = "n"))`
The third example trades off a little clarity for a little robustness by adding a regex instead of assuming the SQL table definition is one field per line.
There is no HTML parsing library in TXR, yet the code still looks good.
TXR Lisp has support for that type of functional transformation of structured data, with fairly tidy syntax. If a need for a full blown HTML parsing library arises, someone will come up with one; maybe me. It could end up integrated into the TXR flex/Yacc parser, which would make it fast.
In the "Get n-th Field" task, what we can do is snarf the data as a string, then remove all the commas and semicolons. It then parses as a TXR Lisp with the lisp-parse function, resulting in this:
> The PDF rendition of the reference manual, which takes the form of a large Unix man page, is over 600 pages long, with no index or table of contents. There are many ways to solve a given data processing problem with TXR.
The "no index or TOC" isn't being touted as a feature, just that the page count is that without these (in documents like these, these features can contribute dozens to the page count). An index would be nice; patches welcome!
The HTML version that most people would be using has a TOC with two-way navigation to the section headings and is hyperlinked. Of course, man page reading allows easy searching.
I guess threads like this remind me why it's nice to have professional doc writers review my customer-facing text at work. ;) Congrats on your project getting some more attention! If you'll indulge a bit of bikeshedding, this particular miscommunication could probably be avoided in the future by changing the sentence to the short "The PDF rendition of the reference manual is over 600 pages long." Even if you add extra things to the PDF later the statement won't be incorrect and so you won't have to deal with nitpickers coming by next time with a comment like "But if you remove the index it's only 597 pages!"
Another edit preserving more of the original would be to replace the final "with no" with something like "even excluding any"...
I've learned/used basic TXR some time ago. I had a text parsing problem that needed backtracing, and it seemed simpler to use TXR than to use implement this in python or perl.
Basic TXR matching is really quite simple. Match some patterns, generate a report at the end. The patterns are interleaved with the matching text, so it's more like a more powerful version of regexprs (but far more readable), than a normal programing language.
You can learn it quickly based on the provided examples.
It's just a few straight forward commands, although you have to wrap your mind how the backtracing parser works.
Most of the manual is about the LISP. I never used that part and I don't think it's really needed for 95+% of all text parsing/summarizing.
Well the HTML version has contents. 600 pages of documentation and with the information density I see in a quick skim would not imply a “you are on your own” mentality to me.
I read it as honesty and not bragging. Few people set out to create a inaccessible language with bad documentation, but given enough time and users, most languages become one. I'd prefer language maintainers and users have enough self-awareness to not believe it is still the elegant and simple language of 20 years ago.
It would be interesting to have a DSL for data munging, but I am afraid TXR is not it. My requirements would be that the language should be functional and total.
Most transformations that we do on data do not require Turing completeness or recursion. I think it would be useful to write these down in a language with semantics that is easy to analyze.
The funny thing is, I originally didn't intend the TXR pattern language to be recursive. It needed functional decomposition (pattern functions) to break up a big pattern match into simpler units. When those were implemented, I realized after the fact, hey we have a push-down automaton that can now grok recursive grammars.
I don't see why we would want to rule out a pattern function invoking itself (directly, or through intermediaries); if that hurts, then just don't do that.
(Though I understand that there are languages deliberately designed without unbounded loops or recursion, for justifiable reasons.)
I found in practice that arbitrary recursion depth is (even on languages with formal recursive grammar) very rarely needed. And where it's needed it can probably be implemented as a primitive in the language (map total function over all the nodes) that can do a similar thing.
XSLT is Turing complete with the usual caveats about memory. Given its complexity it'd be very unlikely for it not to be, but there's clear proof too: someone has implemented a universal Turing machine in it.
4. My use case was: If you have a some what fuzzy parsing problem that is harder than a single regexpr and needs backtracing, and then generate a report from it.
For these things TXR is great.
If you want to do multi threading or best performance it's probably not the thing to use.
Yep. Seriously. R w/tidyverse is a ridiculously powerful data wrangling tool especially when dealing with text files.
I tend use Notepad++ when starting out on a data-wrangling adventure. It has an uncanny ability, unlike any other editor, to open hundreds of files at the same time and to perform regex operations on all of them without dropping dead. I uses Notepad++ for initial manual exploration to get the lay of the problem, and then switch to R for the actual analysis.
Well, this looks great, but I'm not about to start digesting the self-admitted 600-page tome just to see if it's worth learning for the tasks I encounter - surely there's a "tutorial" somewhere?
I can accept that doing something non-standard leads to some rough edges like this, but i'm not sure how many web developers know this is an issue. At least it has surprised me how many websites have this issue of assuming the default color is bright white.
Hi; try it now! I GIMP-ed the image such that the non-transparent pixels are pure red, and only slightly opaque, instead of 100% opaque pinkish white. It looks about the same on a white background. Thanks, again.
I tested it with a lightly grey background, as well as heavy gray.
This little experiment really made me notice HN's hard-coded light grey background box, BTW.
Good heads up. This background (if it is to exist at all) should be done properly as an alpha blend, not as a transparency with opaque off-white pixels, so that it works with various backgrounds. I will look into it.
Of course, Non-GNU projects that are licensed in such a way that they could be GNU projects.
From the registration page, the kind of software project that can be hosted on Savannah is [a] free software package that can run on a completely free operating system, without depending on any nonfree software. You can only provide versions for nonfree operating systems if you also provide free operating systems versions with the same or more functionalities. Large software distributions are not allowed; they should be split into separate projects.
I can summarize this as follows. TXR is my research platform into various topics, including many Lisp topics. It contains numerous innovations. As a whole, that requires working at the implementation level, ground up.
I have over 50 unreleased patches. There are some bugfixes, including a compiler one, involving dynamically scoped variables used as optional parameters:
Patch for that: There is now support in the printer for limiting the depth and length.I added a derived hook into the OOP system; a struct being notified that it is being inherited.