September 2nd, 2024

Tbsp – treesitter-based source processing language

tbsp, a tree-based source-processing language, has recently added lists, index expressions, and string manipulation features, improved documentation, and introduced a command-line interface, following its renaming from "trawk."

Read original article

ExcitementAppreciationCuriosity

Tbsp – treesitter-based source processing language

tbsp is a tree-based source-processing language that has undergone several updates and enhancements recently. Key developments include the addition of lists and index expressions, the implementation of string substring functionality, and improvements to the documentation, including a roadmap and usage examples. The project has also seen a renaming from "trawk" to "tbsp" and the introduction of a command-line interface (CLI) for better usability. The most recent commits reflect ongoing efforts to refine the language and its features, with contributions made by the author Akshay over the past few weeks.

- tbsp is a tree-based source-processing language.

- Recent updates include the addition of lists, index expressions, and string manipulation features.

- The project has been renamed from "trawk" to "tbsp."

- Documentation improvements include a roadmap and usage examples.

- A command-line interface (CLI) has been introduced for enhanced usability.

Bpftop: Dynamic real-time view of running eBPF programs

The GitHub repository for `bpftop` by Netflix provides real-time monitoring of eBPF programs with statistics like average runtime, events per second, and CPU usage. Visit https://github.com/Netflix/bpftop for more details.

Modeling B-Trees in TLA+

Lorin Hochstein explores B-trees using TLA+, modeling operations like key-value retrieval and insertion. He emphasizes historical outputs and node structures, offering insights into B-tree functionality in databases.

Show HN: notesbash – A notes management TUI written in bash

notesbash is a Linux shell-based note-taking tool that allows users to manage notes via a terminal interface. It supports various file formats, customization, and is open-source under the GNU GPL 3+ license.

Sourcetrail: Free and Open-Source Interactive Source Explorer

Sourcetrail is an archived, open-source source code explorer for Windows, macOS, and Linux, supporting C, C++, Java, and Python, with offline functionality and an SDK for custom extensions.

Modern Unix Tool List

The article lists modern Unix command-line tools that enhance traditional utilities, highlighting Atuin, Bat, and Concurrently, while noting some tools as unsatisfactory and emphasizing the need for regular updates.

AI: What people are saying

The introduction of tbsp, a new tree-based source-processing language, has generated positive feedback and suggestions from the community.

Users appreciate the new features and improvements, particularly for parsing and processing tasks.
There are requests for higher-level APIs and tools to simplify grammar handling and AST processing.
Some users express excitement about using tbsp for practical projects, such as converting HTML to CSV.
Concerns are raised about the limitations of the Markdown parser and the need for more robust parsing capabilities.
The community values the thoughtful naming of the language and its potential for broader applications.

16 comments

By @fellowmartian - 5 months

This is great, and a step in the right direction. I wish tree-sitter had an official higher level API that allowed processing and pattern matching for use cases other than those required for text editors.

I’m currently using tree-sitter at work to build AST-based tools, as performance is amazing, even with huge codebases, but I’m finding it slightly frustrating to have to manually write recursive descent processors keyed by strings, with no compile time guarantees on the structure of the grammar.

This is compounded by the fact that grammars themselves don’t really follow any standard structure, some have named fields (presumably the ones created after GitHub contributed this feature), while others require hierarchical pattern matching.

I wish there existed a tool to consume a grammar and output a rust ADT that we can simply match on. This would at least save me from redundant error handling. I’d build one myself, but I’m that good at rust yet.

By @rtpg - 5 months

So an awk but that knows how to walk structures instead of just lines. Excellent!

I'm a big fan of semgrep letting me query ASTs, this feels like something in a similar space. Down with lines, up with everything being trees!

By @sramam - 5 months

This is so cool.

Question (caveat: first export to treesitter and tools like this): Is there a reason the example demonstrates the use of depth as a variable instead of it being built in?

Nesting level of a particular "type" is general enough that it might be included OOTB. What you want to do with this might be generalizable - for example instead of

```

    enter section {
        depth += 1;
    }
    leave section {
        depth -= 1;
    }

    enter atx_heading {
        print("<h");
        print(depth);
        print(">");
    }
    leave atx_heading {
        print("</h");
        print(depth);
        print(">\n");
    }

```

It could simply be:

```

    enter atx_heading {
        print("<h");
        print(depth);
        print(">");
    }
    leave atx_heading {
        print("</h");
        print(depth);
        print(">\n");
    }

```

So depth is always of the nested levels of the same node type, but available out of the box. For markdown, it's headings, sections and lists come to mind - but I might be wrong.

In any event, this looks really well thought-out and now to checkout the other tools mentioned in the comments.....

By @mingodad - 5 months

For those that want to explore the grammars listed at https://github.com/tree-sitter/tree-sitter/wiki/List-of-pars... in a more friendly railroad diagram format I made https://mingodad.github.io/plgh/json2ebnf.html that reads the "src/grammar.json" and try it's best to generate an EBNF understood by (IPV6) https://www.bottlecaps.de/rr/ui or (IPV4) https://rr.red-dove.com/ui where we get a nice navigable railroad diagram (see https://github.com/GuntherRademacher/rr for offline usage).

By @MantisShrimp90 - 5 months

As someone writing a neovim plugin using treesitter thank you! Languages like this help leverage treesitter in more interesting ways whereas current apis are still a bit low-level

By @samgriesemer - 5 months

The md-to-html demo is a good one, but worth mentioning that the Markdown parser[1] being used may not be suitable for more complex documents. From the README:

> "...it is not recommended to use this parser where correctness is important. The main goal for this parser is to provide syntactical information for syntax highlighting..."

There's also a separate block-level and inline parser, not sure how `tbsp` handles nested or multi-stage parsing.

[1]: https://github.com/tree-sitter-grammars/tree-sitter-markdown

By @ashkankiani - 5 months

Adding a way to query the path at the current node would let you skip out on doing stuff like keeping track of `in_section`.

I wonder if the `enter|exit ...` syntax might be too limiting but for a lot of stuff it seems nice and easy to reason about. Easier than tree-sitter's own queries.

I think if you really wanted performance and whatnot, you might end up compiling the queries to another target and just reuse them.

I could see myself writing a lua DSL around compiling these kinds of queries `enter/exit` stanzas or an SQL one too.

By @orra - 5 months

Not a technical comment (as cool as this is), but I love the name.

We always say naming things is one of the hard parts of programming. They avoided the default option of something like tawk.

By @toastal - 5 months

Always kudos towards taking a self-hosted-forge approach

By @otreblan - 5 months

https://aur.archlinux.org/packages/tbsp-git

By @lumb63 - 5 months

This is really cool! I have a lot of short projects that are essentially “parse out 2 or 3 tags of HTML and convert that to CSV. This will be perfect for that; in the past I’ve done it by hand with vim. Next time I’ll give this a shot.

By @jpgvm - 5 months

Maybe update the link to https://git.peppe.rs/languages/tbsp/tree/readme.txt?

By @orjicu98 - 5 months

very interesting paradigm of programmin i would recommend checking out, for inspiration: https://rosettacode.org/wiki/Category:Bracmat and https://www.egison.org/

they define themselves as non linear patter matching pretty niche and unique way to program and i enjoyed playing with thier code

thanks for posting very nice

By @azeirah - 5 months

Awesome! I'd love to see this flourish.

By @vslira - 5 months

That's a lot of work to write lisp without parentheses /j

I joke, really interesting project, props to the team

By @PoppGolfer - 5 months

tablespoon - of course....

Bpftop: Dynamic real-time view of running eBPF programs

Modeling B-Trees in TLA+

Show HN: notesbash – A notes management TUI written in bash

Sourcetrail: Free and Open-Source Interactive Source Explorer

Sourcetrail is an archived, open-source source code explorer for Windows, macOS, and Linux, supporting C, C++, Java, and Python, with offline functionality and an SDK for custom extensions.

Tbsp – treesitter-based source processing language

Related

Bpftop: Dynamic real-time view of running eBPF programs

Modeling B-Trees in TLA+

Show HN: notesbash – A notes management TUI written in bash

Sourcetrail: Free and Open-Source Interactive Source Explorer

Modern Unix Tool List

Related

Bpftop: Dynamic real-time view of running eBPF programs

Modeling B-Trees in TLA+

Show HN: notesbash – A notes management TUI written in bash

Sourcetrail: Free and Open-Source Interactive Source Explorer

Modern Unix Tool List