~ser/legume

A distributed issue tracker based on sourcecode comments
~ser/public-inbox

New mailing list added

1 year, 6 months ago
~ser/legume

New ticket tracker added

2 years ago

#legume

Build statusReleasesChangelogIssuesGo Report Card

Legume is a distributed issue tracker base on developer code comments such as TODO and FIXME. It has no separate metadata or database and understands several programming languages.

Keeping notes in code is a verable programmer tradition. Notes of this sort retain locality; they are reminders and warnings for code browsers, and retain important meta-data about where to start. They integrate naturally with whatever VCS the developer is using, and make it easy for a developer to remember to remove the comment when the topic is addressed. They require no extra external database.

Legume can be used along side other, more formal, issue trackers which might allow non-developers to submit tickets. I also use it, rather than a web-based ticket tracker, on small, single-file scripts.

The project home is here. File bugs here. Send patches, or any other comments or questions, to ~ser/legume@lists.sr.ht There is a public chat available at #legume:matrix.org (matrix clients).

Is Legume using the Sourceforge issue tracker ironic? Legume is for developers, and the only way to add issues is to add code comments. It doesn't have a public interface, so the issue tracker is for non-contributers. If you are interested in the history of Legume -- why I wrote it, the value of DITs -- I wrote a blog about it (also at gemini://ser1.net/post/legume.gmi).

#Features

The biggest selling feature of legume is that issues are embedded in code. This has a number of benefits:

  • Issues can be place-marks for where developers think things are happening.
  • Commenting code with TODO and FIXME comments is a decades-old traditional developer habit
  • Integrated with version control at an intuitive level, reducing cognitive load. No external integration with VCS necessary.
  • No external DB for tickets. TODOs and FIXMEs are still relevant even if you stop using legume. Which also means...
  • legume is not strictly necessary for working with issues. It's a convenience tool. This means you don't have to build your workflow around legume, and nobody in your team needs to install legume. You can find and grep your way through issues.

Legume will also address other, non-developer, use cases by recognizing a todo.txt file in the directory of execution. Legume will assume every line is a separate TODO. This allows easy integration with todo.txt, but also provides a place for sticking issues that either aren't linked to source, or come from a source other than a developer (e.g., web app). todo.txt syntax is also supported in comments.

  • (A) priorities
  • 2017-03-20 "created" dates
  • due:2017-03-20 "tagged" dates
  • *@contexts_
  • +projects
  • key:value tags
  • Multi-line issues (new issue, blank line, or non-comment breaks issue)

The list of programming languages that legume should understand is here.

#Install

Binaries for OSX, Linux, and Windows are here, and are signed with the GPG key 5E0D7ABD6668FDD1 (available from hkp://hkps.pool.sks-keyservers.net). Download assets are compressed with brotli (which is probably available in your distribution's package manager). Here's are the shell steps for DIY installation:

VERSION=v1.3.3
curl -LO https://downloads.ser1.net/software/legume/leg_linux_amd64_${VERSION}.br
gpg --recv-key 5E0D7ABD6668FDD1
curl -LO https://downloads.ser1.net/software/legume/leg_linux_amd64_${VERSION}.br.sig
gpg --verify leg.br.sig
brotli -d -c leg.br | sudo tee /usr/local/bin/leg > /dev/null
curl -L https://downloads.ser1.net/software/legume/docs/leg.1.gz | \
    sudo tee /usr/local/share/man/man1/leg.1.gz > /dev/null

A package for Legume is in AUR; the binary installed is called legume. If you aren't also using the peg package (a parser generator for C), then I recommend aliasing legume to leg, since that's supposed to be the actual executable name.

yay -S legume
echo 'alias leg=legume' >> ~/.bashrc    # or .profile, or .cshrc, or .zshrc

If you want to build it yourself, you can either go install the package, or clone it and compile it manually.

go install ser1.net/legume/cmd/leg@v1.3.3

Note "latest" doesn't work at the moment. This is an older project, and the combination of Mercurial bookmarks and tags confuses go install. Use a version number.

Here's a little install copy/paste snippet that should work on all Unix-ish OSes (including OSX):

go build -o leg ./cmd/leg
sudo cp ./leg /usr/local/bin/leg
sudo cp ./leg.1 /usr/local/share/man/man1/leg.1

#vim

A super-simple vim script is available; to install, copy the contents of legume.vim into your vimrc. It binds <leader>lg to execute legume with the -f vim option and open the quickfix buffer.

#kakoune

drawing

The kakfile can be put in your $XDG_CONFIG_HOME/kak/autoload directory, and exposes some make-like functions & functionality. Kakoune's CWD should be the project directory.

  • leg opens a buffer list of the issues
  • leg-next-todo jumps to the next issue
  • leg-previous-todo jumps to the previous

If you want, map these to shortcuts as you would for make, for example:

declare-user-mode leg
map global leg n ": leg-next-todo<ret>" -docstring "go to next issue"
map global leg p ": leg-previous-todo<ret>" -docstring "go to previous issue"
hook global BufCreate \*leg\* %{
    map global user g ": enter-user-mode leg<ret>" -docstring "legume issues"
}
hook global BufClose \*leg\* %{
    unmap global user g 
}

#Usage

There's the usual -h/--help, but if you installed with a package manager there should also be a more detailed man page (man legume). If you installed from source, there's a manpage in the top directory.

Comments must always start with a keyword, TODO, FIXME, BUG, or XXX`. These key words are case sensitive. After that, todo.txt formats are allowed. Example:

    // TODO 2017-03-20 (A) @con_example +proj_example due:2018-03-22 key:value Priorities, contexts, projects, created & due dates, and key/value pairs

Use the built-in help for options. One workflow is:

$ leg       # to list all todo/fixmes in a project
$ leg -d 5  # to list the details of item # 5
$ leg 5     # Same as above: for convenience, the `-d` may be ommitted

Filtering changes indexing. Because there is no canonical item numbering, any filters that produce a list must be also used when using -d (details). For example,

$ leg -P test       # to ignore all files in the test/ directory
$ leg -P test 3     # For convenience, the `-d` may be ommitted

If the --diff flag is used, Legume reads from the stdin and processes it as a diff. In this mode, Legume counts any add (+) tickets as new, and any removed (-) tickets as closed, and reports them this way. This can be used in conjunction with version control systems that support unified diff output; For example, on the Legume repository itself:

➜  hg diff -r 4:98 | leg --diff
  1 REQ    NEW Include the time stamp in the report; removed use old version, add use new
  2 REQ    NEW filter on priority, category, and project; use meta-tags and -t
  3 BUG    NEW catch string lit escapes
  4 REQ CLOSED Support for STDIN
  5 REQ CLOSED Config file
  6 REQ CLOSED Support for unified diff
  7 REQ CLOSED Add test cases.
  8 REQ CLOSED Add test cases.
  9 REQ CLOSED Add test cases.
 10 REQ    NEW Add unit tests for Alias [component:ui]
 11 REQ    NEW Implement & add unit tests for Keywords [component:ui]
 12 REQ CLOSED Add test cases.
 13 REQ CLOSED Add test cases.
 14 REQ CLOSED Add test cases.
 15 REQ    NEW Parsing diffs seems to be broken
 16 REQ CLOSED Add test cases.
 17 INF    NEW refactor parsing to "consume-to-end"

This can be helpful when building change logs for a change set.

Obvious limitations result from how diff reports information; a changed line is reported as a combination delete + add, which looks to Legume like a close + open. In practice and over short spans, it works pretty well.

If you have commonly used arguments, you can put them in a file named .legrc and leg will load them from there:

-P test
-P data
-f detail

leg looks for this file in the directory from which it is run.

#Performance samples

Legume has already satisfied the basic performance requirement; small projects have sub-second parse times. A future version may implement caching to bring larger projects to this benchmark.

Project with 108 files, 14k lines, 15 todos:

leg .  0.04s user 0.03s system 81% cpu 0.090 total

Project with 992 files, 564k lines, 244 todos:

leg .  1.34s user 0.07s system 95% cpu 1.464 total

The Linux 5.7 kernel source tree, 64,309 files, 28,136,537 lines, 9,697 todos:

leg .  49.60s user 0.96s system 99% cpu 50.713 total

Legume was not designed for large projects such as the Linux kernel; at that scale, there are likely many users and developers, and source code comments won't be sufficient and a separate ticketing system is probably more appropriate. Consequently, the current performance is quite acceptable. Please report if the performance is impeding you; I'd be interested to see a project that large where legume is still providing value.

#Tree-sitter

This is an aside about alternative options for parsing; it's only interesting as a data point about sourcecode parsing.

I attempted in one branch to use a "proper" multi-syntax parser, tree-sitter, via the github.com/smacker/go-tree-sitter project. The 129,354 line bash/parser.c file, in the go-tree-sitter library itself, took several minutes to walk. FFI was certainly a factor, but tree-sitter itself was pretty slow parsing this file: the Rust text editor Helix also uses tree-sitter, and is extremely laggy on even a decently fast CPU. My bespoke parser processes the file in less than 0.08s (on my CPU). Working with a cut-down file of 27kLOC (the 129kLOC file took too long to run for me to want to benchmark) I got the following results:

Version Min Max Mean Runs
Bespoke 23.9ms 24.9ms 24.3ms 113
tree-sitter, visitor 2.523s 2.561s 2.540s 10
tree-sitter, query 1.178s 1.232s 1.209s 10

Notably, the performance of tree-sitter (or the Go binding) does not scale linearly, either by walking the parse tree or using the query API. Tree-parser parses a file 1/5th the size in 1/348th of the time. For comparison:

Size (lines) Bespoke time (ms) tree-sitter time (ms)
26,837 24 2,540
129,354 74 421,040
Factor 4.81x 3.08x 168x

Even if Go FFI were faster, tree-sitter is doing a lot more work parsing and tokenizing, while my bespoke parser is only scanning for and extracting comments -- it's doing far less work. Since the tree-sitter time cost is prohibitive, legume sacrifices potentially better correctness for speed.

#Caveats

Note Legume's executable is called leg. However, there's an older project called peg (a C parser generator) that also has a leg command; to avoid coflicts in distributions, I've named the executable legume in package managers. In this document I use leg.

Legume is intended for a narrow problem space. Large projects, such as the Linux kernel (and certainly the boundary is much lower than that) are certainly better served by a "real" issue tracking system. For large code bases, any tool that parses the entire code base on every invocation will be too slow, even if a tool like legume didn't lack most of the features of a sophisticated isuse tracking system.

Testing on languages other than Go has been limited. Testing on Windows is non-existent.

The diff feature absolutely has a number of limitations; diffs containing multi-line TODOs where a line other than the first changed will be missed; similarly, lines where only the first line changed will report partial descriptions. There's simply a limit to how much meta-information can be interpreted from a diff.

At the time of this writing, despite the version number this legume is young and I'm the only user I know of. There will be bugs. Sorry about that.

#Limitations

The trade-offs of using a tool like Legume for ticket tracking, which may or may not be considered limitations, include:

  • No unique IDs, which can make referencing tickets difficult or cumbersome
  • Very simple tickets. No attachments, no cross-references or dependencies, no robust commenting; history tracking limited to what's available in the VCS.
  • Efficiency. In a pure state, collating tickets requires walking directory trees and parsing source files. At present, small and even medium-sized projects have sub-second processing, and this is faster than making calls to a remote web service to fetch tickets. At some size, Legume will take too long to parse the source tree for it to be a pragmatic tool.

#Prior Art

Legume was inspired by lentil and follows the philosophy of todo.txt. Legume has goals beyond merely duplicating lentil:

  • Accept diffs & STDIN. This allows some ability to track history by using the output of a VCS. E.g., git diff | leg would produce a list of all issues added and resolved, using diff markup (+/-) to indicate opened/closed
  • Improve performance
  • Support todo.txt syntax (priorities, dates, etc)

Below are some other distributed ticketing systems, many of which I've tried. If legume isn't your cup of tea, maybe one of the ones listed below will be more suitable for your workflow.

  • lentil, discussed below.
  • Artemis. Separate bug DB (maildir); integrated with VCS (Mercurial, beta git support). Supports more traditional ticketing features, such as attachments and comments. Only slightly younger than BugsEverywhere, the first commit was in 2007.
  • b. Separate bug DB. Tightly coupled with Mercurial -- implemented as an HG extension. Supports more traditional ticketing features, such as assigning tickets to people.
  • BugsEverywhere. Separate bug DB. Lots of features, including a web interface, email ticket communication, supporting multiple VSes, and most traditional ticketing features. The grand-daddy of distributed ticketing systems; first commit was 2005!
  • git-bug, tightly coupled to git.
  • The one with my favorite name, ScmBug
  • Fossil, which is a kitchen-sink project; the VCS & bug tracker are very tightly coupled, but it does have a distributed ticketing system, so it makes the list.
  • SD, which has integration with git and darcs, and keeps bugs in a SQLite database.
  • ditz, which I used quite a bit way back when I was using Darcs (a predecessor to both git & Mercurial).
  • ticgit, now unmaintained.
  • ditrack; the web site appears to have reverted to the domain name provider.

Distributed ticket tracking has been discussed in numerous places:

Most of these keep bug data in a separate database that's version tracked alongside the code. One of my requirements is to leverage code comments, and Lentil is the only tracker I found that does that. Some trackers keep bugs in a database that are one or more text files, which is OK because the VCS will be able to diff them efficiently; any tracker that keeps bugs in a binary DB (like SQLite) that requires checking into the VCS is, IMO, a non-starter.

Lentil is written in Haskell, and although I do like Haskell, I wanted something lower-impact to compile than Haskell. The core Haskell stack on Arch Linux, including common libraries, is 183 packages and 1.9GB. This is a heavy lift for small systems.

Additionally, I'm less fluent these days in Haskell than Go, and one of the issues I have with Lentil is performance. I'm not advanced enough with Haskell to be able to easily performance tune somebody else's code, and getting better at performance tuning Haskell was not my main objective for this project; I just wanted better speed and some additional functionality, and to get to the functionality I'd have to have solved the performance issues first... it was simply easier to implement a new tool.

// vim: set ft=pandoc