• 0 Posts
  • 27 Comments
Joined 2 years ago
cake
Cake day: June 30th, 2023

help-circle


  • These include semgrep, ast-grep, LLMs, and one-off scripts. After running these tools on a large code-base, you usually end up with lots of additional unintended changes. These range from formatting/whitespace to unrequested modifications by LLMs.

    Maybe LLMs do, but why would semgrep or your one-off script be making unrelated changes? This is like using sed to replace something and using grep to filter out the very things you just specifically modified. It should be unnecessary if you commit frequently enough and don’t do 10 different refactorings before starting to commit each one.











  • It’s not necessarily about the load, it’s about the algorithmic complexity. Going from lists (lines in a file, characters in a line) to trees introduces a potentially exponential increase in complexity due to the number of ways the same list of elements can be organized into a tree.

    Also, you’re underestimating the amount of processing. It’s not about pure CPU computations but RAM access or even I/O. Even existing non-semantic diff implementations are unexpectedly inadequate in terms of performance. You clearly haven’t tried diffing multi-GB log files.