• 0 Posts
  • 67 Comments
Joined 2 years ago
cake
Cake day: July 24th, 2023

help-circle



  • You could save 0.64 bit per char more if you actually treated you output as a binary number (using 6 bits per char) and didn’t go through the intermediary string (implicitly using base 100 at 6.64 bits per char).
    This would also make your life easier by allowing bit manipulation to slice/move parts and reducing work for the processor because base 100 means integer divisions, and base 64 means bit shifts. If you want to go down the road of a “complicated” base use base 38 and get similar drawbacks as now, except only 5.25 bits per char.






  • Rust has monomorphisation like C++ and every function has the aliasing guarantees of restrict, a keyword rarely seen in C code bases use and C++ doesn’t even support.
    This means you can get more optimisations while writing in an intuitive style, where C/C++ requires some changes to the code.

    On the other hand rustc has some hiccups with argument passing and rvo. One could argue that that’s just the compiler while the aliasing problems are part of the language in the C/C++ case, but while there is only one rust compiler its performance is the languages performance.

    For most use cases they are about equally fast.







  • If you want to have a library that can also be a standalone executable, just put the main function in an extra file and don’t compile that file when using the library as a library.
    You could also use the preprocessor to do it similar to python but please don’t.

    Just use any build tool, and have two targets, one library and one executable:

    LIB_SOURCES = tools.c, stuff.c, more.c
    EXE_SOURCES = main.c, $LIB_SOURCES
    

    Edit: added example



  • That boolean can indicate if it’s a fancy character, that way all ASCII characters are themselves but if the boolean is set it’s something else. We could take the other symbol from a page of codes to fit the users language.
    Or we could let true mean that the character is larger, allowing us to transform all of unicode to a format consisting of 8 bits parts.



  • It might also introduce spurious data dependencies

    Those need to be in the in smallest cache or a register anyway. If they are in registers, a modern, instruction reordering CPU will deal with that fine.

    to store a bit you now need to also read the old value of the byte that it’s in.

    Many architectures read the cache line on write-miss.

    The only cases I can see, where byte sized bools seems better, are either using so few that all fit in one chache line anyways (in which case the performance will be great either way) or if you are repeatedly accessing a bitvector from multiple threads, in which case you should make sure that’s actually what you want to be doing.