• 1 Post
  • 266 Comments
Joined 2 years ago
cake
Cake day: June 6th, 2023

help-circle









  • cd ~/repos/work-project27
    git checkout dev
    git branch new_feature
    ### code for a few hours, close laptop, go to sleep, next morning
    git checkout dev
    ### code for a few more hours, close laptop go to sleep, next morning
    ## "oh fuck, I already implemented this in new_feature but differently"
    git checkout dev
    git diff new_feature
    ## "oh no. oh no no no. oh fuck. I can't merge any of this upstream and my history is borked."
    git clone git@workhub:work/work-project work-project28
    cd ~/repos/work-project28
    








  • tetris11@lemmy.mltoProgrammer Humor@lemmy.mlDOGE employee
    link
    fedilink
    arrow-up
    24
    arrow-down
    2
    ·
    edit-2
    2 months ago

    I have to admit, PDF parsing being such a hot and profitable topic in computer science was really something I never saw coming.

    PDFs? The things you can select text from? And when not, there’s decent OCR? And when not, you just ask the person to send you an email or a word doc?

    It sounds like LLMs are looking for a new unpolluted source of historical data that they can learn from, and this source exists in the form of old scanned-in paper documents. That’s the only reason I can fathom as to why this is such a big thing now.