Alternate account for @[email protected]

  • 76 Posts
  • 314 Comments
Joined 2 years ago
cake
Cake day: July 3rd, 2023

help-circle





















  • in developing our reasoning models, we’ve optimized somewhat less for math and computer science competition problems, and instead shifted focus towards real-world tasks that better reflect how businesses actually use LLMs.

    I was just about to say how useless these benchmarks are. Plenty of LLMs claim to be better than Claude and GPT4, but in real world use they’ve always been more reliable. Claude especially. Good to hear they’re not just chasing scores.





  • simple@lemm.eetoOpen Source@lemmy.mlProton's biased article on Deepseek
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    1
    ·
    edit-2
    3 months ago

    I understand it well. It’s still relevant to mention that you can run the distilled models on consumer hardware if you really care about privacy. 8GB+ VRAM isn’t crazy, especially if you have a ton of unified memory on macbooks or some Windows laptops releasing this year that have 64+GB unified memory. There are also websites re-hosting various versions of Deepseek like Huggingface hosting the 32B model which is good enough for most people.

    Instead, the article is written like there is literally no way to use Deepseek privately, which is literally wrong.