• 0 Posts
  • 4 Comments
Joined 2 years ago
cake
Cake day: June 16th, 2023

help-circle

  • “European leaders” operate under a permanent disadvantage because they have to agree among themselves to do anything. This leaves them unable to take the initiative geopolitically, and prone to taking whatever’s the path of least resistance lying before them. The US and Russia have concluded that Europe will roll over and accept whatever they are presented with, after some angsty wailing, and unfortunately they are probably right. Not inviting Europe to talks is just a dominance move showing that they know the Europeans can’t do anything about it.

    Unfortunately for Europe, this is just the logical end point of their institutional arrangements. In a domain like geopolitics, where there are intelligent players looking for advantage, it is suicidal to turn off your ability to make decisions.


  • Dylan’s just being deliberately obtuse. Deepseek developed a way to increase training efficiency and backed it up by quoting the training cost in terms of the market price of the GPU time. They didn’t include the cost of the rest of their datacenter, researcher salaries, etc., because why would you include those numbers when evaluating model training efficiency???

    The training efficiency improvement passes the sniff test based on the theory in their paper, and people have done back of the envelope calculations that also agree with the outcome. There’s little reason to doubt it. In fact people have made the opposite criticism, that none of Deepseek’s optimizations are individually groundbreaking and all they did is “merely engineering” in terms of putting a dozen or so known optimization ideas together.