• 0 Posts
  • 856 Comments
Joined 2 years ago
cake
Cake day: June 16th, 2023

help-circle
  • The issue here is that we’ve well gone into sharply exponential expenditure of resources for reduced gains and a lot of good theory predicting that the breakthroughs we have seen are about tapped out, and no good way to anticipate when a further breakthrough might happen, could be real soon or another few decades off.

    I anticipate a pull back of resources invested and a settling for some middle ground where it is absolutely useful/good enough to have the current state of the art, mostly wrong but very quick when it’s right with relatively acceptable consequences for the mistakes. Perhaps society getting used to the sorts of things it will fail at and reducing how much time we try to make the LLMs play in that 70% wrong sort of use case.

    I see LLMs as replacing first line support, maybe escalating to a human when actual stakes arise for a call (issuing warranty replacement, usage scenario that actually has serious consequences, customer demanding the human escalation after recognizing they are falling through the AI cracks without the AI figuring out to escalate). I expect to rarely ever see “stock photography” used again. I expect animation to employ AI at least for backgrounds like “generic forest that no one is going to actively look like, but it must be plausibly forest”. I expect it to augment software developers, but not able to enable a generic manager to code up whatever he might imagine. The commonality in all these is that they live in the mind numbing sorts of things current LLM can get right and/or a high tolerance for mistakes with ample opportunity for humans to intervene before the mistakes inflict much cost.



  • I’ve found that as an ambient code completion facility it’s… interesting, but I don’t know if it’s useful or not…

    So on average, it’s totally wrong about 80% of the time, 19% of the time the first line or two is useful (either correct or close enough to fix), and 1% of the time it seems to actually fill in a substantial portion in a roughly acceptable way.

    It’s exceedingly frustrating and annoying, but not sure I can call it a net loss in time.

    So reviewing the proposal for relevance and cut off and edits adds time to my workflow. Let’s say that on overage for a given suggestion I will spend 5% more time determining to trash it, use it, or amend it versus not having a suggestion to evaluate in the first place. If the 20% useful time is 500% faster for those scenarios, then I come out ahead overall, though I’m annoyed 80% of the time. My guess as to whether the suggestion is even worth looking at improves, if I’m filling in a pretty boilerplate thing (e.g. taking some variables and starting to write out argument parsing), then it has a high chance of a substantial match. If I’m doing something even vaguely esoteric, I just ignore the suggestions popping up.

    However, the 20% is a problem still since I’m maybe too lazy and complacent and spending the 100 milliseconds glancing at one word that looks right in review will sometimes fail me compared to spending 2-3 seconds having to type that same word out by hand.

    That 20% success rate allowing for me to fix it up and dispose of most of it works for code completion, but prompt driven tasks seem to be so much worse for me that it is hard to imagine it to be better than the trouble it brings.












  • To reinforce this, just had a meeting with a software executive who has no coding experience but is nearly certain he’s going to lay off nearly all his employees because the value is all in the requirements he manages and he can feed those to a prompt just as well as any human can.

    He does tutorial fodder introductory applications and assumes all the work is that way. So he is confident that he will save the company a lot of money by laying off these obsolete computer guys and focus on his “irreplaceable” insight. He’s convinced that all the negative feedback is just people trying to protect their jobs or people stubbornly not with new technology.




  • On the scientific discoveries, we have gotten the low hanging fruit. The twentieth century was remarkable, but the limitations of physics are harsh. A lot of excitement as we went from barely pulling off heavier than air flight to a moon landing in under 50 years. Media naturally imagined space exploration to be just a matter of time. Alas everything is exponentially harder and any further loopholes are supremely elusive.

    Probably the one area with a great deal of unrealized potential would be biology, because the ethical easy forward is slow.


  • The thing is that while people struggle harder and harder for a smaller chunk of scraps, they still have a lot of quality of life improvements over the standard of living back in the 70s.

    You almost certainly have decent access to passable air conditioning, which was far from a given back then. Even if you can’t afford decent health care, the sporadic health care you can get is still better than the standard of care then. You can have a 60 inch television and more content provided to it than you could imagine… You can instantly engage with people all over the world.


  • Sea steading, BioShock here we come…

    But seriously the fact that anyone ever mentions Mars colonization as a realistic strategy to do better than earth shows how stupid they are. Imagine the least habitable biome on earth where no one wants to live, imagine it even worse by unchecked climate change and realize it’s still just ashtoningly easier to live there than the most optimistic expectations of Mars.


  • Yes, as common as that is, in the scheme of driving it is relatively anomolous.

    By hours in car, most of the time is spent on a freeway driving between two lines either at cruising speed or in a traffic jam. The most mind numbing things for a human, pretty comfortably in the wheel house of driving.

    Once you are dealing with pedestrians, signs, intersections, etc, all those despite ‘common’ are anomolous enough to be dramatically more tricky for these systems.