CEO at Redwood Research.
AI safety is a highly collaborative field--almost all the points I make were either explained to me by someone else, or developed in conversation with other people. I'm saying this here because it would feel repetitive to say "these ideas were developed in collaboration with various people" in all my comments, but I want to have it on the record that the ideas I present were almost entirely not developed by me in isolation.
Yes, I've had this experience many times and I'm aware of many other cases of it happening lots of times.
Maybe the proliferation of dating apps means that it happens somewhat less than it used to, because when you meet up with someone from a dating app, there's a bit more common knowledge of mutual interest than there is when you're flirting in real life?
I think it's accurate to say that most Anthropic employees are abhorrently reckless about risks from AI (though my guess is that this isn't true of most people who are senior leadership or who work on Alignment Science, and I think that a bigger fraction of staff are thoughtful about these risks at Anthropic than other frontier AI companies). This is mostly because they're tech people, who are generally pretty irresponsible. I agree that Anthropic sort of acts like "surely we'll figure something out before anything catastrophic happens", and this is pretty scary.
I don't think that "AI will eventually pose grave risks that we currently don't know how to avert, and it's not obvious we'll ever know how to avert them" immediately implies "it is repugnant to ship SOTA tech", and I wish you spelled out that argument more.
I agree that it would be good if Anthropic staff (including those who identify as concerned about AI x-risk) were more honest and serious than the prevailing Anthropic groupthink wants them to be.
I think the LTFF is a pretty reasonable target for donations for donors who aren't that informed but trust people in this space.
To be clear, I think we at Redwood (and people at spiritually similar places like the AI Futures Project) do think about this kind of question (though I'd quibble about the importance of some of the specific questions you mention here).
Justis has been very helpful as a copy-editor for a bunch of Redwood content over the last 18 months!
I think that if you wanted to contribute maximally to a cure for aging (and let's ignore the possibility that AI changes the situation), it would probably make sense for you to have a lot of general knowledge. But that's substantially because you're personally good at and very motivated by being generally knowledgeable, and you'd end up in a weird niche where little of your contribution comes from actually pushing any of the technical frontiers. Most of the credit for solving aging will probably go to people who either narrowly specialized in a particular domain; much of the rest will go to people who applied their general knowledge to improving the overall strategy or allocation of effort among people who are working on curing aging (while leaving most of the technical contributions to specialists)--this latter strategy crucially relies on management and coordination and not being fully in the weeds everywhere.
Thanks for this post. Some thoughts:
This kind of idea has been discussed under the names "surrogate goals" and "safe Pareto improvements", see here.
The classic setting is a party (a place where you meet potential romantic partners who you don't already know (or who you otherwise know from professional settings where flirting is inappropriate), and where conversations are freely starting and ending, such that when you start talking to someone the conversation might either go for two minutes or four hours).
Examples of hints:
In all cases, saying those things is more flirty if it was unnecessary for them to say it. E.g. if they say they're single because it came up in conversation in a way that they couldn't have contrived, that's less flirty than if they tell a story that brings it up.
I think that online content on all this stuff is often pretty accurate.