Research Log: January 2017

# January 2, 2017

I'm an armchair philosopher when it comes to general AI. I can't wait for computer systems that can learn to do new things well without being reprogrammed, and it's interesting to think about how that could happen.

I have a few barely justified opinions on the subject:

I think emulating human intelligence is an interesting way to go about it. Regardless of whether it teaches us anything about intelligence itself, why wouldn't this be useful?
I think friendly AI is important research, even if I'm not particularly terrified of a rogue AI converting all matter in the universe into paperclips. But I can't rule out the possibility that I'm wrong.

In this entry, I wanted to share some thoughts I've had about this over the past few [time period]s:

Self-awareness reframed as pattern recognition + generation

A while back I thought about how connecting a pattern recognizer in a feedback loop could give rise to some kind of self-awareness. I wrote about it in essay form with huge diagrams. :)

When I wrote it, I was thinking of it as a reduction. “Hey look, all you need is the ability to recognize patterns!” But I'm coming to think of it as a reframing. If I imagine training a neural-network-based pattern recognizer on every GPU on the planet, I still don't see it becoming anything more than a goal-oriented chat bot whose capabilities extend no further than if you used the same underlying model (e.g. neural network) in some other framework. So I don't think this brings us any closer to intelligence, but it least it got some gears churning.

And there's a clear path to implementation, which means that I could start creating and testing hypotheses using the scientific method if I wanted to…

Friendly AI: maximize opportunity for all?

Part of friendly AI research (mentioned above) is a critique of objective functions (e.g. maximize the happiness of my creator) because of how they can go wrong (e.g. the AI invents a drug that makes the creator permanently happy). If all the AI does is maximize an objective function, there's always the chance that it will find a solution that we don't approve of at all.

(And scarier: if the AI predicts that we won't approve of its solution, part of its solution would be to counter our attempts to stop it from carrying out its plan… (It bears repeating: I'm not too paranoid about this, but it is an obvious consequence. People cheer when a computer invents unconventional Go strategies, but if it invented unconventional routes through Manhattan…))

So people wonder: can you write an objective function that penalizes solutions that we don't like? Obviously, if the domain is Go boards or city streets, then things can only get so bad, but we're assuming computers will be applied to more general problems than those eventually.

I dunno the answer. But I read a really interesting candidate: maximize the opportunities available to everybody. On the surface, it seems all right:

It would prefer not to administer a permanent happiness drug because then that person would never have the opportunity to experience the full range of human emotion.
It would perfer not to kill someone, because then that person wouldn't have the opportunity to do… anything.
But it might not mind killing someone who seems to be oppressing a great number of others, because that might increase the overall number of available opportunities.

It reminds me of how maximizing available moves is usually a reasonable objective function for games, regardless of the game. It's often equivalent to waiting for your opponent to attack and taking advantage of openings they create. Or attempting to be ready for anything.