You mentioned the P(Doom) debate. I’m concerned that this debate may focus too much on the risk of extinction with AGI, without discussing the risk of extinction without AGI. For a proper risk assessment, that probability should also be estimated. I see the current p(Doom) as very high, assuming we make no changes to our current course. We are indeed making changes, but not fast enough. In this risk framing, AGI overall lowers the total risk, even if AGI itself carries a small extinction risk
It’s a plausible story to me that we entered a potential extinction event a few hundred years ago when we started the Industrial Revolution. Our capability to affect the world has been expanding much faster than our ability to understand and control the consequences of our changes. If this divergence continues, we will crash. AI, and other new tools, give us the chance to make effective changes at the needed speed, and chart a safe course. The small AGI risk is worthwhile in the crisis we face.
Bernard, thanks for the very interesting inversion of the p(Doom) question! Please don't tire of mentioning it to people at the appropriate times; I hadn't thought of it, and I expect many others have not either.
I do want to note that whatever "the crisis" and "we" you had in mind when you wrote that, cannot be inferred from the context. Whether we can agree on what *the* crisis is, and whether we have the same concept of *we* seems unlikely to me.
This was an excellent, excellent, retrospective on GPT-2 and the difficulties of arbitrarily "creating a power floor" in AI regulation.
The best idea is still to increase our knowledge, monitor the models, run evals, understand how they work, and then we will know enough that come the right time we know enough to know how to solve the problems they might cause!
This is a very insightful article. As someone late to the party I found it fascinating. One thing that I wonder about "we also think governments should consider expanding or commencing initiatives to more systematically monitor the societal impact and diffusion of AI technologies, and to measure the progression in the capabilities of such systems."
On the Heisenberg principle (sort of!) is there not a risk that the very act of government monitoring changes the outcome and not for the better?
> I've found myself increasingly at odds with some of the ideas being thrown around in AI policy circles, like those relating to needing a license to develop AI systems; ones that seek to make it harder and more expensive for people to deploy large-scale open source AI models; shutting down AI development worldwide for some period of time; the creation of net-new government or state-level bureaucracies to create compliance barriers to deployment
Sane policies would be "like" those, but this doesn't represent any of the ideas well and doesn't provide any justification for them.
Frontier AI labs are locked in a race; locally, they have to continue regardless of risks; they publicly say that they should be regulated (while lobbying against any regulation in private).
As a lead investor of Anthropic puts it (https://twitter.com/liron/status/1656929936639430657), “I’ve not met anyone in AI labs who says the risk [from a large-scale AI experiment] is less than 1% of blowing up the planet”.
Pointing at complicated processes around nuclear safety to argue that we shouldn't give the governments the power to regulate this field seems kind of invalid in this context.
If the CEO and many employees of your company believe there's a 10-90% chance of your product or the product of your competitors killing everyone on the planet, it seems very reasonable for the governments to step in. It's much worse than developing a nuclear bomb in a lab in the center of a populated city.
Stopping frontier general AI training worldwide until we understand it to be safe is different from shutting down all AI development (including beneficial safe narrow AI systems) "for a period of time". Similarly, a sane idea with licenses wouldn't be about all AI applications; it'd be about a licensing mechanism specifically for technologies that the companies themselves believe might kill everyone.
Ideally, right now there should be a lot of effort focusing on helping the governments to have visibility into what's going on in AI, increasing their capability to develop threat models, and developing their capacity to have future regulation be effective (such as with compute governance measures like on-chip licensing mechanisms that'd allow controlling what GPUs can be used for if some uses are deemed existentially unsafe).
If all the scientists developing nuclear powerplants at a lab estimated that there's a 10-90% chance that everyone will die in the next decades (probably as a result of a powerplant developed), but wanted to race nonetheless because the closer you are to a working powerplant, the more gold it already generates, and others are also racing, we wouldn't find it convincing if a a blog post from a lab's cofounder and policy chief argued that it's better for all the labs to self-govern and not have the governments have any capacity to regulate, impose licenses, or stop any developments.
A totalitarian regime using AGI fear to knock down the last barriers to overseeing and censoring what everyone is doing at all times represents an end-state as permanent and undesireable as extinction.
It's becoming clear that with all the brain and consciousness theories out there, the proof will be in the pudding. By this I mean, can any particular theory be used to create a human adult level conscious machine. My bet is on the late Gerald Edelman's Extended Theory of Neuronal Group Selection. The lead group in robotics based on this theory is the Neurorobotics Lab at UC at Irvine. Dr. Edelman distinguished between primary consciousness, which came first in evolution, and that humans share with other conscious animals, and higher order consciousness, which came to only humans with the acquisition of language. A machine with only primary consciousness will probably have to come first.
What I find special about the TNGS is the Darwin series of automata created at the Neurosciences Institute by Dr. Edelman and his colleagues in the 1990's and 2000's. These machines perform in the real world, not in a restricted simulated world, and display convincing physical behavior indicative of higher psychological functions necessary for consciousness, such as perceptual categorization, memory, and learning. They are based on realistic models of the parts of the biological brain that the theory claims subserve these functions. The extended TNGS allows for the emergence of consciousness based only on further evolutionary development of the brain areas responsible for these functions, in a parsimonious way. No other research I've encountered is anywhere near as convincing.
I post because on almost every video and article about the brain and consciousness that I encounter, the attitude seems to be that we still know next to nothing about how the brain and consciousness work; that there's lots of data but no unifying theory. I believe the extended TNGS is that theory. My motivation is to keep that theory in front of the public. And obviously, I consider it the route to a truly conscious machine, primary and higher-order.
My advice to people who want to create a conscious machine is to seriously ground themselves in the extended TNGS and the Darwin automata first, and proceed from there, by applying to Jeff Krichmar's lab at UC Irvine, possibly. Dr. Edelman's roadmap to a conscious machine is at https://arxiv.org/abs/2105.10461
I read Nobelist Gerald Edelman's Neural Dawinism - the theory of neuronal group selection (TNGS) in 1990, borrowed from a high school friend's dad, chief engineer of a Texas Instruments fab. I reluctantly returned it, but got my own copy the following year. It's written at a level that only a very small percentage even neuroscience professors could understand, and an even smaller percentage were willing to bother. The lengthy and eruduite embryological biochemistry isn't comprehensible or useful, and the crucial cytoskeletal mechanisms of dendritic and synaptic plasticity are given short shrift, but Edelman had crucial insights about real neurons and neuronal groups that computer neural network researchers still ignore, such as, well, pretty much everything.
Real neurons are far more than summation and threshold. Brain tissue is not fungible, particularly between species. I recall reading a few years back about a startup putting human brain tissue cultures on the cloud for rental; mouse brain cultures had much lower performance, the startup's scientists said.
The theory and experimental method that the Darwin automata are based on is the way to a conscious machine. I'm sure much other research will be useful and pertinent in that endeavor.
I think there's more examples of governments reducing their power than is commonly believed, particularly in the US where there's a robust libertarian movement within the republican party, and particaurly within the judicial branch.
For instance, the creation of OIRA cost-effectiveness requirements represented a significant limit on the scope of the executive branch to regulate. I think people also forget that until mid 1970s there US government had fixed prices for airlines and trucking in a way completely unimaginable now, and simiarly the UK had prices and incomes boards. There was also a round of financial regulation in the latter half of the 20th century, most notably the repeal of glass-steagal requirements for the seperation of commercial and investment banking activities, and similarly huge financial deregulation in the UK in the 1986.
There are many examples of the judiciary striking down regulatory actions, and more structurally decreasing the power of the administrative state, such as the recent Chevron decision, the decsion to strike down parts of the consumer finncial protection beurua, and I think most importantly the development of major questions doctrine. I think with Gorsech on the court for the next 20 years, it's likely that the judicary will act as a major constraint on the US administrative state.
I agree that the NRC should serve as a clear warning for the possibility of regulatory overreach, but I also think that because the rationality community specfically and the tech community more broadly is libertarian leaning, there has been selective attention paid to examples of overregulation, and a claim that regulation always rachets up in a way that I had not seen shown in any systematic way.
"Higgs’s thesis is so compelling that it has become the dominant paradigm for understanding the so-called ratchet effect: government grows during crisis and then retrenches afterwards, but not to the same level as before."
All western governments in the age of demoncrazy are on iron rails of monotonically increasing laws, rules, mandates, debt, fiscal mismanagement, confiscation, taxation, counterfeiting, inspection, oversight, spying, censorship, regulation, interference, grift, fraud, rent-seeking and violation of individual rights.
The GPT series of models combines elements of pre-trained large models and generative AI, learning from vast amounts of unlabeled data and fine-tuning for specific tasks. GPT-2 introduced the concept of zero-shot learning, excelling in many downstream tasks without fine-tuning. GPT-3 advanced further with a larger model and more data, showcasing stronger language understanding and generation capabilities. Additionally, GPT-3 introduced "few-shot learning," allowing users to input downstream task samples directly into the model through prompts, enabling the model to learn new patterns and rules in context, known as in-context learning. However, the advent of GPT-3 raised concerns among small and medium enterprises about potential monopolies due to high training costs.
GPT-4 further enhanced model scale and parameter count, significantly boosting performance and accuracy. It not only optimized zero-shot and few-shot learning but also excelled in in-context learning, extending applications to fields like medicine and law.
Overall, the development of GPT models has introduced new approaches and methods to the NLP field, opening up new possibilities for language model advancements.
You mentioned the P(Doom) debate. I’m concerned that this debate may focus too much on the risk of extinction with AGI, without discussing the risk of extinction without AGI. For a proper risk assessment, that probability should also be estimated. I see the current p(Doom) as very high, assuming we make no changes to our current course. We are indeed making changes, but not fast enough. In this risk framing, AGI overall lowers the total risk, even if AGI itself carries a small extinction risk
It’s a plausible story to me that we entered a potential extinction event a few hundred years ago when we started the Industrial Revolution. Our capability to affect the world has been expanding much faster than our ability to understand and control the consequences of our changes. If this divergence continues, we will crash. AI, and other new tools, give us the chance to make effective changes at the needed speed, and chart a safe course. The small AGI risk is worthwhile in the crisis we face.
Bernard, thanks for the very interesting inversion of the p(Doom) question! Please don't tire of mentioning it to people at the appropriate times; I hadn't thought of it, and I expect many others have not either.
I do want to note that whatever "the crisis" and "we" you had in mind when you wrote that, cannot be inferred from the context. Whether we can agree on what *the* crisis is, and whether we have the same concept of *we* seems unlikely to me.
This was an excellent, excellent, retrospective on GPT-2 and the difficulties of arbitrarily "creating a power floor" in AI regulation.
The best idea is still to increase our knowledge, monitor the models, run evals, understand how they work, and then we will know enough that come the right time we know enough to know how to solve the problems they might cause!
Great insights on AI developments over the past five years! GPT-2's journey highlights the rapid evolution and challenges in AI research.
This is a very insightful article. As someone late to the party I found it fascinating. One thing that I wonder about "we also think governments should consider expanding or commencing initiatives to more systematically monitor the societal impact and diffusion of AI technologies, and to measure the progression in the capabilities of such systems."
On the Heisenberg principle (sort of!) is there not a risk that the very act of government monitoring changes the outcome and not for the better?
> I've found myself increasingly at odds with some of the ideas being thrown around in AI policy circles, like those relating to needing a license to develop AI systems; ones that seek to make it harder and more expensive for people to deploy large-scale open source AI models; shutting down AI development worldwide for some period of time; the creation of net-new government or state-level bureaucracies to create compliance barriers to deployment
Sane policies would be "like" those, but this doesn't represent any of the ideas well and doesn't provide any justification for them.
Frontier AI labs are locked in a race; locally, they have to continue regardless of risks; they publicly say that they should be regulated (while lobbying against any regulation in private).
As a lead investor of Anthropic puts it (https://twitter.com/liron/status/1656929936639430657), “I’ve not met anyone in AI labs who says the risk [from a large-scale AI experiment] is less than 1% of blowing up the planet”.
Pointing at complicated processes around nuclear safety to argue that we shouldn't give the governments the power to regulate this field seems kind of invalid in this context.
If the CEO and many employees of your company believe there's a 10-90% chance of your product or the product of your competitors killing everyone on the planet, it seems very reasonable for the governments to step in. It's much worse than developing a nuclear bomb in a lab in the center of a populated city.
Stopping frontier general AI training worldwide until we understand it to be safe is different from shutting down all AI development (including beneficial safe narrow AI systems) "for a period of time". Similarly, a sane idea with licenses wouldn't be about all AI applications; it'd be about a licensing mechanism specifically for technologies that the companies themselves believe might kill everyone.
Ideally, right now there should be a lot of effort focusing on helping the governments to have visibility into what's going on in AI, increasing their capability to develop threat models, and developing their capacity to have future regulation be effective (such as with compute governance measures like on-chip licensing mechanisms that'd allow controlling what GPUs can be used for if some uses are deemed existentially unsafe).
If all the scientists developing nuclear powerplants at a lab estimated that there's a 10-90% chance that everyone will die in the next decades (probably as a result of a powerplant developed), but wanted to race nonetheless because the closer you are to a working powerplant, the more gold it already generates, and others are also racing, we wouldn't find it convincing if a a blog post from a lab's cofounder and policy chief argued that it's better for all the labs to self-govern and not have the governments have any capacity to regulate, impose licenses, or stop any developments.
A totalitarian regime using AGI fear to knock down the last barriers to overseeing and censoring what everyone is doing at all times represents an end-state as permanent and undesireable as extinction.
No invasion of privacy is needed to monitor specialized AI GPUs in data-centers.
It's becoming clear that with all the brain and consciousness theories out there, the proof will be in the pudding. By this I mean, can any particular theory be used to create a human adult level conscious machine. My bet is on the late Gerald Edelman's Extended Theory of Neuronal Group Selection. The lead group in robotics based on this theory is the Neurorobotics Lab at UC at Irvine. Dr. Edelman distinguished between primary consciousness, which came first in evolution, and that humans share with other conscious animals, and higher order consciousness, which came to only humans with the acquisition of language. A machine with only primary consciousness will probably have to come first.
What I find special about the TNGS is the Darwin series of automata created at the Neurosciences Institute by Dr. Edelman and his colleagues in the 1990's and 2000's. These machines perform in the real world, not in a restricted simulated world, and display convincing physical behavior indicative of higher psychological functions necessary for consciousness, such as perceptual categorization, memory, and learning. They are based on realistic models of the parts of the biological brain that the theory claims subserve these functions. The extended TNGS allows for the emergence of consciousness based only on further evolutionary development of the brain areas responsible for these functions, in a parsimonious way. No other research I've encountered is anywhere near as convincing.
I post because on almost every video and article about the brain and consciousness that I encounter, the attitude seems to be that we still know next to nothing about how the brain and consciousness work; that there's lots of data but no unifying theory. I believe the extended TNGS is that theory. My motivation is to keep that theory in front of the public. And obviously, I consider it the route to a truly conscious machine, primary and higher-order.
My advice to people who want to create a conscious machine is to seriously ground themselves in the extended TNGS and the Darwin automata first, and proceed from there, by applying to Jeff Krichmar's lab at UC Irvine, possibly. Dr. Edelman's roadmap to a conscious machine is at https://arxiv.org/abs/2105.10461
I read Nobelist Gerald Edelman's Neural Dawinism - the theory of neuronal group selection (TNGS) in 1990, borrowed from a high school friend's dad, chief engineer of a Texas Instruments fab. I reluctantly returned it, but got my own copy the following year. It's written at a level that only a very small percentage even neuroscience professors could understand, and an even smaller percentage were willing to bother. The lengthy and eruduite embryological biochemistry isn't comprehensible or useful, and the crucial cytoskeletal mechanisms of dendritic and synaptic plasticity are given short shrift, but Edelman had crucial insights about real neurons and neuronal groups that computer neural network researchers still ignore, such as, well, pretty much everything.
Real neurons are far more than summation and threshold. Brain tissue is not fungible, particularly between species. I recall reading a few years back about a startup putting human brain tissue cultures on the cloud for rental; mouse brain cultures had much lower performance, the startup's scientists said.
The theory and experimental method that the Darwin automata are based on is the way to a conscious machine. I'm sure much other research will be useful and pertinent in that endeavor.
I am grateful for your thoughtful writing.
I think there's more examples of governments reducing their power than is commonly believed, particularly in the US where there's a robust libertarian movement within the republican party, and particaurly within the judicial branch.
For instance, the creation of OIRA cost-effectiveness requirements represented a significant limit on the scope of the executive branch to regulate. I think people also forget that until mid 1970s there US government had fixed prices for airlines and trucking in a way completely unimaginable now, and simiarly the UK had prices and incomes boards. There was also a round of financial regulation in the latter half of the 20th century, most notably the repeal of glass-steagal requirements for the seperation of commercial and investment banking activities, and similarly huge financial deregulation in the UK in the 1986.
There are many examples of the judiciary striking down regulatory actions, and more structurally decreasing the power of the administrative state, such as the recent Chevron decision, the decsion to strike down parts of the consumer finncial protection beurua, and I think most importantly the development of major questions doctrine. I think with Gorsech on the court for the next 20 years, it's likely that the judicary will act as a major constraint on the US administrative state.
I agree that the NRC should serve as a clear warning for the possibility of regulatory overreach, but I also think that because the rationality community specfically and the tech community more broadly is libertarian leaning, there has been selective attention paid to examples of overregulation, and a claim that regulation always rachets up in a way that I had not seen shown in any systematic way.
Then let me show you!
https://mises.org/library/book/crisis-and-leviathan
"Higgs’s thesis is so compelling that it has become the dominant paradigm for understanding the so-called ratchet effect: government grows during crisis and then retrenches afterwards, but not to the same level as before."
All western governments in the age of demoncrazy are on iron rails of monotonically increasing laws, rules, mandates, debt, fiscal mismanagement, confiscation, taxation, counterfeiting, inspection, oversight, spying, censorship, regulation, interference, grift, fraud, rent-seeking and violation of individual rights.
The exceptions you cited only prove the rule.
The GPT series of models combines elements of pre-trained large models and generative AI, learning from vast amounts of unlabeled data and fine-tuning for specific tasks. GPT-2 introduced the concept of zero-shot learning, excelling in many downstream tasks without fine-tuning. GPT-3 advanced further with a larger model and more data, showcasing stronger language understanding and generation capabilities. Additionally, GPT-3 introduced "few-shot learning," allowing users to input downstream task samples directly into the model through prompts, enabling the model to learn new patterns and rules in context, known as in-context learning. However, the advent of GPT-3 raised concerns among small and medium enterprises about potential monopolies due to high training costs.
GPT-4 further enhanced model scale and parameter count, significantly boosting performance and accuracy. It not only optimized zero-shot and few-shot learning but also excelled in in-context learning, extending applications to fields like medicine and law.
Overall, the development of GPT models has introduced new approaches and methods to the NLP field, opening up new possibilities for language model advancements.