top of page


AI Systems as Limited Moral Agents

Thomas Larsen

February 27th, 2023

DALL·E 2023-01-08 23.32.44 - German expressionist illustration, for a book with the title


  • AI is becoming more prevalent

  • We currently lack a universal code of AI ethics

  • We can approach this subject by thinking of AI systems as limited moral agents

  • As an AI system becomes more advanced, the blame for its actions shift from the developer, to the operator, and ultimately to the AI itself.

  • Focusing on blame only delays the need for a universal code of AI ethics


AI Becoming More Prevalent

Waiting in line at the local bakery, I noticed the young cashier sharing tidbits of his life with each customer as he rang them up. I overheard him describe a new pocket watch of his to an older gentleman, his sister’s new puppy to the woman in front of me, and when it was my turn he excitedly told me about his new favorite pastime: using a new program to mix and match characters and settings from his favorite comic book franchises. “ChatGPT?” I asked him. He seemed shocked that someone else knew of the tool, so I told him that I have been using it quite often, drafting simple responses to emails and fleshing out powerpoint outlines at work, and writing holiday cards and absurd bedtime stories at home.


This may have been the first time my personal obsession with AI progress has come up organically in a public setting, and I was more than thrilled to see this young person getting excited for and engaging with this new technology in this way (surely more thrilled than the customers behind me in line). But, even though this interaction was extremely positive, it reinvigorated a deepening concern I have with the ongoing development of Artificial intelligence. 


Over the last few years, the rate of new AI development has seemingly grown exponentially. New Language and text-to-image models are released weekly at this point, like the months old chatGPT mentioned before. Self-driving car technology is rapidly improving and showing up in large cities around the world via companies like Waymo, Zoox, and Tesla’s autopilot system. Facial recognition software is not just being used in our phones and computers but in many other venues as well, from business security to world governments. Even the use of AI in agriculture is now nearly ubiquitous in the US at varying levels. 


Lacking a Universal Code of AI ethics

With the growing widespread use of AI across all industries, whether traditionally tech-heavy or not, there are a few distinctions that need to be made about the dangers of AI in its current form, as well as some potential harms to be wary of as we move forward into the future. There are two terms that vaguely cover most of this topic and they are the ideas of AI for good and AI for bad. AI for good is any AI system designed specifically, or having the effect of generating a positive outcome for humanity. AI for bad is the opposite, any AI that is either intentionally or unknowingly used to negatively affect the future of humanity. 


The distinction at surface level seems fairly straightforward, but this topic deserves a deeper dig. Nearly every good AI could be used for malicious deeds if effective safeguards are not in place. Facial recognition software has been in the news over the last few years, first for its bias towards anyone who is not a white male. The bias was not intentionally built into its programming, but was the result of a lack of equal racial and gender representation in the places where these technologies were built and tested. As facial recognition software has improved over the course of the last three or so years, a new issue has arisen: the overuse of facial recognition software in policing. Now, rather than having a racist or sexist system itself, a more refined system is being used in less than savory ways to track the movements of people in a variety of venues the world over. This technology built for good – unlocking phones and computers, faster lines at the airport, and many many other positive uses – is now an obvious attack on privacy. This is just one of many new technologies with histories of both poor development despite benevolent intentions, and malevolent usage despite its eventual proper development. 


What can be done about this problem? Traditionally companies create usage terms and agreements to prevent their technology from being used in ways that it was not intended, but is this enough? And, as these technologies become more and more powerful and begin to gain greater autonomy, will these terms be enough? Some companies have begun encoding their ethics into their systems. ChatGPT, for example, has safeguards in place that stop it from talking about certain topics, or engaging in questionable forms of wordplay, even if the intentions of the user are harmless. OpenAI has decided for their most public algorithm to try not to lie, impersonate a human, engage in harmful speech, give instructions about how to cause harm to others and so on. Though these safeguards are extremely easy to circumnavigate, they are at least an attempt at safeguarding users from harm and false information. But is this enough? And how do we decide what types of conversation are ethical for systems like this? And, upon whose ethics should these safeguards be based? 


Currently the world is full of ethical disagreements. In some countries it is unethical to disciplinarily strike a child at school, in other countries it is commonplace and expected. In some countries, women have every right that a man has, while in other countries women’s rights are almost nonexistent. Even within a single country, the ethical codes followed from person to person can vary greatly. Is homosexuality acceptable? How about abortion? Eating meat? And these are just the currently most inflammatory disagreements our society faces. Is war ever ethical? Is representative democracy ethical? Is capitalism? Is any social net ethical? Is refusing to develop social nets ethical? How about universal healthcare, universal education, universal rights, and so on? Our society is so full of ethical dilemmas that we don’t often stop to think about what is truly right or wrong anymore, or whether something can be truly, definitely, objectively, or really even circumstantially right or wrong. How then could we ever imagine ourselves capable of creating a universal code of AI ethics if we can’t even agree on our own ethics as a species? And what would be the dangers of attempting to do so in this ethically tumultuous time?


One major danger we face currently is that the morals we either intentionally or unintentionally promote through the systems we create now will become permanent features in our society through a process called value lock-in. The threat of premature value lock-in is perhaps one of the biggest threats against the success of our society’s future. An example of value lock-in from history is Europe’s “Dark Ages”. There were certain ideas and values that became locked in and spread throughout the entire continent which inadvertently stopped intellectual and technical progress in its tracks. It took hard work and hundreds of years for the effect of these monolithic values to loosen their grip on the minds and hearts of the men and women of Europe. In some ways, we are still fighting the tail end of that intellectual battle. Now imagine a society that engrains its faulted morality onto an artificial intelligence that becomes the world’s moderator and regulator. How many centuries would have to pass before the stagnation of those new dark ages could wear off. Or would it take millenia or more?


Since it is obvious that there are no universally accepted answers to the moral questions above and countless others, we want to delay the prospect of value lock in for as long as we can. As our values have developed to be more fair and free in the past century, so too may future generations further unlock freedoms and solve ethical dilemmas that currently hold us back. Until that time, we may just be safer keeping our values fluid.


AI Systems as Limited Moral Agents

If we know that there is a problem then, and we believe the values of our current society to be significantly lacking, and that we are potentially ignorant of unnoted inequities within our current society, how can we create this much-needed universal code of AI ethics? Upon whose ethics should this code be based? Since it should be obvious that I believe a universal code of AI ethics is not only impossible to create at this moment, but would also prove to be wildly shortsighted and unethical, it may be also be obvious that I have something very different in mind when it comes to the ethical governing of AI today and into the near future.


First, think about the development of a new ethical foundation not from the perspective of what is right, or what is wrong, but by determining who or what could be seen as being right or wrong, or in other words, who could be blamed. In essence we determine exactly how we define a moral agent. For now, in the world of weak AI, it is arguable that there are no moral agents outside of humanity. Some moral agency is anthropomorphically assigned to the animals and technology we interact with as humans, but it is arguable that we humans are at least the current standard of moral agency, if not the only known moral agents. 


It is perhaps almost universally accepted that pets who do bad things have no moral blame in and of themselves, but rather, that blame is assigned to the pet owner, or to the genetic predispositions of that particular animal. If your dog maims the child next door, your dog may be punished (many laws would require it to be euthanized), but the legal responsibility for its actions falls squarely on you as the owner. 


When it comes to technology we generally follow the same rules as with pets (when your technology does something bad it is the fault of the person running the technology), but we have an interesting and relevant twist added as well. If, say, my brake lines burst as I hit my brakes to avoid a stopped car, an investigation is done to decide whose fault it was that the brake lines burst. If the car is old and hasn’t been properly serviced, the fault could fall on me. If the car is new and the company who built it made a mistake while manufacturing my car, then the fault lands on them. If the car company made a mistake while manufacturing my car by putting faulty parts and had issued a recall that I never heeded, then it may become my fault again. 


Now, how about a car with an AI system used to drive you around? Say you are in the same situation, you are coming to a stop but the AI-driven car does not stop and hits the car in front of you. Who is at fault now? The car itself is obviously not yet considered a moral agent, and you were not the driver, so the only other entity to blame is the company who manufactured the car and/or supplies its AI system. And at this point, we have laws that would deal with the car company.


This general approach to AI Ethics reduces the immediate need to encode any form of ethics into any system. This approach does, however, rely on the appropriate legal response to companies and individuals who use this technology for unethical or malignant purposes. Working within our existing legal system is a good start, but as new technology is developed, novel ethical situations will arise, thus necessitating new standards, laws, and regulations. We must therefore demand from legislators timely, responsible, and informed reactions to new technology.


The immediate pushback to this approach that comes to my mind is that this lifts the responsibility of programmers and companies from responsibly developing ethical systems, and in a way this is true. But, if laws are in place that hold companies and developers responsible for the products they create, then the incentive to create products that do not cross these legal thresholds stands thoroughly in place. We, as humans, often do unethical things, but we tend to avoid illegal activities as much as we can in order to avoid the legal consequences for such actions. Businesses and those who run them, are also incentivised to avoid illegal activities by the same fear of legal action taken against them. 


Can AI be a moral agent?

Now, what happens when an AI system in and of itself has developed enough to no longer be beholden to the will of its maker? It may be the case that a system need not be conscious to become fully autonomous. It may also be proven that a system need not be fully autonomous to be a moral agent. In this approach to AI ethics we must  be able to determine when an AI system itself constitutes a moral agent (an entity who is capable of blame for their actions). 


Laying aside for a moment the extremely intriguing debate surrounding free will and determinism in the context of AI development, and separating yourself from the debate surrounding objective vs subjective morality, as this argument ought to apply to either, think on what capabilities and characteristics an entity must obtain before it can be considered an accountable moral agent. From what we understand about our own moral agency, we would probably consider it to be contingent on a few traits we all share to varying degrees:

  • The ability to decide

  • The ability to act on its decisions

  • The ability to remember its past

  • The ability to envision its future

  • The ability to recognize its mistakes

  • The ability to self-correct

And perhaps to a lesser extent the following traits:

  • The ability to develop ambitions 

  • The ability to perceive social cues 

  • The ability to change over time

  • The ability to sense pain, discomfort, anger, frustration, distrust, embarrassment, shame, and so on in others (theory of mind)

  • The ability to sense pain, discomfort, anger, frustration, distrust, embarrassment, shame, and so on in itself (Self-awareness)

Sam Harris in his book “Free Will” uses the example of a boy who shoots his father’s girlfriend, and then varies the details about the boy’s age, mental capacity, and other situational details at the time of the shooting to illustrate that the level of guilt associated with the same action varies greatly depending on many factors contained within the agent (the one doing the action). In those cases one may argue that each is a limited moral agent. Each has the ability to take some level of blame for the action (except for the two year old boy example he gives whose blame arguably lies solely on the father). But, needless to say, each situation would be handled individually by a court of law and would determine far different outcomes for each, signifying that immoral acts are only immoral when acted out by moral agents. Limitations in any one of those attributes listed above lessens one’s status as a moral agent. 


With the help of the sci-fi entertainment industry, one can easily imagine that any number of combinations of the above listed attributes embodied in a machine would render the machine more or less morally culpable for its own actions. Likewise, one can imagine that a machine with those capabilities could incur a range of punishments for a corresponding range of unsavory actions.The punishments they would receive for their actions would also require some sure determination of their degree of autonomy. 


What we are actually wanting to know then is when exactly would such a machine be both subject to and have a right to a fair trial? Certainly one’s Tesla Autopilot cannot be brought into a courtroom and be subject to a trial after a deadly accident, but the driver, or likely Tesla themselves could be depending on the root cause of the accident. Would you take an image-generating AI like Stable Diffusion or Dall-E to court if it somehow produced obscene or pornographic material at the request of a child? Or would you punish ChatGPt if someone tricked it into teaching them how to build a bomb? Of course you wouldn’t, but at what point would you? 


None of today’s AI is close to the level of autonomy needed to be considered a moral agent, and even once autonomy is achieved, we must determine whether this autonomous machine is capable of obtaining a self. If the decisions of a machine are not its own, but rather strictly a product of its programing, does it truly have a self? If all of its decisions are made for it, does it truly feel like anything to be that machine? There are some great philosophical debates that dig deep into these topics and I am sure I will dig into them myself a bit more in the future, but suffice it to say that there are some very major hurdles to jump through in order to prove a sufficiently complex machine can obtain a sense of self and become a moral agent. But what steps are being taken now to progress towards that not necessarily inevitable future point?


There are two projects that must be spoken of before digging into my primary example. The first is a research model known as Gato, which was released by the UK’s DeepMind lab in May of last year (2022). Gato was a  proof of concept, testing whether or not Large Language Models (LLMs) could be used for anything other than language processing. DeepMind trained Gato in a similar way they would an LLM but with the intent of having it successfully operate a robot arm, play video games, respond to text prompts, generate text, and more all with the same model, the same “brain,” so to speak. The success of this model was limited due to its size, but it was indeed a proof of concept, and the hope was that, at  scale, a single model can be trained as a generalist. 


The second, and most recent breakthrough comes from Google’s PaLM-E, released in March of this year (2023). PaLM-E is another multi-modal model like Gato, but is far more robust an example of the potential capabilities of these types of AI. This model is what Google calls an “embodied language model” and a “visual-language generalist”. Since it was trained on both natural language like GPT-3 and trained on image data, like DALL-E. By combining both sets of training data, the system is able to recognize both language and visual stimuli, and in turn produce both language and visual output. In addition, its visual capabilities enable it to use its body to respond to verbal commands. The example given by Google is that of retrieving a bag of chips from a kitchen drawer and delivering it to the operator. This model is miles beyond Gato but was arguably built upon the proof of concept provided by that initial multi-modal AI model. These examples of embodied AI bring us far closer to the types of AI that may one day be considered moral agents.


But there is a third example which has been developed for a more specific task, that, due to its given task, is perhaps a bit further along this road. Fully Autonomous Vehicles (FAVs) or self-driving cars, embody quite a few of the traits in our list of traits of moral agents. FAVs have the ability to make decisions, though limited only to its task. They have the ability to act on those decisions. Their ability to remember their past is extremely limited, and less specific since the vehicle’s past experiences are generalized into training data and shared across a network of connected FAVs. In an extremely limited sense FAVs envision their future by plotting courses along roadways, by continually scanning ahead of the vehicle for obstacles and hazards, and so on, but this envisioning of the future is a far cry from the types of forward thinking moral agents are capable of. FAVs don’t necessarily recognize their mistakes, they simply gather data to contribute to the training data of other FAVs in their network. All in all, FAVs are perhaps the most fully developed AI systems of our day when it comes to autonomy, but they are still far from becoming true moral agents. How do we know this? Call it intuition, or rely upon the old adage, “you know it when you see it” but conscious beings like us seem to have a sense about these things, and I generally think our intuition ought to be followed, so long as one’s intuition aligns with known and observable facts. Is this then a reliable way to determine a system’s agency? No, but intuition may be the key to calibrating some future reliable system for doing so.


Even though AI systems do not yet have moral agency, AI may soon have some form of limited agency in its actions, responses, etc. to the stimuli we feed it. Existing systems are of course subject to their coding and the inherent biases and insufficiencies within, but they do, in a very limited sense, make “decisions” relative to their abilities and are therefore on the road toward extremely limited moral agency.


Advancing AI Means Shifting Blame

As companies and developers push the limits of what AI technology can do, we may one day find ourselves face to face with an engineered entity capable of moral deliberation. As technology advances toward that potential outcome, the blame for its actions will begin to shift. 


Technology as a tool is simply an extension of ourselves. All tools carry with them some potential for good or harm, and the way the tool is used is dependent on the user. A hammer as a simple technology can be extremely useful. It can also be quite deadly. The responsibility for the actions carried out with this tool fall solely on the shoulders of the user. Technology, for most of the history of humanity, has followed this pattern – a tool is neutral and what it ends up doing is up to the doer. 


As technology has developed, the potential for a tool to do harm on its own has become a real possibility. First you have inanimate objects, then you have types of technology that are complex enough that if they are manufactured poorly they have the potential to cause harm through malfunction (the brake line example from before). And now we have technology with limited amounts of autonomy. Self-driving cars are still tools, but they have the ability to make logic-based decisions according to their programming, and those decisions can cause harm. The harm they do though is much more akin to the mismanufactured technology than the deliberate harmful actions of a human misusing a tool. 


In some future form of technology, tools may have the ability to not just perform logical functions, but to deliberate and act on moral prompts. When a machine chooses the priority of its own actions by extrapolating specific moral insight from general moral principles, and acting on that insight, we have created a moral agent. At this point, the blame for this tool's actions must necessarily fall upon the tool itself. There may still be varying degrees of blame that fall on the developer and manufacturers of such technology, but there is at least some of the blame that ought to fall on the machine itself. Again though, this machine is still an extension of ourselves, but this time in a more philosophical manner.


The moral codes that a machine like this would rely upon are those upon which our society is built and governed. If we find the actions of a machine trained on the morality of our society to be immoral, it may mean that we ourselves are the root cause of that immorality. This concept is the main reason why delaying the development of a universal code of AI ethics is perhaps the best thing we can do at this time. But, when technology does get to this point, it is equally wrong for us to let each company, each developer decide what code of ethics their machines will train on. Hopefully our society will continue to develop as a more free and more universally ethical society so that when we get to that point technologically that we have a code of ethics worthy of being Locked in by the technology itself.


Shifting Blame Only Delays the Need for a Code of Ethics

Eventually the need to agree on a universal code of AI ethics will become unavoidable. Knowing when to Blame an AI and not its operator or developer only delays this need because so many currently foreseeable applications of this technology will heavily affect the future of humanity in perpetuity. At some future point we will no longer have the power to influence with complete confidence the ethics and actions of these artificial moral agents and by that time it will be too late. Even now, the way limited AI is used heavily affects the outcome of current and future world events, and the need for at least a primary code of AI ethics may already be too far delayed.


There are countless ongoing discussions about the future of AI and its acceptable uses, but the culture of today’s AI developers is to build what you can and launch before asking whether you should. And though most AI systems currently in use ought to be viewed as experiments, they are presented as products. The danger of this disregard for safety ought to be viewed as flippant and greedy and irresponsible. But, in the grand scheme of things, these systems are relatively innocuous, especially when you compare it to the US government’s most recent statement on the use of Lethal Autonomous Weapons Systems (LAWS). In short, the US government has taken a stance that it is open to using LAWS in many situations. While it does require that every system employed by the US military be made to ““allow commanders and operators to exercise appropriate levels of human judgment over the use of force,” there is the noted caveat that “‘appropriate’ is a flexible term” and is therefore open to individual interpretation by whatever commanding officer happens to be in charge of the LAWS.


The need for at least a limited moral code for AI development cannot be overstated. With governments like the United States being openly willing to pursue the use of Autonomous killing machines, the regular willingness of policing agencies the world over to use AI systems to violate the privacy of their own citizens, and the unchecked development processes of large tech companies, we have more than enough reason to pursue a limited moral code for AI development now. We should have done it ten years ago, but we are where we are and can only move forward from here.


In summary, AI is a powerful tool that will only grow in its abilities and independence, and humanity will continue to grow more and more reliant upon it in practically every aspect of our lives. We need to understand who to blame when AI messes up, avoiding anthropomorphizing current technology, and yet, we need to know when AI actually is advanced enough that it should begin to be seen as a moral agent, shouldering some of the blame on itself. We should absolutely strive to pursue regular legislative checks and balances over the unhindered development of potentially harmful technology. We need, at the very least, a limited moral code for AI development to keep the worst of humanity’s moral traits from becoming locked into our technology for countless future generations. And finally, we should avoid locking in our values until necessary – perhaps after we’ve successfully granted equal rights to all citizens of the world, or settled some of the more inflammatory ongoing ethical debates – in order to keep ourselves from hindering ongoing human moral progress.


There’s no reason why the future of humanity can’t be bright and infinitely empowered by the technology we are developing today. Yes, we have a lot of work to do, a lot of hard questions to ask, answers to be found, and puzzles to be solved.Yes, we have a need for personal accountability in this new and developing field, and yes, we need to ensure that the world’s governments also remain accountable, but hard work is doable. Some grandparents plant trees for great grandchildren they will never know, but we are planting the seeds of success for all future generations by committing to developing this powerful technology ethically and responsibly. I hope we can learn to use our collective voices, wallets, and votes to push developers, tech companies, and governments to think and act responsibly.

bottom of page