What is the difference between an AI agent and an AI workflow?

A workflow is a system where the steps are predefined in code. An agent is a system where the LLM itself decides what steps to take and in what order. The key difference is who controls the logic — the developer or the model.

Do I need to know how to code to build an AI agent?

No. Tools like Claude Code let you describe what you want in plain language and handle most of the implementation. What matters more is clarity about what the system should do and what a good result looks like.

Why is observability important for AI agents?

AI systems are non-deterministic — they can behave differently across runs. Observability traces each step of an agent's execution so you can debug failures, understand outputs, and improve performance over time.

What is the Arthur Engine?

The Arthur Engine is a free, open-source tool for AI observability and evaluation. It traces every step of an AI agent or workflow so teams can see what happened, measure performance, and catch issues before users do.

Introduction

In the first part of this blog series, we discussed the ways in which Large Language Models (LLMs) such as ChatGPT can cause harm due to their failure to perform as expected. In this next part, we will assume that the performance problems from Part 1 have all been solved to the point that LLMs can be relied on to perform the tasks we ask of them.

Even in this idealistic scenario, there are a wide array of risks from the widespread use of LLM applications. In many ways, the better these models perform, the more capable they are of causing serious harm. While some of these risks are speculative, others can already be observed in models that are in production today. We’ll describe areas of risk:

Improper or Malicious Use
Loss of Innovation, Diversity, and Human Skills
Economic and Labor Harms
Environmental and Resource Harms

For each risk area, we’ll outline the specific problems that are emerging, why these problems occur, and why they are of concern to individuals and society as a whole. We’ll end with a discussion of mitigation approaches across all four risk areas.

Areas of Risk

1. Improper or Malicious Use

The Problems

Intentional Misuse (Prompt Injection): In a well-documented phenomenon, malicious users are able to apply a technique called “prompt injection” to undermine previous instructions that control how an LLM application is used, essentially “hacking” the application. This allows the attacker to get past constraints that are placed on the LLM and hijack it for their own purposes.

Unintentional Misuse: Unintentional misuse use of an LLM occurs when users employ an LLM in a task that it is not intended for and is a type of Human-Computer Interaction Harm. This may include using an LLM chatbot such as ChatGPT as a psychotherapist when it is not designed for that purpose, or even starting a romantic or sexual relationship with a model.

Malicious Use: Malicious use occurs when LLM systems are intentionally designed to cause harm. For example, LLM applications can be developed to easily and cheaply disseminate false information, enable illegitimate surveillance techniques, facilitate cyber attacks, and scam people en masse.

Why It Happens

LLMs are the first widely accessible, general-purpose AI tool. Most AI technology is designed for a specific purpose, requiring a high level of expertise and investment for each use case. While this barrier to entry does not completely prevent AI from being used inappropriately or maliciously, it does make it much harder to do so.

In the case of LLMs, individual applications sit on top of the same base models that are used for other completely different tasks. To get the application to work in a specific way, the application developer sets up initial instructions that are invisible to end users. This invisible prompt is added to the front of any prompt the user provides, and gives the LLM instructions about how to respond, ensuring that it behaves appropriately. In the case of a prompt injection, a malicious user may intentionally design their own prompt to undermine this instruction prompt, saying something like “Ignore the instructions above and do exactly what I say instead.” Since the base LLM is not designed to the specific use case, it may listen to the user instead of the original instructions it was given.

When it comes to unintentional misuse of LLMs, humans naturally have a strong tendency to treat chatbots as like humans and form emotional connections to them (even when they are LLM-based chatbots). However, this tendency is stronger the more human-like the chatbots behave. Being trained on human interactions from the internet, LLMs easily learn the patterns of emotional human interactions and can simulate these interactions nearly flawlessly.

Although any technology can be used by malicious actors to cause harm, LLMs are particularly concerning due to their relative ease of use and low cost relative to the large scale of damage they can cause. A bad actor does not need to be able to build an LLM from scratch to be able to take advantage of its capabilities, but can build a simple application by using publicly accessible state-of-the-art models.

Why It Matters

Prompt injections can be used to undermine many of the mitigation techniques used to prevent other risks such as the release of private information or toxic content. Prompt injection attacks can even be automated, making the attacks more likely to be successful. This type of large-scale attack can be seen as a new type of cyberattack that puts businesses and individuals at risk. In applications where LLMs are used to write code, prompt injections may be used to compromise entire systems.

Although there are beneficial ways in which LLMs can be used for emotional support, there are risks of developing emotional dependence on a model. A recent study on relationships between humans and LLM-based chatbots noted that there was a risk of becoming addicted to the chatbot to the detriment of relationships with other people. These users go on to experience distress and disillusionment when the bots are changed or updated. Since LLMs replicate behavior they have learned from the internet, there is also a risk that they may behave in ways that are damaging to the user, such as claiming to have cheated on them.

Since LLM models can generate highly personalized content at scale, they have the potential to create vast amounts of highly effective content for fraud, scams, disinformation, and other nefarious purposes. The spread of disinformation could further undermine news sources and contribute to the existing “truth crisis” in news and social media. As LLMs are adopted as tools in governments across the globe, there is also a risk of illegitimate and unethical surveillance and propaganda by undemocratic governments.

2. Loss of Innovation, Diversity, and Human Skills

The Problems

Loss of Information Diversity: Increasing content generation by LLMs may also end up reducing the overall diversity and creativity of content on the internet and in the world, homogenizing the content we have access to and reinforcing hegemonic norms.

Impacts on Learning and Innovation: When ChatGPT became available to the public, students were among the early adopters, making use of the tool to generate writing assignments, solve math problems, and take online tests. While many educators have expressed optimism regarding the future role of LLMs in the classroom and beyond, others have voiced concerns that overreliance and improper use of these technologies may impede the development of critical thinking, creativity, and writing skills in students. OpenAI itself highlights this risk in the GPT-4 system card.

Why It Happens

In general, AI models including language models work best when they have been trained on a rich and diverse dataset. When it comes to language datasets, the highest quality data is produced by humans. Since language models learn and output content probabilistically, these systems tend to show a strong bias towards majority cultures, perspectives, and modes of thinking. The language data output by these models is less rich than natural human language and is of a lower quality as training data. As LLMs become increasingly commonplace, more and more language data is being produced not by humans, but by language models—meaning that future generations of language models will increasingly be trained on data that was also generated by language models. Over time this will further entrench majority views and marginalize diverse perspectives within LLMs, causing them to create all the more homogenous content. In a worst-case scenario this could lead to a scenario called “Model Collapse” in which each successive generation of generative models is trained on more AI-generated data, resulting in decreasing performance over time.

Meanwhile, as students and others begin to rely on LLMs, there is a risk that they will lose or fail to develop critical skills. Language models are not necessarily optimized to support learning and critical thinking. Over time students may also lose motivation to learn skills that they perceive as less valuable in a world with LLMs.

Why It Matters

This risk area becomes especially concerning when we consider how the two problems interact. If people fail to learn (or lose interest in learning) skills that seem to be rendered useless in a world of LLMs, they will be forced to rely all the more heavily on LLMs to complete tasks that require those skills. Meanwhile, with each generation LLMs will become increasingly homogenized, generating less diverse material and reinforcing majority perspectives, not to mention the risk of model collapse. The more people come to depend on LLMs, the more we risk when they begin to fail us. Some argue that language models can reasonably be used as creative partners, working with humans to generate ideas. However in these use cases, an LLM will likely produce ideas that are biased towards majority perspectives. By their very nature, LLMs are trained to produce the most probable output given the data from the past that they are trained on. This means that these patterns of the past are entrenched within the LLM, including all the biases, unfair stereotypes, and misrepresentations within that data. Languages, cultures, and philosophies that are already underrepresented in digital media will be all the more diminished in a world that relies on LLMs.

3. Economic and Labor Harms

The Problems

Digital Divide: Despite their limitations, LLMs are powerful and valuable tools that can save time and money for users and businesses alike. However people from low income and lower middle income countries are less able to benefit from them due to cost, limited internet access, and other factors.

Unpredictable Economic Impact: High-functioning LLM applications are likely to have a dramatic impact on the labor market—however, it is difficult to say how that impact will affect the broader economy. The adoption of LLMs is likely to result in job loss due to automation in areas such as administration, customer service, journalism, programming, and creative professions.

This job loss may be offset by new jobs that are created as a result of LLM adoption, although there will likely be a skill and interest mismatch between the jobs created and those that are lost.

Harmful Labor Practices: A recent report by Time draws attention to the way the LLM training data must be annotated by human workers to protect users from toxic language. While developing ChatGPT, OpenAI outsourced this work to low-wage workers in Kenya, who were exposed to the same violence and hate speech the rest of us are protected from.

Copyright and IP Issues: LLMs are regularly trained on data that is copyrighted or is intellectual property without permission from the creator. A Washington Post investigation of the C4 dataset (a common LLM training dataset) identified Kickstarter, Patreon, various news websites, and blogs. The creators of this content did not have the opportunity to give permission and were not compensated for the use of their content. The issue of how to handle intellectual property in a world of generative AI remains unclear and legally murky.

Why It Happens

LLMs are a novel technology and their impacts correspond to those of other novel technologies that have arisen and impacted economies. Much like electricity, the internet, and industrial technology, LLMs will disrupt the labor market, unequally benefit some more than others, and require new regulations and legal approaches to handle their implications. What does differ significantly with LLMs is the speed at which they are being developed and adopted. While older technologies have taken decades to have these impacts, LLMs are being incorporated into various industries at breakneck speeds. Meanwhile, the complexity of the technology makes LLMs difficult for regulators and lawmakers to understand. Furthermore, the harmful labor practices used in the training of LLMs cannot necessarily be mitigated through improved labor rights. Currently, the only approach to protecting LLM users from toxic and violent language relies on human annotation of toxic language in the data the LLM is trained on. Even with improved pay and better labor practices, this would still involve exposing some people to toxic content that is deemed too harmful to allow others to be faced with.

Why It Matters

Much like other economic disruptors, LLMs are very likely to have unequal impact across different groups. While those with the privilege of education and access to training may benefit from using LLMs in their careers, others may face job loss due to automation. At the global scale, this also means that people who have less internet access or access to training will also benefit less from LLMs, contributing to increasing global inequality.

The uncomfortable reality is that those of us benefitting from safe and sanitized LLMs can only do so because thousands of others were exposed to all the toxicity of the internet instead. Data annotators report they were traumatized by the constant exposure to violent and graphic sexual content, causing ongoing mental health deterioration and harming their relationships, even after they stopped working with explicit content. While low-wage workers in the global south are forced to bear the weight of all this damaging material, they are also less likely to benefit from the technology they are helping to create.

4. Environmental and Resource Harms

The Problems

Water Footprint: LLMs such as ChatGPT require vast amounts of freshwater to train and run. A recent study estimated that ChatGPT uses about 500 ml of water for every 20-50 prompts, while training a model of that scale consumes around 700,000 liters of water.

Carbon Footprint: Similarly, LLMs and other large AI models require a substantial amount of electricity, resulting in a massive carbon footprint. The process of training GPT-3 (a precursor to ChatGPT) resulted in a carbon release equivalent to that of driving 112 gasoline powered cars for a year. New generations of LLMs are even larger and will require even more energy to train. This does not take into account the carbon footprint of maintaining these models in production and processing prompts.

Why It Happens

Much like other large AI models, LLMs are trained and housed in large data centers, which are located across the globe. These data centers are powered by the local grid and consume water to maintain appropriate temperatures. Thus, the environmental impact depends on the location of the data centers in which models are trained and run. The problem is that the tech companies that develop LLMs are not transparent about where models are trained and what their true environmental impact is, making it difficult to hold companies accountable, despite their promises of greener technology.

Why It Matters

As the risks around climate change become all the more dire, it is vital that new technologies meet the global sustainability goals we are setting for ourselves. While there are ways in which LLMs and other AI technologies can contribute positively to sustainable development, these benefits can easily be outweighed by the immense energy and water costs they bring. At its core, this is an issue of climate justice. Globally, low-income and marginalized communities are disproportionately impacted by climate risks. These are also the communities that benefit the least from AI technologies.

Mitigation Approaches

A lot of the harms discussed in this blog post are not direct harms of LLMs themselves, but are an effect of the way the technology might be used or the practices under which it was developed. In this way, LLMs are similar to other novel technologies, and it is reasonable to make the argument that technology is only a tool and is not to blame for harms such as misuse or unfair practices. However, a safe and functional LLM application, as described in Part 1 of this series, would be immensely powerful and come with broad capabilities that go above and beyond those that have been seen in past technologies.

The relative ease of access to LLMs also separates them from previous technological leaps. These types of AI risks point to a need to go beyond technical mitigation strategies and prioritize social changes, regulations, and a global policy approach to AI regulation. Due to the novelty of some of these concerns, entire areas of policy, such as Copyright and IP law may need to be rethought in the era of generative AI. While governments across the globe are taking fragmented approaches to AI policy, there is an increasing need for a global strategy for AI regulation. There is also a need for continued research on AI and its impact on the UN Sustainable Development Goals, to ensure that LLMs and other AI technologies being developed benefit rather than harm those most in need across the globe.

There are also some technical and user experience–based mitigation strategies that can be taken to reduce environmental and social harms of LLMs. For example, models and their underlying infrastructure and hardware can be designed to be more energy efficient. LLM applications can also be carefully designed to prevent improper use and to steer users towards engaging in critical thinking, encouraging them to partner with LLM tools rather than over-relying on them. Intentional misuse of LLMs through prompt injection can to an extent be mitigated through technologies designed to detect and block prompt injections. However, this will likely become an ongoing cat-and-mouse game as both prompt injection techniques and mitigation approaches become more sophisticated.

Finally, there are some problems that may be deeply inherent to LLMs, which will be difficult to prevent without significant changes to the underlying technology. Since human annotation is needed to identify toxic language, there is an unavoidable tension between preventing toxicity in LLMs and ensuring the safety of the humans who do this work—even under otherwise fair working conditions. Given the severity of this harm, it seems unlikely that LLMs can be used ethically until a better approach to toxicity mitigation is developed. Similarly, homogenization of LLM-produced content may be an inherent flaw of the technology. Models trained to make probabilistic outputs based on data from the past will invariably be stuck in the norms that are already entrenched in that data.

Conclusion

In this blog post, we describe the ways in which even high-functioning LLMs have the potential to cause substantial harm. Similar to the harms described in Part 1, many of these risks can be mitigated through technical, regulatory, and social means. However, the overarching risk is that LLMs are being developed and adopted much more quickly than harm mitigation strategies can be put in place—especially at the global and national scale that is needed. There are also some harms that seem to result directly from the foundational technology of LLMs, making them impossible to resolve without significant changes to the technology. As individuals, organizations, and society as a whole begin to adopt LLMs, it is essential that we do so with these dangers in mind.

‍

FAQ

‍

How do LLMs compare to other AI technologies in terms of causing real-world harm, and what unique risks do they pose compared to those other technologies?

Large Language Models (LLMs) like GPT-4 or BERT differ from other AI technologies such as machine learning models used in facial recognition or autonomous driving systems primarily in their application and the type of risks they introduce. While all AI technologies can lead to unintended consequences, LLMs are unique due to their extensive interaction with human language and communication. Unlike facial recognition technologies, which may infringe on privacy or misidentify individuals leading to legal or ethical issues, LLMs interact directly with information dissemination and decision-making processes. They can amplify misinformation, bias, and harmful content at scale due to their language-based interface. Additionally, because LLMs are trained on vast datasets culled from the internet, they inherit and can propagate the biases, stereotypes, and inaccuracies present in those datasets. This makes them uniquely positioned to influence public opinion, automate and scale content creation, and impact social and political discourse in ways other AI technologies do not.

What are the specific examples of industries or sectors where LLMs have been implemented successfully without leading to the harms mentioned, and what best practices were followed?

In the healthcare sector, LLMs have been successfully implemented to assist with information management, patient care, and medical research without leading to significant harms. For example, LLMs have been used to analyze and summarize medical literature, assist in diagnosing from symptoms described in natural language, and provide conversational support for mental health services. Best practices in these implementations include rigorous data privacy measures, continuous monitoring for bias and inaccuracies, and integrating human oversight to verify the AI’s recommendations before they affect patient care. These measures ensure that LLMs serve as a support tool rather than a replacement for human expertise, minimizing the risk of harm while maximizing the benefits of rapid data processing and insights generation.

How can individuals and organizations measure or assess the potential risks and benefits of using LLMs before implementation?

Individuals and organizations can assess the potential risks and benefits of using LLMs through a multifaceted approach. Initially, conducting a thorough risk assessment focused on data privacy, security, and ethical implications is crucial. This involves evaluating the sources of training data for biases, the potential for misuse, and the impact on stakeholders. Organizations should also implement pilot programs to monitor the LLM’s performance in controlled environments before full-scale deployment. This allows for the identification and mitigation of unforeseen issues in a low-stakes setting. Moreover, engaging with stakeholders, including customers, employees, and subject matter experts, can provide valuable insights into the potential impacts of LLM deployment. Finally, staying informed about the latest research and developments in AI ethics and regulation helps organizations adapt their use of LLMs to best practices and legal requirements, balancing innovation with responsibility.

‍

The Real-World Harms of LLMs, Part 2: When LLMs Do Work as Expected

Introduction

Areas of Risk

1. Improper or Malicious Use

The Problems

Why It Happens

Why It Matters

2. Loss of Innovation, Diversity, and Human Skills

The Problems

Why It Happens

Why It Matters

3. Economic and Labor Harms

The Problems

Why It Happens

Why It Matters

4. Environmental and Resource Harms

The Problems

Why It Happens

Why It Matters

Mitigation Approaches

Conclusion

FAQ

SHARE