Child Welfare Intelligence? Opportunities and Challenges of Artificial Intelligence Tools in the Child Protective Services Arena

by Lauri Goldkind, PhD & Adrienne Holmes, MSW Candidate

Child protective services and artificial intelligence (AI)? Can high-powered digital tools support families and children? We think they can, when used with care and forethought. No commissioner or supervisor wants to read about a career ending incident caused by a computer, and the best way to avoid digital mishaps and maximize the potential of AI tools is 1. Be clear about AI use policies and 2. Offer staff professional development opportunities to try AI tools out and engage critically in thinking about AI use.

Generative AI like ChatGPT, Gemini and Claude, called Large Language Models (LLMs) may feel new, but computer scientists have been working on building computer “intelligence” since the 1950s and 60s. These systems use masses of data, think of all of the works of Shakespeare, plus all of Wikipedia, Reddit and the New York Times, plus more, to “train” computer algorithms or formulas to work. In many more corporate or commercial fields, such as retail operations, human resources, or sales and marketing, artificial intelligence has been used for decades. However, in child and family protection agencies, AI systems have not been as widely used, for all the reasons we can imagine. AI systems have been accused of making biased decisions, for accelerating existing structural inequalities, and for “hallucinating” or delivering the wrong answers. In our settings, where families’ and children’s safety and security is on the line, caution in adopting AI tools is appropriate and necessary.

The biases baked into predictive systems are well documented, and newer AI tools, like ChatGPT, also pose risks to workers and the families they serve. However, the newer LLMs which can process and output text, audio, images and video, offer significant potential for improving internal operations of child welfare agencies and systems, but should be deployed only with great care and internal oversight. This article describes what AI is, offers examples of the potential AI use cases, as well as examples of unsuccessful AI deployments, ending with suggestions for AI governance in child protection. Pro-active AI use policies and practice guidelines can help leadership and staff safely manage the use of LLMs and other AI tools, in service of protecting workers and families.

What is AI?

A picture of a cell phone showing ChatGPT and Google Gemini AI apps.

AI is not one specific digital tool, it is a range of computational processing tools that encompass everything from computer vision, think self-driving cars, to sensing technologies, think automatic doors, to LLMs, think ChatGPT (among others). These tools use massive amounts of all kinds of data including videos, text, photos and songs to make probabilistic decisions about what data comes next. Traditional or narrow AI uses big sets of existing data to make predictions about what will happen in the future. These kinds of systems have been used in child and family protection systems for the last ten years, most famously in Allegheny County, PA. Newer Generative AI tools, like LLMs are designed to create new data. While the answers from predictive analytic models are designed to be reliable, or report the same response, every time the model is used, Generative AI models are designed to produce new responses, so it would not be surprising for a generative digital tool to create a different case note every time it was given the same data. LLMs also can make mistakes. Hallucinations are the cute name given to the wrong answers that an LLM can potentially deliver. LLMs work by probabilistically linking sequences of words together.

Sometimes, a model can link the wrong set of words. For example, when the New York City Mayor’s Office released a chatbot delivering fake information on the City’s small business regulations (Offenhartz, 2024).

AI in Your Agency?

Generative AI and LLMs are so new that there are no evidence based practices, and the technologies are changing all the time. There are, however, promising practices and policy guidance to help agencies assess and guide staff on AI use. Leaders and supervisors are wise to get ahead of their use and application. Units where AI may be right for an agency include compliance and evaluation offices. For example, an LLM can draft “low stakes” documents such as executive summaries, request for proposals, and impact reports, in seconds. In the marketing and outreach arena, communications staff might use a Generative AI to draft newsletter articles, social media posts, program descriptions, commissioners remarks and speeches, and press releases. Another application where AIs have significant potential is in training and staff development. There are a myriad of places where even a free LLM offers the opportunity to: create instructional materials, conduct role plays, create PowerPoints, but also to summarize academic materials and engage case workers in new ways.

For most of us, digital literacy and AI literacy in particular was not a part of our formal educational training. There is a great opportunity for staff and leadership to learn together about what AI can do, what it shouldn’t be doing, and how to think constructively about adopting AI tools. Offering staff guidance on AI use and its limitations so everyone understands how a question should be developed (“prompt engineering”), what response to anticipate, and how to evaluate its appropriateness and accuracy, can help to mitigate potential risks. Many of us in the academic community are eager to help our local partners to learn about these tools and to provide resources.

AI Pitfalls, Potholes and Problems

All AIs are designed to synthesize massive amounts of data toward solving a problem. In the past there have been implementations like: taking twenty years of past hotline call data and using it to immediately predict if a call should be screened in or out of the system, or using masses of case records to understand the risk of possible abuse, in seconds. While these algorithmic strategies have the potential for avoiding the human biases in hotline safety screenings, we also know that historic decision making in child welfare settings has been tainted by humans biases, including racism, and bias against poor families, leading to scenarios where Black and Brown families were screened into the system at twice the rate of white families (Emanuel, et al., 2023). If we are using these historically biased data sets to make future predictions, they also will bring forward old biases. For example, in 2017, Illinois stopped using a predictive analytics tool to make triage decisions for its risk assessment scoring tool for managing hotline calls. The system, called Rapid Safety Feedback, was meant to assist hotline operators with decisions on screening families into or out of a child welfare case. Instead of serving as a support, the system made faulty decisions, screening in low risk families, and missing many high risk cases (Cournoyer, 2021).

Most recently, we’ve been hearing all about LLMs, and maybe ChatGPT specifically. It uses advanced computing power to create images, text and audio. One of the biggest innovations in these LLM systems is how they can respond to a “regular question.” For instance, you can enter some basic facts into ChatGPT or Gemini and in 10 seconds it can give you a curriculum for professional development on motivational interviewing, including suggested readings. For workers unfamiliar with AI tools, these models can seem like magic. But the mistakes they can potentially make come with a high price tag. Last year, a case worker in Australia was fired for using ChatGPT to write their case summaries. The worker had entered personal identifiable information (PII) into the chatbot, disclosing important and identifying details of the case into a proprietary software (Taylor, 2024). The companies that build and support these models are a billion dollar industry and they train the AI tools with data. Yes, the models get smarter the more we use them, but they are basically taking the information we enter and learning from them.

AI Governance and Policy

While there are real risks associated with making use of Generative AIs, the upside potential is undeniable – especially for those of us in under funded and staff strapped agencies. Using an LLM like a very quick intern can be a huge boost for productivity and efficiency, making time for the human work which staff should be focusing on. Lower stakes tasks – meaning those that don’t involve client data or personally identifiable information – called “PII” in the AI world – can be automated with oversight and guardrails. Training and staff development are a good place to start if your agency is thinking about how to approach artificial intelligence. Bringing staff together to learn about what AI is and how it could be applied in practice can help to demystify these new tools, potentially celebrating people who are already making use of them in their workflows, to start a conversation about how as an agency these systems could make sense.

Once your agency has a basic level of AI literacy, an AI use policy is a concrete next step. Establishing a policy for how and when workers can use generative AI models, publishing communication about that policy in training, and regular communication will help to support workers and also make clear what the limitations of AI use are in your setting. Policy guidance and model policies for AI in human services offer agencies tangible support for how to use AI tools in practice. Good resources include: the Trustworthy AI (TAI) Playbook from the U.S. DEPARTMENT OF HEALTH & HUMAN SERVICES (2012), Unicef’s Policy guidance on AI for children (2021) and the recent AI Plan for State and Local Governments, (HHS, 2024). AI can be infused safely, ethically and responsibly into the complicated systems of child welfare agencies, we look forward to hearing about how it goes.

Lauri Goldkind, PhD, is a professor of social work at the Graduate School of Social Service, Fordham University, New York, NY.

Adrienne Holmes, MSW Candidate, is an MSW student at the Graduate School of Social Service, New York, NY.