Hikvision, Dahua & OEM Security Products for Contractors and Distributors

AI Risk Assessment in Policing: The HART System and Concerns Over Algorithmic Bias

An overview of how the UK police used the HART algorithm to assess reoffending risk, why postcode data raised concerns, and what this case reveals about AI decision-making in law enforcement.

An algorithm designed to help UK police make custody-related decisions has recently been adjusted due to concerns that it could discriminate against citizens from poorer areas.

Looking back at how this algorithm has been used, one important issue becomes clear: its predictions can differ greatly from those made by human officers.

1. What Is the HART System?

Over the past five years, Durham Constabulary and computer scientists have been developing the Harm Assessment Risk Tool, known as HART.

HART is an artificial intelligence system designed to predict whether a suspect has a low, moderate, or high probability of reoffending within two years.

HART is one of the first algorithmic systems adopted by UK police forces. It does not directly decide whether a suspect should be detained. Instead, it is intended to help police decide whether a person should be referred to a rehabilitation program called Checkpoint.

The purpose of the Checkpoint program is to mediate legal proceedings and prevent some suspects from being sent to court.

2. Why Postcode Data Became Controversial

The HART system uses 34 personal data points to assess the level of criminal risk. These include factors such as:

  • Age
  • Gender
  • Criminal history
  • Postcode information
  • However, police are now removing key postcode-related fields from the AI system. For example, they are deleting the first four digits of Durham postcodes.
  • A draft academic paper published in September 2017 reviewed the use of the algorithm and stated:
  • “HART is currently undergoing a data update, with the aim of removing one of the two postcode-related predictors.”
  • One of the co-authors of the paper was a member of the police force.
  • Andrew Wooff, a criminal justice expert at Edinburgh Napier University, expressed concern about using the first few digits of postcodes as predictive indicators.
  • He said that using geographical and socio-demographic information as predictors could deepen bias in policing decisions and the justice system.
  • If a system predicts that a certain postcode area has a higher risk of crime, and the police act based on that prediction, it may reinforce and amplify that perception.

3. The Risk of Bias in Predictive Policing

The upcoming academic paper offers one of the first serious reviews of HART.

It points out that postcodes may be connected to the level of deprivation in a community. Address information may become a relevant predictor of crime because of human-created social conditions.

If postcode data is used to build a reoffending model, the model may place more attention on residents living in areas already considered to have higher crime rates.

The paper emphasizes that the real concern may not be the model itself, but the predictive factors used to build the model.

4. Human Predictions vs. Algorithmic Predictions

The paper also highlights a clear difference between human and algorithmic prediction.

During the initial experimental phase of the algorithm, police officers were asked to imitate the algorithm’s output. They predicted whether a person’s chance of reoffending was low, moderate, or high.

In nearly two-thirds of cases, about 63.5%, officers classified offenders as having a moderate risk of reoffending.

The paper stated that the agreement rate between the model and police officers was only 56.2%.

Regarding changes to the algorithm, WIRED contacted Durham Constabulary, but had not received a response by the time of publication.

5. How the HART Algorithm Works

One Durham police officer received an invitation letter that read:

“You are invited to participate in a research project.”

He was told that the research would completely change his life. If the research succeeded, offenders would never reoffend.

The Checkpoint program was an experiment jointly led by Durham Constabulary and the University of Cambridge.

The experiment aimed to reduce reoffending by studying why some people experience problems such as:

  • Drug use
  • Alcohol abuse
  • Homelessness
  • Mental health issues
  • The candidates for the Checkpoint program were selected by the HART algorithm.
  • If HART classified a person as having a moderate risk of reoffending, that person would be included in the Checkpoint program.
  • People classified as having either low or high reoffending risk would not be included in the program.
  • Jennifer Doleac, a professor of public policy and economics at the University of Virginia, asked whether there might be a better way to handle crime—one that is fairer and brings society closer to its goals.
  • The Checkpoint program had previously received an award from the Howard League for Penal Reform, which praised its attempt to keep people away from the criminal justice system.

6. Random Forests and Historical Data

HART is a machine learning system written in the R programming language. It makes decisions using a method called random forest.

A random forest is a prediction method based on multiple decision outputs.

Every decision made by HART is based on historical data. It analyzes past data to predict future behavior.

Durham Constabulary provided the first version of HART with details from 104,000 custody events between 2008 and 2012.

From this information, the system extracted 34 predictive factors, including location data, to predict each person’s risk of reoffending.

All HART predictions are generated through 509 internal votes. Each vote falls into one of three categories:

  • Low risk
  • Moderate risk
  • High risk

7. A Real-World Example of HART Prediction

The research was led by Sheena Urwin, head of criminal justice at Durham Constabulary.

Her published research suggested that HART was effective in the real world.

An early version of HART once predicted that a 24-year-old man had a high probability of reoffending. The man had a history of violent crime, and police records showed 22 previous offenses.

Within HART’s internal voting system:

  • 414 votes classified him as high risk
  • 87 votes classified him as moderate risk
  • 8 votes classified him as low risk
  • The man was later arrested and convicted of murder.

8. Using Algorithms in Law Enforcement

Although the use of artificial intelligence in policing and law enforcement is still at an early stage, many warning signs have already emerged for police agencies interested in developing AI-based systems.

A widely cited 2016 investigation by ProPublica found that COMPAS, a software system developed by Northpointe, showed bias against Black defendants.

Megan Stevenson, a law professor at George Mason University, studied the role of a risk assessment algorithm used in Kentucky and found that it did not have much impact.

After analyzing data from more than one million criminal cases, Stevenson concluded that the system did not bring the efficiency improvements expected by supporters of risk assessment algorithms. At the same time, it did not deepen racial discrimination in the way critics had predicted.

Her research also found that the longer judges used Kentucky’s risk assessment method, the more likely they were to return to their original working habits and decision-making patterns.

9. Efforts to Reduce Human Bias

To prevent human racial and social bias from entering the HART algorithm, Durham Constabulary organized awareness sessions for staff about unconscious bias.

The police also emphasized that race was not included among the predictive factors used by the algorithm.

They stated that the algorithm’s output was only a supportive tool. Its purpose was to help humans make better decisions, not to replace human judgment.

In December 2017, Urwin explained to members of Parliament:

“Although I cannot give you the exact figures, officers certainly do not fully follow the algorithm’s predictions, because the prediction is not the complete and final reference. It is only an aid.”

However, Professor Wooff from Edinburgh Napier University warned that due to time pressure and limited resources, police may become overly dependent on AI-generated decisions.

He said that officers might rely more on system decisions than on their own judgment.

Wooff also argued that written records may help officers who need to make decisions, partly because if something goes wrong, they can defend their decision-making process.

10. Comparing COMPAS with Human Judgment

Another study focusing on the accuracy of COMPAS found that its predictions were not significantly different from those made by untrained humans.

Julia Dressel, one of the study’s authors and now an Apple engineer, said:

“COMPAS’s predictions are no more accurate than predictions made by people with almost no criminal justice experience, based on an online survey.”

Dressel and Dartmouth professor Hany Farid paid people through Amazon Mechanical Turk to predict whether offenders would commit crimes again.

They then compared human predictions with COMPAS results.

The results showed that both humans and algorithms had an accuracy rate of around 67%.

Dressel said that people should not simply assume that a tool can accurately predict the future just because it uses big data.

Such systems need to meet very high standards. They must be tested and must prove that they are as accurate and effective as they claim to be.

11. Transparency and Public Oversight

Durham Constabulary’s algorithm is a black box.

The system cannot fully explain how it makes decisions. What is known is that it is based on more than 4.2 million data points inside the model.

A September 2017 review of HART concluded that a certain level of opacity seemed unavoidable.

At present, HART only uses data from Durham Constabulary. In the future, it may also include data from local councils or the UK Police National Database.

To address the problem of algorithmic opacity, the police created a framework defining when and how the algorithm should be used.

This framework is called Algorithmic Considerations. It states that algorithms must be:

  • Lawful
  • Accurate
  • Challengeable
  • Responsible
  • Explainable
  • Dillon Reisman, a researcher at the AI Now Institute, said:
  • “Accountability cannot just be a checklist.”
  • He added that while it was good to see the police create Algorithmic Considerations, they should also consider whether using algorithms in the first place is appropriate.
  • The AI Now Institute mainly studies the social impact of artificial intelligence.

12. Should the Code Be Made Public?

The police refused to release HART’s underlying code. They argued that doing so would not be in the public interest and could kill the system at the research stage.

However, the police said they were willing to provide the underlying system to a central organization.

When asked about data disclosure, police responded:

“Durham Constabulary will be prepared to disclose the HART algorithm, related personal data, and custody datasets to an algorithmic regulatory body.”

Reisman argued that simply releasing the data would not be enough, because code alone cannot fully evaluate an algorithm.

He said that people also need to understand how human decision-makers act based on algorithmic decisions.

13. The Future of AI Policing Remains Uncertain

Before these issues are properly addressed, the effectiveness of AI policing systems remains open to question.

In September 2017, a review report on HART, co-authored by Urwin, focused on two major questions:

  • Whether algorithmic predictions are fully applicable
  • Whether data related to race and other sensitive factors should be included in policing systems
  • A co-author of the COMPAS analysis report said that accurately predicting what a person will do over the next two years based on past behavior is extremely difficult.
  • If such a high level of accuracy cannot be achieved, then this method should be abandoned.
  • Instead, simpler approaches should be explored, while seeking a balance between civil liberty and social stability.

14. Conclusion

The HART case shows both the potential and the risks of using artificial intelligence in policing.

On one hand, algorithmic tools may help police identify risk, allocate resources, and support rehabilitation programs. On the other hand, if the data used to train these systems reflects existing social inequality, the algorithm may reinforce bias rather than reduce it.

The debate around HART is not only about technology. It is also about fairness, accountability, transparency, and the proper role of human judgment in law enforcement.

For AI policing systems to gain public trust, they must be carefully tested, openly reviewed, and used only under clear legal and ethical safeguards.

Facebook
Twitter
LinkedIn

REQUEST A QUOTE