ALG Blog 1: Exception to Data Driven Rules

Published on:

The right to be an exception
How data-driven rules can unintentionally harm individuals who don’t fit the average profile

Case Study reading:
The right to be an exception to a data-driven rule

Case Study Overview

The article “The Right to Be an Exception to a Data-Driven Rule” talks about the growing concern over relying on algorithms to make important decisions, like advising judges or screening job applicants. The key point is that people have a fundamental right to be treated as exceptions when these rules don’t fit them. Even the most accurate systems can unfairly penalize someone simply because they fall outside the average, accuracy, after all, is just a number and doesn’t capture everything. Instead of assuming a rule works for everyone, the author argues that decision makers need to understand the algorithm, make sure it’s narrowed enough for individuals, and be confident enough in the decision to accept any potential harm.

Discussion Questions

1. What is a data-driven rule, and what does it mean to be a data-driven exception? Is an exception the same as an error?

A data-driven rule isn’t just a mathematical concept; it shapes real people’s lives. For instance, when insurance companies decide who gets coverage, they often rely on rules based on age, pre-existing conditions, or lifestyle factors. A person who doesn’t fit the average profile becomes, what is defined in this article as, a data-driven exception. Even if the model performs well for most patients, it can unfairly exclude those whose circumstances aren’t reflected in the data.

Similarly, as seen with recent changes in immigration and passport control rules, these systems can penalize people based on their country of origin or online presence. A traveler might be denied a visa because a model predicts a higher risk of overstaying for people from their region, even though that individual would comply perfectly with the rules. These people are exceptions, not necessarily errors, yet the consequences can be serious.

From a more statistical perspective, data-driven exceptions can arise from many sources, including:

  • Clustering effects that obscure individual variation
  • Limited model capacity to capture complex or nonlinear patterns
  • Confounding variables or colliders
  • Incomplete or missing data
  • Biases in data collection, even when unintentional

I used to think models were fair because they were objective, but I’ve realized that’s not always true. Exceptions show the limits of these systems and why oversight and attention to individual circumstances are so important.

2. In addition to those listed above, what other factors differentiate data-driven decisions from human ones?

From my experience working with statistical models, one big difference is that they rely on averages. Models are built to work well for most cases, but that means they can completely miss individuals who don’t fit the decided pattern as outliers often get ignored. I could experience this difference in my own life when applying to jobs. Sometimes I feel like I meet the qualifications, but with the rise of automated systems for the first faces of candidate selection, I got rejected right away because my resume didn’t match a certain keyword or field.

Another difference is scale. Models can make decisions really fast and across thousands of cases at once. This can create a systemic problem. If the same hiring model is used by many companies, someone who gets filtered out by the algorithm might never get a chance, even though a human might have given them a shot. Humans are slower and more inconsistent, sure, but that inconsistency actually gives people opportunities that models would not.

I also notice how models often ignore context. When I work with data I’m not familiar with, like economic datasets, I can see what the numbers say, but I don’t always understand the bigger picture. Models do the same thing as they find the line of best fit, without understanding what’s the context behind and if a different model would make more sense for that data.

Seeing this play out in job applications has made me realize how easy it is to assume that models are fair just because they’re data-driven. They might be more accurate than a human, but they often do not account for individuals.. This has made me more aware of how important it is to consider both the numbers and the people behind them.

3. Beyond what is discussed above, what are some of the benefits and downsides of individualization?

Benefits:

  • Better fit for individuals: In machine learning, more individualized models, like the ones that have more features, usually perform better. For example, I saw a classmate’s deep learning project in Denmark that predicted whether a plastic bottle could be returned. When they included more details, like bottle size, the model became much more accurate.
  • Fairer treatment: Individualization helps avoid treating people as just averages.
  • Context-aware decisions: Individualized models have space for consideration of unique circumstances instead of one size fits all.

Downsides:

  • Privacy concerns: More individualization means collecting more personal data, which can feel invasive in contexts like hiring or health insurance.
  • Limited flexibility: Models are stuck with the features they were trained on and can’t easily adapt to new information.
  • Overfitting: If a model is too narrow, it might perform really well on the training data but fail in real, more broad situations.
  • Unavoidable uncertainty: For example, in DnD, no matter how well I design encounters for a character, dice rolls can still introduce randomness that no amount of modifying or trying to adapt can control.

4. Why is uncertainty so critical to the right to be an exception? When the stakes are high (e.g., in criminal sentencing), is there any evaluation metric (e.g., accuracy) that can justify the use of a data-driven rule without the consideration of uncertainty?

Even when models are carefully individualized to reduce systematic uncertainty, there’s always things that can’t be predicted no matter how much data we have. For example, a hiring algorithm might know my skills, my GPA, and even the projects I’ve done, but it can’t predict how I’d grow into a role or how motivated I’d be once I’m actually there. It is unfair to treat any candidate as if the model would be able to predict that person’s performance.

That’s why high-stakes decisions, like criminal sentencing mentioned in the article, can’t rely only on numbers, especially when those numbers come from a model. Even a model that’s 95% accurate still fails for 1 in 20 people, and for the person on the wrong side of that statistic, the consequences could be devastating. In my own experience with machine learning projects, I’ve seen models that looked great on paper but completely fell apart when tested on real-world data.

For me, the “right to be an exception” is really about making those that make the decisions think about consequences: what if this is the one time the model gets it wrong? In job applications, that might just mean someone misses out on a chance. But in criminal justice, those decisions could completely change someone’s life. That’s why accuracy on its own isn’t enough. We need honesty about uncertainty and a reminder that people aren’t just data points.

Personal Discussion Question

Think about a time when an automated system or rigid rule, maybe a job application or a too broad rubric for a college course, has affected you. How did the lack of transparency in the system influence your ability to understand or advocate for yourself, and what information would the person that decided for this model/system would need to provide to ensure the decision accounted for your individual circumstances?

I thought of this question because it focus on how these models can affect out daily life. As a past TA, I’ve seen students get the correct answer in different ways, but a strict rubric sometimes favored one method over another. This lack of transparency and sometimes explanation makes it hard for students to understand their grade.

After thoughts

It was nice to think more in depth about how models don’t really have the bigger picture, and that sometimes a line of best fit doesn’t give us the result we actually want to interpret. In stats, we always focus so much on finding the best accuracy, but I hadn’t really stopped to think about what gets lost when we only chase that number. I know I struggle sometimes when I work with datasets I’m not familiar with, but I’d never thought about how models have that same kind of issue of lacking the context that gives meaning to the data.