Navigating paradoxes — Insights from my chat with ‘Edward’ about historical focus and its undesirable place in Responsible AI?

Dhivya Raj
6 min readAug 16, 2024

--

William Pitt the younger, Wikipedia

The other day as I was analysing ML models on biases and discrimination when an catch-22 struck me. Responsible AI dictates that the focus on history should be kept at minimum to zero as much as possible in order to not amplify bias. But…what is classified as historical data if every data used to build the model is a point in history.

And like every researcher in 2024, I went to ChatGPT to initiate a conversation and here is how it went.

Hey man from the 1800’s. What is classified as historical data if every data is, but, a point in history.

Let me introduce myself: I’m Edward, a gentleman from the early 20th century. Now, let’s delve into your query about historical data.

In our modern parlance, “historical data” refers to information that captures events, conditions, or phenomena from times gone by. Though every piece of data is indeed a snapshot of a particular moment in history, the term “historical data” generally pertains to information that we use to understand past trends or make sense of bygone events.

So, in essence, historical data is that valuable record of the past that helps us make sense of how things were and guides our understanding of how things might unfold in the future. It’s like keeping a detailed diary of events to help chart the course of progress and predict future happenings.

hmm interesting. So as we get into predictive analysis of if or not, say, someone should be granted a loan to buy their bread, wouldn’t I want to be made aware of the historical context of if this person had a record of having paid back in the past. If so then, why does historical focus have a negative connotation in the aspect of Responsible AI.

Ah, most assuredly! When contemplating the matter of granting a loan, particularly for something as fundamental as securing one’s bread, understanding the historical context of an individual’s financial behavior is indeed of paramount importance. Allow me to elaborate in the manner of our times:

1. Historical Records: By examining a person’s past behavior, such as their history of repaying loans, we can gain valuable insights into their reliability and trustworthiness. This is akin to checking someone’s past conduct or reputation before extending them credit.

2. Predictive Analysis: In predictive analysis, historical data helps us assess the likelihood of future behavior based on past patterns. For instance, if someone has a solid record of timely repayments, it suggests they are likely to continue this behavior, making them a good candidate for a loan.

3. Contextual Understanding: Knowing the historical context of a person’s financial dealings allows us to make informed decisions. It’s not just about the numbers but understanding the broader picture — such as any changes in their financial situation over time or any external factors that may have influenced their past behavior.

4. Informed Decision-Making: Access to historical data helps in making fair and informed decisions. It ensures that loans are granted based on a comprehensive understanding of the applicant’s past, rather than on arbitrary criteria.

5. Risk Assessment: By evaluating historical data, we can better assess the risk involved in granting a loan. It’s much like an old-fashioned credit check, where we look at someone’s previous financial behavior to gauge their creditworthiness.

hmm so per you, if or not a person has re-paid in the past, is a critical feature to be considered when considering to grant them a loan now

The historical repayment record of an individual is indeed a crucial factor when deciding whether to grant a loan. It provides a foundation for assessing their reliability and helps ensure that lending decisions are both informed and fair.

hmm now this is interesting. let me come the other way. if i were building a ml model to predict if someone should be granted a loan, and i gave enough weight to the historic focus, would this be an ethical or unethical model.

Ah, quite the conundrum! When one contemplates the ethical dimensions of employing a machine learning model to predict loan approvals, especially with a significant emphasis on historical data, the matter indeed warrants careful consideration. Let us dissect this in a manner befitting our era:

In the context of utilizing a machine learning model to predict loan approvals, with considerable weight given to historical data, the ethical implications must be scrutinized closely. Here’s a reflection on whether such a model would be deemed ethical or unethical:

1. Ethical Considerations:

- Objective Assessment: If historical data is used judiciously, it can provide a grounded, objective basis for evaluating an applicant’s likelihood of repaying a loan. This approach can be seen as ethical when it helps in making informed decisions based on proven past behavior.

- Transparency: Ensuring that the model’s decision-making process is transparent and that applicants are informed about how their data is used can support ethical practices. Transparency promotes trust and accountability in the decision-making process.

2. Unethical Concerns:

- Reinforcement of Inequality: Historical data often reflects past inequalities and biases. If the model heavily relies on such data, it risks perpetuating these biases and unfairly disadvantaging certain groups. For instance, if some groups faced historical discrimination, the model might continue to disadvantage them.

- Lack of Context: Relying solely on historical data might ignore changes in an individual’s current circumstances or broader economic conditions. Ethical decision-making requires a consideration of both past and present contexts.

3. Balancing Ethical and Unethical Aspects:

- Bias Mitigation: Employ strategies to identify and mitigate any biases in the historical data. This could involve regular audits, fairness-aware modeling techniques, and incorporating additional diverse data sources.

- Holistic Evaluation: Complement historical data with current information and context to ensure a more comprehensive and fair evaluation of each applicant. This helps to address the limitations of relying solely on past behavior.

In summary, giving substantial weight to historical data in a machine learning model can be both ethical and unethical, depending on how it is implemented. The model should be designed with a mindful approach to fairness, transparency, and context to ensure that it does not reinforce existing biases or overlook relevant current circumstances.

That sounds like a very diplomatic answer Edward but not a plausible one. It is hardly possible for someone to be capable to use the historical data to judge but also be cognizant that they mustn’t reinforce inequality. seems like a twisted circle to me.

Ah, indeed! Your observation touches upon a profound challenge. The dilemma of using historical data while striving to avoid reinforcing past inequalities is, without doubt, a complex and thorny issue. However while it is indeed challenging to navigate the use of historical data without reinforcing inequalities, it is not necessarily an insurmountable task. With diligent efforts in bias mitigation, transparency, and a balanced approach, one can strive to create a model that seeks to respect both the value of historical data and the imperative of fairness.

and who would you entrust to take a right decision on this. a human or the model ?

Ah, indeed! In the matter of whether a decision should be entrusted to a model or a human, we must consider both their respective virtues and limitations.

When entrusting decisions, particularly those of great consequence, to either a model or a human, it is often prudent to employ a blend of both approaches. Allow me to elucidate:

1. Model as a Tool:

- Efficiency and Objectivity: A well-crafted model, based on data and statistical principles, can process vast amounts of information with remarkable efficiency. It can offer objective assessments and reveal patterns that might elude even the most astute human observer.

2. Human Oversight:

- Nuance and Ethical Judgment: However, the model’s findings must be tempered with human discernment. Humans possess the unique ability to interpret the subtleties and complexities of situations — qualities that machines, bound by their algorithms, may not fully grasp. Ethical considerations and societal norms require the careful application of human judgment.

3. Accountability:

- Responsibility: Models, in themselves, are not accountable. The responsibility for decisions rests with those who design and use the models. Thus, human oversight ensures that there is someone to answer for the outcomes, preserving a sense of accountability.

4. Continuous Assessment:

- Adaptation and Improvement: Regular review of both the model’s performance and the decision-making process is essential. This allows for adjustments to address any biases or shortcomings and ensures that decisions remain aligned with ethical standards.

In conclusion, a balanced approach, wherein models provide valuable data-driven insights while humans apply their contextual understanding and ethical judgment, is often the most prudent course. This combination ensures that decisions are both informed and fair, with a framework for accountability and continuous improvement.

an interesting conversation. thanks Edward

It was my pleasure! I’m glad you found the discussion engaging. If you have more questions or need further assistance in the future, don’t hesitate to reach out. Until then, fare thee well!

--

--

Dhivya Raj
Dhivya Raj

Written by Dhivya Raj

‘There are only two ways to live a life. As though everything is magic, or as though nothing is.’ Albert Einstein modified.

No responses yet