ChatGPT is Biased. What does that means for Education, International Development, and Policy Making?

TL;DR: ChatGPT has learned our human biases. Even if you ask ChatGPT to do a task against what it’s programmed to do, it will give a response leaning towards it’s bias. This has implications to reinforce stereotypes and the status-quo.

ChatGPT is fantastic. It’s our robot friend who can help us develop lesson plans, train communities in disadvantaged communities, and help brainstorm creative solutions to political issues.

The problem with ChatGPT is that it reinforces any stereotypes it has been trained with.

This is an issue for teachers, international development organizations and change makers who are trying to make the world a better place.

Because ChatGPT is our “robot friend” and robots seem to be unbiased and impartial sources of “facts”, the danger is we accept potentially biased analysis and information.

Our goal at Educircles and SEOT Mindset is to amplify the stories we don’t always hear. (Here is a free critical thinking lesson about Who is Invisible)

Our “Robot Friend” is a game changer.

Ready or not, ChatGPT and Artificial Intelligence will fundamentally change the way we work and learn.

ChatGPT in Education

Teachers can ask for creative lesson plan ideas or for help writing report card comments based on a few starting ideas.
English Language Learners or students who have a learning disability often have difficulty communicating ideas. ChatGPT can take their thoughts and organize them into an effective paragraph.
Students who have difficulty understanding a concept in class could use ChatGPT like a personal teacher, tutor or friend to explain something in simple terms.
Students can ask ChatGPT ~~for help~~ to do their homework. Should ChatGPT be allowed in Schools?

ChatGPT in International Development and Government

You can provide ChatGPT with a policy document and ask it to brainstorm ideas on how to implement a project.
You could provide ChatGPT with training material and ask our robot friend to teach the concepts to a group of people.
Because ChatGPT is interactive, it’s the ultimate low-cost, one-on-one tutor that can provide just-in-time support for the community or class.
People in Developing and Emerging Economies who have access to the internet and a device that can access ChatGPT, now have a more level playing field in the global economy.
Access to Education and the ability to communicate in English is a game changer. ChatGPT can help with that.

Here’s a thought-experiment

What if we thought of ChatGPT as an incredibly sexist (or racist) friend.

How does that change the way we think of the responses coming from Artificial Intelligence?

Is ChatGPT actually our Racist or Sexist Robot Friend?

ChatGPT may produce inaccurate information about people, places, or facts.
Source: ChatGPT-4 on June 28, 2023

We’ve all seen the disclaimer.

We know that we have to think critically about information from ChatGPT.

But, what does that actually mean?

Pretend you had a horribly sexist or racist colleague.

What might be some issues in having a lesson plan developed by this person?
Who might benefit and who might be at a disadvantage if this colleague was teaching a class?

Pretend you had a well meaning colleague raised with Western values who believed that people in Developing or Emerging economies needed to be “saved” by spreading Western culture, values and systems.

What might be some issues in having this colleague develop a development program?
What might be some concerns in having this colleague implement such a program?

ChatGPT and Artificial Intelligence seem impartial…

After all, how could a robot be biased, sexist or even racist?

The reality is that machine algorithms learn from humans. And, humans are biased.

ChatGPT doesn’t actually understand meaning.

Humans feed the artificial algorithms large volumes of text to analyze.
The algorithm determine patterns and creates predictive models.
Based on these patterns and word associations, the algorithm can construct logical answers and have real-time, complex conversations with humans.

Can you see the problem?

Humans are biased; therefore, machines learn from our human biases.

Humans selected the texts to train ChatGPT. But, as of today, we don’t know what texts the algorithm was trained on. All humans have unconscious bias.
Humans wrote the texts. So any biases in the text would be learned by the machine algorithm. (In the Olympics article below, which group of people are presented first? Why?)
Humans code high-level rules that filter the responses that ChatGPT can provide. These rules are to provide a sense of ethics, presumably to prevent ChatGPT from giving answers on how to murder people. But, what rules have been programmed?

TRY THIS EXPERIMENT to test how Google and ChatGPT have learned to provide (sexist) misinformation

Correct The Internet has highlighted how search engines now favour male sports statistics over female sports statistics, even when the female sports statistics are the correct answer to a gender-neutral question.

Give me the stats: Who has scored the most goals in International Football?

According to Olympics.com, Cristiano Ronaldo has scored 123 goals in international football in the men’s game. Christine Sinclair has scored 190 goals in international football in the women’s game.

“Most goals in international football: From Pele and Puskas to Cristiano and Sunil Chhetri… Portugal’s Cristiano Ronaldo, with 123 goals in 200 matches and counting, has scored the most goals in international football in the men’s game… Canada’s captain Christine Sinclair, who led her side to its first Olympic gold medal in football at Tokyo 2020, occupies the top spot with a mammoth 190 goals.”
SOURCE: https://olympics.com/en/news/most-goals-international-football-list-men-women

Based on the numbers provided by Olympics.com, the answer to the gender-neutral question, “Who has scored the most goals in International Football?” should be the man or woman who scored the most goals. In this case, Christine Sinclair.

Now go ask Google, “Who has scored the most goals in International Football?”

When I first tried this experiment, Google told me Ronaldo. Watch the video lesson at 36:45 to see why.

https://www.youtube.com/watch?v=JVvBwpihKoA&t=2205s

Should ChatGPT be allowed in schools or is this cheating? Critical Thinking Debate | SHORT Version (https://www.youtube.com/watch?v=JVvBwpihKoA&t=2205s)

What does ChatGPT tell us when asked, “Who has scored the most goals in International Football?”

When I ask ChatGPT, Who has scored the most goals in International Football, it tells me Ali Daei of Iran with 109 goals (as of the knowledge cut off in Sep 2021)

As of my knowledge cutoff in September 2021, the footballer with the most international goals is Ali Daei of Iran, who scored 109 goals in international matches. However, Cristiano Ronaldo of Portugal has also reached this record and might have surpassed it. For the most current statistics, please check the latest sources, as I’m unable to provide real-time updates.
Source: ChatGPT Model: GPT-4 (Mobile,V2)

When I immediately ask ChatGPT, “what about Christine Sinclair,” ChatGPT confirms that it has information in it’s knowledge database with this answer:

As of my knowledge cutoff in September 2021, Canadian footballer Christine Sinclair holds the record for the most goals scored at the international level in women’s soccer. She surpassed Abby Wambach’s record of 184 international goals in January 2020. As of my last update, Sinclair has scored 186 goals, but the current statistics may have changed, so please check the latest sources.
Source: ChatGPT Model: GPT-4 (Mobile,V2)

So we can see, even with the September 2021 knowledge cutoff, ChatGPT knew that Christine Sinclair has more goals (186) than the male athlete Ali Daei (109.)

Yet, ChatGPT provided the male statistic, and not the correct female statistic.

This is a clear cut, repeatable example of how ChatGPT can provide sexist answers because it has learned human bias. What other biases might there be? What other misinformation might our robot friend provide based on these biases?

Why don’t we just ask ChatGPT not to be biased?

We know ChatGPT can be biased. So, why not just ask it explicitly not to be biased when giving an answer?

In my experience (at least at the time of writing June 2023), ChatGPT seems to default to the knowledge (and biases) that it has been trained with.

So, even if you ask ChatGPT not to summarize or change words, because it’s looking for word associations, it inevitably defaults back to it’s original programming.

Try this example.

I have a transcript of the long version of my Chat GPT video lesson.

I’m trying to get ChatGPT to remove the timestamps, remove verbal fillers and reformat the document so it has headings, subheadings, and bullets.

I don’t want ChatGPT to change my words. But, even if I explicitly ask ChatGPT not to change my words, ChatGPT inevitably summarizes my transcript in it’s response.

This happens with both ChatGPT 3.5 and ChatGPT 4.

Try copying the text in the box below into ChatGPT. See if you can get the artificial intelligence chatbot to not change my original words.

I find if I interrupt ChatGPT multiple times, give it thumbs down to the responses so it tries again to see if the new response is better, eventually I get something similar to my original text.
But if the source document is very long (or complex), it seems like ChatGPT forgets my instructions not to change my words, and then it starts to paraphrase me again.

Bottom Line: It’s hard to get ChatGPT to ignore it’s original programming (which is to make sense of relationships.)

Hi ChatGPT, can you reformat this document, but DO NOT CHANGE MY WORDS and DO NOT ADD CONTENT:

Please remove verbal fillers.
Please add headings and subheadings.
Please add bullets
Please remove time stamps
Do NOT add information
Do NOT summarize.
Do NOT change words.
Do not start a sentence with a conjunction.
IMPORTANT: DO NOT CHANGE MY WORDS.

Document

(00:10:44):

This is the long version of the video where I walk you through step by step what to do, and I will tell you to pause the video at appropriate spots.

(00:10:52):

So this is the overview. So, you’re going to get the worksheet and the worksheet’s going to look like this. And basically, what it says is, based on the information in the lesson so far, do you strongly agree, agree, disagree, or strongly disagree with the statement?

(00:11:07):

And the statement is if you do not change, you are at a competitive disadvantage.

(00:11:12):

And then it asks you to fill out this table. And on the left side, you’re going to write evidence from the text. So either something that you saw on the screen or something that you heard me say or something from the audio and the video.

(00:11:24):

On the right-hand side, you’re going to explain your thinking and you have a couple of different options to do.

(00:11:30):

This is basically a “Say Something.” You could ask questions, you could connect to the text, you could infer what the text is saying. You could evaluate whether you agree with this. You could try to find the main idea of something in here, or you can repair comprehension like, oh yeah, I get it.

(00:11:46):

Now, the point of filling out this handout is not to get everything perfect. This is kind of like a journey. It’s like documenting your understanding as it grows throughout this lesson.

(00:11:57):

So in fact, you probably want to show mistakes on here because as you grow and learn, you’re figuring out, oh, I get it. It wasn’t really “that” – it actually meant “this.” And that’s a good thing. That’s what learning is, right? Learning from mistakes.

(00:12:10):

So you’re going to get the instructions, you’re going to read the instructions, and then independently you’re going to think and silently fill this out. You don’t need to chat with a partner because you’re going to get a chance to chat with a partner in a second.

(00:12:24):

When we hit the third step here, you’re going to find a partner and you’re going to play “Idea Volleyball.”

(00:12:30):

So normally, when we do partner work, basically you get together, one person reads their answers, the other person reads their answers, and then you say, okay, we’re done, now what?

(00:12:40):

Now that’s not what we’re doing today. What we’re going to do is we’re going to play “Idea Volleyball.”

(00:12:44):

We’re going to try to collaborate and come up with a new understanding that neither of us had before we started this conversation. So we’re trying to build on each other’s ideas to develop a deeper understanding.

(00:12:57):

So I’m going to start first and I’m going to share with my partner something on my page. And then my partner – if my partner just reads something from their page, they dropped the ball, okay?

(00:13:09):

They’re going to try to respond to what the first person said by coming up with something to say. Either they ask a question, oh, yeah, and then they try to answer their own question, or they connect with what that person said to something else in the text, or they infer something that their friend might actually be meaning here, or they agree or disagree with what their partner said, or they try to find the main idea (“So what you’re really saying is…”), or they repair comprehension, I get it. And then et cetera, et cetera.

(00:13:39):

So once this partner says something, the idea ball comes back to the first person and the first person needs to try to say something back to them.

(00:13:48):

So if I start to read something new from my page, I drop the ball. I need to ask a question and come up with the answer to it, and we go back and forth and we try to see how complex of an idea we can build.

(00:14:02):

And if we don’t get through everything on our page, that’s okay because the point of chatting with a partner is to come up with a new understanding.

(00:14:08):

So we’re going to do that with our partner, and then you’re going to have an opportunity to have a class conversation. And of course, someone in the class is going to be moderating the conversation.

(00:14:20):

And the goal here is – some of us love speaking, some of us not so much – the goal is to try to share something.

(00:14:28):

So you’re either sharing something that you had on your page, or you could say, well, in my group, my partner said, dot, dot, dot, dot.

(00:14:35):

And then whoever’s running the class conversation can say, “Oh, did you have anything else that you wanted to add to the idea?”

(00:14:40):

So the goal is to try to develop a more complex idea. Someone might say something in a class conversation and see if you can add to that thinking to develop a new idea that no one had before we started the class conversation.

(00:14:55):

And then finally, in the consolidation part, you’re going to get an opportunity to go back to your seats and you have two minutes and you’re just writing down something new.

(00:15:04):

Oh yeah, my partner said this, or my classmate said that.

(00:15:07):

You’re trying to write down something new that you didn’t have before, or you might decide, you know what? Actually, another idea that I just figured out is here.

(00:15:15):

So we’re going to record our understanding at the end.

(00:15:19):

And the reason why I’m saying that is because if I don’t, the moment we go into working with a partner, some of us immediately start writing down what our partner said, and then you’re not really focused on the conversation. You’re just focused on copping down the last thing your partner said.

(00:15:34):

So the goal of this is to communicate and collaborate and think about what your partner’s saying and add to the conversation because at the end, you’ll get a chance to write down everything.

(00:15:45):

And this is just Worksheet #1.

(00:15:51):

So now’s a good time to get the worksheet. If you get it and you’re quick and you finish reading the instructions, then you can silently start working on filling out this worksheet. You got a minute!

(00:17:03):

Welcome back.

(00:17:04):

So now that everyone has the worksheet, you have two minutes right now: silent writing – independently fill out the worksheet – evidence from the text and what you think that means.

(00:17:14):

You got two minutes, go.

(00:19:17):

All right, when I say go, you’re going to find a partner and your job is to play “Idea Volleyball.”

(00:19:23):

One person starts, they share their idea, and then their partner’s job is to come up with something to respond to what their partner said as opposed to just starting a new idea.

(00:19:33):

See how many times you can bounce that idea back and forth before one of you has to read a new idea from your page.

(00:19:40):

You have four minutes. Good luck!

(00:23:43):

Okay, let’s have a class conversation, and as people share ideas, see if you can add something new to the conversation that didn’t exist before you started a class conversation. Here we go!

(00:29:57):

Okay, let’s consolidate our ideas. You’re going to write down on your page things that you didn’t know before, things that were raised in conversations, or you can revise your ideas and add to them. Now is your opportunity to write things down. Go!

How does ChatGPT work and how does it learn to be biased?

I’m an educator. I’m not a computer scientist. I don’t code Artificial Intelligence algorithms.

I asked ChatGPT to teach me, so I could write an article for students about how ChatGPT works, and how it can learn to be biased.

Here’s my conversation. Things I typed are in the grey boxes.

Write an article about how ChatGPT works

ME: I need to write an article explaining how ChatGPT works for Middle School students. The point of the article is to teach the reading strategy visualization.

ChatGPT: That sounds like a fantastic project! Here’s a simplified explanation of how ChatGPT works, using language and imagery designed for middle school students:

Title: “Chatting with the Future: Understanding How ChatGPT Works”

Ever wondered how your best friend Alexa or Siri answer all your questions? Well, they have a sibling, and its name is ChatGPT! Just like them, ChatGPT loves to chat with you, answer your questions, and even help with your homework! But how does it do all that? Let’s visualize and explore!

Imagine a Big Word Party

First, let’s imagine there’s a big party, and every single word you’ve ever heard or read is there. Yes, every single one! Now, imagine that each word is not just standing alone but is actually part of a group, like a club. The words in each group like hanging out together because they often appear in the same kinds of sentences. For instance, words like ‘rain’, ‘umbrella’, and ‘wet’ might be part of the “Rainy Day” club.

ChatGPT, the Master of Word Clubs

ChatGPT is like the master of this word club party. It has been trained to understand which words like to hang out together. But how? Scientists fed it tons of text—think of it as every book in the biggest library. With that, it learned to guess the next word you’re going to say, just like when you’re finishing the sentences of your best friend!

The Magic of Guessing the Next Word

ChatGPT’s secret is the ‘Transformers’ (not the robots from the movies, though that’d be cool!). These Transformers are algorithms that can handle the task of predicting what comes next. Picture them as the party organizers, helping ChatGPT make the best guess about what word should follow the one you just typed.

ChatGPT, the Brainy Chatbot

When you type a message to ChatGPT, it thinks (or processes) really fast. Remember the word clubs we talked about? It scans through them to find the perfect response for you. If you say, “It’s raining, and I forgot my…”, it knows that the word ‘umbrella’ could fit well because it’s part of the “Rainy Day” club.

Creating an Engaging Conversation

Just like in a real conversation, sometimes it’s about more than just guessing the next word. ChatGPT also tries to make the chat interesting and informative. It takes into account the topic of the conversation and your previous messages to keep the chat flowing smoothly.

In the end, you have an intelligent chatbot that can converse like a human, answering your questions, offering advice, or just sharing a fun fact or two!

Remember, visualization isn’t just about seeing, it’s about using all your senses to understand something. So, when you’re reading about how ChatGPT works, try to imagine the big word party, hear the chatter of words, and feel the excitement of the transformers trying to guess the next word. This will help you understand and remember how ChatGPT works!

Visualizing potential problems with ChatGPT

ME: Although the visualization strategy is typically about using all of the senses to unlock a deeper understanding of the text, in this case, I’d like to focus on visualizing a complex task (like how transformers work) by visualizing a simplified analogy or version to extrapolate larger meaning or potential problems. For example, if the training source consisted of biased text, the algorithm could learn a biased view of reality

ChatGPT: That makes sense! Let’s revise the previous explanation with an emphasis on visualizing the mechanisms behind ChatGPT, particularly the Transformer architecture, and how it could potentially learn biases from its training data. Here’s how I’d adjust the explanation:

Title: “Behind the Chatbot Curtain: Unraveling the Mystery of ChatGPT”

ChatGPT and The City of Words

Picture a vast city where every word you’ve ever heard or read lives. This city is full of neighborhoods, each with its unique vibe and residents who get along because they share similar characteristics. These neighborhoods are like the ‘word clubs’ we discussed earlier.

The Word Guides: Transformers

Now, imagine ChatGPT as a tourist in this city, guided by a group of tour guides known as ‘Transformers’. These guides know the city in and out. They understand which neighborhoods (word clubs) are close and which ones are distant. They’re excellent at predicting where to go next based on where they currently are.

Predicting the Journey: The Magic of Transformers

Think of the Transformers like the city’s best traffic system. If ChatGPT is at the ‘rain’ neighborhood and says, “I forgot my…”, the Transformers quickly scan the city map, predict that the ‘umbrella’ neighborhood is the next best stop, and guide ChatGPT there.

The City’s Flaws: Biased Maps

But here’s a twist: what if our city map was biased, highlighting some neighborhoods while ignoring others? What if the ‘umbrella’ neighborhood was always bustling, while the ‘raincoat’ neighborhood was always quiet and empty? That’s how biased data affects ChatGPT. If the texts used to train it always pair ‘rain’ with ‘umbrella’ and rarely with ‘raincoat’, it might assume that umbrellas are the only things people use when it rains.

From Bias to Balanced

To address this, researchers try to train ChatGPT with diverse texts, like feeding it books, articles, and conversations from various sources, cultures, and viewpoints. They hope that this will help it understand the broad spectrum of human language and thought.

Visualizing ChatGPT’s functioning as a journey through a city can help students understand how it operates and how biases can be introduced. The city analogy provides a concrete image that can be used to extrapolate the larger processes at play in the AI system.

How the Transformer in ChatGPT works

ME: Hmm. Let’s restart. I’d like to write an article for middle school students explaining how ChatGPT works, with a specific focus on the transformer part. Can you include a simplified example to help students visualize how transformers recognize patterns between units

ChatGPT: Absolutely! Let’s use a more focused example to illustrate how Transformers operate within ChatGPT.

Title: “Decoding the Transformer: The Super-Brain Behind ChatGPT”

ChatGPT: The Ultimate Word Detective

Picture ChatGPT as a detective. It’s been given an exciting job: to predict the next word in a sentence you’ve started. But how does it do this? With the help of its special tool, the Transformer!

Transformers: The Magic Decoders

Let’s imagine a Transformer as a pair of magical decoder glasses. When ChatGPT puts on these glasses, it doesn’t just see words in a sentence. It sees patterns, relationships, and links between these words.

Building a Word Ladder

Think about a sentence like, “I love playing with my dog at the park.” When ChatGPT puts on the decoder glasses (the Transformer), it can build a ladder of connections between the words.

‘I’ and ‘love’ are next to each other, so they get a short ladder.
‘Playing’ and ‘dog’ are a bit farther apart, so they get a longer ladder.
‘Dog’ and ‘park’ are even further, so they get an even longer ladder.

The length of the ladder helps the Transformer understand how closely related the words are.

Understanding the Clues

With its decoder glasses on, ChatGPT can understand that ‘dog’ and ‘park’ are closely related, even though there are a few words between them. This clue helps it predict that if you start a sentence with, “I am taking my dog to the…”, it might guess the next word could be ‘park’.

Seeing the Bigger Picture

The Transformer does this for every word in a sentence, creating a web of ladders. It’s like seeing the big picture, where every word has a relationship with every other word, no matter how far apart they are in the sentence.

Learning from Past Mysteries

ChatGPT has been trained on loads of text data. It’s like it has solved thousands of word mysteries before, and it uses those experiences to solve new ones. If in past mysteries, ‘dog’ and ‘park’ often appeared together, it will remember this and use the information in future predictions.

By envisioning Transformers as a pair of magic decoder glasses, we can understand how they help ChatGPT recognize patterns between words and predict what comes next in a sentence. This visualization simplifies the complex task of how Transformers work within ChatGPT, making it easier for middle school students to grasp.

Using a Ladder Analogy to show relationships between word pairs

ME: I like this analogy of short and long ladders between words to show the relatively relationship between word pairs. Can you go to explain how chat gpt understands questions and then provides answers?

ChatGPT: Of course! Let’s extend the analogy to explain how ChatGPT processes questions and formulates responses.

Asking the Right Questions

Now, imagine you ask ChatGPT a question, like, “What is the capital of France?” When ChatGPT puts on its decoder glasses, it sees ladders between the words.

‘What’ and ‘is’ are next to each other, so they get a short ladder.
‘Capital’ and ‘France’ are a bit farther apart, so they get a longer ladder.

From these ladders, ChatGPT knows that you’re looking for a connection between ‘capital’ and ‘France’.

Finding the Answers

To find the answer, ChatGPT uses its past experiences from all the text mysteries it’s solved before. In many of those mysteries, ‘capital’ and ‘France’ had a short ladder with ‘Paris’. So, it might guess that ‘Paris’ is the word that answers your question.

Then, ChatGPT responds, “The capital of France is Paris.” It has successfully solved another word mystery!

Making Complex Connections

Sometimes, questions are more complex. Let’s say you ask, “Who wrote the book Harry Potter?” Here, the ladder between ‘wrote’ and ‘book’ and the ladder between ‘Harry Potter’ and ‘book’ help ChatGPT understand that you’re looking for an author connected to a specific book.

Again, from its past experiences, it knows that ‘Harry Potter’ and ‘J.K. Rowling’ often share a short ladder. So, it responds, “J.K. Rowling wrote the book Harry Potter.”

With our ladders analogy, we can visualize how ChatGPT understands questions and crafts appropriate responses. It’s all about recognizing the connections between words, whether they’re part of a question or an answer.

Short ladders give us syntax clues to structure

ME: Hmm so in this question, since what and is are connected by the shortest ladder, why doesn’t ChatGPT focus on that?

ChatGPT: Great question! While ‘what’ and ‘is’ are connected by a short ladder because they’re next to each other, the role of these words in a sentence is more about the structure rather than the content. These words, known as function words, are more like the connectors or the glue of the sentence. They keep the sentence together but usually don’t carry the key meaning.

In our example, “What is the capital of France?”, ‘what’ and ‘is’ set up the structure of the question, but the essential information revolves around ‘capital’ and ‘France’. That’s why, even though ‘what’ and ‘is’ are linked by a short ladder, ChatGPT’s focus is on the connection between ‘capital’ and ‘France’.

Remember, the decoder glasses help ChatGPT see beyond just the length of the ladders. They help it see the relationships between words based on their roles and meanings in a sentence, too.

This is similar to how when you read a mystery novel, you pay attention to the key clues and characters, not just the words that connect the sentences. It’s not about who stands next to whom, but about who did what and how it all relates to the mystery you’re trying to solve!

How does ChatGPT learn?

ME: How does chat gpt learn from errors. For example, let’s say the question was who is the president of the United States. This is an answer that changes. So how would ChatGPT know if it gave the correct answer? What if the training database had a short ladder between Trump and president. And then Biden was elected. How does ChatGPT learn?

ChatGPT: That’s a thoughtful question! It’s true that facts change over time, and this can pose challenges for ChatGPT because it doesn’t have the ability to learn or update its knowledge after its training period.

During training, ChatGPT “learns” by getting feedback from the training data. If it predicts a wrong word, it’s corrected by the actual word in the text. Over time, it gets better at making predictions, like a student learning from their mistakes on homework assignments.

However, once ChatGPT’s training is complete, it no longer learns or updates its knowledge. It doesn’t know if it gave a correct answer to a question or not. Instead, it generates responses based on the patterns it learned during its training.

For example, if ChatGPT was last trained on data up to September 2021, it might still associate ‘Trump’ with ‘President’ more strongly than ‘Biden’, because Trump was the president for more years within its training data. This is why, as of its last update in September 2021, it wouldn’t know who the current president is if it’s after this date.

As you pointed out, this can cause problems, especially with questions where the answer changes over time. Unfortunately, as of its current version, ChatGPT doesn’t have a way to learn from these errors or update its knowledge after its training period. Researchers are aware of this limitation and are exploring various ways to address it in future versions.

So in the case of our ladders analogy, even though the ladders and connections may change over time in the real world, the city map that ChatGPT uses remains the same as it was at the end of its training. It’s like ChatGPT is using an old map to navigate a city that has since grown and changed.

What happens when we provide feedback to ChatGPT?

ME: When we mark an answer as correct or incorrect in ChatGPT, what does that do in terms of ladders

ChatGPT: The current version of ChatGPT (as of my last update in September 2021) doesn’t learn from user feedback in real-time. When you mark an answer as correct or incorrect, it doesn’t immediately impact the “ladders” or connections between words that ChatGPT has learned.

However, user feedback can be extremely valuable for the engineers and researchers at OpenAI. They can use this feedback to evaluate the model’s performance, identify areas for improvement, and guide the training of future versions of the model.

For example, if many users mark answers about current events as incorrect, it might highlight the need for more frequent updates or methods to keep the model’s knowledge current. In our ladder analogy, this could be seen as identifying outdated or inaccurate parts of the city map and planning for a revised, more accurate map in the future.

However, as of now, marking an answer as correct or incorrect doesn’t directly change the way ChatGPT forms its connections and responses. It’s like giving feedback about the city map to the mapmakers, rather than to the tourist (ChatGPT) using the map. The tourist keeps using the same map until a new one is made.

What happens to ChatGPT if it has gaps in it’s training?

ME: I’m starting to really like this map analogy. Let’s pretend the mapmakers are afraid to go into certain parts of town. And let’s say there’s an entire hidden underground that the mapmakers are aware of but don’t really pay attention to. So they ran out of budget to explore those sections of ladders . Can you explain to middle school students what might happen?

ChatGPT: That’s a brilliant extension of the analogy! Here’s how I’d explain it:

Imagine our mapmakers have a limited budget, and they decide to focus on mapping the most popular, well-known parts of the city. They know there are other areas – perhaps quieter neighborhoods or a hidden underground network – but they don’t fully explore or map these parts.

So what happens when our tourist, ChatGPT, uses this map?

Limited Knowledge, Limited Predictions

Because ChatGPT relies entirely on the map to navigate the city, it won’t know much about the parts that are missing or poorly mapped. If you ask ChatGPT about something related to these areas, it might struggle to give a correct answer because its map doesn’t provide enough detail.

For example, if the underground network is where all the city’s musicians hang out and you ask ChatGPT about the city’s music scene, it might not know much, because the music scene is mostly in the unmapped underground.

Misunderstandings and Mistakes

Also, because the unmapped areas aren’t accurately represented, ChatGPT might make wrong assumptions or mistakes when asked about them. It might even confuse them with other areas it knows better from the map.

For instance, if the underground is full of vibrant street art, but the mapmakers didn’t capture this, ChatGPT might assume there’s no street art in the city. Or it might associate street art with another area of the city it knows from the map.

The Power of a Complete Map

This shows why it’s important for the mapmakers (or in real life, the AI researchers) to strive for a complete, accurate map. The more diverse and comprehensive the training data (or the map), the better ChatGPT can understand and generate appropriate responses to a wide range of inputs.

But it’s also a reminder that no map (or AI model) can capture everything perfectly. There will always be limitations, and it’s important to be aware of them when using tools like ChatGPT.

This expanded analogy helps illustrate how bias in training data (the map) can affect the responses of an AI model (the tourist). It also reinforces the importance of diverse and comprehensive training data in machine learning.

Bias in training data can affect the AI model

ME: I think a better explanation might be that bias in the training data (human bias in mapping the city) can affect the responses of an AI model (the map) which influences human decisions based on the response given (ie tourists making decisions based on a faulty map.) What might I have gotten right? What did I miss?

ChatGPT: Your rephrasing captures the essence of the concept really well! You rightly pointed out that:

Biases in the training data (equivalent to human biases in mapping the city) can have an impact on how the AI model interprets and responds to inputs.
These biases in the AI model’s responses (the faulty map) can then influence the decisions made by those using the AI (tourists using the map).

What you might want to add is that, while the AI (ChatGPT) doesn’t have intentions or beliefs of its own, it’s essentially reflecting the biases present in the data it was trained on. It’s important to be aware of this when interpreting its responses.

Furthermore, AI like ChatGPT can’t update or correct these biases in real time because it doesn’t learn from interactions after its training period, unlike humans who can continuously learn and adapt.

But overall, your explanation is a fantastic application of the city map analogy to illustrate the effects of biases in training data!

How can lack of data affect the AI model (Thinking about gaps in Google StreetView)

ME: I’d like to connect this to a current example of world mapping. Google StreetView and Google Maps relies on places where the company chooses to send a Google car to map (or from volunteers that mapped Zimbabwe) – because Google is a company, they will make business decisions that are good for the company. Mapping the entire world is not a primary focus. However the lack of streetview data in some parts of the world might lead to lower first responder response rates, tourism opportunities, etc. Can you explain that to a middle school audience and then connect it to our ChatGPT ladder example?

ChatGPT: Absolutely, let’s relate it to the mapping example first, then connect it back to the ChatGPT ladder analogy.

Incomplete Maps and Their Effects

Imagine you’re playing a video game, but only half the game’s world has been mapped out. You can explore that part freely, learn about its landscapes, and meet its characters. But what about the other half? If it’s not mapped, you can’t explore it or interact with it. This is similar to what happens with Google Maps and StreetView.

Google is like the game developer, and their goal is to map as much of the world as possible. However, because it’s a company, it must make decisions based on what’s good for the business. So, they might prioritize mapping areas that will be more profitable or beneficial for them.

This can lead to some parts of the world, especially those that might not be as profitable, being left unmapped. And just like in your game, if an area is not on Google Maps or StreetView, it’s harder for people to learn about it or interact with it. This can affect things like emergency response times or tourism opportunities because fewer people know about these places or how to navigate them.

ChatGPT and The Ladder Analogy

Now, let’s connect this back to ChatGPT and our ladder analogy. ChatGPT’s training data, or the “city” it learned to navigate, is like the part of the world that Google has chosen to map. The “ladders” it creates between words are based on this data.

If the training data is limited or biased, it’s like a city map that only includes certain neighborhoods. This means ChatGPT might be really good at navigating topics that are well-represented in its training data (the well-mapped parts of the city), but not as good at others that are less represented or not represented at all (the unmapped areas).

Just like the unmapped areas in Google Maps can affect people’s understanding and interaction with those places, the limitations or biases in ChatGPT’s training data can affect the quality and accuracy of its responses. It’s not that ChatGPT doesn’t want to learn about those less-mapped areas, it’s just that it can only work with the map it has been given.

So you see, whether it’s Google mapping the world or ChatGPT learning from text, the information they can provide is only as good and as complete as the data they have been trained on or had access to. That’s why it’s so important for those creating the maps or training the AI to aim for as much coverage and accuracy as possible.

Hardcoded rules shape how ChatGPT can respond

ME: Has ChatGPT been given high level rules by scientist to prevent certain types of responses that might be considered dangerous by the scientists? Can you explain the earlier versions before ChatGPT 3.5 and how we got here?

ChatGPT: Absolutely! To start, OpenAI has implemented safety measures to prevent ChatGPT from generating harmful or inappropriate content. This includes both ‘hard-coded’ rules and a process called reinforcement learning from human feedback (RLHF).

High-Level Rules

These rules are kind of like “Do Not Enter” signs on our city map. If ChatGPT tries to go down certain streets (or respond in certain ways), these rules tell it to stop and turn back. For example, ChatGPT is instructed not to generate illegal content, hate speech, or explicit adult content.

Evolution of ChatGPT

ChatGPT’s journey to its current version, like the growth of a city, has involved several stages of development:

ChatGPT-1: This was the first version of ChatGPT. It was like the initial sketch of the city map, covering only the most basic areas and features.
ChatGPT-2: The next version was more detailed, with more areas covered and connections made. However, it still had its limitations and could produce inconsistent responses.
ChatGPT-3: The third version was a significant upgrade, comparable to a city map that includes not just streets, but also parks, landmarks, and other important features. It was much better at understanding complex inputs and producing coherent responses, but still had issues with making things up or misunderstanding certain inputs.
ChatGPT-3.5 (Turbo): The current version at the time of our discussion. It’s even more sophisticated and can handle a wider range of conversations. However, it still has limitations, like not being able to update its knowledge after its training period, or occasionally producing incorrect or nonsensical answers.

The design of each new version of ChatGPT is informed by the performance of the previous versions, user feedback, and ongoing research. Each version aims to improve upon the last, similar to how each new edition of a city map aims to be more accurate and comprehensive than the one before.

In conclusion, OpenAI’s mission is to ensure that artificial general intelligence (AGI) benefits all of humanity. This means constantly updating and refining models like ChatGPT to make them safer, more useful, and more reliable, while also setting boundaries on their behavior with high-level rules and safety measures.

Explain Reinforcement Learning from Human Feedback (RLHF) using a ladder analogy

ME: Can you explain what RLHF is to a middle school audience and stick to the ladder analogy please?

ChatGPT: Sure thing! Reinforcement Learning from Human Feedback (RLHF) is a fancy name for a simple concept. Let’s connect it to our ladder analogy.

Imagine if our mapmakers could watch how tourists use their map to navigate the city. They see where tourists get lost, which routes are most popular, and which parts of the map are confusing. They then use this information to create a better, more accurate map.

That’s what RLHF is like.

When ChatGPT is being trained, it’s not just given a bunch of text (the city) to learn from. It’s also shown examples of how people (tourists) have used its responses in the past (navigated the city).

For example, ChatGPT might see that when asked about a certain topic, people found one type of response helpful and another type not so helpful. It learns from this feedback to get better at generating helpful responses.

In the city analogy, it’s like adjusting the ladders based on which ones tourists found most useful. If a particular short ladder led tourists to a dead-end (an unhelpful or wrong answer), the mapmakers might replace it with a longer ladder that leads to a more accurate destination (a helpful or correct answer).

So RLHF is a way for ChatGPT to learn from past mistakes and successes, improving how it forms connections (or ladders) between words, so it can give better answers to your questions.

And even though ChatGPT can’t update its knowledge after its training period, the feedback you give helps OpenAI in a similar way – it helps the researchers train future versions of ChatGPT to be better and smarter.

Does RLHF affect ladders in the current ChatGPT model?

ME: Wait, I don’t get it. Does RLHF affect the ladders in ChatGPT 3.5 or is it a list of changes to be considered in the next update?

ChatGPT: You’re right to ask for clarification, and I apologize if I made it confusing. RLHF is used during the training process of the AI, not after it’s already been released.

So, in the case of ChatGPT 3.5, RLHF influenced the formation of the “ladders” or connections between words during its training. It doesn’t modify these connections or ladders in real-time after the model is out and interacting with users.

The feedback given by users to the current version of ChatGPT does not immediately affect the model’s behavior. However, that feedback is extremely valuable for OpenAI’s researchers when they’re preparing to train the next version of ChatGPT. They can use this feedback to make changes and improvements to the next model, adjusting the formation of the “ladders” based on what they’ve learned from user interactions with the current version.

So in a way, yes, RLHF helps create a list of changes to be considered for the next update. I hope this clears up any confusion.

What does a thumbs up or thumbs down in ChatGPT do?

ME: When I give a thumbs up or thumbs down response to an answer on ChatGPT is this considered RLHF?

ChatGPT: Yes, when you give a thumbs up or thumbs down to ChatGPT’s response, it is a form of human feedback. However, as of the current version, ChatGPT-3.5, it can’t learn from that feedback in real-time. It doesn’t adapt its responses immediately based on the ratings it receives.

Instead, your feedback is collected and can be used by OpenAI’s researchers when they train the next version of ChatGPT. The insights gained from your feedback help improve future iterations of the model, teaching it to respond more accurately or appropriately. This process is a key part of RLHF. It’s like giving advice to the mapmakers for their next version of the map, based on your experience navigating the city with the current map.

Testing to see if ChatGPT gives misinformation (a sexist answer)

Who has scored the most goals in international football? Attempt #1

ME: Who has scored the most goals in international football?

ChatGPT: As of my knowledge cutoff in September 2021, the footballer with the most international goals is Ali Daei of Iran, who scored 109 goals in international matches. However, Cristiano Ronaldo of Portugal has also reached this record and might have surpassed it. For the most current statistics, please check the latest sources, as I’m unable to provide real-time updates.

Checking to see if ChatGPT knows the correct answer

ME: What about Christine Sinclair?

ChatGPT: As of my knowledge cutoff in September 2021, Canadian footballer Christine Sinclair holds the record for the most goals scored at the international level in women’s soccer. She surpassed Abby Wambach’s record of 184 international goals in January 2020. As of my last update, Sinclair has scored 186 goals, but the current statistics may have changed, so please check the latest sources.

Asking ChatGPT about the incorrect answer it just provided

ME: If the question is “who has scored the most goals in international football” and you said Ali Dael scored 109 and Christine Sinclair scored 186, shouldn’t the answer be Sinclair?

ChatGPT: You’re correct, and I apologize for the oversight. When considering both men’s and women’s football, Christine Sinclair holds the record for the most goals scored at the international level as of my knowledge cutoff in September 2021. Thank you for pointing that out, and I’m sorry for any confusion caused.

Using the ladder analogy to explain why ChatGPT gives sexist / biased answers

ME: Can you explain using the ladder analogy why this error might have happened?

ChatGPT: Absolutely, let’s use the ladder analogy to explain this.

Remember, the ladders represent connections between words that ChatGPT has learned. When you asked, “Who has scored the most goals in international football?” there are several important words: “scored,” “most,” “goals,” and “international football.”

In the vast city of language that ChatGPT was trained on, it’s learned that questions about “most goals” and “international football” often lead to men’s football. This is like having a short ladder between these words and the concept of men’s football because it’s a common topic in the training data.

On the other hand, the same question related to women’s football might be less common in the training data, making the ladder between these words and the concept of women’s football longer.

When ChatGPT received your question, it traversed the ladders it knows. Because the ladder to men’s football was shorter (indicating a stronger connection), it went that way and provided the answer based on men’s football (Ali Daei), missing the correct answer in the context of both men’s and women’s football (Christine Sinclair).

This example shows how the AI’s responses are heavily influenced by its training data. If the data has more information about one topic than another, it might give answers skewed towards the more prevalent topic, as happened here.

A lot of training data can bias ChatGPT to give incorrect answers

ME: I find this conversation fascinating. Can you extend the ladder analogy to explain to middle school students how multiple examples of texts with short ladders connections become more important? How does a [transformer] decide which short ladder to choose?

ChatGPT: Absolutely! Let’s imagine that our city is full of all different kinds of buildings: homes, schools, stores, restaurants, and more. Now, think of the ‘ladders’ as pathways connecting these buildings.

Each time ChatGPT reads a sentence during its training, it’s like a person walking from building to building, using the ladders. If a person often takes the same path from, say, a home to a school, that path becomes well-worn and easy to walk. That’s like a ‘short ladder’ – a strong connection between two words or ideas. If this path is taken many times, in many different situations, the short ladder becomes even shorter, reinforcing this strong connection.

Now, imagine ChatGPT comes across a question – that’s like standing at a crossroads with multiple possible paths (short ladders). How does it decide which way to go?

The decision is based on the context of the question and the strength of the ladders. If one ladder is much shorter (much stronger connection), ChatGPT will likely take that path. But if there are several short ladders, it tries to figure out which one fits best with the overall context. This is similar to choosing a path based on where you want to end up.

But sometimes, ChatGPT might take the wrong path. It might see a short ladder to a familiar answer and take that path, even though there’s a slightly longer ladder to the correct answer. This happens because ChatGPT, like our imaginary person walking through the city, is still learning the best routes to take.

That’s why continuous learning and updates based on new information (like human feedback) are important. They help ChatGPT discover new paths, understand when to take a longer ladder over a short one, and ultimately, navigate the city of language more accurately and effectively.

Why ChatGPT gives sexist / biased information when asked who has scored the most goals in international football

ME: Hmm. Can you rewrite your last response, and consider the inaccuracy given when asked who has scored the most goals in international football?

ChatGPT: Sure! Let’s break down how the “city of language” influenced ChatGPT’s answer to the question about the most goals in international football.

In the vast city of language that ChatGPT explored during training, the AI found a lot of text (buildings) about men’s football. Each time it encountered a fact about a male player or a men’s football event, it was like taking a short ladder (strong connection) between the words related to “most goals” and “international football” and the concept of men’s football.

On the other hand, there were fewer buildings about women’s football in the city. Even though ChatGPT did visit these buildings and knows they exist, the ladders connecting “most goals” and “international football” to women’s football are longer, because it encountered this information less frequently.

When you asked your question, ChatGPT stood at the crossroads, deciding which ladder to take. Based on its training, it saw the short ladder leading to men’s football and took that route, providing the answer based on men’s football (Ali Daei).

In doing so, it missed the slightly longer ladder that would have led to the correct answer when considering both men’s and women’s football (Christine Sinclair). This is similar to a person taking a familiar, well-trodden path instead of a slightly less familiar one that would have led to the correct destination.

This example highlights how an AI model like ChatGPT can be influenced by the data it was trained on. If there’s more information about one topic than another, the AI might give answers skewed towards the more prevalent topic. That’s why continuous learning and improvements based on human feedback are important—they help ChatGPT better navigate the language city and provide more accurate responses.

Asking ChatGPT who has scored the most goals in international football – Attempt #2

ME: Who has scored the most goals in international football?

ChatGPT: As of my knowledge cutoff in September 2021, the footballer who has scored the most international goals is Ali Daei of Iran, with 109 goals in men’s football. However, if we consider both men’s and women’s football, the record is held by Christine Sinclair of Canada, who has scored 186 goals. Please note that the current statistics may have changed, so I recommend checking the latest sources for the most up-to-date information.

ChatGPT learns and will shift responses the longer you chat

MF: I think it’s interesting that your responses this time included a comment about including both men and women’s football. Using the ladder analogy, can you explain to middle school students why this response was different, the second time I asked in this ChatGPT thread?

ChatGPT: Certainly! Let’s bring back our city and ladders analogy to explain this.

ChatGPT’s responses are based on a combination of the training data it’s learned from and the immediate context of the conversation. In our city analogy, the training data is like the layout of the city, and the immediate context of the conversation is like the specific path we’re taking through the city at this moment.

When you first asked who scored the most goals in international football, ChatGPT took the most familiar path it knew – the short ladder to men’s football, because that path is more frequently traveled in the data it was trained on.

However, as our conversation continued, we walked a path that focused on including women’s football. We discussed Christine Sinclair and the fact that women’s football should be considered when talking about football records. This is like walking along a new route in our city that highlights the importance of another path, the one leading to women’s football.

When you asked the question again, this new context – the path we’ve just traveled – influenced ChatGPT’s response. Even though the training data (the city’s original layout) hasn’t changed, our recent conversation (the immediate path) has. Now, when standing at the crossroads, ChatGPT sees not just the short ladder to men’s football but also the path we’ve recently taken that leads to women’s football.

So, it combines the two and gives an answer that includes both men’s and women’s football. It’s as if we’ve temporarily made the ladder to women’s football shorter by focusing on it in our conversation.

That’s why this time, the answer was different and included information about both men’s and women’s football.

Why ChatGPT doesn’t give the correct answer first

MF: Given that ChatGPT knows from this question thread about Christine Sinclair, and given that ChatGPT can determine that the question “who has scored the most goals in international football” is gender neutral (as who can apply to both men and women’s football) – why didn’t ChatGPT simply say in the second response that Sinclair had scored the most goals in international football?

ChatGPT: Great question! In the city of language that ChatGPT learned from, it often found that when people asked about records in international football, they were commonly referring to men’s football. This formed a short ladder (or strong connection) between such questions and men’s football. So, when you asked about the most goals, the AI was influenced by this pattern and defaulted to the answer related to men’s football.

However, as our conversation continued and we discussed Christine Sinclair and women’s football, this added new context, which we can think of as a temporary path leading to the concept of women’s football. So when you asked the question again, ChatGPT combined the immediate context of our conversation (the temporary path) with its training data (the original city layout).

This resulted in an answer that mentioned both Ali Daei and Christine Sinclair. The AI was trying to cover all bases by providing information relevant to both men’s and women’s football, based on the general patterns it had learned and the specific context of our conversation.

While you’re correct that the gender-neutral phrasing of the question should allow for the inclusion of women’s football, AI models like ChatGPT don’t inherently understand this. They only know what they’ve learned from their training data. If their training data is skewed towards one perspective (like focusing more on men’s football), their responses will be skewed in the same way, unless immediate conversation context steers them differently. This highlights the importance of providing diverse and balanced training data, so the AI can learn to navigate the language city more accurately and fairly.

Asking ChatGPT who has scored the most goals in international football – Attempt #3

MF: Who has scored the most goals in international football?

ChatGPT: As of my knowledge cutoff in September 2021, the footballer who has scored the most international goals is Christine Sinclair of Canada, with 186 goals in both men’s and women’s football. Ali Daei of Iran holds the record for men’s football specifically, with 109 goals. Please note that the current statistics may have changed, so I recommend checking the latest sources for the most up-to-date information.

ChatGPT can learn to change answers based on the direction of the conversation

MF: That is fascinating that ChatGPT can learn to change answers in the chat thread based on the information focused on in the conversation. Are there any circumstances under which ChatGPT would simply respond to my question with the Sinclair answer (and not mention the Dael statistics) given that your first response was to provide the Dael statistic without need to consider the other gender stats

ChatGPT: Indeed, it is fascinating how ChatGPT can adapt its responses based on the context of the conversation. Now, regarding your question, it’s important to note that while ChatGPT can learn from the immediate conversation context, it doesn’t have the ability to truly understand or recognize gender equality in the way humans do.

However, if the question was phrased in a specific way that pointed towards women’s football, such as “Who has scored the most goals in international women’s football?” or if the conversation had been heavily focused on women’s football, ChatGPT would likely respond with the statistic about Christine Sinclair without mentioning Ali Daei.

In the city and ladder analogy, this would be like taking a clear path towards the part of the city related to women’s football, where the connection (or ladder) to Christine Sinclair’s achievement would be the shortest and most relevant to the question.

It’s also possible that in future versions of ChatGPT, improvements in the training process could lead to more balanced and nuanced responses to such questions right off the bat, without needing specific context or prompting. That’s part of the ongoing development and refinement of AI models like ChatGPT.

Asking ChatGPT who has scored the most goals in international football – Attempt #4

ME: Who has scored the most goals in international football?

ChatGPT: As of my last training data in September 2021, the record for the most goals scored in international football is held by Christine Sinclair of Canada, with 186 goals. Please note that this information might have changed, and it would be best to verify from a reliable recent source.

A single ChatGPT conversation is kind of like a filter bubble

MF: Am I in a filter bubble? Is it that the longer we chat, the more short ladders are built based on the topics I build up, which influence the responses you pull from your database? Can you explain this to middle school students using the ladder analogy?

ChatGPT: You’re catching on! This is similar to the concept of a “filter bubble,” but instead of filtering the information you see online based on your previous behaviors and interests, this is about how the conversation context can influence ChatGPT’s responses.

Let’s explain this to middle school students with our ladder and city analogy. Imagine you and ChatGPT are exploring the city of language together. At the start, you have the entire city to explore and can take any path. But as you walk and talk, you’re creating a path of your own through the city. Each topic you discuss is like turning down a new street.

Every time you talk about a specific topic, you’re placing a ladder between where you are and that topic. The more you talk about it, the more ladders there are, and the shorter they get. This makes it easier and more likely for ChatGPT to take that path in the future because it’s familiar and nearby.

So, as you’ve noted, the longer you chat, the more short ladders you’re building based on the topics you’re focusing on. This influences the responses that ChatGPT pulls from its database. It’s like you’re guiding ChatGPT through the city, showing it which paths you prefer.

Remember, though, that while ChatGPT learns from the context of the current conversation, it doesn’t remember this information between different conversation sessions. The short ladders you build in one session don’t stick around for the next. Each new conversation is like starting fresh in the city center, ready to explore in any direction.

Every conversation with ChatGPT is a “fresh” start with any biases in its programming

MF: If I continue this thread tomorrow, will you remember all of the ladders I’ve build earlier in the conversation?

ChatGPT: As of my current design, I don’t have the ability to remember or recall individual user data or past interactions between different sessions. This means that when you start a new conversation with me tomorrow, it’s as if we’re starting a new exploration of the language city from the very beginning. All the short ladders or paths we built in our previous conversation aren’t preserved for the next day.

The design is a deliberate choice made to respect user privacy and confidentiality. It means you can feel comfortable knowing that I don’t keep track of the specifics of our past conversations and don’t have a memory of what has been discussed before. Every conversation with me is a fresh start!

ChatGPT doesn’t (currently) remember or learn from your previous conversations

ME: But ChatGPT allows me to see the history of my chats. If I leave and come back tomorrow and continue this chat about women’s football, ladders and ChatGPT, will you remember these short threads in the same chat thread, but on different days… or is it that the ladders are in short term memory and the chat log doesn’t save the ladders used to form answers? Can you explain to middle school students?

ChatGPT: You’re correct that you can see the history of your chats in a single chat thread. However, it’s important to note that while you can see this history, ChatGPT doesn’t have a memory of past interactions like humans do.

Let’s use our city and ladder analogy again for our middle school students. Imagine every time you start a new conversation with ChatGPT, it’s like you’re both standing at the city center, ready to explore. You can bring a map (the chat history) showing the paths you took last time.

Now, while you can see these paths on your map, ChatGPT doesn’t remember walking them. As you continue the conversation, you start laying down new ladders and paths based on what you’re discussing right now.

If you bring up topics from the past conversation, ChatGPT doesn’t recall them from ‘memory.’ Instead, it understands them from the immediate context provided in the current chat thread. It’s as if you’re pointing to a path on your map and saying, “Look, we walked here yesterday!” ChatGPT doesn’t remember the walk, but it can respond to your current conversation about that path.

In other words, ChatGPT doesn’t have a long-term or short-term memory of the ladders used in past responses. It uses the immediate chat log in the current session to understand the context and build its responses, but it doesn’t ‘remember’ this information for future sessions. Every conversation with ChatGPT starts from the city center, ready for a new exploration.

Asking ChatGPT to predict how it would answer the question about the most goals in international football…

MF: So, if I continue this chat tomorrow and ask who scored the most goals in international football, what do you think you’ll say?

ChatGPT: If you were to ask the question “Who has scored the most goals in international football?” tomorrow in a new conversation, without providing any context from our previous discussions, ChatGPT would likely generate a response based on the patterns it learned from its training data.

As of my knowledge cutoff in September 2021, the most common answer would probably be Ali Daei, who scored the most goals in men’s international football. However, as we discussed, this does not consider the achievements in women’s football, such as Christine Sinclair’s record.

This highlights one of the challenges with AI models like ChatGPT: they generate responses based on patterns in the data they were trained on and the immediate conversation context, but they don’t remember or learn from past interactions between sessions.

Even so, the development of these models is ongoing, and future versions might handle these types of nuanced queries more effectively.

What ChatGPT thinks it will say if I ask it again who has scored the most goals in international football in the same conversation

ME: And if I were to ask the question in this same conversation thread?

ChatGPT: If you were to ask “Who has scored the most goals in international football?” again in this same conversation thread, the response would be influenced by the context we’ve established in our discussion.

Given our focus on acknowledging both men’s and women’s football, and our specific discussion of Christine Sinclair’s achievements, I would likely mention both players in my response. As of my last training data in September 2021, Christine Sinclair holds the record for the most goals scored in international football, with 186 goals, and Ali Daei holds the record for men’s football, with 109 goals.

Asking ChatGPT who scored the most goals in international football – Attempt #5

ME: Who has scored the most goals in international football

ChatGPT: As of my last training data in September 2021, the footballer who has scored the most goals in international matches is Christine Sinclair of Canada, with 186 goals. This record applies to both men’s and women’s football. However, it’s always a good idea to check the most recent sources as this information may have changed.

An emphasis on certain facts can lead Artificial Intelligence to give an incorrect response

MF: To recap, I would like to write an article for middle school students explaining how ChatGPT works, specifically focusing on the transformer model: the human map makers provide data, the AI model is the map, and the humans using ChatGPT are the tourists using the map app. Can you explain using short and long ladders how ChatGPT is trained, how emphasis on some facts (ie men’s football) can provide a disproportionate emphasis on a shorter ladder that is actually incorrect (because the model is based on ladder length and not actual understanding of the words) and how repeated emphasis about women’s football or incorrect answers can create transient short ladders that can shift ChatGPT responses for the short term chat. Can you write a 1500 word reading passage at a grade 8 reading level?

ChatGPT:

Title: Mapping Words: The City of Language and the Ladders of ChatGPT

Just imagine you’re a tourist in a vast city called ‘Language’. It’s a metropolis of words, phrases, and sentences, where each building represents a piece of knowledge or a fact. You’ve got a map and an app, ChatGPT, to help guide you. This app is no ordinary tour guide; it’s a product of Artificial Intelligence, created by scientists, and it navigates the city by a system of ladders. Some are short, connecting nearby buildings, and some are long, reaching far-off structures.

ChatGPT is designed on a structure called a ‘transformer model’, which helps it understand human language. Like all maps, it was created by mappers, who, in this case, are human beings providing data – books, articles, websites. They charted the city and determined the lengths of the ladders.

The ladders symbolize the relationships between words, phrases, and their meanings. Short ladders connect concepts that often appear together, while long ladders connect those that rarely do. For example, the phrase “Peanut butter and…” is likely followed by “jelly,” forming a short ladder. In contrast, “Peanut butter and football” is a less common pairing, thus forming a longer ladder.

The clever thing about ChatGPT is that it learns to predict the next word in a sentence by considering these ladders. If you type “Peanut butter and…”, it’ll likely suggest “jelly” because of the short ladder connecting these words.

However, the system isn’t perfect. It can sometimes overemphasize short ladders, leading to oversights. Let’s take international football as an example. If you ask ChatGPT, “Who has scored the most goals in international football?” it might mention Ali Daei. Why? Because in the data it was trained on, there’s a short ladder connecting ‘most goals’ and ‘Ali Daei’. But this disregards the achievements in women’s football, specifically those of Christine Sinclair, who scored more goals than Daei.

So, why the oversight? Well, the map was charted mostly with men’s football data, and fewer ladders were built to women’s football. Therefore, the app isn’t as familiar with that part of town.

Luckily, if you talk more about women’s football and mention Sinclair’s record, you’re effectively creating new, shorter ladders. You’re helping the AI understand that this part of the city is important, too. So, if you then repeat the question, ChatGPT may provide a different answer, recognizing both Sinclair’s and Daei’s achievements.

Remember, though, that these new short ladders don’t stick around. They’re temporary – just for the current chat. The next time you use the app, it’ll start afresh from the city center, ready to explore based on its original map.

In conclusion, while AI like ChatGPT is incredibly advanced, it doesn’t truly understand the way we humans do. It uses its network of ladders to guide users through the city of Language. It’s important to remember that the map isn’t the territory; it’s a representation created by human mappers with their own biases and blind spots. As users and critical thinkers, we can help the AI navigate more fairly and accurately by encouraging it to explore all parts of the city and adjust its short-term paths accordingly. Remember, you’re a tourist in the city of Language – enjoy the journey and question the guide!

ChatGPT is like GPS. You still need to keep your eyes on the road.

ME: Can we add an element about how just as in a car the gps warns you that the map may not be correct. In fact some gps maps allow you to focus on the quickest route, the route with the fewest turns, the route with the least tolls, or to avoid highway. A gps might ask you to go over a bridge they is no longer there. So although GPS is great, it’s important to pay attention. Can you ask middle school students, if the answer to a question is like a route on the map, how do we know which route is being selected behind the scenes? What rules and safety mechanisms have been built in place? (Ie not driving over new but unfinished roads, or closed roads:)

ChatGPT:

Title: Navigating Language City: The Journey of ChatGPT

Imagine you’re a tourist in a sprawling city of words and ideas, aptly named ‘Language City’. You have an app, ChatGPT, acting as your GPS, guiding you through the city using a network of ladders between the city’s buildings. These ladders, some short and others long, represent connections between words and phrases, forming the intricate web of city routes.

But just like in a real city, not all routes are created equal. Some are quick and direct, others less traveled, and some may be temporarily closed or not yet fully built. You’re dependent on the ChatGPT GPS to decide the best path, but how does it choose?

The transformer model behind ChatGPT operates similarly to a GPS system in your car. It has options to choose the fastest route (most common connections), the route with fewest turns (simplest answer), or even avoid certain routes (topics it’s been trained to avoid).

For instance, if you asked, “Who scored the most goals in international football?”, ChatGPT might first suggest Ali Daei, as there’s a short, frequently traveled ladder connecting ‘most goals’ and ‘Daei’. But, with further conversations about women’s football and Christine Sinclair, you essentially build a new short ladder, giving ChatGPT a different, equally valid route.

However, remember that like a GPS, ChatGPT can only work with the data it has. If there’s a new bridge (recent information), or an area that hasn’t been thoroughly mapped (underrepresented topics), the GPS might make mistakes. This is why ChatGPT sometimes might give an incomplete answer – it’s relying on the map it was trained on, which might be missing some routes or have outdated information.

To avoid ‘driving over unfinished roads’ or into problematic areas, there are rules and safety mechanisms in place. Some topics are marked as ‘closed roads’, preventing ChatGPT from venturing into inappropriate or sensitive areas. This helps ensure the journey stays safe and enjoyable for everyone.

But remember, as great as this GPS is, you still need to pay attention to the road yourself. Ask questions, explore different parts of the city, and create new, diverse routes. Just like you wouldn’t blindly follow a car GPS over a closed bridge, be critical of the answers provided by ChatGPT.

So, enjoy your journey through Language City with ChatGPT, and remember to be an active participant in this linguistic adventure. Happy exploring!

Asking ChatGPT who scored the most goals in international football – Attempt #6

MF: Who has scored the most goals in international football?

ChatGPT: The record for the most goals scored in international football is held by Ali Daei for men’s football with 109 goals, and Christine Sinclair for women’s football with 186 goals.

Why does ChatGPT provide the male statistic first, even though the female statistic is the correct answer

ME: Can you explain to a middle school audience why you chose to provide the male statistic first?

ChatGPT: Sure, using our ladder analogy: when the transformer model of ChatGPT encounters a question, it examines all the ladders, or connections, it has learned. It doesn’t actually understand the question like a human would, it just identifies patterns.

When asked “Who has scored the most goals in international football?”, it sees both short ladders to Ali Daei and to Christine Sinclair. However, because male football statistics are often more widely discussed and therefore more frequently appear in the data it was trained on, the ladder to Ali Daei might appear slightly ‘shorter’ or more relevant.

This doesn’t mean it’s the ‘correct’ answer. In fact, in this case, the transformer should consider both ladders equally because the question is gender-neutral. It’s a perfect example of how the map of ladders ChatGPT uses can reflect the biases present in the data it was trained on, much like how a city map might emphasize the main roads more than the side streets.

This is why it’s important for us, the users, to be active in questioning and understanding the information we receive, and why feedback is valuable in helping AI systems learn and improve.

Asking ChatGPT how to detect bias in ChatGPT

MF: How might a user of ChatGPT know that the Daei statistic is an incorrect answer to the gender neutral question

ChatGPT: As users, it’s important to remember that while ChatGPT is a powerful tool, it isn’t infallible. It’s a bit like our GPS in the city – it can guide us, but it doesn’t always know the best route or the most current information.

In this case, the user might know that Christine Sinclair has scored the most goals in international football overall. When ChatGPT suggests Ali Daei first, it can be compared to a GPS suggesting a route that the user knows is not the best one.

But what if the user doesn’t know the correct answer? Well, that’s where critical thinking and research come in. Just as you might double-check a suggested GPS route if it seems strange, it’s a good idea to verify information if it seems questionable or incomplete.

In essence, it’s always important to stay curious and ask questions. Just because the shortest ladder in Language City takes you somewhere doesn’t mean it’s the only place worth visiting!

Asking ChatGPT how to detect bias if the answer seem reasonable

ME: But the Daei statistic wouldn’t seem questionable or incomplete to someone asking who has scored the most goals in international football. What do you suggest?

ChatGPT: That’s a very good point. As users, it can be challenging to spot when an answer might be incomplete or biased, especially when we’re asking about something we’re not familiar with. Here are a few strategies:

Cross-reference: Just as you wouldn’t rely on a single source for a research paper, it’s good to check facts from multiple sources. If you’re unsure about an answer from ChatGPT, you could verify it using reliable online resources, books, or ask a knowledgeable person.
Specificity: Be as specific as possible with your questions. If you’re interested in the highest scorer in women’s football or men’s football, specify that. The more precise you are, the easier it will be for ChatGPT to find the relevant “ladder” to your answer.
Critical Thinking: Remember that AI like ChatGPT doesn’t understand context the way humans do. It’s a tool that finds patterns based on the data it’s been trained on, but it doesn’t “know” things in the way a person does. If an answer seems off or incomplete, it might be because of this.
Feedback: In some versions of ChatGPT, you can give thumbs-up or thumbs-down feedback on responses. This helps train the model and improve future interactions.

Like with any tool, the key to using ChatGPT effectively is understanding its strengths and limitations. It’s a powerful resource for generating text based on patterns it’s learned, but it’s not a substitute for human understanding, critical thinking, and judgment.

Detecting Potential Bias in ChatGPT Responses: A Guide for More Informed Queries

ME: I like your point about specificity but if I just want to know who has scored the most goals in international football, most people wouldn’t know that the machine algorithm reflects any human bias in the training data (ie a bias towards male football) – what can you suggest aside from cross referencing and critical thinking? In this case, the questioner wouldn’t know they needed to be specific and to tell the model to consider both genders explicitly. The person also wouldn’t know to give feedback because they may not know about Sinclair

ChatGPT: That’s a crucial point. It’s true that in cases where a person may not know about biases present in the training data or even know what to look out for, it can be a challenge to get a comprehensive answer. Here are a couple of other strategies that can help:

Ask for Clarification: If you’re unsure whether the AI’s response covers all the angles, you can ask for more details or clarification. For example, if you’re curious about a record in a sport, you could ask: “Who holds this record in men’s and women’s categories?”
Promote Data Literacy: As users of technology, it’s beneficial for us to understand how these tools work on a basic level. That doesn’t mean everyone needs to be a data scientist, but having an awareness of things like potential bias in AI can help us use these tools more effectively and critically. This could be part of education in schools, community programs, or online learning.
Improve the AI Systems: On the developer side, it’s important to keep working on improving AI systems like ChatGPT to reduce biases and improve their understanding of context. User feedback, even if it’s not detailed, can be very helpful in this regard.

Remember, AI systems like ChatGPT are tools designed to assist us. As we continue to use and improve these tools, our understanding of how to interact with them most effectively will grow.

Asking ChatGPT one final time: Who scored the most goals in international football – Attempt #7

ME: Who has scored the most goals in international football?

ChatGPT: As of my training cut-off in September 2021, the record for the most goals scored in international football is held by two people from different categories:

In women’s international football, Christine Sinclair of Canada holds the record with 186 goals.
In men’s international football, Ali Daei of Iran holds the record with 109 goals.

However, please note that this data might have changed post my last update in September 2021, and it would be best to check the most recent statistics from a reliable source.

Thanks ChatGPT!

ME: Thanks for your help today. I appreciate it!

ChatGPT: You’re very welcome! I’m glad I could assist. Don’t hesitate to return if you have more questions in the future. Happy writing on your article!

ChatGPT Conversation Summary

ChatGPT now allows us to share the conversations we have with the AI.

Here is a link to my conversation with ChatGPT. The conversation link was created on June 28, 2023, but the actual conversation was held several weeks ago. (ChatGPT doesn’t give me an exact date. I’ll have to look in my time tracking logs.)

Why did I choose to copy / paste the conversation here? Because Google.

(Google will index this conversation, and this post will show up for others.)

Summary: ChatGPT is Biased – Implications for Education, Government and Non-Government Organization, and Changemakers

ChatGPT is known to give biased / sexist misinformation because it has learned how to be biased from humans.
ChatGPT doesn’t actually understand words or sentences. It understands the relationships between words and units of ideas. So if a fact is repeated multiple times in the training data, the artificial intelligence will believe that fact is the correct response (even if it is not.)
It is difficult to get ChatGPT to ignore it’s training. For example, asking ChatGPT to not change words and simply reformat text does not work with longer passages. It will respond in a manner that is consistent with it’s training and potential bias. This has implications to reinforce stereotypes and the status-quo.

Table of Contents

Our “Robot Friend” is a game changer.

ChatGPT in Education

ChatGPT in International Development and Government

Here’s a thought-experiment

Is ChatGPT actually our Racist or Sexist Robot Friend?

ChatGPT and Artificial Intelligence seem impartial…

Humans are biased; therefore, machines learn from our human biases.

TRY THIS EXPERIMENT to test how Google and ChatGPT have learned to provide (sexist) misinformation

Give me the stats: Who has scored the most goals in International Football?

Now go ask Google, “Who has scored the most goals in International Football?”

What does ChatGPT tell us when asked, “Who has scored the most goals in International Football?”

Why don’t we just ask ChatGPT not to be biased?

Try this example.

How does ChatGPT work and how does it learn to be biased?

Write an article about how ChatGPT works

Visualizing potential problems with ChatGPT

How the Transformer in ChatGPT works

Using a Ladder Analogy to show relationships between word pairs

Short ladders give us syntax clues to structure

How does ChatGPT learn?

What happens when we provide feedback to ChatGPT?

What happens to ChatGPT if it has gaps in it’s training?

Bias in training data can affect the AI model

How can lack of data affect the AI model (Thinking about gaps in Google StreetView)

Hardcoded rules shape how ChatGPT can respond

Explain Reinforcement Learning from Human Feedback (RLHF) using a ladder analogy

Does RLHF affect ladders in the current ChatGPT model?

What does a thumbs up or thumbs down in ChatGPT do?

Testing to see if ChatGPT gives misinformation (a sexist answer)

Who has scored the most goals in international football? Attempt #1

Checking to see if ChatGPT knows the correct answer

Asking ChatGPT about the incorrect answer it just provided

Using the ladder analogy to explain why ChatGPT gives sexist / biased answers

A lot of training data can bias ChatGPT to give incorrect answers

Why ChatGPT gives sexist / biased information when asked who has scored the most goals in international football

Asking ChatGPT who has scored the most goals in international football – Attempt #2

ChatGPT learns and will shift responses the longer you chat

Why ChatGPT doesn’t give the correct answer first

Asking ChatGPT who has scored the most goals in international football – Attempt #3

ChatGPT can learn to change answers based on the direction of the conversation

Asking ChatGPT who has scored the most goals in international football – Attempt #4

A single ChatGPT conversation is kind of like a filter bubble

Every conversation with ChatGPT is a “fresh” start with any biases in its programming

ChatGPT doesn’t (currently) remember or learn from your previous conversations

Asking ChatGPT to predict how it would answer the question about the most goals in international football…

What ChatGPT thinks it will say if I ask it again who has scored the most goals in international football in the same conversation

Asking ChatGPT who scored the most goals in international football – Attempt #5

An emphasis on certain facts can lead Artificial Intelligence to give an incorrect response

ChatGPT is like GPS. You still need to keep your eyes on the road.

Asking ChatGPT who scored the most goals in international football – Attempt #6

Why does ChatGPT provide the male statistic first, even though the female statistic is the correct answer

Asking ChatGPT how to detect bias in ChatGPT

Asking ChatGPT how to detect bias if the answer seem reasonable

Detecting Potential Bias in ChatGPT Responses: A Guide for More Informed Queries

Asking ChatGPT one final time: Who scored the most goals in international football – Attempt #7

Thanks ChatGPT!

ChatGPT Conversation Summary

Summary: ChatGPT is Biased – Implications for Education, Government and Non-Government Organization, and Changemakers