Uncategorized

The classroom has been Builderalld to explain how LLMs could change education

What will the future be for Learning and Assessing Learning in Higher Education? Analyzing the problems of GPT-3.5 and its successor GPT-4

There are still problems to be solved. Questions remain about whether LLMs can be made accurate and reliable enough to be trusted as learning assistants. It is not yet known what the future will be for education, but more institutions should explore the benefits and pitfalls of this new technique, or their students will miss out on a useful tool.

Wei Wang, a computer scientist at the University of California, Los Angeles, found that GPT-3.5 — which powers the free version of ChatGPT — and its successor, GPT-4, got a lot wrong when tested on questions in physics, chemistry, computer science and mathematics taken from university-level textbooks and exams2. Wang and her colleagues experimented with different ways to query the two GPT bots. The best method used by them was GPT-4 and its bot scored 80% in one exam, however it did not answer one third of the textbook questions correctly.

In a February preprint, researchers described how, in a benchmark set of relatively simple mathematical problems usually answered by students aged 12–17, ChatGPT answered about half of the questions correctly2. If the problem was more complex and required four or more calculations in a single calculation, it was more likely to fail.

The teachers were a bit concerned when it was launched a year ago. The artificial intelligence is able to research, write, and answer questions in response to assignment questions, forcing schools to rethink their evaluation methods. A few countries brought back pen-and-paper exams. The classroom model has been changed by some schools, in which students do their assignments after learning something at home.

A man who has worked in education at UNESCO for over two decades says that understanding the limitations of artificial intelligence is crucial. At the same time, LLMs are now so bound up in human endeavours that he says it is essential to rethink how to teach and assess learning. “It’s redefining what makes us human, what is unique about our intelligence.”

He likens the attention that they’re attracting to that previously lavished on massively online open courses and educational uses of the 3D virtual worlds known as the metaverse. Neither have the transformative power that some once predicted, but both have their uses. “In a sense, this is going to be the same. It’s not bad. It is not perfect. It’s not everything. It’s a new thing,” he says.

An important question around the use of AI in education is who will have access to it, and whether paid services such as Khanmigo will exacerbate existing inequalities in educational resources. Khan Academy is looking for philanthropists and grants to help pay for computing power, as well as to provide access for under-resourced schools, according to DiCerbo. “We are working to make sure that digital divide doesn’t happen,” she says.

An investor in educational-tech companies in New York City says that RAG is used by one of the most progressive universities in adoption of LLMs. After an initial narrow release for testing, ASU launched a toolbox in October that enables its faculty members to experiment with LLMs in education through a web interface. This also includes access to six LLMs, such as GPT-3.5 and GPT-4.

AI company Merlyn Mind in New York City is using RAG in its open-source Corpus-qa LLM, which is aimed at education. Like ChatGPT, Merlyn Mind’s LLM is initially trained on a big body of text not related to education specifically — this gives it its conversational ability.

When the LLM answers a question, it doesn’t depend on what it has learned in its training. Instead, it also refers to a specific corpus of information, which minimizes hallucinations and other errors, says Satya Nitta, chief executive of the company. In order to resist hallucination, Merlyn Mind fine-tunes its LLMs if they do not have a high-quality response and work on producing a better answer.

The tutoring companies are experimenting with the use of LLMs. An assistant was created based on GPT-4 in April by an education technology firm. And TAL Education Group, a Chinese tutoring company based in Beijing, has created an LLM called MathGPT that it claims is more accurate than GPT-4 at answering maths-specific questions. Students can be helped by explaining how to solve problems.

Students are at risk of being led astray by chatg pt. The bot is brittle, getting things wrong when a question is phrased differently, and even making things up, despite it’s doing good stuff in business, legal and academics.

But whether Khanmigo can truly revolutionize education is still unclear. LLMs are trained to include only the next most likely word in a sentence, not to check facts. They therefore sometimes get things wrong. To improve its accuracy, the prompt that Khanmigo sends to GPT-4 now includes the right answers for guidance, says DiCerbo. It still makes mistakes, however, and Khan Academy asks users to let the organization know when it does.

Khanmigo was first introduced in March, and more than 28,000 US teachers and 11–18-year-old students are piloting the AI assistant this school year, according to Khan Academy. Users include private subscribers as well as more than 30 school districts. Individuals pay US$99 a year to cover the computing costs of LLMs, and school districts pay $60 a year per student for access. Openai does not use Khanmigo data for training.

Khanmigo works differently from ChatGPT. It appears as a pop-up chatbot on a student’s computer screen. Students can discuss the problem that they are working on with it. The tool will send the student’s query to GPT-4 if it adds a Prompt before it, instructing the bot to not give away answers and instead to ask lots of questions.

The scores PyrEval gives helps students to reflect on their work, if the artificial intelligence doesn’t detect a theme that the student thought they had included, it could indicate that the idea needs to be explained more clearly or that they made small conceptual or grammatical errors. The team is trying to get other LLMs to do the same thing, and comparing the results.

PyrEval has scored physics essays for more than 2,000 middle school students a year since 2010 with help from an educational psychologist at the University of Wisconsin–Madison. The essays do not receive traditional grades, but PyrEval gives teachers the power to quickly check the assignments for key themes and give feedback during the class, something that would otherwise be impossible.

Companies are marketing commercial assistants, such as MagicSchool and Eduaide, that are based on OpenAI’s LLM technology and help schoolteachers to plan lesson activities and assess students’ work. Academics have produced other tools, such as PyrEval4, created by computer scientist Rebecca Passonneau’s team at Pennsylvania State University in State College, to read essays and extract the key ideas.

“Are there positive uses?” asks Collin Lynch, a computer scientist at North Carolina State University in Raleigh who specializes in educational systems. Absolutely. Is there any risk? There are huge risks and concerns. I think there are ways to mitigate those.

Privacy can be a hurdle for students, as they discover that everything they type into themselves may be used to train models.

If students and teachers are able to focus on discussion and learning from large stretches of text using LLMs, time could be saved. ChatGPT’s ability to lucidly discuss nearly any topic raises the prospect of using LLMs to create a personalized, conversational educational experience. They might cost less than a human tutor but they are always available and some teachers see them as thought partners.

Some teachers fear that the rise of a test will make it simpler for students to cheat on assignments. Yet Beghetto, who is based in Tempe, and others are exploring the potential of large language models (LLMs), such as ChatGPT, as tools to enhance education.

Ronald Beghetto asked graduate students and professional teaching professionals to discuss their work in an unconventional way last month. As well as talking to each other, they conversed with a collection of creativity-focused chatbots that Beghetto had designed and that will soon be hosted on a platform run by his institute, Arizona State University (ASU).