Training an Arabic LLM that reflects local values

Training an Arabic LLM that reflects local values

Training an Arabic LLM that reflects local values
The Arab world did not play a key role in the PC, internet and mobile eras. In the AI era, it will be different. (Shutterstock)
Short Url

Advances in the large language models that underpin generative AI are changing everything, from medicine and education to entertainment.

Our relationship with technology is becoming more intimate as machines change from passive tools into active assistants that amplify our innate human abilities.

This new era poses both a challenge and an opportunity for the Middle East.

The challenge is that leaders in this new field, like OpenAI’s ChatGPT and Google’s Gemini, come from Silicon Valley, or from China, where my team at 01.AI has built models that rival the Americans. In Europe, too, startups such as France’s Mistral have entered the race.

The opportunity is for the Middle East to join this league and make sure its voice is heard.

Inspired by my latest trip to Riyadh, I decided to test how the current crop of AI models would handle a simple request. I imagined myself as a young Saudi getting ready to host a dinner party and asked ChatGPT to prepare a menu.

The food it recommended sounded delicious — stuffed grape leaves, tabouleh salad, mandi and stuffed dates. But the beverages were a problem.

Aside from drinks such as mint lemonade and jallab, a mixture of dates, grape molasses and rose water, ChatGPT also offered this: “For alcoholic beverages, you could offer a selection of international wines, beers, or non-alcoholic mocktails.”

To its credit, when I repeated the question, it offered only non-alcoholic drinks.

If a model recommends breaking both the law and cultural norms, imagine how it might answer other more sensitive questions about politics or religion? Indeed, researchers have even shown that some models have exhibited an anti-Muslim bias.

My modest test underlines the urgent need to develop an Arabic large language model that reflects local values.

The first step to building this is creating enough high-quality Arabic digitized data to properly train a new generation of models.

Although there are 400 million Arabic speakers, only an estimated 2 percent of online content is in Arabic. Meta’s open source LLM model Llama is overwhelmingly trained on English data, with Arabic comprising less than 0.1 percent of the data.

The lack of data naturally skews the results. To fix this dearth of data, either a visionary entrepreneur or a government-backed organization should collect, digitize and convert the many Arabic books into training data for Arabic models.

Once the data is gathered, it can be fed into the breakthrough pre-training process, which reads trillions of words and creates its own virtual concept space or model of the world. This concept space has been shown to be mostly in English and Chinese.

Adding a sizable number of texts in Arabic, which has enormous cultural output and significance, will make the concept space more knowledgeable about Arabic and more balanced in its concepts and views.

After such pre-training, the model needs to be fine-tuned by data and labels from the Arab world, which will align with the values of the region. Those are different from American models, which are aligned to US values, and Chinese models, which reflect Chinese values.

The collection of alignment data, the coordination of human labeling and the alignment process will need to be done in-region by AI experts.

A new Arabic-enhanced large language model could encourage entrepreneurs and developers to build new applications tailored to the needs of their nations.

Kai-fu Lee

Finally, safety modules will need to be added to ensure legal compliance and to avoid harm. These will also need to be developed locally.

The above steps will create localized, sovereign models that will reflect the traditions of the Middle East. Privately developed or government-backed, it could be the foundation for a new wave of Arabic AI innovation.

A new Arabic-enhanced large language model could encourage entrepreneurs and developers to build new applications tailored to the needs of their nations.

Imagine an AI tool that could find, summarize, organize and write insightful content, an AI teacher that makes learning fun and customized, an AI doctor that is more knowledgeable than any human, an AI engineer that can write software and applications, and an AI assistant that knows its owner better than the owner themselves.

The Arab world did not play a leading role in the PC, internet and mobile eras. In the AI era, it will be different.

This transformation is by no means an easy feat. It will require an unprecedented investment of money, energy and human capital.

Middle Eastern leaders like Saudi Crown Prince Mohammed bin Salman and others have shown that they have the vision, determination and resources to lead their countries into the future.

Standing on my hotel balcony in Jeddah recently, overlooking the King Abdullah University of Science and Technology, I saw part of that vision coming to fruition.

Universities such as KAUST and the Mohamed bin Zayed University of Artificial Intelligence in the UAE are striking examples of the resources that have already been poured into this transformation.

These world-class academic institutions can attract and retain the best top tier global talent.  It is especially important to bring in the world’s best computer engineers to help fulfill this vision of the future AI.

Our team at 01.AI has shown what a group of talented and motivated computer scientists can achieve in just one year. With the right commitment of resources and drawing upon the best talent, countries like Saudi Arabia can easily catch up with their global peers.

The Middle East can also lead the world in the use of renewables to run power-hungry generative AI models.

As it seeks to diversify its economy, Saudi Arabia is actively promoting the use of alternative energy sources such as solar, which could power server farms and reduce their carbon footprint — a growing concern as AI becomes more widespread.

It may take time for countries to figure out their strategy for building a sovereign AI. But it is critical for the Arab world to quickly catalyze the creation of culturally appropriate LLMs and build a rich ecosystem to allow AI-powered Arabic apps to blossom.

A recent encounter with a female sales assistant at a computer store in Riyadh served as an apt reminder of what is at stake. Dressed in jeans and sporting a tattoo, she was a reminder of the transformative changes that the country is undergoing.

Where are you from, I asked. “I’m Saudi,” she said. “One day I want to be Saudi Arabia’s Elon Musk.” I hope on my next visit she will pitch me a homegrown AI app.

Kai-Fu Lee is a computer scientist, CEO of 01.AI, chairman of Sinovation Ventures, former president of Google China, and author of “AI 2041” and “AI Superpowers”
 

Disclaimer: Views expressed by writers in this section are their own and do not necessarily reflect Arab News' point of view

One killed in blast at Moscow residential building, TASS reports

One killed in blast at Moscow residential building, TASS reports
Updated 56 sec ago
Follow

One killed in blast at Moscow residential building, TASS reports

One killed in blast at Moscow residential building, TASS reports
MOSCOW: One person was killed and four people injured in a blast at a residential building in northwest Moscow, Russian state news agency TASS reported on Monday, citing emergency services
Baza, a Telegram channel with contacts in Russia’s security services, published video showing major damage to what it said was the Alye Parusa residential complex, where the blast took place.
It was not immediately clear what had caused the blast.
In December, Ukraine took credit for the killing of Russian General Igor Kirillov in a bomb blast outside a Moscow apartment building.
There was no immediate comment from Ukraine.

Pakistan police officer killed as polio vaccination drive starts — police

Pakistan police officer killed as polio vaccination drive starts — police
Updated 20 min 15 sec ago
Follow

Pakistan police officer killed as polio vaccination drive starts — police

Pakistan police officer killed as polio vaccination drive starts — police
  • Two motorcycle riders opened fire on police officer in northwestern Jamrud town, say police
  • Pakistan launched this year’s first nationwide immunization effort today after 73 cases in 2024 

PESHAWAR, Pakistan: A Pakistan police officer traveling to guard polio vaccinators was shot dead Monday, police said, on the first day of a nationwide immunization effort after a year of rising cases.

The officer was traveling to guard polio vaccinators in the area of Jamrud town in northwestern Khyber Pakhtunkhwa province when he was killed, local police official Zarmat Khan told AFP.

“Two motorcycle riders opened fire on him,” he said. “The constable died instantly at the scene.”

Pakistan and neighboring Afghanistan are the only countries where polio is endemic and militants have for decades targeted vaccination teams and their security escorts.

Pakistan recorded at least 73 polio infections last year compared to six in 2023. The vaccination campaign which started on Monday is the first of the year and is due to last a week.

“Despite the incident, the polio vaccination drive in the area remains ongoing,” Khan said.

Abdul Hameed Afridi, another senior police official in the area, also confirmed details of the attack and said officers have “launched an investigation.”

No group immediately claimed responsibility for the attack, however Khyber Pakhtunkhwa — which neighbors Afghanistan — is a hive of militant activity.

The Pakistani Taliban are the most active group in the area.

Polio can easily be prevented by the oral administration of a few drops of vaccine, but scores of vaccination workers and their escorts have been killed over the years.

In the past, clerics falsely claimed that the vaccine contained pork or alcohol, declaring it forbidden for Muslims.

In more recent years the attacks have focused on vulnerable police escorts accompanying the vaccinators as they go door-to-door.

Last year, dozens of Pakistani policemen who accompany medical teams on campaigns went on strike after a string of militant attacks targeting them.

Pakistan has witnessed rising militant attacks since the Taliban returned to power in neighboring Afghanistan.

More than 1,600 people were killed in attacks in 2024 — the deadliest year in almost a decade — according to the Center for Research and Security Studies, an Islamabad-based analysis group.

Islamabad accuses Kabul’s new rulers of failing to rout militants organizing on Afghan soil, a charge the Taliban government routinely denies.

Pakistan’s Prime Minister Shehbaz Sharif said Sunday last year’s polio eradication efforts faced “a major setback.”

“We must eradicate polio from Pakistan at any cost,” he said as he launched the new vaccination drive.


Lawyers in Pakistan’s capital strike to protest ‘unfair’ transfer of judges

Lawyers in Pakistan’s capital strike to protest ‘unfair’ transfer of judges
Updated 26 min 28 sec ago
Follow

Lawyers in Pakistan’s capital strike to protest ‘unfair’ transfer of judges

Lawyers in Pakistan’s capital strike to protest ‘unfair’ transfer of judges
  • Lawyers in Pakistan’s capital strike to protest ‘unfair’ transfer of judges
  • Lawyers reject president’s move, say transfer “unfair” and affects seniority of other Islamabad High Court judges 

ISLAMABAD: Lawyers associated with the capital’s bar associations and councils have gone on strike today, Monday, to protest against the president’s move to transfer three judges from different high courts to the Islamabad High Court (IHC). 

President Asif Ali Zardari on Saturday approved the transfers of Justice Sardar Muhammad Sarfraz Dogar from the Lahore High Court, the Sindh High Court’s Justice Khadim Hussain Soomro and Balochistan High Court’s Justice Muhammad Asif to the IHC, invoking anger from the capital city’s district and high court bar associations. 

Zardari went ahead with the transfers despite opposition from five of 10 IHC judges. In a letter addressed to the chief justices of the Supreme Court and high courts on Friday, the judges said the decision was unfair to the existing senior judges of the IHC.

They also cited speculation and news reports that the government wanted to appoint Justice Dogar as the IHC’s chief justice, saying that it would be a “fraud on the constitution.”

“This decision is unfair to the judges serving in the IHC as it clearly affects their seniority,” Asad Manzoor Butt, president of the Lahore High Court Bar Association, told Arab News. 

Butt supported the Islamabad High Court Bar Association’s call for notification of the judges’ transfer to be canceled. 

“We will travel to Islamabad to consult with the Bar leadership and extend our support to them,” he said. 

Pakistan’s constitution empowers the president to transfer a judge from one high court to another after the concerned judge consents to the decision. The president can approve the transfer after consulting the chief justice of Pakistan and the chief justice of both high courts.

The Islamabad Bar Council (IBC) said on Sunday that the capital city’s lawyer bodies will pursue all legal and constitutional avenues to challenge the move and safeguard the “judicial independence of Islamabad.”

It said an All-Pakistan Lawyers’ Convention will be held under the IBC on Monday to formulate a future strategy to challenge the transfers. 


Israeli prime minister in Washington for Gaza ceasefire talks

Israeli prime minister in Washington for Gaza ceasefire talks
Updated 46 min 22 sec ago
Follow

Israeli prime minister in Washington for Gaza ceasefire talks

Israeli prime minister in Washington for Gaza ceasefire talks
  • Netanyahu told reporters he would discuss "victory over Hamas"
  • Trump said Sunday that negotiations with Israel and other countries in the Middle East were "progressing"

JERUSALEM: Israeli Prime Minister Benjamin Netanyahu is expected to begin talks Monday on brokering a second phase of the ceasefire with Hamas, his office said, as he visits the new Trump administration in Washington.
Ahead of his departure, Netanyahu told reporters he would discuss "victory over Hamas", contending with Iran and freeing all hostages when he meets with President Donald Trump on Tuesday.
It will be Trump's first meeting with a foreign leader since returning to the White House in January, a prioritisation Netanyahu called "telling".
"I think it's a testimony to the strength of the Israeli-American alliance," he said before boarding his flight.
He was welcomed to the US capital on Sunday night by Israel's ambassador to the UN Danny Danon, who stressed the coming Trump-Netanyahu meeting would strengthen "the deep alliance between Israel and the United States and will enhance our cooperation".
Trump, who has claimed credit for sealing the ceasefire deal after 15 months of war, said Sunday that negotiations with Israel and other countries in the Middle East were "progressing".
"Bibi (Benjamin) Netanyahu's coming on Tuesday, and I think we have some very big meetings scheduled," Trump said.
Netanyahu's office said he would begin discussions with Trump's Middle East envoy Steve Witkoff on Monday over terms for the second phase of the truce.
Indirect negotiations between Israel and Hamas are meanwhile due to resume this week.
The initial, 42-day phase of the deal is due to end next month.
The next stage is expected to cover the release of the remaining captives and to include discussions on a more permanent end to the war.
Trump has said that 15 months of fighting has reduced the Palestinian territory to a "demolition site" and has repeatedly touted a plan to "clean out" the Gaza Strip, calling for Palestinians to move to neighbouring countries such as Egypt or Jordan.
Qatar, which jointly mediated the ceasefire along with the US and Egypt, underscored on Sunday the importance of allowing Palestinians to "return to their homes and land".
"We emphasised the importance of concerted efforts to intensify the entry of humanitarian aid and rehabilitate the Strip to make it livable and to stabilise the Palestinian people in their land," Qatari Prime Minister Mohammed bin Abdulrahman bin Jassim Al Thani said following a meeting with Turkey's foreign minister.

Under the ceasefire's first phase, Hamas was to free 33 hostages in staggered releases in exchange for around 1,900 Palestinians held in Israeli jails.
The truce has also led to a surge of food, fuel, medical and other aid into rubble-strewn Gaza, while displaced Palestinians have been allowed to begin returning to the north.
During their October 7, 2023 attack, Hamas militants took 251 hostages, 91 of whom remain in Gaza, including 34 the Israeli military has confirmed are dead.
The attack resulted in the deaths of 1,210 people, mostly civilians, according to an AFP tally based on official Israeli figures.
Israel's retaliatory response has killed at least 47,283 people in Gaza, a majority civilians, according to the Hamas-run territory's health ministry, figures which the UN considers reliable.
While Trump's predecessor Joe Biden sustained Washington's military and diplomatic backing of Israel, it also distanced itself from the mounting death toll and aid restrictions.
Trump moved quickly to reset relations.
In one of his first acts back in office, he lifted sanctions on Israeli settlers accused of violence against Palestinians and reportedly approved a shipment of 2,000-pound bombs that the Biden administration had blocked.
The ceasefire discussions in Washington are expected to also cover concessions Netanyahu must accept to revive normalisation efforts with Saudi Arabia.
Riyadh froze discussions early in the Gaza war and hardened its stance, insisting on a resolution to the Palestinian issue before making any deal.
Trump believes "that he must stabilise the region first and create an anti-Iran coalition with his strategic partners," including Israel and Saudi Arabia, said David Khalfa, a researcher at the Jean Jaures Foundation in Paris.
But Netanyahu faces intense pressure from within his cabinet to resume the war, with Finance Minister Bezalel Smotrich threatening to quit and strip the prime minister of his majority.

On the ground, Israel said Sunday it has killed at least 50 militants and detained more than 100 "wanted individuals" during an operation in the West Bank.
The massive offensive began on January 21 with the Israeli military saying it aimed to root out Palestinian armed groups from the Jenin area, which has long been a hotbed of militancy.
On Sunday, Palestinian official news agency WAFA said Israeli forces "simultaneously detonated about 20 buildings" in the eastern part of Jenin refugee camp, adding that the "explosions were heard throughout Jenin city and parts of the neighbouring towns".
The Palestinian health ministry meanwhile said the Israeli military killed a 73-year-old man and a 27-year-old in separate incidents in the West Bank on Sunday.
Violence has surged across the West Bank since the Gaza war broke out in October 2023.
Israeli troops or settlers have killed at least 883 Palestinians in the West Bank since the start of the war, according to the Palestinian health ministry.
At least 30 Israelis have been killed in Palestinian attacks or during Israeli military raids in the territory over the same period, according to Israeli official figures.


Cynthia Erivo kicks off Grammys in Ashi Studio look as Beyonce wins top award

Cynthia Erivo kicks off Grammys in Ashi Studio look as Beyonce wins top award
Updated 58 min 38 sec ago
Follow

Cynthia Erivo kicks off Grammys in Ashi Studio look as Beyonce wins top award

Cynthia Erivo kicks off Grammys in Ashi Studio look as Beyonce wins top award

DUBAI/ LOS ANGELES: The 2025 Grammys in Los Angeles saw “Wicked” star Cynthia Erivo kick off proceedings in a gown by Saudi couturier Mohammed Ashi.

Accompanied by Herbie Hancock on piano, Erivo sang Frank Sinatra’s “Fly Me to the Moon” while wearing a sculpted gown from the Paris-based designer’s Fall/ Winter 2024 collection.

Cynthia Erivo showed off a gown by Ashi Studio at the Grammys. (AFP)

She complemented her Ashi Studio dress with Messika jewelry and Christian Louboutin heels.

Erivo’s look hailed from Ashi Studio’s Fall/Winter 2024-25 collection, titled “Sculpted Clouds.”

At the ceremony on Sunday night, Beyoncé won album of the year for “Cowboy Carter,” delivering her — at last — the show’s elusive top award.

The superstar, who is both the most awarded and nominated artist in Grammys history, has been up for the category four times before.

In winning album of the year with “Cowboy Carter,” Beyoncé became the first Black woman to win the top prize in the 21st century. The last was Lauryn Hill with “The Miseducation of Lauryn Hill” 26 years ago. Before her was Natalie Cole and Whitney Houston. That means Beyoncé is only the fourth Black woman to win album of the year at the Grammys.

Beyonce accepts the Album of the Year award with Blue Ivy Carter onstage. (AFP)

Members of the Los Angeles Fire Department presented Beyonce with the trophy Sunday, one of several times the show reflected the recent wildfires that burned thousands of homes.

“It’s been many, many years,” Beyoncé said in her speech. “I want to dedicate this to Ms. Martell,” she said, referencing Linda Martell, the performer who became the first Black woman to play the Grand Ole Opry, a music venue in Nashville, Tennessee.

“We finally saw it happen, everyone,” host Trevor Noah said, nodding to the long overdue achievement for one of music's transcendent artists.

Kendrick Lamar won song and record of the year for his diss track “Not Like Us” at the 2025 Grammys, taking home two of the night's most prestigious awards.

Kendrick Lamar, winner of the Record of the Year, Best Rap Performance, Best Rap Song, Best Music Video, and Song of the Year Awards for "Not Like Us" poses in the press room during the 67th Annual Grammy Awards. (AFP)

“We're gonna dedicate this one to the city,” Lamar said before shouting out Los Angeles area neighborhoods.

It is the second hip-hop single to ever win in the category. The first was Childish Gambino’s “This Is America."