Blog

License Plate Recognition Across the GCC: Arabic Numerals, Emirate Codes, Reality

Eastern Arabic numerals, emirate prefix conventions, KSA plate formats. Why off-the-shelf ANPR fails here without retraining.

Dr. Raphael Nagel

October 4, 2025

License Plate Recognition Across the GCC: Arabic Numerals, Emirate Codes, Reality

Automatic number plate recognition trained on European or North American data does not work in the Gulf, and the gap is not a tuning problem but a script, layout, and physics problem that begins at the pixel level.

The argument runs against a marketing claim that recurs in every tender from Riyadh to Manama: that a recognized international ANPR engine, certified somewhere in Western Europe and benchmarked against German, Polish, or French plates, can be dropped onto a site in the Gulf and deliver the read rates promised in the data sheet. It cannot. The plates themselves are different, the optical conditions are different, the regulatory expectations are different, and the consequences of a misread, at a gate to a refinery, a residential compound, or a free-zone logistics yard, are operational rather than statistical. What follows is a description, drawn from field practice, of why off-the-shelf ANPR fails in the GCC and what retraining actually involves.

The script problem nobody wants to name

GCC plates carry two parallel inscriptions. The Latin character set, sometimes called Western Arabic, with the digits zero through nine that any commercial OCR engine has seen millions of times. And the Eastern Arabic numerals, the indigenous form, which look different and which an engine trained on European data has rarely or never encountered. The zero in Eastern Arabic is a dot, not a circle. The five resembles a Latin zero. The six looks like a Latin seven. The seven looks like a Latin V. Confusion is not a corner case, it is the default behavior of an untrained classifier.

Most commercial ANPR engines treat the Eastern Arabic inscription as noise. They focus on the Latin row, extract what they can, and report a result. On many GCC plates this is acceptable, because the Latin row is present and legible. But it fails in three common situations. First, plates issued in earlier generations or in specific emirates where the Latin row is smaller, partially obscured, or absent on certain plate categories. Second, commercial vehicles and trailers where the secondary plate carries only Eastern Arabic. Third, plates partially damaged, where the Latin row is the one that has weathered or been struck and the Eastern Arabic remains the only legible source.

A system that cannot read both scripts is not an ANPR system in this region. It is a partial ANPR system, and the operator will discover the partiality in the worst situations, during an incident, during a forensic search, during a dispute with an insurer. The solution is not a filter or a post-processing step. It is a model trained from the ground up on both scripts, with a balanced dataset that includes Eastern Arabic numerals in every font variant used across the seven GCC plate-issuing authorities. There is no shortcut. The book BOSWAU + KNAUER. From Building to Security Technology makes the point in a different context, that special-purpose intelligence is what carries operational systems, not general intelligence borrowed from another market. ANPR in the Gulf is a textbook case.

Emirate codes, KSA categories, and the layout problem

Beyond script, GCC plates differ structurally from European plates in ways that matter for detection and classification. The United Arab Emirates uses a system where each of the seven emirates issues its own plates, with its own color scheme, its own code prefixes, and its own category logic. Abu Dhabi plates carry a category number, sometimes a single digit, sometimes a letter, followed by a numeric block. Dubai uses an alphabetic code letter, A through Z, with its own internal ordering. Sharjah, Ajman, Ras Al Khaimah, Umm Al Quwain, and Fujairah each have their own conventions, and these conventions have changed over time, so a fleet operating across the Emirates contains plates from multiple generations.

Saudi Arabia operates a different system again. KSA plates carry three letters and up to four digits, in both Arabic and Latin, with the script rows arranged vertically rather than horizontally on most formats. The letter set used on Saudi plates is restricted to specific Arabic and Latin characters chosen for visual distinctness, which is a useful constraint for a classifier but only if the classifier knows which characters are valid. A model trained on European plates has no such prior and will report any visually plausible letter, producing false positives that look credible to an operator but reference a plate that cannot exist in the registry.

Qatar, Bahrain, Kuwait, and Oman each add further variations. Qatar plates separate the numeric block by category. Bahrain uses both Arabic and Latin in a horizontal arrangement with distinctive coloring for diplomatic, commercial, and private vehicles. Kuwait places a country code that looks superficially similar to Saudi formats but follows different rules. Oman uses a regional color scheme.

The layout problem is that a localization model, the component that finds the plate in the image before reading it, has to handle these geometric variations without collapsing them into a single average. An engine trained on the rectangular, single-row European plate will find GCC plates inconsistently, miss the secondary script row, and crop in ways that lose information. Retraining the localizer is as important as retraining the character recognizer, and most off-the-shelf solutions skip the first step entirely, presenting impressive recognition figures on the plates they happened to find while silently dropping the ones they could not.

Heat, dust, glare, and the optical reality

The Gulf environment punishes optical systems in ways that European specifications do not anticipate. Surface temperatures on plates parked in direct sun routinely exceed sixty degrees Celsius, which affects both the plate material, which can warp or fade, and any imaging electronics positioned to read it. Dust deposits on cameras within hours of cleaning, and the fine, light-colored sand common in the Gulf scatters infrared illumination in ways that wash out images during dawn and dusk transitions. Heat haze rises from asphalt and creates shimmer that degrades sharpness at distances above twenty meters, which is precisely the range at which most gate ANPR cameras are positioned.

Glare is the single most underestimated factor. Plates in the Gulf are often photographed against pale backgrounds, white compound walls, sand-colored pavement, light vehicle bodies, and the contrast between the plate characters and the surroundings is lower than in European conditions where darker pavement and varied backgrounds provide natural contrast. Polarizing filters help but reduce light gathering, which then forces the camera to longer exposures, which then increases motion blur on moving vehicles. The engineering trade-offs are different from those in temperate climates, and a camera-and-software combination optimized for one environment will not perform in the other without rework.

Night operation introduces its own complications. The infrared illumination used at most ANPR installations reflects differently off the retro-reflective material used in GCC plates, which itself varies by emirate and by generation. Some plates produce a strong, even return that supports clean recognition. Others produce hot spots that saturate the sensor and obscure characters. A model trained on consistent European retro-reflective material will misinterpret these returns, treating saturation as character data or treating gaps as character boundaries. Calibration must be done per camera, per plate generation, per environmental condition. The labor involved is substantial and is rarely budgeted in initial procurement.

CISA guidance on operational technology resilience, and the IEC 62443 framework for industrial automation security, both emphasize that detection systems must be characterized in the environment in which they operate, not in the environment in which they were certified. ANPR is a detection system. The principle applies.

Retraining: what it actually requires

Retraining an ANPR engine for the GCC is not a tuning exercise. It is a multi-stage program that begins with data collection and ends with continuous improvement. The first stage is the assembly of a training corpus, a body of plate images representative of the plate populations the system will encounter. This corpus must include all seven UAE emirate plate variants, all KSA plate categories, all Qatar, Bahrain, Kuwait, and Oman formats, plus the various special categories: diplomatic, commercial, taxi, government, military where visible, temporary, and trailer. Each format must be represented in sufficient quantity that the model learns the layout rather than memorizing instances. The minimum corpus for a usable model is in the tens of thousands of labeled plates. A robust corpus runs into the hundreds of thousands.

The labels themselves are the second challenge. Each plate must be annotated with the correct transcription in both scripts, with the bounding box coordinates of the plate within the image, with the script row positions, and with metadata describing the issuing authority and category. This labeling work cannot be done by annotators who do not read Arabic. It cannot be done by annotators unfamiliar with regional plate variations. The labor pool capable of producing this quality of data is small, regionally concentrated, and expensive. Shortcuts here, machine-generated labels, transliterated annotations, partial coverage, propagate into the model and surface as errors at exactly the moment when errors are most costly.

The third stage is the model architecture itself. A standard convolutional recognition network can be adapted, but the two-script requirement typically benefits from a dual-head design, where the localizer identifies plate regions and the recognizer runs two parallel character classifiers, one for Latin, one for Eastern Arabic, with a consistency check between them. NIST CSF 2.0 frames this kind of architectural decision under the Identify and Protect functions, recognizing that detection systems carry their own integrity requirements. The cross-script consistency check is the integrity layer for ANPR. When the two scripts disagree on a plate, the system should know it and flag it, not pick one and hope.

The fourth stage is field validation. A model that performs well on a held-out test set will not necessarily perform well at a specific gate. Camera angle, illumination, distance, vehicle speed, plate position, all of these vary by site. Validation requires a parallel period in which the new system runs alongside the existing system, or alongside manual recording, with all reads compared and disagreements analyzed. This period runs for at least sixty days in our experience, often ninety. Anything shorter misses seasonal effects, weather variations, and the long tail of unusual plates that constitute the operational reality.

The fifth stage is continuous retraining. New plate formats are issued. Old plates weather. The vehicle fleet evolves. A model deployed once and left alone degrades. Operational practice in the GCC requires a retraining cycle, quarterly at minimum, with new data collected from the operating system itself and used to refine the next generation. ISO 27001 information security management principles, while written for a different purpose, apply by analogy: detection capability is an asset to be maintained, not a one-time purchase.

Integration, identity, and the regulatory frame

A reading is only useful if it connects to something. ANPR in isolation produces a stream of plate transcriptions that no operator has time to monitor. The value emerges in integration: the plate is compared against an authorized list, against a deny list, against a visitor pre-registration, against a fleet management database, against a paid-parking record, against a regulatory inquiry. Each of these integrations requires that the plate transcription be machine-comparable, which in turn requires a canonical representation that handles both scripts, both layouts, and the variations in spacing and formatting that human operators introduce when they enter plates by hand.

Establishing the canonical form is non-trivial. A plate is not a string. It is a structured record: emirate or country, category, alphabetic component, numeric component, in some defined order. Two transcriptions of the same plate, one with a space between letter and number, one without, one with the Arabic numerals, one with the Latin, must resolve to the same record. The transformation rules are specific to each issuing authority, and they evolve. An integration that hard-codes one format breaks when the next plate generation arrives. ASIS International's guidance on access control and identity verification emphasizes that identifiers must be resolved through controlled processes, not direct string comparison. The same principle applies here.

The regulatory frame matters as well. Several GCC jurisdictions have introduced data protection regulations that touch on the collection, storage, and processing of vehicle plate data. The UAE's federal data protection law, Saudi Arabia's PDPL, Bahrain's PDPL, all impose requirements on how plate data, considered personal data when linked to an identifiable individual, may be used. Compliance is not optional, and it interacts with the technical architecture. Retention periods, access controls, audit logging, all must be designed into the system. NIST 800-53 controls, while developed for federal information systems, provide a usable reference catalog for the technical safeguards expected in serious deployments.

The German GDV and the European insurance frame are sometimes invoked in regional tenders, by international bidders importing their domestic compliance posture. This is rarely the right reference. The applicable frame in the GCC is local, supplemented by international standards where the locals adopt them. A vendor who cannot speak to PDPL, who cannot show how their retention controls map to local requirements, who cannot demonstrate that their plate data does not leave the jurisdiction unless explicitly authorized, is a vendor who has not done the homework.

Who supplies tuned systems, and what to look for

The market for genuinely tuned GCC ANPR is narrower than the marketing material suggests. Several international vendors offer regional packages, but the quality of the underlying training data varies. The questions an operator should ask, before committing to a system, are concrete. How many GCC-specific plate images are in the training corpus, by emirate and country? How is the Eastern Arabic numeral classifier validated, and against what test set? What is the read rate, in this specific deployment environment, measured against ground truth over a defined period? What is the cross-script consistency check, and what happens when the two scripts disagree? How is retraining handled, on what cadence, and at what cost?

Vendors who answer these questions clearly, with figures rather than adjectives, are a small subset. Vendors who cannot answer them, or who answer with brochures rather than data, are signaling that they have a European product with a regional brochure overlay. The BSI and BSI-aligned certification regimes in Germany are useful baselines for general engineering quality, but they do not certify GCC plate recognition. A vendor citing BSI as proof of regional fitness is conflating two different things.

The operator's protection against this conflation is the audit. A three to five day audit of an existing ANPR installation, with a controlled comparison of system reads against manual ground truth, reveals the actual performance within a week. The numbers that emerge are usually well below the marketing claims, sometimes by twenty percentage points or more on Eastern Arabic recognition, sometimes by ten on overall read rate. These gaps are not failures of the technology in principle. They are failures of the specific implementation in the specific environment, and they can be closed, but only by an honest assessment first.

What holds

ANPR in the GCC is a regional discipline, not a global product with a regional skin. Script, layout, optical conditions, regulatory frame, and integration architecture all differ from the environments in which most commercial systems were developed. The gap can be closed, but it requires training data measured in the hundreds of thousands of regionally sourced plate images, labelers who read Arabic, model architectures that handle dual scripts with cross-validation, and field validation periods measured in months rather than days. The retraining cycle continues for the life of the system.

Operators who treat ANPR as a procurement item discover the gap during an incident, when a plate is misread, when a forensic search returns nothing, when an insurer disputes a claim because the system's record does not match the registry. Operators who treat ANPR as an engineering commitment, with audit, validation, and retraining built into the operating model, get the read rates the technology can actually deliver. The difference is not the equipment. The difference is the discipline.

For operators who suspect their current ANPR is performing below specification, Path II, the three to five day audit, produces a controlled comparison of system reads against ground truth at the specific deployment site, with a clear figure for actual performance and a list of the specific gaps. For operators considering a new deployment, Path III, the ninety-day pilot, runs a tuned system against the actual plate population at the actual gates, with the data needed to size the rollout. Path I, the sixty-minute confidential conversation, is the entry point for both.

Frequently asked questions

How are GCC plates structured?

GCC plates vary by country and, in the UAE, by emirate. UAE plates are issued by each of the seven emirates with distinct color schemes, category codes, and layout conventions. Saudi plates carry three letters and up to four digits in both Arabic and Latin, typically arranged vertically. Qatar, Bahrain, Kuwait, and Oman each have their own formats. All GCC plates include both Eastern Arabic and Latin inscriptions, though the prominence of each script varies. Plate generations within a single jurisdiction may differ as registries are modernized, so fleets often contain multiple format generations simultaneously.

Can Western ANPR read Arabic numerals?

Most commercial ANPR engines developed for European or North American markets cannot reliably read Eastern Arabic numerals. The character forms differ from Latin digits, with the Eastern Arabic zero rendered as a dot, the five resembling a Latin zero, and the six resembling a Latin seven. Engines that have not been trained on Eastern Arabic data either ignore the script entirely, focusing on the Latin row, or misclassify Eastern Arabic characters as visually similar Latin ones. Reliable dual-script recognition requires explicit training on Eastern Arabic character data, not post-processing of a Latin-only model.

How is retraining done?

Retraining involves assembling a corpus of regionally sourced plate images, typically hundreds of thousands, labeled by annotators who read Arabic and understand local plate conventions. The model architecture is adapted to handle dual scripts, with parallel classifiers for Latin and Eastern Arabic and a cross-script consistency check. Field validation runs for sixty to ninety days against ground truth. Continuous retraining cycles, quarterly at minimum, incorporate new plate formats and operational data. The process is engineering rather than configuration, and shortcuts in data quality, labeling, or validation surface as errors at the worst moments.

Who supplies tuned systems?

A narrow subset of vendors offer genuinely tuned GCC ANPR, distinguishable by their willingness to provide concrete figures on regional training data volume, Eastern Arabic recognition rates, cross-script consistency handling, and retraining cadence. Vendors who cannot answer these questions with numbers are typically offering European or international products with regional marketing overlays. The operator's protection is independent audit of actual deployment performance against ground truth, which reveals real read rates within a controlled comparison period and identifies the specific gaps that retraining or replacement must close.

About the author

Dr. Raphael Nagel (LL.M.) is founding partner of Tactical Management. He acquires and restructures industrial businesses in demanding market environments and writes on capital, geopolitics, and technological transformation. raphaelnagel.com