Posted Monday, July 17, 2023 1:34 PM - Written by Tony Lez
Are UUIDs and GUIDs truly unique?
main image

The UUID (universally unique identifier) is a widely used naming convention commonly found in databases. Each UUID is generated and is considered to be unique across time, space and the wider universe. But is it really….

UUID Examples

e554af3c-1a43-493a-acc4-4b37a5460337

02f9eb0c-711a-43c4-8326-ad1c4d9ab1ae

It is made up of 32 hexadecimal characters (sometime referred to as the base-16 representation). It consist of sequence of 8 character, 4 characters, 4 characters, 4 characters, 12 character each section separated by a dash (-). Each character can be any digit between “0” through to “9” or any letter from “a” through to “f” thus follows the regex pattern below:

[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}

This mean each character has 16 different unique letter or number that can be used. Therefore each the total number of possible UUID combination is equal to 1632 = 3.4028 x 1038. A lot of combinations.

At this point, anyone with a serviceable understanding of mathematics would say that there is always some chance of a UUID is generated twice, causing a duplicate, and that this chance will become greater and greater as more and more UUIDs are generated. This is true but let’s calculate the probability of there being no duplicates in a set of UUIDs.

The total number of UUID combination is 16^32, N, and total number of UUID that has already been generated, n. This mean the probability, p, of the UUID that is already generated to be generated again, causing a duplicate, is equal to:

p = n/N

If we assume that the probability of duplicate is at 1%, this means that the total amount of UUID already generated is equal to:

0.01 = n/16^32

n = 1.8447 x 10^36

This means that 1.8447 x 10^36 UUIDs can be generated before the probability of a duplicate occurring reaches 1%. If the UUID was generated at a rate of one per second, it would take 5.843 x 10^8 centuries to reach 1% chance of a duplicated occurring.

This means that the UUID, whilst not “truly” unique, it provide adequate pseudo-uniqueness when taking into accounts of the limitation of human life-expectancy, database life-expectancy and separation of systems out there.