Version numbers for RPMs and SRPMs aren’t sortable by normal string comparisons. Take the following example:
- 3.10 (read: three point ten)
- 3.11 (read: three point eleven)
The above versions are sorted from oldest to newest. However, when sorting according to string sorting rules, the order is determined to be:
This behavior affects both sorting RPMs as well as querying for RPMs relative to a specific
version (i.e. “RPMs newer than version 3.9”). It applies to both the
attributes on an RPM.
To work around this issue, two extra attributes are added to the RPM’s metadata that is stored
in Pulp’s database:
release_sort_index. When sorting or querying against
either an RPM’s version or release, the query should be done against the sort index attributes
In order to use simple string sorting in the database, the original values for version and release are encoded for their sort index values. The encoding algorithm is as follows:
- Each version is split apart by periods. We’ll refer to each piece as a segment.
- If a segment only consists of numbers, it’s transformed into the format
- dd - number of digits in the value, including leading zeroes if necessary
- num - value of the int being encoded
- If a segment contains one or more letters, it is:
- Split into multiple segments of continuous letters or numbers. For example, 12a3bc becomes 12.a.3.bc
- All of these number-only subsegments is encoded according to the rules above.
- All letter subsegments are prefixed with a dollar sign ($).
- Any non-alphanumeric characters are discarded.
3.9 -> 01-3.01-9
3.10 -> 01-3.02-10
5.256 -> 01-5.03-256
1.1a -> 01-1.01-1.$a
1.a+ -> 01-1.$a
12a3bc -> 02-12.$a.01-3.$bc
2xFg33.+f.5 -> 01-2.$xFg.02-33.$f.01-5