summaryrefslogtreecommitdiff
blob: 84cdddab7b31f2ff918cd24514cbb6ee3995ee86 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
{{GLEP
|Number=73
|Title=Automated enforcing of REQUIRED_USE constraints
|Type=Standards Track
|Status=Draft
|Author=Michał Górny <mgorny@gentoo.org>
}}

==Abstract==

This GLEP proposes using automated solving to satisfy REQUIRED_USE constraints. It lists the problems with the current handling of REQUIRED_USE, and explains how auto-solving would solve them. It specifies the algorithms that can be used to verify the constraints, automatically solve them and check whether they can be solved. It provides the rationale for all the design decisions, and considers the compatibility with the PMS, and between the constraints and the package managers before and after the GLEP is used.

==Motivation==

===The issues with REQUIRED_USE===

[https://projects.gentoo.org/pms/6/pms.html#x1-910008.2.7 REQUIRED_USE] has been introduced in EAPI 4 as a solution to the problem of enforcing specific relations between USE flags. According to the [https://dev.gentoo.org/~zmedico/portage/doc/ch06s03s05.html#package-ebuild-eapi-4-metadata-required-use Portage documentation on REQUIRED_USE], it has been specifically targeted as a more data-oriented and machine-friendly alternative to verifying the validity of USE flag choice in ebuild phases.

At the moment of writing, REQUIRED_USE is used in around 25% of the ebuilds in Gentoo. It is an obligatory part of some eclasses, e.g. in the Python ecosystem. Its uses include improving clarity of user choices, simplifying ebuilds via copying upstream feature dependencies and enforcing valid data for USE dependencies. Nevertheless, a number of developers raise strong arguments against using REQUIRED_USE.

The commonly noted disadvantages of REQUIRED_USE are:
# Unsatisfied REQUIRED_USE constraints unnecessarily (and sometimes frequently) require explicit user action, even if there is no real gain from the user explicitly selecting. For example, if a package supports building either against Qt4 or Qt5, and user has enabled the flags for both, the package manager would request him to disable one of the flags for the package. For most of the cases, just using the newer version would be more friendly.
# Satisfying REQUIRED_USE usually requires altering flags via permanent configuration. Those alterations can become obsolete over time and without proper maintenance can put system into a suboptimal configuration. For example, if a Python package requires enabling a non-default Python target, then the leftover flag can keep forcing an obsolete Python version when the package gains support for the default target.
# The machine-oriented form of REQUIRED_USE constraints can result in confusing and unreadable output to the user, especially for complex constructs and cross-dependent constraints. Bad output can result in the user being unable to solve the problem at all, or to solve it in a satisfactory way (i.e. without disabling the features he needs). It can also cause frustration if satisfying REQUIRED_USE requires more than one attempt.

Those arguments have resulted in a number of developers avoiding REQUIRED_USE. For example, [https://wiki.gentoo.org/wiki/Project:Qt/Policies#Handling_different_versions_of_Qt the Qt team policies] discourage using it unless absolutely necessary. The attempts of avoiding REQUIRED_USE sometimes result in suboptimal descriptions of USE flags or even inconsistent use of them.

===The providers problem===

A very specific case of a problem where REQUIRED_USE has some use is the ''providers'' problem. That is, whenever a package has a feature that can be supplied by more than one library of choice, and the user needs to choose between the providers. The exact form of this problem depends on the number of providers and whether the feature is optional.

The commonly used solutions include:

* Using one or more binary flags to toggle between the providers (with number of the flags < number of providers). This is most readable with only two providers, e.g. with ''USE=libressl'' meaning ''use LibreSSL instead of OpenSSL'', and ''USE=-libressl'' meaning ''use OpenSSL''. For packages with optional SSL/TLS feature, there is also an additional ''USE=ssl'' to toggle that feature, and with ''USE=-ssl'', the ''libressl'' flag is meaningless (ignored). This is usually the least intrusive method but it's unreadable and causes the flags to be confusing.
* Using unary flags for providers along with REQUIRED_USE. In this case, each provider gets an explicit flag and REQUIRED_USE is used to force selecting exactly one of them. For optional feature, there is either an additional feature flag or it is disabled when all providers are disabled. This is usually the most readable solution although it frequently requires adjusting flags.
* Using unary flags without REQUIRED_USE. In this case, if user selects more than one provider (or does not select any), the package decides which one is preferred and uses that. For optional feature, again there could be either an additional feature flag or it could be disabled by disabling all the providers. This is less intrusive than the previous solution but it's less readable (it is unclear which provider is actually used) and unsuitable for USE dependencies.

As noted, all of the mentioned solutions have their specific disadvantages. This causes different developers to use different solutions for specific problems. Sometimes, it could go as bad as to have more than one solution applied to a single problem, or different concepts used inconsistently by different developers.

===Automatic solving as the solution===

This GLEP focuses on the idea of establishing automated solving of REQUIRED_USE as a solution to the aforementioned issues. In this context, REQUIRED_USE is extended not only to specify what combinations of USE flags are valid but also how to proceed from a disallowed flag set to one that satisfies the constraints.

This clearly resolves the first two issues with REQUIRED_USE. Since REQUIRED_USE mismatches are solved automatically, there is no explicit user interaction required. No changes are done in configuration files — since the solving is meant to be deterministic, the package manager can recalculate the effective USE flag set using the input USE flag set and the REQUIRED_USE constraint.

The third disadvantage is partially solved. Since there is no necessity for the user to perform any action, there is also no necessity of explaining the constraints to the user. However, for practical uses it may be deemed appropriate to explain to the user why a particular flag has been enabled or disabled.

Solving the most common problems with REQUIRED_USE makes it possible to extend its use cases to the areas where developers so far rejected to use it, or did not even think of using it. This includes working towards a single solution to the providers problem. Given that REQUIRED_USE no longer requires altering the configuration to select between multiple allowed providers, we can reasonably work towards using the middle solution consistently — that is, having clear unary flags for every provider, and using REQUIRED_USE to automatically transform inconclusive input into a single implementation.

Furthermore, the non-intrusive version of REQUIRED_USE could be used extensively to conditionally mask meaningless flags and map equivalent flag sets into a single common set of choice. This can further improve readability (by making flags clearly indicate what it used, e.g. by disabling all SSL/TLS provider flags when SSL/TLS is disabled) and improve compatibility between binary packages (by reducing the number of incompatible USE flag sets).

==Specification==

===Restrictions on REQUIRED_USE format===

The REQUIRED_USE format is defined by [https://projects.gentoo.org/pms/6/pms.html the PMS]. This specification requires the following additional restrictions being enforced:

* An any-of group (||), at-most-one-of (??) and an exactly-one-of (^^) group can contain only flat USE flag items. In particular, no other group can be nested inside it.
* All-of groups are forbidden inside REQUIRED_USE (they have no use now).
* An any-of group (||), at-most-one-of (??) and an exactly-one-of (^^) group must contain at least one subitem (can not be empty).

As a result, unlimited nesting is allowed only for use-conditional groups. All other constructs are kept flat. This serves the following goals:

* avoiding surprising results of automatic flag adjustments,
* improving readability of REQUIRED_USE constraints,
* keeping the specification and implementation relatively simple.

===The algorithm for satisfying REQUIRED_USE constraints===

====Processing algorithm====

The existing package managers have to validate REQUIRED_USE constraints while evaluating the dependency graph. The current validation action is replaced by the following algorithm:

# Check whether the REQUIRED_USE constraint is satisfied by the USE flags enabled by the current user configuration. If it is, accept the package (the algorithm stops).
# Check whether the REQUIRED_USE constraint matches restrictions set in [[#Restrictions on REQUIRED_USE format]]. If it does not, report a REQUIRED_USE mismatch and abort.
# Find all any-of (||), at-most-one-of (??) and exactly-one-of (^^) groups inside REQUIRED_USE and reorder (sort) them according to the algorithm defined below.
# Attempt to solve the REQUIRED_USE constraint using the algorithm defined below. If the attempt succeeds, accept the package with the set of USE flags determined by the solver.
# If the attempt at solving failed, report a REQUIRED_USE mismatch and abort.

====REQUIRED_USE verification algorithm====

The verification algorithm is implied by the meanings of REQUIRED_USE constructs as defined by the PMS. It is repeated here for completeness and for reuse in further algorithms.

The REQUIRED_USE constraint is considered satisfied if ''all'' the top-level items evaluate to true. An item evaluates to true if, depending on the item type:

* A '''USE flag name''' that is not prefixed by an exclamation mark evaluates to true if the named flag is enabled. Accordingly, a USE flag name that is prefixed by an exclamation mark evaluates to true if the named flag is disabled.
* For a '''USE-conditional group''' the condition needs to be tested first (according to the same rule). If the condition evaluates to true, the USE-conditional group is true only if all items in it evaluate to true. If the condition evaluates to false, the USE-conditional group always evaluates to true and the items inside it need not to be tested.
* An '''any-of group''' (||) evaluates to true if at least one of the items in it evaluates to true.
* An '''exactly-one-of group''' (^^) evaluates to true if exactly one of the items in it evaluates to true, and all the remaining items evaluate to false.
* An '''at-most-one-of group''' (??) evaluates to true if at most one of the items in it evaluates to true.

====Constraint group reordering algorithm====

The constraint solving algorithm is built on ''prefer leftmost'' assumption for all any-of, exactly-one-of and at-most-one-of groups. That is, if the constraint is not satisfied by the current set of enabled USE flags, the algorithm prefers enforcing the leftmost constraints and disabling rightmost.

Due to different system profiles, it might be impossible to automatically solve the constraint using the leftmost flag specified by ebuild (e.g. when it is masked). In order to account for this, the specification provides a group reordering (sorting) phase before the solving algorithm.

The reordering applies to any-of, exactly-one-of and at-most-one-of groups. Per the format restriction, each group can only contain flat USE flags.

For each of the items in the group, if the item names a forced/masked USE flag:

* if the item evaluates to true according to the flag's value, it is moved to the leftmost position in the group,
* if the item evaluates to false according to the flag's value, it is moved to the rightmost position in the group,

Relative positions of multiple forced/masked flags are of no relevance since those flags are not altered.

This reordering ensures that if a flag is forced, it is always preferred over other choices; and if it is masked, it is never preferred. This makes it possible to easily account for all possible cases without having to provide a detailed algorithm to handle various possible results.

====REQUIRED_USE solving algorithm====

If the REQUIRED_USE constraint is not satisfied according to the initial set of USE flags implied by the configuration, the package manager attempts to alter the USE flags according to REQUIRED_USE.

Before solving, a set of '''immutable flags''' is determined based on forced and masked USE flags. If a flag is either forced or masked, it is marked immutable and the algorithm can not alter its value. If a particular rule would cause the flag to be altered, the solving is aborted and an error is reported.

The solving algorithm is applied at least once, and the REQUIRED_USE is rechecked after each application. The package manager may support running multiple iterations of the algorithm, in which case it needs to either limit the allowed number of iterations or abort after obtaining one of the values previously given by the algorithm (hitting an infinite loop).

In order to enforce REQUIRED_USE, each top-level item in REQUIRED_USE that did not evaluate to true needs to be enforced. All items are enforced in order, left to right. Depending on the item type, enforcing implies:

* For a '''USE flag name''' that is not prefixed by an exclamation mark, the named flag is enabled. If it is prefixed by an exclamation mark, the named flag is disabled.
* For a '''USE-conditional group''', the condition (LHS) is evaluated first. If the condition evaluates to true, all the items inside the group are enforced, in order. If it evaluates to false, the group is skipped.
* For an '''any-of group''' that did evaluate to false, the first (left-most) item in the group is enforced.
* For an '''at-most-one-of group''' that did evaluate to false, the first (left-most) item that evaluates to true needs to be determined first. Afterwards, all items following it are negatively-enforced (forced to evaluate to false).
* An '''exactly-one-of group''' is equivalent to a conjunction of an at-most-one-of group and an any-of group. That is, if all items evaluate to false, the rule for any-of is applied. If more than one item evaluates to true, the rule for at-most-one-of is applied.

The negative enforcing action can be applied to plain '''USE flag names''' only. If the name is not prefixed by an exclamation mark, then the flag is disabled. If the name is prefixed by an exclamation mark, it is enabled appropriately.

===QA checks to verify REQUIRED_USE solutions===

====Context to QA checks====

All of the QA checks are performed in context of a specific set of forced and masked USE flags, called ''immutable flags''. All of the checks need to be repeated for every set. Since they can alter the preferences inside any-of, at-most-one-of and exactly-one-of groups, it may also be necessary to perform a separate transformation for each set.

The complete set of immutable flag combinations can be obtained using the following algorithm:
* let '''U''' be the set of all USE flags (both explicit IUSE and implicit) that are used in REQUIRED_USE,
* for every enabled profile:
** let '''I1''' be the effective {{Path|use.force}}, {{Path|use.mask}}, {{Path|package.use.force}}, {{Path|package.use.mask}} values that apply to the package and affect flags in '''U''',
** let '''I2''' be the effective {{Path|use.stable.force}}, {{Path|use.stable.mask}}, {{Path|package.use.stable.force}}, {{Path|package.use.stable.mask}} values that apply to the package and affect flags in '''U''',
** add '''I1''' to the result set,
** if package has any stable keywords, combine '''I1''' and '''I2''', and add the result to the result set.

Afterwards, all checks should be performed for all unique values in the result set.

====Requirements for REQUIRED_USE constraints====

In order to verify the ability to solve REQUIRED_USE reliably, the QA check tools should ensure that the following conditions are met:

# no valid combination of USE flags can result in the constraint requesting the same flag to be simultaneously both enabled and disabled;
# no valid combination of USE flags (that is, not prohibited by immutable flags) can attempt to alter immutable flags;
# no constraint in REQUIRED_USE may alter flags in such a way that any of the constraints preceding it would start to apply and change the resulting flags in a second iteration.

====Concept for transforming REQUIRED_USE into implications====

The algorithms used to verify REQUIRED_USE rely on them being expressed in a ''flat implication form''. In this form, the constraints are expressed as zero or more ''implications''. Each implication specifies zero or more conjunctive ''conditions'', and one or more ''effects''. It is equivalent to a nested USE-conditional group. If all of the ''conditions'' are met, the ''effects'' are applied.

If a constraint is valid, then the solutions of its transformation are the same as of the original.

By idea, the transformation consists of the following steps:

# Reordering all any-of (||), at-most-one-of (??) and exactly-one-of (^^) groups according to the [[#Constraint_group_reordering_algorithm|reordering algorithm]].
# Replacing all any-of (||), at-most-one-of (??) and exactly-one-of (^^) groups according to the following transformations:
#* <code>^^ ( a b c… )</code> → <code>|| ( a b c… ) ?? ( a b c… )</code>,
#* <code>|| ( a b c… )</code> → <code>!b? ( !c? ( !…? ( a )… ) )</code>,
#* <code>?? ( a b c… )</code> → <code>a? ( !b !c… ) b? ( !c… ) c? ( … ) …</code>.
# Creating an ordered directed graph linking all nested conditions to their effects.
# Traversing all the paths from the topmost graph nodes to the deepest, in order.

For example, an ordered graph is provided for the following REQUIRED_USE constraint:

 a b? ( c? ( d !b ) d? ( e ) ) b? ( f )

Nodes and edges are numbered to explain the ordering. Furthermore, the final (effect) nodes are colored red.

[[File:ReqUseExampleGraph.svg|framed|Example graph for REQUIRED_USE]]

Traversing this graph produces the following paths, in order:

# '''a<sub>(1)</sub>'''
# b<sub>(2)</sub> → c<sub>(3)</sub> → '''d<sub>(4)</sub>'''
# b<sub>(2)</sub> → c<sub>(3)</sub> → '''!b<sub>(5)</sub>'''
# b<sub>(2)</sub> → d<sub>(6)</sub> → '''e<sub>(7)</sub>'''
# b<sub>(8)</sub> → '''g<sub>(9)</sub>'''

Those paths are roughly equivalent to the following USE-conditional group constructs:

# <code>a</code>
# <code>b? ( c? ( d ) )</code>
# <code>b? ( c? ( !b ) )</code>
# <code>b? ( d? ( f ) )</code>
# <code>b? ( g )</code>

Except that the value of ''b'' for constraint 4 is considered from the initial value rather than the one possibly altered by constraint 3. Constraint 5 uses a separate condition, and so uses the new value of ''b''.

====Algorithm for transforming REQUIRED_USE into implications====

Steps 2 through 4 of the fore-mentioned transformation can be performed using the following recursive function. It should be applied to every top-level REQUIRED_USE item, in order.

It should be noted that for the purpose of distinguishing separate branches, all the condition objects need to have an unique identity. In Python this occurs naturally via instantiating an object. In other languages an explicit unique identifier may need to be included.

 function '''transform'''(''item'', ''conditions=[]''):
   if ''item'' is a USE flag:
     append (''conditions'', ''item'') to the results
   if ''item'' is a USE-conditional group:
     ''new_conditions'' := ''conditions'' + [''item.condition'']
     for ''subitem'' in ''item.subitems'':
       call '''transform'''(''subitem'', ''new_conditions'')
   if ''item'' is an any-of (||) group:
     ''n'' := len(''item.subitems'') - 1  # (last index)
     ''new_conditions'' := ''conditions''
     for ''f'' in ''item.subitems[1..n-1]'':
       ''new_conditions'' += [!''f'']
     append (''new_conditions'', ''item.subitems[0]'') to the results
   if ''item'' is an at-most-one-of (??) group:
     ''n'' := len(''item.subitems'') - 1  # (last index)
     for ''i'' := 0 .. ''n''-1:
       ''new_conditions'' := ''conditions'' + [''item.subitems[i]'']
       for ''f'' in ''item.subitems[i+1..n]'':
         append (''new_conditions'', ''!f'') to the results
   if ''item'' is an exactly-one-of (^^) group:
     apply the logic for an any-of (||) group
     apply the logic for an at-most-one of (??) group

==== QA check logic ====

The logic for the reference algorithm is split into four split functions:

# Verifying that the constraints do not alter immutable flags,
# Verifying that the conditions for the constraints are not self-conflicting,
# Verifying that no two constraints will attempt to force opposite values for a single flag,
# Verifying that no constraint will meaningfully enable any of the constraints preceding it.

In the following descriptions, ''C'' will indicate zero or more conditions (''c<sub>i</sub>'' being the sub-conditions) of the flat constraint, and ''E'' will indicate the enforcement.

The check for alteration of immutable flags is done for every constraint separately. A flat constraint is determined to alter immutable flags if both of the following conditions occur:

* ''C'' can evaluate to true — that is, none of ''c<sub>i'' refer to an immutable flag whose value is ''¬c<sub>i</sub>'',
* ''E'' references an immutable flag whose immutable state is ''¬E''.

The check for self-conflicting constraints is performed for every constraint separately. A flat constraint is determined to be self-conflicting if the following condition occurs:

* For any pair of sub-conditions ''c<sub>i</sub>'', ''c<sub>j</sub>'' (''i ≠ j''), ''c<sub>i</sub> = ¬c<sub>j</sub>.

The check for attempting to force opposite values for a single flag is performed for every pair of constraints. Since it is symmetric, it is only necessary to perform it for unique pairs. For practical reasons, let's assume it is performed for every pair ''((C<sub>i</sub>, E<sub>i</sub>), (C<sub>j</sub>, E<sub>j</sub>))'', where ''j > i''. The pair is determined to force opposite values for a single flag if all of the following conditions are met:

* ''E<sub>i</sub> = ¬E<sub>j</sub>'',
* ''C<sub>i</sub>'' and ''C<sub>j</sub>'' can simultaneously evaluate to true,
* ''C<sub>i</sub>'' can evaluate to true after applying all the constraints preceding it, with flags ''F = C<sub>i</sub> ∪ C<sub>j</sub>'',
* ''C<sub>j</sub>'' can evaluate to true after applying all the constraints preceding it, with flags ''F = C<sub>i</sub> ∪ C<sub>j</sub>''.

The check for enabling the previous constraints is performed for every pair ''((C<sub>i</sub>, E<sub>i</sub>), (C<sub>j</sub>, E<sub>j</sub>))'', where ''j > i''. The constraint ''(C<sub>j</sub>, E<sub>j</sub>)'' is determined to meaningfully enable the constraint ''(C<sub>i</sub>, E<sub>i</sub>)'' if all of the following conditions are met:

* ''E<sub>j</sub>'' matches any of the conditions in ''C<sub>i</sub>'' (''E<sub>j</sub> = c<sub>i,k</sub>'', for any ''k''),
* ''C<sub>i</sub>'' and ''C<sub>j</sub>'' can simultaneously evaluate to true,
* ''E<sub>i</sub>'' does not always evaluate to true after applying all of the constraints, with flags ''F = C<sub>j</sub>''.

Two flat constraints ''C<sub>i</sub>'' and ''C<sub>j</sub>'' can simultaneously evaluate to true if the following condition is met:

* For every ''c<sub>i,k</sub>'', ''c<sub>j,l</sub>'' (where ''k'' and ''l'' are all possible indexes of the condition of the first and second constraint appropriately), ''c<sub>i,k</sub> ≠ ¬c<sub>j,l</sub>''.

A constraint ''C'' can evaluate to true if and only if all sub-constraints can evaluate to true. A sub-constraint ''c<sub>i</sub>'' can evaluate to true if the current set of flags does not include its negation (for every ''f<sub>j</sub>'', ''f<sub>j</sub> ≠ c<sub>i</sub>'').

A constraint ''C'' always evaluates to true if and only if all sub-constraints always evaluate to true. A sub-constraint ''c<sub>i</sub>'' always evaluates to true if the current set of flags includes the condition (there exists at least one ''f<sub>j</sub>'' that ''f<sub>j</sub> = c<sub>i</sub>'').

In order to determine whether a condition ''C<sub>i</sub>'' can evaluate to true after applying a specific set of constraints, with initial flags ''F<sub>1</sub>'', determine the final set of flags ''F<sub>n</sub>'' and afterwards test if the constraint can evaluate to true with flags ''F<sub>n</sub>''.

In order to determine whether a condition ''C<sub>i</sub>'' always evaluates to true after applying a specific set of constraints, with initial flags ''F<sub>1</sub>'', determine the final set of flags ''F<sub>n</sub>'' and afterwards test if the constraint always evaluates to true with flags ''F<sub>n</sub>''.

In order to determine the final set of flags ''F<sub>n</sub>'', with specific set of constraints ''(C<sub>i</sub>, E<sub>i</sub>)'' and initial flags ''F<sub>1</sub>'':

* For every flat constraint ''(C<sub>i</sub>, E<sub>i</sub>)'' in the set:
** If the condition ''C<sub>i</sub>'' always evaluates to true, update ''F'' with ''E<sub>i</sub>'' (''F<sub>i+1</sub> = F<sub>i</sub> ∪ {E<sub>i</sub>} ∖ {¬E<sub>i</sub>}'').

====Limitations of the algorithm====

The presented check algorithm has a limitation which could result in false positives. However, the testing against all real Gentoo uses of REQUIRED_USE has shown that none of those occur at the moment of writing this GLEP, and that is quite unlikely for them to become a major issue in the future.

The algorithm is unable to infer indirect implications of the constraints. For example, given the following constraint:

 a? ( !b ) !a? ( !b ) b? ( c )

The algorithm is unable to correctly infer that due to the first two constraints, ''b'' will never be true. As a result, it will e.g. report an immutability error on <code>b? ( c )</code> if ''c'' is masked even though this condition could never evaluate to true.

However, it is considered that a natural occurrence of such a constraint is quite unlikely, and usually indicates a problem with the constraint anyway. Therefore, reporting a false positive here could serve as an indication of another problem.

==Rationale==

===Restrictions for allowed REQUIRED_USE syntax===

The specification imposes a number of arbitrary restrictions to REQUIRED_USE syntax, in particular by restricting the possible nesting and disallowing other complex constructs. The main goal is to simplify the algorithms used and make the results more obvious. This is at cost of prohibiting constructs that are rarely used, and usually could be replaced by simpler and more readable constructs.

====Nested any-of, at-most-one-of, exactly-one-of groups====

The first and most important restriction is that nesting of any-of, at-most-one-of and exactly-one-of groups is forbidden. While technically such constructs could work, some of them are not really meaningful and others are really confusing. At the time of writing, nested ||/??/^^ groups were used in exactly two Gentoo packages. The specific uses were:

# app-admin/bacula: <code>|| ( ^^ ( mysql postgres sqlite ) bacula-clientonly )</code>
# dev-games/ogre: <code>?? ( gl3plus ( || ( gles2 gles3 ) ) )</code>

The first use is not very complex, and indicates that either exactly one of the database providers need to be selected, or the ''bacula-clientonly'' flag needs to be used. However, at a first glance a user might be confused that the database ^^ constraint needs to be applied independently of the ''bacula-clientonly'' flag. The same construct can be expressed in a more straightforward way:

 !bacula-clientonly? ( ^^ ( mysql postgres sqlite ) )

The second use is much more confusing. It means that both ''gl3plus'' and either of the ''gles2'' or ''gles3'' flags can not be enabled at the same time. However, ''gles2'' and ''gles3'' can be enabled simultaneously. The same construct can be expressed in a more straightforward way as:

  gl3plus? ( !gles2 !gles3 )

As can be seen, in both cases the alternative constructs were both more readable and shorter than the nested expressions. In the first case, it is also the more natural way of expressing the problem. While replacing expressions that have more than two subexpressions would be harder, there were no uses of such expressions so far, and the potential ambiguity makes them unlikely to appear.

====All-of groups====

The second restriction imposed by this GLEP is disallowing all-of groups. The PMS allows them anywhere but in reality they are only meaningful inside ||, ?? and ^^ groups (elsewhere they do not have any effect, and can be inlined into parent block). Inside those groups, they imply that the item is considered matched only if all items inside the all-of group match.

The meaning of all-of groups inside || is pretty clear. However, inside ?? and ^^ some confusion may occur. In particular, for a general case of:

 ?? ( a ( b c ) )

the constraint only affects the combination of all flags inside the all-of group. In this case, enabling ''a'' prohibits having the combination of both ''b'' and ''c'' enabled. However, either ''b'' or ''c'' can be enabled separately without affecting ''a''. This makes this constraint unlikely to have real use cases, and if it has, they are unlikely to be the most natural way of expressing the problem.

Furthermore, automatic solving of such constraints forces some implicit ambiguity. Since both (multiple) flags have to be enabled together to cause a particular item to match, there are multiple solutions of forcing an item not to match. For the fore-mentioned sample, having ''a'' enabled would require the solver to force ''( b c )'' not to match. To do this, the solver could either disable ''b'', disable ''c'' or disable both flags.

There are arguments for both options — disabling only one flag follows the idea of 'smallest change needed'. Disabling both can be considered more consistent. In either case, there will be developers and user confused by the package manager relying on either behavior.

The all-of groups inside || do not suffer from the same issue since solving them does not require disabling anything. However, they also have seemingly low value and banning all-of groups altogether improves symmetry between the different group types.

Furthermore, the nested all-of groups make transformation into implication graph much more complex. Without them, the conditions are purely conjunctive. If we were to support all-of groups inside ||, ??, ^^ we would have to support disjunctive conditions, and transform them into conjunctive form.

The all-of groups were used in 5 different packages at the time of writing. Two of them were outside ||, ??, ^^, rendering them meaningless and probably accidental. The three remaining cases were:

# sci-chemistry/icm: <code>^^ ( ( !32bit 64bit ) ( 32bit !64bit ) ( 32bit 64bit ) )</code>
# media-sound/snd: <code>^^ ( ( !ruby !s7 ) ( ruby !s7 ) ( !ruby s7 ) )</code>
# app-i18n/ibus: <code>|| ( deprecated ( gtk3 introspection ) ) )</code>

Of those cases, the first two can be replaced by pure, flat || and ?? groups appropriately. It furthermore indicates that all uses of all-of groups inside ^^ in Gentoo were purely mistaken.

The third case is potentially valid. It indicates that either ''deprecated'' or both ''gtk3'' and ''introspection'' flags need to be enabled. However, it does not clearly indicate the preferred course of action. After investigating the ebuild in question, it is most likely that the following constraint would be more correct, and clearer to the user:

 || ( deprecated gtk3 ) gtk3? ( introspection )

That is, if user enables ''gtk3'' and ''gtk3'' requires ''introspection'', then it seems more reasonable to enable ''introspection'' than to ignore the ''gtk3'' flag and force ''deprecated'' module instead.

====USE-conditionals inside ||, ??, ^^ groups====

The last restriction forbids using USE-conditional groups inside any-of, at-most-one-of and exactly-one-of groups. Those indicate that some of the items inside the group are to be considered its members only if the relevant flags are enabled. They are logically equivalent to all-of groups, i.e. <code>|| ( foo? ( bar ) ... )</code> and <code>|| ( ( foo bar ) ... )</code>, except they have a different semantic — the latter form suggests enabling both flags, the former suggests considering ''bar'' only if ''foo'' is already enabled.

Supporting USE-conditional groups properly would most likely require splitting the parent group into multiple variants for different initial values of USE conditionals. Considering the above equality, it would also be inconsistent with the ban on all-of groups. Finally, those groups have little real value.

The only use case in Gentoo was in media-video/mpv:

 opengl? ( || ( aqua egl X raspberry-pi !cli? ( libmpv ) ) )

It indicates that the OpenGL video output requires selecting one of the variants, with the ''libmpv'' variant being allowed only without CLI enabled. While this may be technically valid, it is confusing. Furthermore, other REQUIRED_USE constraints already require that either ''cli''' or ''libmpv'' is enabled, making ''!cli'' imply ''libmpv''. Therefore, the USE-conditional in the constraint is redundant.

====Empty any-of, at-most-one-of, exactly-one-of groups====

As the first mailing list review indicated, the PMS explicitly specifies a special case that empty any-of, at-most-one-of and exactly-one-of groups all evaluate to true.

This behavior has been explained as a historical behavior associated with Portage removing unmatched USE-conditional groups inside any-of dependency groups which could result in the group becoming effectively empty. As REQUIRED_USE was introduced, the rule was effectively extended into the new operators.

It is unclear whether this is the most correct behavior logically though. Alexis Ballier pointed out:

: I mean, in every context I've ever seen, applying a rule to the empty set is the neutral of that rule, so that it preserves associativity.
: That'd mean: <code>|| ( )</code> is false, <code>&& ( )</code> is true, <code>^^ ( )</code> is false, <code>?? ( )</code> is false.

(the thread afterwards develops that the more correct result for <code>?? ( )</code> could be to be true)

Since the original use case does not apply here (USE-conditional groups are banned inside those operators), the correct behavior is unclear and this has no real use case, banning it seems like the best course of action.

===Solving algorithm===

The solving algorithm attempts to enforce REQUIRED_USE in the most natural way, interpreting the constraints as developer suggestions on how to make the constraint apply.

====Application of different types of constraints====

The algorithm aims to solve mismatched constraints in the most natural way, presuming that this interpretation is the most likely to be correct.

For the USE-conditional groups, it assumes that they mean ''if X is true, then Y should also be true''. Appropriately, the algorithm does not alter the flag in the condition (''X''); instead, if the condition is true, it enforces the expression inside the group (''Y'').

For other groups, the algorithm applies the natural interpretation presuming that the items in group are stated in decreasing preference order, with the left-most item being most preferred. That is, if the group evaluates to false, it enforces a solution that either disables all enabled items except for the left-most already enabled, or enables the first item if no item is enabled.

====Reordering of ||, ??, ^^ groups====

The left-most-preferred assumption about the groups results in the solving algorithm relying on the ability to enable the item and disable other items. This is not possible if the relevant flag is masked, or (in cases of ??, ^^) some other flag is forced. If that were the case, the ordering inside those groups would have to be strictly limited by the 'common denominator' between the profiles. This would sometimes result in less preferred options being encouraged, or even impossible to express constraints — e.g. if the preferred implementation would not be stable but the package were stabilized.

To account for this, the groups are transformed to account for forced/masked (immutable) flags. The transformation is done through reordering the items because this keeps the specification as simple as possible. It does not to cover specifically how to interpret immutable flags in different kind of groups, and how to handle the groups afterwards. Instead, reordering results in the forced flags being preferred naturally, and the masked flags being discouraged naturally.

It also naturally handles the case when forced/masked flags result in impossible to satisfy constraints. Those cases do not need to be detected by the reordering algorithm implicitly, and instead just cause solver to fail early.

====Left-to-right constraint application====

The solving algorithm applies all changes necessary to enforce the constraints in order, left to right. Enforcing a specific ordering, combined with the PMS specifying how ebuild and eclass values for REQUIRED_USE are combined, makes the algorithm deterministic. Applying left-to-right is also the most natural way of doing it, making it easy for developers to predict the results.

Originally I had considered making the algorithm work independently of constraint order. However, this would clearly defining what the desired solution is, and finding an algorithm to enforce that. To achieve a deterministic solution, we would most likely have to require developers to provide groups that do not overlap. That is, for example:

 a? ( !b ) b? ( c )

would be unacceptable since with both ''a'' and ''b'' flags enabled, the constraint would either enforce ''c'' or not, depending on the processing order. The developer would have to write:

 a? ( !b ) !a? ( !b? ( c ) )

While this is a possible solution, expressing complex constraints would be very hard. Developers would no longer be able to naturally express the constraints, and instead would have to determine the correct sets of conditions for each requested result.

====Single vs multiple iterations====

This GLEP does not specifically restrict the implementations to doing simple or multiple iterations. Both options have their advantages.

A single iteration can successfully solve all valid REQUIRED_USE constraints, as long as they are properly ordered. An implementation using a single iteration has simpler error handling — it is only necessary to verify whether the REQUIRED_USE actually matches after enforcing it. It is also reasonable to request developers to order their constraints for a single iteration solving.

The advantage of using multiple iterations is that they can also solve wrongly ordered constraints. However, the implementation needs to account for the possibility of invalid (circular) constraints putting the solver in an infinite loop. For this reason, the solver needs to either limit the maximum number of iterations or store previous results and detect when the algorithm gives one of the previous results again.

For most of the real-life use cases, two iterations should be able to solve all the constraints. A large number of iterations is unlikely to be required by naturally written REQUIRED_USE constraints. It could be artificially caused by writing constructs like:

 c? ( d ) b? ( c ) a? ( b )

===QA checks/verification===

====The necessity of verification====

The purpose of REQUIRED_USE constraint verification is to ensure that for all valid combinations of input USE flags, the solver will be able to find a valid solution. This needs to be done explicitly since complex REQUIRED_USE constraints may trigger solving issues with non-obvious USE flag combinations, causing the developers to miss the issue.

Since the solver must be able to deal with non-solvable constraints (by reporting them and letting the user deal with them), verification is not a strict necessity for enforcing REQUIRED_USE. However, it improves the user experience, and so is a worthwhile addition to the QA tools in place.

To provide the best coverage, it is beneficial to integrate the verification into the tools commonly used by developers — repoman and pkgcheck, including the CI runs. For this to be possible, the algorithm must meet two requirements:

* It must be fast enough not to cause significant increase in repoman/pkgcheck run time for the full repository.
* It must not trigger a large number of false positives, and if any are triggered, they should be easy to work around.

====Context to the checks====

As noted in the specification part, all of them checks need to be repeated for all possible sets of the immutable flags. This is necessary since the immutable flags can alter the solutions significantly. In particular:

* They can alter the preferred choices in the any-of, at-most-one-of and exactly-one-of groups,
* They can cause some of the constraints to be unable to be satisfied,
* They can cause some of the USE-conditional groups to be disabled entirely.

To account for that and avoid the case where REQUIRED_USE solving would fail on some of the profiles, the verification should be performed for all combinations of immutable flags found throughout the enabled classes of profiles. Only the flags that apply to the REQUIRED_USE constraint in question need to be considered.

Due to the EAPI 5 [https://projects.gentoo.org/pms/6/pms.html#x1-600005.2.11 stable masking], the immutable flags have to separately be calculated for ~arch and stable keywords. The stable variant does not need to be considered unless the package is actually stable or being stabilized, to avoid unnecessarily cluttering up {{Path|package.use.stable.mask}} and/or {{Path|package.use.stable.force}} for packages that are going to stay in ~arch.

====The requirements for REQUIRED_USE====

The rules imposed for verification aim to cover most of the common cases of unsolvable constraints. In particular:

''no valid combination of USE flags can result in the constraint requesting the same flag to be simultaneously both enabled and disabled''
: If the effective REQUIRED_USE constraint (after collapsing all the groups) contains both ''foo'' and ''!foo'', the verification will never consider the constraint met (since logically ''x ∧ ¬x'' is always false).
''no valid combination of USE flags (that is, not prohibited by immutable flags) can attempt to alter immutable flags''
: This is implied by the immutability of masked/forced flags. An attempt to toggle those flags while solving should be considered a fatal error since {{Path|use.mask}}/{{Path|use.force}}/… always takes precedence over regular configuration and package-level toggles. Therefore, if such flags are enforced by an USE-conditional group, their condition should also be masked or forced appropriately.
''no constraint in REQUIRED_USE may alter flags in such a way that any of the constraints preceding it would start to apply and change the resulting flags in a second iteration.''
: This is required for reliable single-pass solving. While the solving may work correctly with multiple iterations, the constraints can be reliably (and usually easily) fixed via reordering. More importantly, this also catches the constraints that can not be solved due to circular toggling between the constraints.
: The addition condition for the second iteration change has been added to account for the common case of <code>a? ( b ) c? ( a b )</code>. While technically the second clause causes the first to start to apply, the second one already covers that case explicitly, so a second iteration would not change the result.

====Transformation into implication form====

The transformation of REQUIRED_USE into implication form is used to provide a form of the original constraint that is more convenient for analysis.

Firstly, the diverse (convenience) item types are all converted into a combination of implications and plain USE flags. The latter can express all the original constraints exactly, provided that the any reordering necessary is done prior to the transformation. As a result, we gain both simplified set of items that need to be considered, and a clear logical mapping of behavior associated any-of, at-most-one-of and exactly-one-of groups.

All of the transformed forms are built by definition, from the verification and solving algorithm:

* Any-of group constraints are satisfied if at least one of the items match. Therefore, the solving only applies if none of them does, in which case the first item is enforced. Appropriately, the result of transformation is the enforcement of first item conditional to the negation of all other items (the condition for the first item is omitted as redundant — enforcing a flag that is already enabled does not change anything).
* At-most-one-of group constraints are satisfied if no more than one item matches. The solving is applied if more than one item is enabled, in which case all but the first enabled item are forcibly disabled. Since disabling an already disabled flag does not change anything, this can be simplified to disabling all the remaining items if the left-most item is matched. The transformation does exactly that, for each item that can be possibly enabled, left-to-right.
* Exactly-one-of group constraints are satisfied if exactly one item matches. Logically this is equivalent to both having at least one item and no more than one item matching. Therefore, this constraint can be converted into a combination of an any-of group and an at-most-one-of-group, for which the transforms are already defined.

Secondly, having limited the set of item types to just two, of which only one can be nested, the constraint can be easily converted into a graph. The resulting graph provides a clean visualization of the structure of the nested conditions. All nodes but the final (bottom-most) ones represent conditions, while the final nodes represent enforcements.

A plain graph could be used to visualize relations between different conditions and enforcements. However, the specifics of REQUIRED_USE processing, especially left-to-right processing, require that the transform preserves exact structure of the constraints.

Thirdly, having the graph (tree) of conditions, we can easily traverse them. In doing so, we construct paths that precisely express which conditions need to be met for a particular enforcement to apply. Since the constraints are applied in order, we need to traverse the graph in this specific order, and write the paths down in the same order.

In doing the two last steps, it is important that we preserve the identity of the original condition nodes. This is necessary to distinguish between two cases:

# <code>a? ( b c )</code>
# <code>a? ( b ) a? ( c )</code>

Since the solving algorithm is applied recursively to USE-conditional groups, in the first case the outer ''a'' condition is not reevaluated between processing ''b'' and ''c''. In the latter case, the use of separate groups causes reevaluation of the condition.

While in this specific example there is no technical difference between the two forms, it becomes apparent when dealing with the following corner case:

# <code>a? ( !a b )</code>
# <code>a? ( !a ) a? ( b )</code>

In both cases, applying the first sub-item disables ''a''. However, only in the second case will the solver reevaluate the value of ''a'' and omit the second group. A plain flattening would cause the same to incorrectly happen for the first case, rendering the transform not equivalent to the original form.

In order to prevent that from happening, the verification algorithms need to be able to determine that the ''a'' condition is the same node in both resulting flattened expressions, and appropriately account for the fact that it is not affected by the enforcement. In the reference implementation, this is done via preserving the identity of the node, and doing identity-wide node comparison.

====The choice of algorithm====

A few algorithms were considered for the purpose of verification.

The first and the most obvious choice was to attempt to enforce the constraint for all possible combinations of USE flags, and report issues if any of the combinations results in failure. Such an algorithm has three important advantages:
# it is trivial to implement and requires little extra code,
# it is reliable since all combinations of USE flags are tested — if any of them fails, the check would find it,
# it reuses the verification/enforcing function verbatim, so there is no risk of the check diverging from the base algorithm.

However, this method has a single important drawback: it is slow. For each test context, it needs to process 2<sup>n</sup> combinations (n — number of USE flags); the number can grow huge with packages having 30 or more USE flags in REQUIRED_USE (which is especially the case for any-of groups). Furthermore, for each combination the check takes the average of 1 to 3 constraint iterations.

It is possible to attempt to speed up this method a little, e.g. via grouping the flags into separate, independent groups and processing them separately. However, this still doesn't give a significant gain and is not a reliable method of solving the problem. As a result, such an algorithm — while useful for the purposes of testing and reference — is not suitable for integrating with the QA tools.

An alternate algorithm has been considered that processes the restriction left-to-right and builds a decision tree-like structure in order to analyze all the possible outcomes of the REQUIRED_USE constraint. However, the pure version of this algorithm was also rejected because it could not give a significant speed gain — the check still needed to consider 2<sup>n</sup> cases (n — number of USE conditional groups in the transformed constraint). While it certainly could be faster than the previous one, especially that it did not require multiple iterations for each variant, and that the latter variants required less processing, it would still not be fast enough for a broad use.

The effective algorithm selected is somehow a simplified derivation of the above method. However, instead of analyzing the complete decision tree enforced by the REQUIRED_USE constraint, it focuses on analyzing the possible effects of each constraint. The specified algorithm has been split into four logical checks, although in real implementation they could be easily grouped together. Two of the checks are performed on each flattened constraint separately, and the other two are done on unique pairs of flattened constraints. As a result, the effective number of iterations is much lower than in the other cases, as is the complexity of each iteration.

Even with the additional logic needed to prevent some of the false positives the algorithm is still fast enough to serve its purpose. While it is not perfect, it has been tested on all real cases of REQUIRED_USE from Gentoo and verified not to cause any issues.

====Verification: altering immutable flags====
The first of the checks is meant to ensure that under no circumstances the constraint will attempt to toggle flags that are immutable, that is whose values are established through use.mask / use.force files. This concept is not only important for the scope of this GLEP but it also ensures that the constraints could be satisfied at all.

The generic idea is that the following constraint:

 a? ( b )

combined with use.mask on ''b'' will cause an error because if the user enables ''a'', then ''b'' is required but it can not be enabled. Likewise, the following:

 a? ( !b )

with ''b'' use.forced will cause an error since ''b'' can not be disabled.

Those constraints would be acceptable if ''a'' were masked as well, as to prevent the condition from ever being true. This is both the reason for the rule on the condition of flattened constraint, and the correct solution for the issue.

It should be noted that the check is done separately for every flattened constraint, and does not consider the implications of other constraints. That is, given the following example constraint:

 !a? ( !b ) b? ( c )

with both ''a'' and ''c'' masked, the check will still consider the REQUIRED_USE erroneous even though ''b'' could not ever be true. However, this is not realistically considered an issue and can be solved via masking ''b'' as well. It will also improve the clarity of the USE flags and avoid giving a false sense that ''b'' could be enabled.

====Verification: self-conflicting constraints====
This check is not especially important; it was added mostly as a matter of a precondition check to avoid providing unexpected input to the checks following it. It is meant to catch a self-conflicting conditions such as:

 a? ( !a? ( b ) )

As it can clearly be seen here, this condition will never evaluate to true because it would require ''a'' being both enabled and disabled simultaneously.

An occurrence of such a constraint is extremely unlikely. However, it effectively breaks some of the assumptions for the algorithms since it is impossible to provide a valid set of flags that would satisfy the condition. It is therefore explicitly rejected as invalid.

====Verification: forcing opposite values for a flag====
This check is meant to account for the case where a combination of two different constraints would require a flag to be both enabled and disabled at the same time. A generic example is:

 a? ( c )
 b? ( !c )

Here, the first constraint requires ''c'' enabled while the second one requires it disabled. Therefore, if the user enables both ''a'' and ''b'', the constraint can not be satisfied. The only enforcements explicitly allowed here are enabling and disabling ''c'' in order, neither of which is capable of solving the problem.

The first condition listed in the algorithm verifies the most important symptom of the problem — that two flattened constraints require the opposite values of a flag. The remaining conditions are meant to rule out false positives.

The second rule states that both conditions need to be able to simultaneously evaluate to true, or in other words the two conditions can not contain opposite values. For example, this rules out the following case:

 a? ( c )
 !a? ( b? ( !c ) )

where both conditions can never evaluate to true simultaneously because they require the opposite values of ''a''.

The third and fourth rules aim to verify that the condition can realistically occur after considering all the constraints preceding it. This is meant to avoid the following kind of false positives:

 !a? ( !b )
 !a? ( !c )
 b? ( c )

Here, after considering the first two conditions the second and third constraints can occur simultaneously because ''!a'' and ''b'' do not conflict with each other. However, after considering the constraints preceding it is clear that they can't since ''!a'' will implicitly disable ''b'', rendering the third condition false.

It should be noted that this also works for corner cases like:

 c? ( a )
 a? ( b )
 d? ( !a )
 !a? ( !b )

because even though the algorithm would incorrectly presume that the second and the fourth conditions can not occur simultaneously, it will detect a conflict between the first and the third conditions.

====Verification: enabling a condition preceding the constraint====
This check verifies that a constraint will not meaningfully cause a constraint preceding it to start to apply. This effectively means the constraints that will require more than one iteration of the algorithm to enforce them.

A generic example is:

 b? ( c )
 a? ( b )

In this case, having only ''a'' enabled will result in ''b'' being enabled in the first iteration, and ''c'' in the second.

The first condition verifies the most important symptom of the problem — that is, that the effect of the later constraint matches the condition of an earlier constraint. The remaining conditions rule false positives out.

Once again, the second condition checks whether the two conditions can occur simultaneously, that is not conflict one with another. A generic example of a false positive ruled out by this is:

 !a? ( b? ( c ) )
 a? ( b )

in which case although the second constraint enforces ''b'' that is one of the conditions for the first constraint, both conditions can not occur simultaneously since ''a'' would have to be enabled and disabled at the same time.

The third rule checks whether the conditions of the later constraint do not enforce the same effect as the earlier constraint anyway. That is, they account for a relatively common pattern of:

 b? ( c )
 a? ( b )
 a? ( c )

Even though the second constraint causes the first one to start to apply, having ''a'' enabled will also cause the third constraint to apply. Since the third constraint has the same effect as the first one, applying the first one will have no effect (the constraint will be satisfied already) and the second iteration will not be required.

====Helper algorithms====
The specification also provides helper algorithms to determine the two cases: when a condition can evaluate to true, and when it always evaluates to true. In general, the algorithms are concerned only by ''strong'' enforcements, that is those that are guaranteed to happen.

Therefore, it is assumed that a condition ''can'' evaluate to true unless there is at least one sub-condition that can not evaluate to true. That is, having a condition of the form:

 a? ( b? ( c? ( ... ) ) )

it is assumed that it can evaluate to true unless we explicitly have ''!a'', ''!b'' and/or ''!c'' in the currently enforced flag state. Otherwise, we assume that the flag can have any value and so the condition could be made true with appropriate flag values.

Appropriately, a condition ''always'' evaluates to true only if we know that all sub-conditions will evaluate to true. In the fore-mentioned example this would mean that the current flags would have to explicitly list ''a'', ''b'' and ''c''. Otherwise, we assume that one of the flags can have an opposite value and therefore make the condition evaluate to false.

While determining the effective flags, we are only concerned with the flattened constraints whose conditions always evaluate to true (with the value of flags current to the processed constraint). This is in order to avoid enforcing any flags that may not be enforced in a real use case. Considering the above, it means that the flags that would be enforced by such constraints are left in ''undefined'' state, and do not restrict the constraints following.

As noted in the limitation section, this logic suffers from the limitation that it can not infer complex implications of the constraints such as:

 !a? ( b ) a? ( b )

If the value of ''a'' is undefined at the time of processing, the algorithm will presume that neither of the conditions is guaranteed to be true, and therefore ''b'' will be left in undefined state. However, this is considered an unlikely corner case and is not a major concern.

==Backwards compatibility==

===Compliance with the PMS===

This GLEP does not break the PMS compliance in any way. The syntax used by the constraints is a subset of the [https://projects.gentoo.org/pms/6/pms.html#x1-780008.2 REQUIRED_USE syntax allowed by the PMS]. The semantic extends the one defined in the PMS in non-conflicting way.

The PMS does not require a very specific behavior for REQUIRED_USE. The [https://projects.gentoo.org/pms/6/pms.html#x1-910008.2.7 USE state constraints section] requires that the package manager does not use (build/install) package versions where REQUIRED_USE constraints are not met.

However, it does not require the package manager to verbosely report the conflict which the package managers actually do. That considered, it should not cause any non-compliance if this verbose reporting is (partially) replaced by automatic solving. If the solving succeeds, the constraints are met and the package manager can proceed with building/installing the package. If it does not, the existing behavior of reporting the issue is preserved.

===New constraints vs non-compliant package managers===

This GLEP preserves full syntax compatibility with the existing package managers. The constraints written for auto-solving will still work correctly in the package managers not supporting it, resulting in regular REQUIRED_USE mismatch. Furthermore, the extended semantic meaning can result in improved readability of constraints, and therefore the messages issued by the package managers. Users aware of the auto-solving rules will have a suggested algorithm for satisfying REQUIRED_USE.

The only potential danger is that the auto-solving will result in more extensive use of REQUIRED_USE and less concern for whether they are satisfied by default, resulting in more frequent REQUIRED_USE mismatches. Avoiding this problem should be done on policy level, requiring the developers not to rely purely on auto-solving through a migration period.

===Old constraints vs auto-solving===

Most of the existing REQUIRED_USE constraints are already compatible with auto-solving. There are three problematic cases:

# Constraints that are disallowed per [[#Restrictions_on_REQUIRED_USE_format|the restrictions on REQUIRED_USE format]],
# Constraints that can not be solved by the algorithm,
# Constraints that give sub-optimal (non-preferred) solutions.

While the impact and details differ for each case, it can be commonly noted that all of them can be reliably fixed before implementing auto-solving, and — as noted above — the fixes will not break existing package managers.

====Constraints disallowed in this GLEP====

For simplification, this GLEP will reject some of the REQUIRED_USE forms that are valid per the PMS. They will be rejected for all combinations of USE flags that do not satisfy the constraint. However, this is not a major issue for three reasons:

# The unsupported constraints are extremely rare, of low value and fixing them improves readability. As listed in [[#Restrictions_for_allowed_REQUIRED_USE_syntax|rationale for the restrictions]], there were a total of 8 packages being affected at the time of writing, and fixing them was already in progress.
# The constraints are only rejected for auto-solving but are still supported for REQUIRED_USE verification. The package manager will therefore just report the unsolvable REQUIRED_USE to the user, making this not a regression from the previous state.
# This GLEP does not strictly disallow the package manager from solving those constraints, only does not specify the solutions for them. Therefore, the package managers may implement custom extensions to solve them. However, they should still warn that this is non-portable and unreadable.

====Constraints that can not be solved====

Not all valid REQUIRED_USE constraints can be reliably solved. There are two major cases for that:

# Constraints that toggle flags that caused previous conditions not to apply. Solving those may require more than one iteration of the solving algorithm. However, they usually can be fixed easily by reordering.
# Constraints that have conflicts between flags. Solving those will result in repeated results where the constraint is unsatisfied. With multi-iteration solving, they can cause infinite loops. They have no trivial solution.

However, the problem usually applies to only some of the disallowed USE flag combinations. The verification algorithm should be able to detect most of those cases.

====Constraints with sub-optimal solutions====

While this specification uses an algorithm that attempts to read REQUIRED_USE constraints in the most natural way, not all constraints in Gentoo are written in this manner. Especially, many any-of, at-most-one-of and exactly-one-of groups are written with no specific ordering in mind. In some cases, they are used interchangeably with USE-conditional groups. Some USE-conditional groups are written without concern for clearly establishing the relation between the condition and the items inside the group.

While the auto-solving algorithm is able to solve many of those constraints, the solution can be considered sub-optimal as they do not follow the solution that the developers would knowingly suggest. For example, per the current rules the two following constraints are equivalent:

 feature? ( dep )
 !dep? ( !feature )

However, per the auto-solving semantic the first one will favor enabling the dependency, while the second one will favor disabling the feature.

This is probably the most important issue since there is no easy way to automatically detect that.

==Reference implementation==

===Proof-of-concept code===

The reference implementation of various algorithms and the scripts used to test them are included in [https://github.com/mgorny/required-use the mgorny/required-use project on GitHub].

The repository includes the following scripts/modules:
* {{Path|parser.py}} which provides a simple parser of REQUIRED_USE constraints into AST, and is supposed to represent a minimal parser that should be implemented in a package manager already. When run as a script, it outputs the AST of input string.
* {{Path|solve.py}} which provides an implementation of solving (enforcement) algorithm. When run a script, it prints the solutions (output flag sets) for every possible input flag combination.
* {{Path|sort_nary.py}} which provides an implementation of sorting any-of, at-most-one-of and exactly-one-of groups according to immutable flags. When run as a script, it prints the AST of input string after sorting.
* {{Path|to_flat3.py}} which implements the transformation into flattened constraints. When run as a script, it transforms the input string to a list of flattened constraints and prints it.
* {{Path|validate_ast.py}} which implements the validation of correct nesting in AST. When run as a script, it validates the input string.
* {{Path|verify2.py}} which implements the verification algorithms for the QA checks. When run as a script, it verifies the input string and prints the result.

===PkgCore===

The implementation of the following parts of the specification have been submitted to the PkgCore package manager for inclusion:

* validation of REQUIRED_USE constraints for compliance with the restricted syntax: [https://github.com/pkgcore/pkgcheck/pull/58 pkgcheck PR#58].

All of those bits are actively used in the pkgcheck fork used for the [[Project:Repository_mirror_and_CI|Repository mirror & CI]] project for CI.

==Thanks==

The author would like to thank Alexis Ballier <aballier@gentoo.org> for his feedback, mathematical analysis and his own reference code that helped shape the GLEP into its final form and made it possible to solve many of the problems.

==Copyright==

This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/.