Even when a destructive update must be performed on an output reference, UIP outperforms LRLR.
This is fairly straightforward and is the sort of thing done in LRLR.
By contrast, LRLR allocates [H|NT] and binds NT in the recursive call for a total of two stores (initialization and binding) in the second field as compared with UIP's one.
This means that the polynomial method for reuse analysis presented in Debray  cannot be applied to LRLR or UIP.
To compare the performance of LRLR and UIP code against standard, untransformed code, we measured the runtimes of 16 different programs.
Performance Comparison of Standard Prolog, LRLR, UIP Using setarg/3, and UIP Using Macros.
Std LRLR UIP w/setarg Program Upd Upd Speedup Upd Speedup app 4.
Of course, this is easily remedied by either UIP or LRLR.
This is what we expected, since LRLR performs at least as many destructive updates as UIP.
The cases where LRLR outperforms UIP implemented by using setarg (labeled "w/setarg" in Table I) are cases where most output references must eventually be updated.