Quick sort and related Arrays utilities#58
Quick sort and related Arrays utilities#58prvshah51 wants to merge 33 commits intodafny-lang:masterfrom
Conversation
The latter doesn’t work well in CI here because the /verificationLogger options cause extra, unpredictable output lines
src/Comparison.dfy
Outdated
| cmp(t0, t1) | ||
| } | ||
|
|
||
| predicate Complete??(t0: T, t1: T) { |
There was a problem hiding this comment.
I'm not a big fan of the ?? suffix. We may want to engage in some bikeshedding about style here before we introduce it. Based on my experience even Valid? below is not common, Valid is much more common and is even baked into the language through {:autocontracts}.
I'm not necessarily to using ? in user-defined constructs, I'd just like to agree on concrete guidance in the style guide on it. :)
There was a problem hiding this comment.
Okay I see where the idea came from at least - you almost want to overload Complete? to accept either a pair or a set.
Could this be CompleteOnPair perhaps?
There was a problem hiding this comment.
Yeah, the ?? was due to have two layers. Happy with calling them Complete and Complete' (the OCaml style) or Complete1 (the Lisp style) or even Complete_ or Complete2. CompletePair is long.
There was a problem hiding this comment.
Okay, let's go with removing all of the single ? suffixes (except for Le? and Ge? above), and replacing all of the double ?? suffixes with a ' instead.
So in this case Complete?? will become Complete', and Complete? will become Complete.
robin-aws
left a comment
There was a problem hiding this comment.
So naturally I've focussed most of my reviewing capacity on arguing about question marks. :) I do actually think it's a really important point for readability though, so I'd love to nail that down first before I look more deeply at the proofs.
| reads arr | ||
| ensures t in arr[lo..hi] | ||
| { | ||
| arr[lo] // TODO |
There was a problem hiding this comment.
We should either implement this or drop the function and just inline this choice of pivot above.
There was a problem hiding this comment.
Let's implement it — it's not hard and it's an important part of making the algorithm safe.
There was a problem hiding this comment.
As per offline conversation we decided not to block on this, but in that case we should add a comment on QuickSort that it doesn't yet have the intended O(n log n) worst case runtime yet (which is what @cpitclaudel was referring to as "safe").
src/Collections/Arrays/Arrays.dfy
Outdated
| } | ||
| } | ||
|
|
||
| lemma Sortable_Slice(arr: array<T>, lo: int, hi: int, lo': int, hi': int) |
There was a problem hiding this comment.
| lemma Sortable_Slice(arr: array<T>, lo: int, hi: int, lo': int, hi': int) | |
| lemma SortableSlice(arr: array<T>, lo: int, hi: int, lo': int, hi': int) |
Only instance of an underscore I see here and it really sticks out.
Also consider SortableSubslice instead.
There was a problem hiding this comment.
SortableSlice or SortableSubslice both makes sense putting Subslice describes it better.
| import opened Comparison | ||
| import opened Wrappers | ||
|
|
||
| trait Sorter<T> { |
There was a problem hiding this comment.
This definitely needs an example in examples to demonstrate using it. I assume you have to not only provide a concrete class that implements Sorter<T> and fills in Cmp, but perhaps even has to invoke a lemma or two from the trait in order to prove that it works.
Is it possible to also provide a PredicateSorter class that is constructed around a (T, T) -> Cmp function value? Perhaps it could even be a partial function if you also provide a ghost set of all values you intend to sort with it.
There was a problem hiding this comment.
Just to clarify, I'd strongly prefer to have an example, but the PredicateSorter idea can wait for a future PR.
There was a problem hiding this comment.
Oh yikes, I went to write an example just to play around with it and immediately got the A class may only extend a trait in the same module, unless the parent trait is annotated with {:termination false} error. That means this is currently useless unless we add {:termination false}.
Unfortunately we should treat that as a hard blocker given there's already been resistance to putting {:termination false} on any standard library code: #22 And especially given the discovery of dafny-lang/dafny#2500 I'm not inclined to push back on that.
We can think about whether there's a good alternative for this code that doesn't use traits, but it might be better to stick with this if it's good UX and at least wait for the solution to dafny-lang/dafny#2500.
There was a problem hiding this comment.
That's not right: the proper API to use is PredicateSorter, which is a regular class, defined right below Sorter. That doesn't require :termination false.
Original code here, with examples: https://github.com/dafny-lang/compiler-bootstrap/blob/4f616822209828e48cf63d3da66ee1c010f689d4/src/Utils/Lib.Sort.dfy#L396-L420
There was a problem hiding this comment.
Thanks @cpitclaudel, I see the issue now is this PR doesn't include the PredicateSorter class. We should just add that as well.
| const Le? := this != Gt | ||
| const Ge? := this != Lt |
There was a problem hiding this comment.
I definitely like this use of ? at least, since it's a very natural extension of the built-in Lt?/Eq?/Gt? discriminator syntax.
| Complete?(ts) && /* Antisymmetric?(ts) && */ Transitive?(ts) | ||
| } | ||
|
|
||
| predicate Sorted(sq: seq<T>) { |
There was a problem hiding this comment.
Why wouldn't this have a ? as well then?
I'll be happiest if we come up with a simple rule for where ? belongs. I definitely don't think it should be every predicate, especially since changing Valid to Valid? will be very disruptive for arguably not a lot of benefit. It looks like the pattern here is if the predicate is "about" the receiver rather than the arguments, but that feels very fuzzy to me, and I don't find it improves readability personally.
Perhaps to start, the straw-person proposal could be to only use it in symbols that aren't used like function calls, such as the Le? and Ge? constants above.
src/Comparison.dfy
Outdated
| cmp(t0, t1) | ||
| } | ||
|
|
||
| predicate Complete??(t0: T, t1: T) { |
There was a problem hiding this comment.
Okay I see where the idea came from at least - you almost want to overload Complete? to accept either a pair or a set.
Could this be CompleteOnPair perhaps?
| requires 0 <= lo <= hi <= arr.Length | ||
| reads arr | ||
| { | ||
| multiset(arr[lo..hi]) == old(multiset(arr[lo..hi])) |
There was a problem hiding this comment.
I remember adding this to almost every every function method in MergeSort.dfy will need to replace that with shuffled instead.
Co-authored-by: Robin Salkeld <salkeldr@amazon.com>
| import opened Comparison | ||
| import opened Wrappers | ||
|
|
||
| trait Sorter<T> { |
There was a problem hiding this comment.
Just to clarify, I'd strongly prefer to have an example, but the PredicateSorter idea can wait for a future PR.
src/Comparison.dfy
Outdated
| cmp(t0, t1) | ||
| } | ||
|
|
||
| predicate Complete??(t0: T, t1: T) { |
There was a problem hiding this comment.
Okay, let's go with removing all of the single ? suffixes (except for Le? and Ge? above), and replacing all of the double ?? suffixes with a ' instead.
So in this case Complete?? will become Complete', and Complete? will become Complete.
src/Comparison.dfy
Outdated
| {} | ||
|
|
||
| } | ||
| } No newline at end of file |
src/Comparison.dfy
Outdated
| } | ||
|
|
||
| predicate {:opaque} Valid?(ts: set<T>) { | ||
| Complete?(ts) && /* Antisymmetric?(ts) && */ Transitive?(ts) |
There was a problem hiding this comment.
| Complete?(ts) && /* Antisymmetric?(ts) && */ Transitive?(ts) | |
| Complete?(ts) && */ Transitive?(ts) |
Just following the general rule of not checking in commented out code. We could put a comment saying something like "we want to include Antisymmetric(ts) as well but it isn't necessary and makes verification timeout" (for example, I don't actually know in this case).
| } | ||
|
|
||
| predicate Complete??(t0: T, t1: T) { | ||
| cmp(t0, t1) == cmp(t1, t0).Flip() |
There was a problem hiding this comment.
It threw me a bit that I didn't recognize this isn't one of the standard properties of total orderings, until I realized that "Transitive" really meant "cmp(...).Le? is a transitive binary relation" and similarly for "Reflexive". Is "complete" a standard term for this property of this kind of comparator?
I expect at the point where we try to connect Relations to this version, we'll have a lemma that proves that C.Valid(s) ==> Relations.TotalOrdering((t1, t2) => C.cmp(t1, t2).Le?)
src/Collections/Arrays/Arrays.dfy
Outdated
| ensures Shuffled(arr, lo, hi) | ||
| {} | ||
|
|
||
| twostate predicate SameElements(arr: array<T>, lo: int, hi: int) |
There was a problem hiding this comment.
Consider UnchangedExceptSliceShuffled instead.
| reads arr | ||
| ensures t in arr[lo..hi] | ||
| { | ||
| arr[lo] // TODO |
There was a problem hiding this comment.
As per offline conversation we decided not to block on this, but in that case we should add a comment on QuickSort that it doesn't yet have the intended O(n log n) worst case runtime yet (which is what @cpitclaudel was referring to as "safe").
| Complete?(ts) && /* Antisymmetric?(ts) && */ Transitive?(ts) | ||
| } | ||
|
|
||
| predicate Sorted(sq: seq<T>) { |
| cmp(t0, t1) | ||
| } | ||
|
|
||
| predicate CompleteonPair(t0: T, t1: T) { |
There was a problem hiding this comment.
| predicate CompleteonPair(t0: T, t1: T) { | |
| predicate Complete'(t0: T, t1: T) { |
CompleteonPair was my earlier not-as-good suggestion, let's do Complete' instead :)
|
|
||
| function method Cmp(t0: T, t1: T): Cmp | ||
|
|
||
| twostate predicate UnchangedSlice(arr: array<T>, lo: int, hi: int) |
There was a problem hiding this comment.
Just food for thought, or a potential learning exercise: It might make all of this more readable if we had an actual ArraySlice datatype to pass around containing these values. It's a great example of the classic Design Pattern of introducing a type to hold collections of values that are always passed around together. Something like:
datatype ArraySlice_<T> = ArraySlice_(arr: array<T>, lo: int, hi: int) {
predicate Valid() {
0 <= lo <= hi <= arr.Length
}
twostate predicate Unchanged()
requires Valid()
reads arr
{
arr[lo..hi] == old(arr[lo..hi])
}
// etc.
}
type ArraySlice<T> = a: ArraySlice_<T> | a.Valid() witness *AFAICT this would be useful even if we only used it in ghost code to avoid any runtime cost (and even then it's possible we could optimize the wrapper away in the future, just as we now do for single-field datatypes).
| import opened Comparison | ||
| import opened Wrappers | ||
|
|
||
| trait Sorter<T> { |
There was a problem hiding this comment.
Oh yikes, I went to write an example just to play around with it and immediately got the A class may only extend a trait in the same module, unless the parent trait is annotated with {:termination false} error. That means this is currently useless unless we add {:termination false}.
Unfortunately we should treat that as a hard blocker given there's already been resistance to putting {:termination false} on any standard library code: #22 And especially given the discovery of dafny-lang/dafny#2500 I'm not inclined to push back on that.
We can think about whether there's a good alternative for this code that doesn't use traits, but it might be better to stick with this if it's good UX and at least wait for the solution to dafny-lang/dafny#2500.
Fix #54
By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.