A5 Q2

Discussion:

A5 Q2

(too old to reply)

Daniel S. Roche

2011-07-17 21:34:00 UTC

Hi class,

Some students are asking about how to handle duplicate keys in the
RangeSkip class. Note that the underlying (included) SkipList class does
handle duplicate keys seamlessly, just by having separate nodes
(SkipTowers) for each time the key is repeated.

There are at least two good reasons why your RangeSkip class should also
handle duplicate x values in the keys:
1) Since we have two-dimensional data, two keys could be distinct but
have identical x-coordinate. For instance, the points (1,2) and (1,3).
2) The keys here are approximate values (type double), so the question
of equality becomes much less clear anyway. For instance, if the points
represent geographical coordinates, we might have two different cities
that are very close together, so their (imprecise) coordinates are the
same even though they should be distinct.

However, for the purposes of auto-marking your assignment, I will be
very nice and promise that all points are in *general position*, meaning
the the x-coordinates AND the y-coordinates of any two distinct points
will always be different. So for instance we won't test with points
(1,2) and (1,3), or with points (4,5) and (6,5), or with points (0,2)
and (0,2).

So although you should handle duplicate x and y coordinates, and I
actually think the code is simpler if this is handled correctly, you
don't have to in order to receive full marks for this assignment.

-Dan Roche
CS 240 Instructor, Section 003

Maros Hluska

2011-07-18 04:49:17 UTC

Permalink

Post by Daniel S. Roche
Hi class,
Some students are asking about how to handle duplicate keys in the
RangeSkip class. Note that the underlying (included) SkipList class does
handle duplicate keys seamlessly, just by having separate nodes
(SkipTowers) for each time the key is repeated.
There are at least two good reasons why your RangeSkip class should also
1) Since we have two-dimensional data, two keys could be distinct but
have identical x-coordinate. For instance, the points (1,2) and (1,3).
2) The keys here are approximate values (type double), so the question
of equality becomes much less clear anyway. For instance, if the points
represent geographical coordinates, we might have two different cities
that are very close together, so their (imprecise) coordinates are the
same even though they should be distinct.
However, for the purposes of auto-marking your assignment, I will be
very nice and promise that all points are in *general position*, meaning
the the x-coordinates AND the y-coordinates of any two distinct points
will always be different. So for instance we won't test with points
(1,2) and (1,3), or with points (4,5) and (6,5), or with points (0,2)
and (0,2).
So although you should handle duplicate x and y coordinates, and I
actually think the code is simpler if this is handled correctly, you
don't have to in order to receive full marks for this assignment.
-Dan Roche
CS 240 Instructor, Section 003

I can't figure out how to handle duplicate x-coordinates. Is this
something you can explain since its not tested for on the assignment?

Since were using a BST for the x-coordinates, don't the values in the
left subtree have to be strictly less than the value at the node, and
strictly greater for values in the right subtree? So I'm not sure how to
add duplicate values without modifying the RangeNode structure.

Maros Hluska

Daniel S. Roche

2011-07-18 15:15:27 UTC

Permalink

Post by Maros Hluska
I can't figure out how to handle duplicate x-coordinates. Is this
something you can explain since its not tested for on the assignment?
Since were using a BST for the x-coordinates, don't the values in the
left subtree have to be strictly less than the value at the node, and
strictly greater for values in the right subtree? So I'm not sure how to
add duplicate values without modifying the RangeNode structure.

Well, the way we defined BSTs originally was assuming there were no
duplicates. To allow for duplicates, we just relax the ordering
conditions to say "all keys in left subtree are <= key of root" and "all
keys in right subtree are >= key of root".

For range trees (or range-skip trees), every point appears exactly once
in the top-level BST, which I will call the x-tree. So points that have
the same x coordinate show up as separate nodes in the x-tree. Of course
they must be predecessors/successors of each other, or else the BST
ordering property would be violated. However, the ordering among these
different nodes with the same x-coordinate is unspecified, and in fact
it doesn't matter. The definition of the associated structures is also
unchanged: they contain all points in the corresponding subtree from the
x-tree, sorted by y-coordinates.

I will point out that the included RangeSkip::rangeSearch function will
work correctly if there are different points with the same x-coordinate
in the x-tree, as long as they follow all the normal rules of the tree
as discussed above.

Again, YOU DO NOT NEED TO HANDLE SUCH CASES FOR THE PURPOSES OF MARKING
A5. Although again, I think that the code is probably simpler if you do.

-Dan Roche
CS 240 Instructor, Section 003