March 1, 2019
As a general guideline, Java becomes incredibly slow once you hit 50% memory usage as you spend more time in the garbage collector than in your code.
JSQLParser also supports more expressive limit clauses (e.g., including offsets). You will not be required to support anything more complex than LIMIT N.
$\sigma_{c_1 \wedge c_2 \wedge c_3}(R \times S) = \sigma_{c_1}(R) \bowtie_{c_2} \sigma_{c_3}(S)$
Pattern: A SelectIterator
with a ProjectIterator
as an input.
Iterator optimize(Iterator query) {
if(query instanceof SelectIterator){
// ...
}
}
Use instanceof
to identify the root of your pattern
Iterator optimize(Iterator query) {
if(query instanceof SelectIterator){
SelectIterator select = (SelectIterator)query;
// ...
}
}
Cast to the desired type.
Iterator optimize(Iterator query) {
if(query instanceof SelectIterator){
SelectIterator select = (SelectIterator)query;
if(select.input instanceof ProjectIterator){
ProjectIterator project = (ProjectIterator)select.input;
// ...
}
}
}
Match children (e.g., ProjectIterator
).
Iterator optimize(Iterator query) {
if(query instanceof SelectIterator){
SelectIterator select = (SelectIterator)query;
if(select.input instanceof ProjectIterator){
ProjectIterator project = (ProjectIterator)select.input;
query = new ProjectIterator(
new SelectIterator(
project.input
)
);
}
}
}
Compute a replacement
Iterator optimize(Iterator query) {
if(query instanceof SelectIterator){
SelectIterator select = (SelectIterator)query;
if(select.input instanceof ProjectIterator){
ProjectIterator project = (ProjectIterator)select.input;
query = new ProjectIterator(
new SelectIterator(
project.input
)
);
}
}
query.input = optimize(query.input);
return query;
}
Recur to find patterns nested in the children
The reference implementation uses Java's HashMap
.
Keep in-mind that you may need to hash multiple tuples to the same join key.
The join itself is easy once you have a sort operator. When running --on-disk
the reference implementation uses Sort-Merge.
Due: March 29