Join Faster! Join Faster!!
Work will be crazy the next few months. I don't think it will be super busy, but one of the projects I'll be working on will be very open ended and undefined. I will be working on the optimizer and query execution engine of MySQL. That sounds really cool, since those are 2 of the main components of a database system. I think it'll be interesting to work on the main guts of the system, since that's where a lot of the cool crap happens. =)
However, the crazy part of this project (technical details below) is that it really isn't defined at all. I mean, there is basically a final, hand-wavy goal, but other than that, I have to do the rest. Theoretically, it should be possible, but I have no idea how to do it, how long it will take, and how much code I will have to change/add. The unknown is kinda scary, but I'll just be posi and say I'll get it working eventually... =) MySQL code is also very ugly, so I have to work with really ugly crap again. Yeah, it's ugly, and I don't like ugly...
Technical details, few will read, no one will care:
MySQL has a feature called multi-range read, where you can read multiple index key ranges in a single storage engine call. This basically aggregates multiple single index key lookup calls. However, this doesn't work completely with our setup, so I will have to change that first. After I do that, I can use that to greatly speed up joins. MySQL uses nested-loop joins for all their joins. They also have a feature of the join buffer to try to do block-nested-loop join, but they only consider using that for cartesian products. So, my next task will be to use join buffers, to better implement block-nested-loop joins and also be able to use the multi-range reads to consolidate multiple index key lookups into fewer calls.
However, the crazy part of this project (technical details below) is that it really isn't defined at all. I mean, there is basically a final, hand-wavy goal, but other than that, I have to do the rest. Theoretically, it should be possible, but I have no idea how to do it, how long it will take, and how much code I will have to change/add. The unknown is kinda scary, but I'll just be posi and say I'll get it working eventually... =) MySQL code is also very ugly, so I have to work with really ugly crap again. Yeah, it's ugly, and I don't like ugly...
Technical details, few will read, no one will care:
MySQL has a feature called multi-range read, where you can read multiple index key ranges in a single storage engine call. This basically aggregates multiple single index key lookup calls. However, this doesn't work completely with our setup, so I will have to change that first. After I do that, I can use that to greatly speed up joins. MySQL uses nested-loop joins for all their joins. They also have a feature of the join buffer to try to do block-nested-loop join, but they only consider using that for cartesian products. So, my next task will be to use join buffers, to better implement block-nested-loop joins and also be able to use the multi-range reads to consolidate multiple index key lookups into fewer calls.
