In order to get a better understanding of the cost of data queries some tests have been carried out.
Different databases, different ways of specifying the query and various indexing methods have been tried.
Sample Data are stored in a relational database.
A perl script is run to retrieve events located in a region defined by ra_max, dec_max, ra_min,dec_min
. (typically 2x2 degrees).
The time spent by the query as well as the number of events found are kept.
The query is launched 100 times, statistics are output.
Postgresql, since a long time, has implemented geospatial features.
select * from events where ra_min < ra and ra < ra_max and dec_min < dec and dec < dec_max
The response time is around 1. second. On 100 queries the average number of events found is 50.
select * from events where point(ra,dec) @ box ra_max,dec_max, ra_min,dec_min
The object point is constructed with the content of the 2 columns ra, dec
. The operator @
means is inside.
The response time is around 1.2 second.
select * from events where position @ box ra_max,dec_max, ra_min,dec_min
A column position
has been added to the table. It is of type point
and stores ra, dec
.
The response time is around 0.7 second.
select * from events where error && box ra_max,dec_max, ra_min,dec_min
A column error
has been added to the table. It is of type box
and stores ra+dg, dec+dg, ra-dg, dec dg
with dg = 1.°
. The operator &&
means overlaps.
The response time is around 0.9 second.