| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
By default the ES index will only be created now if it is not
alredy present. In case it is already present it won't be
recreated when using the kkuleomi:index:init task.
To forcefully recreate the index, kkuleomi:index:recreate
can be used.
Signed-off-by: Max Magorsch <arzano@gentoo.org>
|
|
|
|
| |
Signed-off-by: Hans de Graaff <graaff@gentoo.org>
|
|
|
|
| |
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So far, any history-related information has been fetched using git.
That is, whenever a user requested any history-related information
'git log' was run and the output was parsed. That is time-consuming.
Loading the page https://packages.gentoo.org/packages/keyworded takes
around 120 seconds this way.
Instead of doing so, any git commits are now added to an ES index and
retrieved using ES. This way, the same page as mentioned before, loads
in under 3 seconds.
The commits for populating the index are fetched incrementally. This
way, the first run may take some time, but afterwards, updates are fast.
Signed-off-by: Max Magorsch <max@magorsch.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Elasticsearch-persistence is used as the persistence layer for Ruby
domain objects in Elasticsearch in this application. So far, the
ActiveRecord pattern has been used here. However, this pattern
has been deprecated as of version 6 of the gem and was removed in
version 7. That's why the application has been migrated to use the
repository pattern instead.
For further information, please see:
https://www.elastic.co/blog/activerecord-to-repository-changing-
persistence-patterns-with-the-elasticsearch-rails-gem
Note: The old Elasticsearch index won't be compatible with this
version anymore. That's why a fresh index should be populated.
Signed-off-by: Max Magorsch <max@magorsch.de>
|
| |
|
| |
|
|
|
|
|
|
| |
The previous version relied on elastic search matching packages based
solely on package name (e.g. "gentoo-sources") but in order to collect
the version from the index, an entire atom is required.
|
|
|
|
| |
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
|
|
|
|
|
|
|
|
|
|
| |
According to
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html
this controls the number of buckets (defaulting to 10 apparently). This
was changed in ES5 but went unnoticed until now.
Force to 10k buckets to continue to support flags with *lots* of
packages and revisit in the future if too expensive.
|
|
|
|
|
|
|
| |
Some USE flags like static-libs are very common, so raise the limit.
Bug: https://bugs.gentoo.org/648040
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
|
|
|
|
|
| |
Its simpler to fix by not renaming the class and just renaming the index
name due to how we automate things in p.g.o.
|
| |
|
|
|
|
| |
Merge base_settings into settings used to create index.
|
|
|
|
|
|
| |
Originally designed as a replacement for find_all_by_parent, we don't
need a custom function for it because Version.find_all_by(:package,
name) works just as well.
|
|
|
|
| |
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html
|
|
|
|
|
| |
size: 0 goes out outside of the query (and already exists in this
query); dropping it fixes queries.
|
|
|
|
|
|
|
| |
The 'text' fields were not indexed properly. Keyword fields are
indexed properly, so turn it back into a keyword.
Also fix the index reloading to remove indices by name.
|
|
|
|
| |
https://www.elastic.co/guide/en/elasticsearch/reference/1.4/query-dsl-function-score-query.html
|
|
|
|
|
| |
Indexes have field limits now. I'm not sure this is the right fix,
but I've expanded it to 25k; hoping to contain the entire tree.
|
| |
|
|
|
|
|
| |
ES6's multiple indexes (as opposed to a single index) mean we cannot
use this relation. Instead we key versions off of package atoms.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Due to splitting of indexes for ES6, we can no longer rely on parent,
child relations to determine 'which packages have which versions'
because ES6 cannot join between two indexes.
Instead we lookup in the versions table a particular package (CP).
This should yield some reasonable count of versions (typically 1-100).
We then use those results to find answers like "highest version" by
simplying sorting the result and taking the first of the sort.
|
|
|
|
|
|
| |
ES6 replaces this with 'query'. See:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-has-parent-query.html
|
|
|
|
| |
Comma separate the list items, of course.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ES6 introduces a number of breaking changes into ES.
1) https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
This basically entails breaking up the single multi-typel index we had
into N indices, one per model. Our other option is to continue to use a
single index, but add a custom type property. This seemed unwieldy.
2) The 'String' type was also deprecated. See
https://www.elastic.co/blog/strings-are-dead-long-live-strings
So we see many updates from "not analyzed" to "keyword" to retain
the previous behavior
|
| |
|
|
|
|
|
|
|
| |
Similar to the previous commit:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html
documents that query => filtered semantic is deprecated. This fixes more
references.
|
|
|
|
|
|
| |
https://www.elastic.co/guide/en/elasticsearch/reference/5.1/query-dsl-filtered-query.html
"Filtered / Filter" is still valid.
|
|
|
|
| |
This should hopefully relieve the load the repeated `git log' calls cause.
|
|
|