Searching Datasets
Basic Queries
To search your datasets, begin by selecting a dataset from your Koverse workspace dashboard. This will present you with your dataset and data within that you have the correct ABAC attributes to view.
Additionally, you can select to search the current dataset, or all datasets via the dropdown menu to the immediate right of the search field.
Type your search criteria within the search field and allow a few moments as it is applied. Below you can see an example of what type of view you will encounter with a search.
Lucene Queries
You can also pass Lucene queries through the dataset search to have more refined results by selection and/or exclusion of specific fields and values. The basic syntax structures are:
field:value
field1:value1 AND field2:value2
field:value1 OR field:value2
Key points to remember:
- Search terms (with or without field) without operator specified will be
AND
by default - Search term without a specified field name will be for all fields
- This is true even if there is a search term with a field in the query string
- An empty search will be treated as "select all records from dataset"
- Grouping utilizing parentheses
()
is unsupported - Partially indexed datasets will only return indexed fields from the dataset when searched.
The above types of queries could result in something similar to the below depictions. In example, a dataset including films and shows can be queried like so:
- field:value
country:Qatar
- field1:value1 AND field2:value2...
genres:Dramas, Independent AND country:United Kingdom AND rating:R AND release_year:2015
More advanced searches can be provided as well:
field1:value1 AND field2:value2 OR field2:value3
In example:
- field1:value1 AND field2:value2 OR field2:value3
type:movie AND release_year:2015 OR release_year:2010
Searching Your Datasets and Indexing
Datasets are automatically indexed during ingest; however, you can disable the index entirely before or after ingest if you choose. Alternatively, you may modify the index settings after data is ingested to add specific fields for indexing, or to selectively disable existing index fields. Indexing will impact how and what you can find using the dataset search function, as fields that are not indexed are not searchable. As stated previously, only indexed fields will populate as a result of your search query.
For additional information on indexing datasets see Index Management.
As an example, the below image depicts how a non-indexed dataset will present during the search: