
Welcome To Biocarian
Semantic Explorer for Biological Databases
Semantic Explorer for Biological Databases
BioCarian is a search engine for exploratory searches of semantic databases. There are two modes of searches in BioCarian, free-text and faceted
User can start by entering a search term for free text searching. There is an autocompletion functionality for common search terms present in the database.
The results of a free-text search for the term cancer is shown below. The top most relevant 300 hits are displayed. If more results are needed the number of results required can be entered to the input box in the footer. Another option is to press the More option.
The following shows the query results after More has been selected.
Each search result is associated with a score and a star rating. The score is an absolute measure of the relevance of the entry to the free-text entered. For example, if the search term entered was "lung cancer", an entry containing the term "lung cancer" would get a higher score than an entry containing the term "cancer". The star rating shows the relevance of an entry compared to with other results. Higher starred results show most relevant results. The buttons that appear on top shows external links containing information about an entry. An entry can be expanded or collapse using the toggle link
If no free-text is entered then the search engine performs a faceted search. All possible facets for databases are displayed in this mode. Note that faceted search in free-text mode is similar to what is described here, except the domain of the facets is limited to the search results as opposed to the whole database. A feceted search without free text looks like what is shown below, where no specific result is shown and the available databases are shown.
There are three types of facets, Related, Deleted and Hidden. Related facets are the important facets in the current search results. These description of each facet and the number of entries in each of them are shown. Clicking on a select box will list the available facets. When a facet is right-clicked, a context menu will appear. This will show several options. If there are too may facet values cluttering the interface, Delete will temporarily move it into the deleted facets category. The Reset All option will undo all the facet value selections.
The Hidden Facets are those facet values that are typically not useful. If one of them needs to be activated use the Generate Facet option. This will make transfer the facet from Hidden Facets to Related Facets. If a facet generated from Hidden Facets is deleted, it is moved back to the Hidden Facet category.
Now we will select tissue types marked T(for tumor). There are 74 items in the samples menu. We can rank them by the count, or frequency. This will show each facet suffixed by the number of times it appears in the search result.
Another way is to rank them by significance. It assigns a rank to show the probability of a facet value appearing just by chance in a search. A lower probability indicates a low probability of a chance appearance. Facet values can be similarly ranked based on their over or under representation in the results.
74 facet values is too large to process. We will use the crop option to reduce the number of choices. The crop by count will retain only those facet values that have an extreme frequency. After cropping there are only 18 samples left here. They are color coded according to the distances their frequencies have with the mean frequency. A lighter yellow implies a higher frequency than normal, and a lighter green implies a lower frequency than normal.
If more or less values are required to be shown after a crop the corresponding options can be selected from the context menu. The figures below show the More/Less functionality.
Values can also be cropped by significance as shown below. In this case only those facets that are most significant to the search result is shown, and in this particular case result is a reduction of 80 genes to 33 important genes. Similarly the values can be cropped by over and under representation.
The facet values can be sorted alphabetically or numerically (when possible). The middle picture below shows the facet values sorted numerically and the last picture the same facet values sorted alphabetically.
Setting menu has several options. It can set the SPARQL endpoint being used in the text box SPARQL endpoing. The Search Limit is the number of hits to return in a free-text search. The Relevance compared to the tophit is related to the hits that are returned in a free-text search. If a hit has a score below this percentage of the highest scoring hit it is regarded as not relevant to the search. The Number of results to display is the number of hits shown per database.
The SPARQL editor can take custom SPARQL queries and generate facets on the result. The general template for a query can be auto-generated by pressing the template button and edited. The requirement is to write the query in a way that will return all the subjects under the variable ?subject. There is a select box that will display all the available predicate names in the database. They can be entered to the editor simply by selecting and clicking.