Thursday, April 23, 2020

SOLR Search - CookBook

SOLR 

Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene.

This blog has a curated list of SOLR packages and resources. It starts with how to install and then show some basic implementation and usage.

Installing Solr


Typically in order to install on my Mac, I always use Homebrew
first update your brew: brew update
    Updated Homebrew from 37714b5ce to 373a454ac.
then install solr: brew install solr

However this time I am going to show step by step installation on mac as explained in the Apache Solr Reference Guide:


Starting Solr

Once extracted, you are now ready to run Solr
bin/solr start

bin/solr start

*** [WARN] *** Your open file limit is currently 2560.  

 It should be set to 65000 to avoid operational disruption. 

 If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh

*** [WARN] ***  Your Max Processes Limit is currently 5568. 

 It should be set to 65000 to avoid operational disruption. 

 If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh

Waiting up to 180 seconds to see Solr running on port 8983 [-]  

Started Solr server on port 8983 (pid=71357). Happy searching!

This will start Solr in the background, listening on port 8983.

Check if Solr is Running

bin/solr status

bin/solr status


Found 1 Solr nodes: 


Solr process 71357 running on port 8983

{

  "solr_home":"/Users/***********/SOLR/solr-8.6.2/server/solr",

  "version":"8.6.2 016993b65e393b58246d54e8ddda9f56a453eb0e - ivera - 2020-08-26 11:00:26",

  "startTime":"2020-09-11T17:54:18.187Z",

  "uptime":"0 days, 0 hours, 0 minutes, 31 seconds",

  "memory":"322.7 MB (%63) of 512 MB"}

Interfaces

Use a Web browser to see the Admin Console.
http://localhost:8983/solr/

Other Interfaces:


  • Appleseed Search Web User Appleseed Search Web User interfaces - Angular JS 1 Search Interfaces for SolR, Elastic Edit Add topics.
  • Blacklight A multi-institutional open-source collaboration building a better discovery platform framework.
  • Solr PHP UI Solr client and user interface for search (UI).
  • AJAX Solr AJAX Solr is a JavaScript library for creating user interfaces to Apache Solr.
  • Spyglass Simple search results with Solr and EmberJS.
  • Splainer Angular JS Solr and Elasticsearch Diagnostic Search Services.
  • Solrstrap Solrstrap is a Query-Result interface for Solr.
  • ngSolr Easy faceted search for Apache Solr.
  • SOLR-AJAX Single Page Faceted Search Interface to Apache Solr/Lucene.
  • Solstice A simple Solr wrapper for AngularJS apps.
  • SolrDora A quick and easy way to explore the data in your Solr core.

  • Create a Core


    bin/solr create -c party

    This will create a core that uses a data-driven schema which tries to guess the correct field type when you add documents to the index.

    To see all available options for creating a new core, execute:

    bin/solr create -help
    Once the core is created you can see that from the SOLR administration console


    Stoping Solr

    bin/solr stop

    bin/solr stop


    Sending stop command to Solr running on port 8983 ... waiting up to 180 seconds to allow Jetty process 71357 to stop gracefully.




    Indexing Exercise:

    I followed the indexing exercise from the 

    Indexing Techproducts Example Data


    This exercise will walk you through how to start Solr as a two-node cluster (both nodes on the same machine) and create a collection during startup. Then you will index some sample data that ships with Solr and do some basic searches.

    Launch Solr in SolrCloud Mode

    To launch Solr, run: bin/solr start -e cloud on Unix or MacOS


    Searching

    Tools

    • Solr proxies Simple solr proxies implemented in PHP, Node.js, Java, or NGINX.

    Projects

    • Transformalize This tool expedites mundane data processing tasks like cleaning, reporting, and denormalization. Specifically can quickly process data from SQL/MySQL/PostgreSQL to Solr/ Elasticsearch.
    • JesterJ A new highly flexible, highly scaleable document ingestion system.
    • Spark-Solr Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.
    • Apache Flume Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.
    • Storm Solr Tools for building Storm topologies for indexing data into SolrCloud.
    • SolrMQ SolrMQ is a plugin for Solr that allows you to send updates to Solr using a AMQP messaging queue. We use the RabbitMQ library.

    Clients

    • solrs An async, non-blocking solr client for java/scala, providing a query interface like SolrJ.
    • Solr Play Scala Client A Scala library in Play framework for indexing and searching documents within an Apache Solr.
    • Python:SolrClient SolrClient is a simple python library for Solr; built in python3 with support for latest features of Solr.
    • mysolr mysolr was born to be a fast and easy-to-use client for Apache Solr’s API and because existing Python clients didn’t fulfill these conditions.
    • rsolr A ruby client for Solr.
    • Sunspot Solr-powered search for Ruby objects.
    • Solarium Solarium is a Solr client library for PHP.
    • Solr PHP extension The Solr extension allows you to communicate effectively with the Apache Solr Server in PHP.
    • Go-Solr A solr library written in Go.
    • go-solr Solr client in Go, core admin, add docs, update, delete, search and more.
    • Gora A simple Solr client for Go.
    • Solrclj A Clojure client for Apache Solr.
    • flux A Clojure based Solr client.
    • solr-node-client A solr client for node.js. A solr client for indexing, adding, deleting,committing and searching documents within an Apache Solr installation
    --
    Captain Nemo
    New York

    No comments:

    Post a Comment

    Scala & Spark for Managing & Analyzing Big Data (Using Machine Learning)

    Managing & Analyzing Big Data using Apache Scala & Apache Spark In this blog we will see how to use Scala and Spark to analyze Big D...