Open IT Experts for Enterprise

Zylk empresa de desarrollo de ecommerce

SOLR client APIs

Cesar Capillas
Cesar Capillas
Solr banner

Using python API for SOLR

In SOLR, you may find different client APIs for your favourite programming language such as Java, Python, Ruby, Perl or Javascript. Basically, client apps can reach Solr by creating HTTP requests and parsing the corresponding HTTP responses, encapsulating much of the work of sending requests and parsing responses, and making easier to write client applications. If you are a java programmer you will be confortable with SorlJ, while you can use pysolr or rsolr libs, for Python and Ruby languages. Rsolr was used previously in the output connector for sending log data from logstash to SOLR in a post about monitoring metrics with SOLR, Banana and Apache Zeppelin. By the way, the wt parameter in the query allows to choose an appropiate output format for the chosen API, for example JSON, XML, binary (used by SolrJ), Python, PHP, Ruby, XSLT or CSV.

The following script illustrates how to send a query to SOLR in python using http requests (urllib) and parsing the JSON response (simplejson), without using a client API. You need to rpip the cited libraries if you don’t have them installed.

#! /usr/bin/python
import urllib
import simplejson
import pprint
import sys

host       = "localhost"
port       = "8983"
collection = "techproducts"
qt         = "select"
url        = 'http://' + host + ':' + port + '/solr/' + collection + '/' + qt + '?'

q          = "q=ipod"
fl         = "fl=id,name"
fq         = "fq="
rows       = "rows=10"
wt         = "wt=json"
#wt        = "wt=python"
params     = [ q, fl, fq, wt, rows ] 
p          = "&".join(params)

connection = urllib.urlopen(url+p)

if wt == "wt=json":
  response   = simplejson.load(connection) 
else:
  response   = eval(connection.read())

print "Number of hits: " + str(response['response']['numFound'])
pprint.pprint(response['response']['docs'])

Take note that we can use wt=python or wt=json, but with wt=python we need to use an eval function later, which may be potencially insecure, while JSON format is a more robust response format. This results in:

$./do-search.py

Number of hits: 3
[{'id': 'IW-02', 
  'name': ['iPod & iPod Mini USB 2.0 Cable']},
 {'id': 'F8V7067-APL-KIT',
  'name': ['Belkin Mobile Power Cord for iPod w/ Dock']},
 {'id': 'MA147LL/A', 
  'name': ['Apple 60 GB iPod with Video Playback Black']}]

On the other hand, we can use PySolr client API, a lightweight Python wrapper for SOLR.

#! /usr/bin/python
import pprint
import pysolr
import sys

host       = "localhost"
port       = "8983"
collection = "techproducts"
q          = "ipod"
fl         = "id,name"
qt         = "select"
fq         = ""
rows       = "10"
url        = 'http://' + host + ':' + port + '/solr/' + collection 

solr       = pysolr.Solr(url, search_handler="/"+qt, timeout=5)
results    = solr.search(q, **{
    'fl': fl,
    'fq': fq,
    'rows': rows
})

print("Number of hits: {0}".format(len(results)))
for i in results:
  pprint.pprint(i)
$./do-search-pysolr.py

Number of hits: 3
{u'id': u'IW-02', 
 u'name': [u'iPod & iPod Mini USB 2.0 Cable']}
{u'id': u'F8V7067-APL-KIT',
 u'name': [u'Belkin Mobile Power Cord for iPod w/ Dock']}
{u'id': u'MA147LL/A', 
 u'name': [u'Apple 60 GB iPod with Video Playback Black']}

This second form is more appropiate if you have a cluster with Apache Zookeeper, or for some other methods such add, delete or update.

# For SolrCloud mode
zookeeper = pysolr.ZooKeeper("zkhost1:2181,zkhost2:2181,zkhost3:2181")
solr      = pysolr.SolrCloud(zookeeper, collection)

This is tested in SOLR 6.6 with SOLRCloud.

Links:

Si te ha parecido interesante comparte este post en RRS

Facebook
LinkedIn
Telegram
Email

Leer más sobre temas relacionados

Deja un comentario

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *