SOLR client APIs

Cesar Capillas

Using python API for SOLR

In SOLR, you may find different client APIs for your favourite programming language such as Java, Python, Ruby, Perl or Javascript. Basically, client apps can reach Solr by creating HTTP requests and parsing the corresponding HTTP responses, encapsulating much of the work of sending requests and parsing responses, and making easier to write client applications. If you are a java programmer you will be confortable with SorlJ, while you can use pysolr or rsolr libs, for Python and Ruby languages. Rsolr was used previously in the output connector for sending log data from logstash to SOLR in a post about monitoring metrics with SOLR, Banana and Apache Zeppelin. By the way, the wt parameter in the query allows to choose an appropiate output format for the chosen API, for example JSON, XML, binary (used by SolrJ), Python, PHP, Ruby, XSLT or CSV.

The following script illustrates how to send a query to SOLR in python using http requests (urllib) and parsing the JSON response (simplejson), without using a client API. You need to rpip the cited libraries if you don’t have them installed.

#! /usr/bin/python
import urllib
import simplejson
import pprint
import sys

host       = "localhost"
port       = "8983"
collection = "techproducts"
qt         = "select"
url        = 'http://' + host + ':' + port + '/solr/' + collection + '/' + qt + '?'

q          = "q=ipod"
fl         = "fl=id,name"
fq         = "fq="
rows       = "rows=10"
wt         = "wt=json"
#wt        = "wt=python"
params     = [ q, fl, fq, wt, rows ] 
p          = "&".join(params)

connection = urllib.urlopen(url+p)

if wt == "wt=json":
  response   = simplejson.load(connection) 
else:
  response   = eval(connection.read())

print "Number of hits: " + str(response['response']['numFound'])
pprint.pprint(response['response']['docs'])

Take note that we can use wt=python or wt=json, but with wt=python we need to use an eval function later, which may be potencially insecure, while JSON format is a more robust response format. This results in:

$./do-search.py

Number of hits: 3
[{'id': 'IW-02', 
  'name': ['iPod & iPod Mini USB 2.0 Cable']},
 {'id': 'F8V7067-APL-KIT',
  'name': ['Belkin Mobile Power Cord for iPod w/ Dock']},
 {'id': 'MA147LL/A', 
  'name': ['Apple 60 GB iPod with Video Playback Black']}]

On the other hand, we can use PySolr client API, a lightweight Python wrapper for SOLR.

#! /usr/bin/python
import pprint
import pysolr
import sys

host       = "localhost"
port       = "8983"
collection = "techproducts"
q          = "ipod"
fl         = "id,name"
qt         = "select"
fq         = ""
rows       = "10"
url        = 'http://' + host + ':' + port + '/solr/' + collection 

solr       = pysolr.Solr(url, search_handler="/"+qt, timeout=5)
results    = solr.search(q, **{
    'fl': fl,
    'fq': fq,
    'rows': rows
})

print("Number of hits: {0}".format(len(results)))
for i in results:
  pprint.pprint(i)

$./do-search-pysolr.py

Number of hits: 3
{u'id': u'IW-02', 
 u'name': [u'iPod & iPod Mini USB 2.0 Cable']}
{u'id': u'F8V7067-APL-KIT',
 u'name': [u'Belkin Mobile Power Cord for iPod w/ Dock']}
{u'id': u'MA147LL/A', 
 u'name': [u'Apple 60 GB iPod with Video Playback Black']}

This second form is more appropiate if you have a cluster with Apache Zookeeper, or for some other methods such add, delete or update.

# For SolrCloud mode
zookeeper = pysolr.ZooKeeper("zkhost1:2181,zkhost2:2181,zkhost3:2181")
solr      = pysolr.SolrCloud(zookeeper, collection)

This is tested in SOLR 6.6 with SOLRCloud.

Links:

https://lucene.apache.org/solr/guide/6_6/client-api-lineup.html
https://lucene.apache.org/solr/guide/6_6/using-python.html#UsingPython-SimplePython
https://pypi.python.org/pypi/pysolr/3.6.0
https://www.zylk.net/es/web-2-0/blog/-/blogs/more-on-monitoring-dashboards-for-alfresco-using-solr-banana-and-apache-zeppelin

Si te ha parecido interesante comparte este post en RRS

Leer más sobre temas relacionados

tech

El Empoderamiento de la Mujer en la Era Digital

La presente década está siendo protagonizada por el proceso de digitalización y la aparición de tecnologías disruptivas que prometen poner patas arriba el mundo tal

6 de marzo de 2024 No hay comentarios

liferay

Liferay y ChatGPT: La Fusión de la Experiencia Digital y la Inteligencia Artificial

En un mundo donde la comunicación digital desempeña un papel fundamental en la experiencia del cliente, la integración de tecnologías avanzadas como el procesamiento del

1 de marzo de 2024 No hay comentarios

G-SMART 5.0, respaldado por el programa Hazitek de SPRI y liderado por el Grupo Gestamp busca impulsar la Smart Factory en la industria vasca

innovación / i+d

GSMART 5.0 Avanzando hacia la Smart Factory. Innovación Tecnológica en el sector Industrial del País Vasco

En la actualidad industrial, el desarrollo tecnológico ha creado un entorno marcado por la competencia entre regiones tecnológicamente avanzadas y una constante incertidumbre en la

30 de diciembre de 2023 No hay comentarios

Indatia trabaja el paradigma smart factory con IA y el tratamiento de los datos

cloudera

INDATIA. Nuevos desarrollos para la gestión de los datos y la Inteligencia Artificial

El futuro de la industria actual, pasa por impulsar lo que se denomina Industria 4.0, donde los datos producidos por la fabricación se convierten en

29 de diciembre de 2023 No hay comentarios

plataforma digital de control de calidad para la fabricación con cero defectos mediante inteligencia artificial

innovación / i+d

KAIA: Plataforma digital de control de calidad para la fabricación con cero defectos mediante inteligencia artificial

El proyecto KAIA es una iniciativa surgida en 2021 y llevada a cabo gracias al programa Hazitek de SPRI y los Fondos FEDER Europeos, en

28 de diciembre de 2023 No hay comentarios

Airsafe es un proyecto de Hazitek para monitorizar la calidad del aire en tiempo real

cloudera

AIRSAFE: control y monitorización real-time de la calidad del aire

A lo largo de los últimos años ha aumentado la concienciación de la calidad del aire en espacios públicos cerrados como hospitales, colegios o residencias,

26 de diciembre de 2023 No hay comentarios

Deja un comentario Cancelar respuesta

Busca por categorías