The Solr distribution comes with a couple of sample applications. I will be focusing on 2 of those - one is found in the example directory (solr-4.2.0/example) and the other is found in the example-DIH directory (solr-4.2.0/example/example-DIH). The Data Import Handler (DIH) is used to index database contents.
Deploying both the Solr examples with Jetty (source)
Deploying a custom DIH application (backed by PostgreSQL) with Tomcat 7
A bit of background - I'm going to assume a simple database table whose contents we would like to index. My Solr home here is /opt/solr-4.2.0/example/solr/.
Zookeeper
On Solr Caches :
Deploying both the Solr examples with Jetty (source)
- Download and untar Solr :
- cd /opt
- wget http://apache.mirror.vexxhost.com/lucene/solr/4.2.0/solr-4.2.0.tgz
- tar -xvf solr-4.2.0.tgz
- Change security groups :
- open port 8983 on your EC2 security group
- you can access this from your online EC2 console
- Start the Solr sample example :
- cd /opt/solr-4.2.0/example
- java –jar start.jar
- If you want to run the DIH Solr example instead :
- java -Dsolr.solr.home="/opt/solr-4.2.0/example/example-DIH/solr/" -jar start.jar
- You should be able to see the Solr manager at :
- foo.com:8983/solr/admin/ or (your-ec2-ip):8983/solr/admin/
- Note that your Solr home is /opt/solr-4.2.0/example/solr/ for the example and /opt/solr-4.2.0/example/example-DIH/solr/ for the DIH example.
Deploying both the Solr examples with Tomcat 7
- Setup Tomcat 7 on your EC2 instance :
- Check out my blog post on this (here)
- Change server.xml :
- sudo vim /opt/apache-tomcat-7.0.34/conf/server.xml
- add the following
<Connector port="8983" protocol="HTTP/1.1" connectionTimeout="20000" redirectPort="8443" URIEncoding="UTF-8" />
- Create solr.xml :
- sudo vim /opt/apache-tomcat-7.0.34/conf/Catalina/localhost/solr.xml
- add the following
<?xml version="1.0" encoding="utf-8"?> <Context path="/solr" docBase="/opt/apache-tomcat-7.0.34/webapps/solr.war" debug="0" crossContext="true"> <Environment name="solr/home" type="java.lang.String" value="/opt/solr-4.2.0/example/solr" override="true"/> </Context>
- The next thing is the Solr deployment. To do that we need the /opt/solr-4.2.0/dist/solr-4.2.0.war file that contains the necessary files and libraries to run Solr that is to be copied to the Tomcat webapps directory and renamed solr.war.
- Start the Solr sample example :
- sudo service tomcat restart
- If you want to run the Data Import Handler (DIH) Solr example instead :
- in step 3, replace
<Environment name="solr/home" type="java.lang.String" value="/opt/solr-4.2.0/example/solr" override="true"/>
- with
<Environment name="solr/home" type="java.lang.String" value="/opt/solr-4.2.0/example/example-DIH/solr" override="true"/>
- You should be able to see the Solr manager at :
- foo.com:8983/solr/admin/ or (your-ec2-ip):8983/solr/admin/
- Note that your Solr home is /opt/solr-4.2.0/example/solr/ for the example and /opt/solr-4.2.0/example/example-DIH/solr/ for the DIH example.
Deploying a custom DIH application (backed by PostgreSQL) with Tomcat 7
A bit of background - I'm going to assume a simple database table whose contents we would like to index. My Solr home here is /opt/solr-4.2.0/example/solr/.
- Lets assume a simple table (if you want to setup PostgreSQL on your EC2 instance, refer to my blog post on this here) :
create table users( uid serial primary key, firstname varchar(255) not null );
- Start off by copying the db directory which the Solr DIH example uses to the Solr example directory (I would rather run a Solr cluster on the example, as opposed to example-DIH) :
- cd /opt/solr-4.2.0/example/solr
- cp -r /opt/solr-4.2.0/example/example-DIH/solr/db .
- Modify solr.xml :
- sudo vim /opt/solr-4.2.0/example/solr/solr.xml
- Add <core name="db" instanceDir="db" />
- Modify solrconfig.xml :
- sudo vim /opt/solr-4.2.0/example/solr/db/conf/solrconfig.xml
- change <lib dir="../../../../dist/" regex="solr-dataimporthandler-.*\.jar" /> to <lib dir="../../../dist/" regex="solr-dataimporthandler-.*\.jar" />
- Modify db-data-config.xml :
- sudo vim /opt/solr-4.2.0/example/solr/db/conf/db-data-config.xml
- remove everything and add your own DB configuration e.g.,
<dataConfig> <dataSource driver="org.postgresql.Driver" url="jdbc:postgresql://foo.com:5432/fooDB" user="power_user" password="pass" /> <document> <entity name="user" query="SELECT uid,firstname from users"> <field column="uid" name="id" /> <field column="firstname" name="name" /> </entity> </document> </dataConfig>
- The /opt/solr-4.2.0/example/solr/db/conf/schema.xml file doesn't need to be changed for my example, but you will likely have to change it if you're going to index other stuff in your DB.
- You will also need to include your JDBC driver :
- cd /opt/solr-4.2.0/example/solr/db/lib
- paste your postgresql-9.1-902.jdbc4.jar in that folder
- Restart Tomcat :
- sudo service tomcat restart
- When you begin indexing things, you might encounter permission problems creating the db/data/index folder. In this case, I just change all the permissions of the db folder and its files :
- sudo chmod -R 777 /opt/solr-4.2.0/example/solr/db
- this might not be a great idea security wise
In order to run SolrCloud—the distributed Solr installation—you need to have Apache ZooKeeper installed. Zookeeper is a centralized service for maintaining configurations, naming, and provisioning service synchronization. SolrCloud uses ZooKeeper to synchronize configuration and cluster states (such as elected shard leaders), and that's why it is crucial to have a highly available and fault tolerant ZooKeeper installation. If you have a single ZooKeeper instance and it fails then your SolrCloud cluster will crash too.
- Download and untar Zookeeper :
- wget http://apache.mirror.vexxhost.com/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz
- sudo tar -xvf zookeeper-3.4.5.tar.gz
- Create and modify your zoo.cfg file :
- cd /opt/zookeeper-3.4.5/conf/
- sudo cp zoo_sample.cfg zoo.cfg
- sudo vim /opt/zookeeper-3.4.5/conf/zoo.cfg
- Add/change the following
dataDir=/opt/zookeeper-3.4.5/data server.1=<ec2-instance-ip>:2888:3888 (e.g., server.1=192.168.1.1:2888:3888) server.2=<ec2-instance-ip>:2888:3888 (e.g., server.1=192.168.1.2:2888:3888) server.3=<ec2-instance-ip>:2888:3888 (e.g., server.1=192.168.1.3:2888:3888)
-
Regarding the second line that you added, the first part is the IP address of the ZooKeeper server, and the second and third parts are the ports used by ZooKeeper instances to communicate with each other.
- Open port 2181 from ec2 console.
- Start Zookeeper :
- sudo /opt/zookeeper-3.4.5/bin/zkServer.sh start
- If successful, the following message will be displayed :
JMX enabled by default Using config: /opt/zookeeper-3.4.5/bin/../conf/zoo.cfg Starting zookeeper ... STARTED
Notes
On Solr Caches :
- Caches play a major role in a Solr deployment. There are three Solr caches :
- Filter cache : This is used for storing filter (query parameter fq ) results and mainly enum type facets
- Document cache : This is used for storing Lucene documents which hold stored fields
- Query result cache : This is used for storing results of queries
- There is a fourth cache - Lucene's internal cache - which is a field cache, but you can't control its behaviour. It is managed by Lucene and created when it is first used by the Searcher object.
- With the help of these caches we can tune the behaviour of the Solr searcher instance. Solr cache sizes should be tuned to the number of documents in the index, the queries, and the number of results you usually get from Solr.
On Solr Directory Implementation :
- One of the most crucial properties of Apache Lucene, and thus Solr, is the Lucene directory implementation.
- The directory interface provides an abstraction layer for Lucene on all the I/O operations. This can affect the performance of your Solr setup in a drastic way.
- If you want Solr to make the decision for you, you should use solr.StandardDirectoryFactory. This is a filesystem-based directory factory that tries to choose the best implementation based on your current operating system and Java virtual machine used.
- If you are implementing a small application, which won't use many threads, you can use solr.SimpleFSDirectoryFactory which stores the index file on your local filesystem, but it doesn't scale well with a high number of threads.
- solr.NIOFSDirectoryFactory scales well with many threads, but it doesn't work well on Microsoft Windows platforms (it's much slower), because of the JVM bug, so you should remember that.
- solr.MMapDirectoryFactory was the default directory factory for Solr for the 64-bit Linux systems from Solr 3.1 till 4.0. This directory implementation uses virtual memory and a kernel feature called mmap to access index files stored on disk. This allows Lucene (and thus Solr) to directly access the I/O cache. This is desirable and you should stick to that directory if near real-time searching is not needed.
- If you need near real-time indexing and searching, you should use solr.NRTCachingDirectoryFactory. It is designed to store some parts of the index in memory (small chunks) and thus speed up some near real-time operations greatly.
- solr.RAMDirectoryFactory, is the only one that is not persistent. The whole index is stored in the RAM memory and thus you'll lose your index after restart or server crash. Also you should remember that replication won't work when using solr.RAMDirectoryFactory. One would ask, why should I use that factory? Imagine a volatile index for an autocomplete functionality or for unit tests of your queries' relevancy. Just anything you can think of, when you don't need to have persistent and replicated data. However, please remember that this directory is not designed to hold large amounts of data.
(This post is based on the Apache Solr 4 Cookbook by Rafal Kuc)
Cloud is one of the tremendous technology that any company in this world would rely on(Salesforce Certification). Using this technology many tough tasks can be accomplished easily in no time. Your content are also explaining the same(Salesforce crm training in chennai). Thanks for sharing this in here. You are running a great blog, keep up this good work.
ReplyDeletevery informative blog
ReplyDeleteArtificial intelligence Training in noida
ReplyDeleteArtificial intelligence Training in noida-Artificial Intelligence Training in Noida, Artificial Intelligence Training classes in Noida, Artificial Intelligence Training classes in Noida, Artificial Intelligence Training
by Real time ARTIFICIAL INTELLIGENCE Experts, Big-Data and ARTIFICIAL INTELLIGENCE Certification Training in Noida
WEBTRACKKER TECHNOLOGY (P) LTD.
C - 67, sector- 63, Noida, India.
F -1 Sector 3 (Near Sector 16 metro station) Noida, India.
+91 - 8802820025
0120-433-0760
0120-4204716
EMAIL: info@webtrackker.com
Website: www.webtrackker.com
Our Other Courses:
artificial intelligence Training in noida
SAS Training Institute in Delhi
SAS Training in Delhi
SAS Training center in Delhi
Sap Training Institute in delhi
Sap Training in delhi
Best Sap Training center in delhi
Best Software Testing Training Institute in delhi
Software Testing Training in delhi
Software Testing Training center in delhi
Best Salesforce Training Institute in delhi
Salesforce Training in delhi
Salesforce Training center in delhi
Best Python Training Institute in delhi
Python Training in delhi
Best Android Training Institute In delhi
Best Python Training center in delhi
Android Training In delhi
best Android Training center In delhi
ReplyDeleteBest Solidworks training institute in noida
SolidWorks is a solid modeling computer-aided design (CAD) and computer-aided engineering (CAE) computer program that runs on Microsoft Windows. SolidWorks is published by Dassault Systems. Solid Works: well, it is purely a product to design machines. But, of course, there are other applications, like aerospace, automobile, consumer products, etc. Much user friendly than the former one, in terms of modeling, editing designs, creating mechanisms, etc.
Solid Works is a Middle level, Main stream software with focus on Product development & this software is aimed at Small scale & Middle level Companies whose interest is to have a reasonably priced CAD system which can support their product development needs and at the same time helps them get their product market faster.
Company Address:
WEBTRACKKER TECHNOLOGY (P) LTD.
C-67,Sector-63,Noida,India.
E-mail: info@webtracker.com
Phone No: 0120-4330760 ,+91-880-282-0025
webtrackker.com/solidworks-training-Course-institute-in-noida-delhi
3D Animation Training in Noida
ReplyDeleteBest institute for 3d Animation and Multimedia
Best institute for 3d Animation Course training Classes in Noida- webtrackker Is providing the 3d Animation and Multimedia training in noida with 100% placement supports. for more call - 8802820025.
3D Animation Training in Noida
Company Address:
Webtrackker Technology
C- 67, Sector- 63, Noida
Phone: 01204330760, 8802820025
Email: info@webtrackker.com
Website: http://webtrackker.com/Best-institute-3dAnimation-Multimedia-Course-training-Classes-in-Noida.php
Awesome Post!! Keep posting more Blogs.
ReplyDeleteSelenium Training in Chennai
Best Selenium Training Institute in Chennai
ios developer training in chennai
Digital Marketing Training in Chennai
.Net coaching centre in chennai
PHP Institutes in Chennai
PHP Training Center in Chennai
Wow cool site I have already withdrawn money from here. Actually my girlfriend advised me and I decided to risk the risk and you elegant real casino I wish you more victories
ReplyDeleteYour info is really amazing with impressive content..Excellent blog with informative concept. Really I feel happy to see this useful blog, Thanks for sharing such a nice blog..
ReplyDeleteIf you are looking for any python Related information please visit our website Python classes in pune page!
Hi,
ReplyDeleteBest article, very useful and well explanation. Your post is extremely incredible.Good job & thank you very much for the new information, i learned something new. Very well written. It was sooo good to read and usefull to improve knowledge. Who want to learn this information most helpful. One who wanted to learn this technology IT employees will always suggest you take Data science course in Pimple Saudagar
Best Python Training Institute in Noida
ReplyDeleteGood content and you explained clearly about this topic..thank you for shairing this information
ReplyDeleteBig Data Training Institutes In Bangalore
best big data training institute in Bangalore
big data courses in bangalore with placement
Really i appreciate the effort you made to share the knowledge. The topic here i found was really effective...
ReplyDeleteLooking for Training Institute in Bangalore , India. Softgen Infotech is the best one to offers 85+ computer training courses including IT software course in Bangalore, India. Also it provides placement assistance service in Bangalore for IT.
Best Software Training Institute in Bangalore
wonderful thanks for sharing an amazing idea. keep it...
ReplyDeleteStart your journey with Training Institute in Bangaloreand get hands-on Experience with 100% Placement assistance from Expert Trainers with 8+ Years of experience @eTechno Soft Solutions Located in BTM Layout Bangalore.
SAP Training in Bangalore
ReplyDeleteThanks for sharing,excellent information.It is very useful for me to learn and understand easily.Tableau is a powerful and fastest growing data visualization tool used in the Business Intelligence Industry. Business Intelligence Industry suggest to take tableau course to enhance their skills
tableau training institute in bangalore
Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.
ReplyDeleteDigital Marketing Training Course in Chennai | Digital Marketing Training Course in Anna Nagar | Digital Marketing Training Course in OMR | Digital Marketing Training Course in Porur | Digital Marketing Training Course in Tambaram | Digital Marketing Training Course in Velachery
"Thanks for sharing useful information.. we have learned so much information from your blog..... keep sharing
ReplyDeleteDigital Marketing Training Course in Chennai | Digital Marketing Training Course in Anna Nagar | Digital Marketing Training Course in OMR | Digital Marketing Training Course in Porur | Digital Marketing Training Course in Tambaram | Digital Marketing Training Course in Velachery
"
Your information about CLR is really interesting and innovative. Also I want you to share latest updates about this CLR. Can you update it in your website? Thanks for sharing
ReplyDeleteDigital Marketing Training Course in Chennai | Digital Marketing Training Course in Anna Nagar | Digital Marketing Training Course in OMR | Digital Marketing Training Course in Porur | Digital Marketing Training Course in Tambaram | Digital Marketing Training Course in Velachery
Good job in presenting the correct content with the clear explanation. The content looks real with valid information. Good Work
ReplyDeleteDot Net Training in Chennai | Dot Net Training in anna nagar | Dot Net Training in omr | Dot Net Training in porur | Dot Net Training in tambaram | Dot Net Training in velachery
Fantastic article to go through,I would appreciate the writer's mind and the skills he has presented this great article to get its look in better style.
ReplyDeleteWeb Designing Training in Chennai
Web Designing Course in Chennai
Web Designing Training in Bangalore
Web Designing Course in Bangalore
Web Designing Training in Hyderabad
Web Designing Course in Hyderabad
Web Designing Training in Coimbatore
Web Designing Training
Web Designing Online Training
Your info is really amazing with impressive content..Excellent blog with informative concept. Really I feel happy to see this useful blog..
ReplyDeleteweb designing training in chennai
web designing training in omr
digital marketing training in chennai
digital marketing training in omr
rpa training in chennai
rpa training in omr
tally training in chennai
tally training in omr
Your post is very good. I got to learn a lot from your post. Thank you for sharing your article for us. it is amazing post about computer science and technology.
ReplyDeleteweb designing training in chennai
web designing training in tambaram
digital marketing training in chennai
digital marketing training in tambaram
rpa training in chennai
rpa training in tambaram
tally training in chennai
tally training in tambaram
I would like to appreciate your work for good accuracy. Keep posting.
ReplyDeletejava training in chennai
java training in annanagar
aws training in chennai
aws training in annanagar
python training in chennai
python training in annanagar
selenium training in chennai
selenium training in annanagar
Good blog, it's really very informative, do more blog under good concepts.
ReplyDeleteDigital Marketing Course in OMR
Digital Marketing Course in T Nagar
Digital Marketing Course in Anna Nagar
Digital Marketing Course in Velachery
Digital Marketing Course in Tambaram
Really, it’s a useful blog. Thanks for sharing this information.
ReplyDeleteR programming Training in Chennai
R programming Training in Bangalore
Xamarin Course in Chennai
Ionic Course in Chennai
ReactJS Training in Chennai
PLC Training in Chennai
Thanks for sharing information to our knowledge, it helps me plenty keep sharing…
ReplyDeleteBig Data Training Institute In Bangalore
Big Data Training In Bangalore
To become successful and good entrepreneurs, they first have to identify the real needs and problems of people and solve them. Thus, enrolling in Entrepreneur Training Courses is the best idea. To know more visit here
ReplyDeleteExcellent and very cool idea and great content of different kinds of the valuable information's.
ReplyDeleteData Science Training in Bangalore
Data Science Training Institute in Bangalore
หาคุณกำลังหาเกมส์ออนไลน์ที่สามารถสร้างรายได้ให้กับคุณ เรามีเกมส์แนะนำ เกมยิงปลา รูปแบบใหม่เล่นง่ายบนมือถือ คาสิโนออนไลน์ บนคอม เล่นได้ทุกอุปกรณ์รองรับทุกเครื่องมือ มีให้เลือกเล่นหลายเกมส์ เล่นได้ทั่วโลกเพราะนี้คือเกมส์ออนไลน์แบบใหม่ เกมยิงปลา
ReplyDeleteperde modelleri
ReplyDeleteNumara onay
mobil ödeme bozdurma
NFTNASİLALİNİR.COM
Ankara Evden Eve Nakliyat
trafik sigortası
DEDEKTÖR
Website kurma
aşk kitapları
thanks for the article
ReplyDeletebest embedded system training in chennai
< a href =" http://www.plctraininginchennai.net"> plc and scada training in chennai | plc and scada training | plc and scada training center | plc training institute in chennai
embedded training in chennai | microcontroller training in chennai | embedded c training in chennai | arduino training in chennai | arm training in chennai | raspberry pi training in chennai
best vlsi training in chennai | vlsi training institute in chennai | vlsi training center in chennai | vlsi projects for engineering | vlsi project ideas | vlsi projects and training | vlsi courses | verilog training | vhdl training
Nice article... thanks for sharing...
ReplyDeletebest project center in chennai| project center in chennai| embedded project center in chennai| software proejct center in chennai| ieee project center in chennai| mechanical project center in chennai| final year project center in chennai| hardware project center in chennai
Great blog... Thanks for sharing..
ReplyDeleteBest CSE Project Center in Chennai| Best Software Project Center in Chennai