{"id":92365,"date":"2020-07-14T15:00:00","date_gmt":"2020-07-14T12:00:00","guid":{"rendered":"https:\/\/examples.javacodegeeks.com\/?p=92365"},"modified":"2020-07-13T13:50:34","modified_gmt":"2020-07-13T10:50:34","slug":"apache-solr-opennlp-tutorial-part-2","status":"publish","type":"post","link":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/","title":{"rendered":"Apache Solr OpenNLP Tutorial &#8211; Part 2"},"content":{"rendered":"<h2 class=\"wp-block-heading\"><a name=\"introduction\"><\/a>1. Introduction<\/h2>\n<p>In <a aria-label=\"undefined (opens in a new tab)\" href=\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial\/\" target=\"_blank\" rel=\"noreferrer noopener\">Part 1<\/a> we&#8217;ve set up Apache Solr OpenNLP integration and used its analysis components, tokenizer, and filters, to process and analyze the sample data. <\/p>\n<p>In this example, we are going to explore another powerful feature provided by Solr OpenNLP integration: extracting named entities at index time by using OpenNLP NER (Named Entity Recognition) model.<\/p>\n<div class=\"toc\">\n<h3>Table Of Contents<\/h3>\n<dl>\n<dt><a href=\"#introduction\">1. Introduction<\/a><\/dt>\n<dt><a href=\"#technologies_used\">2. Technologies Used<\/a><\/dt>\n<dt><a href=\"#solr_opennlp_ner_integration\">3. Solr OpenNLP NER Integration<\/a><\/dt>\n<dd>\n<dl>\n<dt><a href=\"#named_entity_recognition\">3.1. Named Entity Recognition<\/a><\/dt>\n<dt><a href=\"#setting_up_the_integration\">3.2. Setting Up The Integration<\/a><\/dt>\n<dt><a href=\"#examples\">3.3. Examples<\/a><\/dt>\n<\/dl>\n<\/dd>\n<dt><a href=\"#download\">4. Download the Sample Data File<\/a><\/dt>\n<\/dl>\n<\/div>\n<p>&nbsp;<\/p>\n<h2 class=\"wp-block-heading\"><a name=\"technologies_used\"><\/a>2. Technologies Used<\/h2>\n<p>The steps and commands described in this example are for <a aria-label=\"undefined (opens in a new tab)\" href=\"https:\/\/lucene.apache.org\/solr\/downloads.html#solr-852\" target=\"_blank\" rel=\"noreferrer noopener\">Apache Solr 8.5<\/a> on Windows 10. <a aria-label=\"undefined (opens in a new tab)\" href=\"http:\/\/opennlp.sourceforge.net\/models-1.5\/\" target=\"_blank\" rel=\"noreferrer noopener\">Pre-trained models for OpenNLP 1.5<\/a> are used in this example. To train your own models, please refer to Apache OpenNLP for details. The JDK version we use to run the SolrCloud in this example is OpenJDK 13.<br \/>Before we start, please make sure your computer meets the <a aria-label=\"undefined (opens in a new tab)\" href=\"https:\/\/lucene.apache.org\/solr\/8_5_0\/SYSTEM_REQUIREMENTS.html\" target=\"_blank\" rel=\"noreferrer noopener\">system requirements<\/a>. Also, please download the binary release of <a aria-label=\"undefined (opens in a new tab)\" href=\"https:\/\/lucene.apache.org\/solr\/downloads.html#solr-852\" target=\"_blank\" rel=\"noreferrer noopener\">Apache Solr 8.5<\/a>.<\/p>\n<h2 class=\"wp-block-heading\"><a name=\"solr_opennlp_ner_integration\"><\/a>3. Solr OpenNLP NER Integration<\/h2>\n<h3 class=\"wp-block-heading\"><a name=\"named_entity_recognition\"><\/a>3.1 Named Entity Recognition<\/h3>\n<p>In information extraction, a Named Entity is a real-world object, such as persons, locations, organizations, etc. Named Entity Recognition (NER) uses pre-trained models to locate and classify named entities in text into pre-defined categories. Each pre-trained model is dependent on the language and entity type it is trained for. Solr OpenNLP integration provides an update request processor to extract named entities using an OpenNLP NER model at index time. Let&#8217;s see how to set up the OpenNLP NER integration in the next section.<\/p>\n<h3 class=\"wp-block-heading\"><a name=\"setting_up_the_integration\"><\/a>3.2 Setting Up The Integration<\/h3>\n<p>Please follow the steps described in section 3.2 Set Up The Integration of <a aria-label=\"undefined (opens in a new tab)\" href=\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial\/\" target=\"_blank\" rel=\"noreferrer noopener\">Apache Solr OpenNLP Tutorial &#8211; Part 1<\/a> to put jars on the classpath and add required resources to the configSet. Once completed, firstly, please make sure the following directives are in <code>solrconfig.xml<\/code> of the <code>jcg_example_configs<\/code> configSet:<\/p>\n<pre class=\"brush:xml\">  &lt;lib dir=\"${solr.install.dir:..\/..\/..\/..\/..\/}\/contrib\/analysis-extras\/lucene-libs\" regex=\".*\\.jar\" \/&gt;\n  &lt;lib dir=\"${solr.install.dir:..\/..\/..\/..\/..\/}\/contrib\/analysis-extras\/lib\" regex=\".*\\.jar\"\/&gt;\n  &lt;lib path=\"${solr.install.dir:..\/..\/..\/..\/..\/}\/dist\/solr-analysis-extras-8.5.2.jar\"\/&gt;<\/pre>\n<p>Secondly, the pre-trained models for the English language are downloaded and copied to the <code>jcg_example_configs<\/code> configSet under the directory <code>${solr.install.dir}\\server\\solr\\configsets\\jcg_example_configs\\conf\\opennlp<\/code>.<\/p>\n<pre class=\"brush:bash\">D:\\Java\\solr-8.5.2\\server\\solr\\configsets\\jcg_example_configs\\conf\\opennlp&gt;dir\n Volume in drive D is Data\n Volume Serial Number is 24EC-FE37\n\n Directory of D:\\Java\\solr-8.5.2\\server\\solr\\configsets\\jcg_example_configs\\conf\\opennlp\n\n06\/30\/2020  11:28 PM    &lt;DIR&gt;          .\n06\/30\/2020  11:28 PM    &lt;DIR&gt;          ..\n06\/28\/2020  08:25 PM         2,560,304 en-chunker.bin\n06\/30\/2020  11:24 PM         1,632,029 en-lemmatizer.bin\n06\/28\/2020  08:24 PM         5,030,307 en-ner-date.bin\n06\/28\/2020  08:25 PM         5,110,658 en-ner-location.bin\n06\/28\/2020  08:25 PM         4,806,234 en-ner-money.bin\n06\/28\/2020  08:25 PM         5,297,172 en-ner-organization.bin\n06\/28\/2020  08:25 PM         4,728,645 en-ner-percentage.bin\n06\/28\/2020  08:25 PM         5,207,953 en-ner-person.bin\n06\/28\/2020  08:25 PM         4,724,357 en-ner-time.bin\n06\/28\/2020  08:26 PM        36,345,477 en-parser-chunking.bin\n06\/28\/2020  08:24 PM         5,696,197 en-pos-maxent.bin\n06\/28\/2020  08:24 PM         3,975,786 en-pos-perceptron.bin\n06\/28\/2020  08:24 PM            98,533 en-sent.bin\n06\/28\/2020  08:24 PM           439,890 en-token.bin\n06\/30\/2020  10:34 PM                35 stop.pos.txt\n              15 File(s)     85,653,577 bytes\n               2 Dir(s)  47,963,561,984 bytes free<\/pre>\n<p>Thirdly, the <code>text_en_opennlp<\/code> field type is added in <code>managed-schema<\/code> in <code>jcg_example_configs<\/code> configSet under the directory <code>${solr.install.dir}\\server\\solr\\configsets\\jcg_example_configs\\conf<\/code> as below:<div style=\"display:inline-block; margin: 15px 0;\"> <div id=\"adngin-JavaCodeGeeks_incontent_video-0\" style=\"display:inline-block;\"><\/div> <\/div><\/p>\n<pre class=\"brush:xml\">&lt;fieldType name=\"text_en_opennlp\" class=\"solr.TextField\" positionIncrementGap=\"100\"&gt;\n  &lt;analyzer&gt;\n    &lt;tokenizer class=\"solr.OpenNLPTokenizerFactory\" sentenceModel=\"opennlp\/en-sent.bin\" tokenizerModel=\"opennlp\/en-token.bin\"\/&gt;\n    &lt;filter class=\"solr.OpenNLPPOSFilterFactory\" posTaggerModel=\"opennlp\/en-pos-maxent.bin\"\/&gt;\n    &lt;filter class=\"solr.OpenNLPChunkerFilterFactory\" chunkerModel=\"opennlp\/en-chunker.bin\"\/&gt;\n    &lt;filter class=\"solr.KeywordRepeatFilterFactory\"\/&gt;\n    &lt;filter class=\"solr.OpenNLPLemmatizerFilterFactory\" lemmatizerModel=\"opennlp\/en-lemmatizer.bin\"\/&gt;\n    &lt;filter class=\"solr.RemoveDuplicatesTokenFilterFactory\"\/&gt;\n    &lt;filter class=\"solr.TypeAsPayloadFilterFactory\"\/&gt;\n    &lt;filter class=\"solr.TypeTokenFilterFactory\" types=\"opennlp\/stop.pos.txt\"\/&gt;\n  &lt;\/analyzer&gt;\n&lt;\/fieldType&gt;<\/pre>\n<p>Finally, let&#8217;s set up <a aria-label=\"undefined (opens in a new tab)\" href=\"https:\/\/lucene.apache.org\/solr\/guide\/8_5\/update-request-processors.html\" target=\"_blank\" rel=\"noreferrer noopener\">Update Request Processors<\/a> by using OpenNLP NER models. Detailed usage of <code>solr.OpenNLPExtractNamedEntitiesUpdateProcessorFactory<\/code> can be found in <a aria-label=\"undefined (opens in a new tab)\" href=\"https:\/\/lucene.apache.org\/solr\/8_5_0\/\/solr-analysis-extras\/org\/apache\/solr\/update\/processor\/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.html\" target=\"_blank\" rel=\"noreferrer noopener\">the java doc<\/a>. In this example we are going to extract organization names from the introduction field of an article by using OpenNLP NER model <code>en-ner-organization.bin<\/code> so the configurations are as below:<\/p>\n<p>Open <code>managed-schema<\/code>, add the following two fields:<\/p>\n<pre class=\"brush:xml\">&lt;field name=\"introduction\" type=\"string\" indexed=\"true\" stored=\"true\"\/&gt;\n&lt;field name=\"organization\" type=\"string\" indexed=\"true\" stored=\"true\"\/&gt;<\/pre>\n<p>Open <code>solrconfig.xml<\/code>, add the following update request processor chain with an OpenNLP NER update processor:<\/p>\n<pre class=\"brush:xml\">&lt;!-- Update requeset processor chain with OpenNLP NER Update Request Processor --&gt;\n&lt;updateRequestProcessorChain name=\"extract-organization\" default=\"true\"\n         processor=\"uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date,add-schema-fields\"&gt;\n  &lt;processor class=\"solr.OpenNLPExtractNamedEntitiesUpdateProcessorFactory\"&gt;\n    &lt;str name=\"modelFile\"&gt;opennlp\/en-ner-organization.bin&lt;\/str&gt;\n    &lt;str name=\"analyzerFieldType\"&gt;text_en_opennlp&lt;\/str&gt;\n    &lt;str name=\"source\"&gt;introduction&lt;\/str&gt;\n    &lt;str name=\"dest\"&gt;organization&lt;\/str&gt;\n  &lt;\/processor&gt;\n  &lt;processor class=\"solr.LogUpdateProcessorFactory\" \/&gt;\n  &lt;processor class=\"solr.RunUpdateProcessorFactory\" \/&gt;\n&lt;\/updateRequestProcessorChain&gt;<\/pre>\n<p>If you have other update request processor chain configured as default such as <code>add-unknown-fields-to-the-schema<\/code> chain, please comment it out.<\/p>\n<p>For your convenience, a <code>jcg_example_configs.zip<\/code> the file containing all configurations and schema is attached to the article. You can simply download and extract it to the directory <code>${solr.install.dir}\\server\\solr\\configsets\\jcg_example_configs<\/code>.<\/p>\n<h3 class=\"wp-block-heading\"><a name=\"examples\"><\/a>3.3 Examples<\/h3>\n<h4 class=\"wp-block-heading\">3.3.1 Trying The Pre-defined Model With OpenNLP Name Finder<\/h4>\n<p>Before we start Solr and use the pre-trained NER model to index data, there is an easy way to try out the pre-trained NER model with Apache OpenNLP name finder. It is a command line tool for demonstration and testing purpose. Download the English organization model <code>en-ner-organization.bin<\/code> and start the Name Finder Tool with the following command:<\/p>\n<pre class=\"brush:bash\">opennlp TokenNameFinder en-ner-organization.bin<\/pre>\n<p>The output would be:<\/p>\n<pre class=\"brush:bash\">D:\\Java\\apache-opennlp-1.9.2\\bin&gt;opennlp TokenNameFinder en-ner-organization.bin\nLoading Token Name Finder model ... done (0.717s)<\/pre>\n<p>The name finder now is waiting to read a tokenized sentence per line from stdin, an empty line indicates a document boundary. Just copy the text below to the terminal:<\/p>\n<pre class=\"brush:plain\">Kevin Yang wrote an article with title \"Java Array Example\" for Microsoft in Beijing China in June 2018\nThis article was written by Kevin Yang for IBM in Sydney Australia in 2020\n\n<\/pre>\n<p>The name finder will output the text with markup for organization names:<\/p>\n<pre class=\"brush:plain\">Kevin Yang wrote an article with title \"Java Array Example\" for &lt;START:organization&gt; Microsoft &lt;END&gt; in Beijing China in June 2018\nThis article was written by Kevin Yang for &lt;START:organization&gt; IBM &lt;END&gt; in Sydney Australia in 2020<\/pre>\n<p>The pre-trained model work well without Solr. Time to see some examples of how Solr OpenNLP NER works.<\/p>\n<h4 class=\"wp-block-heading\">3.3.2 Indexing Data<\/h4>\n<p>Start a single Solr instance on the local machine with the command below:<\/p>\n<pre class=\"brush:bash\">bin\\solr.cmd start<\/pre>\n<p>The output would be:<\/p>\n<pre class=\"brush:bash\">D:\\Java\\solr-8.5.2&gt;bin\\solr.cmd start\nWaiting up to 30 to see Solr running on port 8983\nStarted Solr server on port 8983. Happy searching!<\/pre>\n<p>Then create a new Solr core with the command below:<\/p>\n<pre class=\"brush:bash\">curl -G http:\/\/localhost:8983\/solr\/admin\/cores --data-urlencode action=CREATE --data-urlencode name=jcg_example_core --data-urlencode configSet=jcg_example_configs<\/pre>\n<p>The output would be:<\/p>\n<pre class=\"brush:bash\">D:\\Java\\solr-8.5.2&gt;curl -G http:\/\/localhost:8983\/solr\/admin\/cores --data-urlencode action=CREATE --data-urlencode name=jcg_example_core --data-urlencode configSet=jcg_example_configs\n{\n  \"responseHeader\":{\n    \"status\":0,\n    \"QTime\":641},\n  \"core\":\"jcg_example_core\"}<\/pre>\n<p>Download and extract the sample data file attached to this article and index the <code>articles-opennlp.csv<\/code> with the following command:<\/p>\n<pre class=\"brush:bash\">java -jar -Dc=jcg_example_core -Dauto post.jar articles-opennlp.csv<\/pre>\n<p>The output would be:<\/p>\n<pre class=\"brush:bash\">SimplePostTool version 5.0.0\nPosting files to [base] url http:\/\/localhost:8983\/solr\/jcg_example_core\/update...\nEntering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log\nPOSTing file articles-opennlp.csv (text\/csv) to [base]\n1 files indexed.\nCOMMITting Solr index changes to http:\/\/localhost:8983\/solr\/jcg_example_core\/update...\nTime spent: 0:00:00.670<\/pre>\n<p>Note that the <code>post.jar<\/code> is included in Solr distribution under <code>example\\exampledocs<\/code> directory. It is also be included in the sample data file attached to this article.<\/p>\n<h4 class=\"wp-block-heading\">3.3.3 Verifying Named Entity Extraction<\/h4>\n<p>To verify the named entity extraction works or not, we can simply run a search query to return all articles with the <code>organization<\/code> field:<\/p>\n<pre class=\"brush:bash\">curl -G http:\/\/localhost:8983\/solr\/jcg_example_core\/select --data-urlencode \"q=*:*\" --data-urlencode fl=title,author,introduction,organization<\/pre>\n<p>The output would be:<\/p>\n<pre class=\"brush:json\">{\n  \"responseHeader\":{\n    \"status\":0,\n    \"QTime\":0,\n    \"params\":{\n      \"q\":\"*:*\",\n      \"fl\":\"title,author,introduction,organization\"}},\n  \"response\":{\"numFound\":13,\"start\":0,\"docs\":[\n      {\n        \"title\":[\"Java Array Example\"],\n        \"author\":[\"Kevin Yang\"],\n        \"introduction\":\" Kevin Yang wrote an article with title \\\"Java Array Example\\\" for Microsoft in Beijing China in June 2018\",\n        \"organization\":\"Microsoft\"},\n      {\n        \"title\":[\"Java Arrays Showcases\"],\n        \"author\":[\"Kevin Yang\"],\n        \"introduction\":\"This article was written by Kevin Yang for IBM in Sydney Australia in 2020\",\n        \"organization\":\"IBM\"},\n      {\n        \"title\":[\"Java ArrayList 101\"],\n        \"author\":[\"Kevin Yang\"],\n        \"introduction\":\"This article was written by Kevin Yang for Atlanssian in Sydney Australia in 2020\"},\n      {\n        \"title\":[\"Java Remote Method Invocation Example\"],\n        \"author\":[\"Kevin Yang\"],\n        \"introduction\":\"This article was written by Kevin Yang for Oracle in Beijing China in 2010\",\n        \"organization\":\"Oracle\"},\n      {\n        \"title\":[\"Thread\"],\n        \"author\":[\"Kevin Yang\"],\n        \"introduction\":\"This article was written by Kevin Yang for HP in Sydney Australia in 2020\",\n        \"organization\":\"HP\"},\n      {\n        \"title\":[\"Java StringTokenizer Example\"],\n        \"author\":[\"Kevin Yang\"],\n        \"introduction\":\"This article was written by Kevin Yang for Apple in Sydney Australia in 2020\",\n        \"organization\":\"Apple\"},\n      {\n        \"title\":[\"Java HashMap Example\"],\n        \"author\":[\"Evan Swing\"],\n        \"introduction\":\"This article was written by Evan Swing for Google in Boston USA in 2018\"},\n      {\n        \"title\":[\"Java HashSet Example\"],\n        \"author\":[\"Evan Swing\"],\n        \"introduction\":\"This article was written by Kevin Yang for Goldman Sachs in Sydney Australia in 2020\",\n        \"organization\":\"Goldman Sachs\"},\n      {\n        \"title\":[\"Apache SolrCloud Example\"],\n        \"author\":[\"Kevin Yang\"],\n        \"introduction\":\"This article was written by Kevin Yang for Tripadvisor in Sydney Australia in 2020\"},\n      {\n        \"title\":[\"The Solr Runbook\"],\n        \"author\":[\"James Cook\"],\n        \"introduction\":\"This article was written by James Cook for Samsung in London UK in 2020\",\n        \"organization\":\"Samsung\"}]\n  }}<\/pre>\n<p>The original <code>articles-opennlp.csv<\/code> we just indexed doesn&#8217;t have a <code>organization<\/code> field. And as we can see from the search results above, organization names are extracted from the text of the introduction field and put into the organization field. Solr OpenNLP NER integration works as expected. Also, you may notice from the search results above, some well-known organizations such as Google, Atlassian, and Tripadvisor are not recognized by the en-ner-organization.bin model. This is because the training data used to train this model doesn&#8217;t have these organization names covered. You can try to use other pre-trained models such as en-ner-person.bin to extract a person&#8217;s names as an exercise. Furthermore, it will be full of fun if you follow the instructions in the <a aria-label=\"undefined (opens in a new tab)\" href=\"https:\/\/opennlp.apache.org\/docs\/1.9.2\/manual\/opennlp.html#tools.namefind\" target=\"_blank\" rel=\"noreferrer noopener\">Apache OpenNLP manual<\/a> to train your own models with the data in your business domain and use them with Solr OpenNLP NER integration.<\/p>\n<h2 class=\"wp-block-heading\"><a name=\"download\"><\/a>4. Download the Sample Data File<\/h2>\n<div class=\"download\"><strong>Download<\/strong><br \/>\nYou can download the full source code of this example here: <a href=\"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2020\/07\/apache-solr-opennlp-tutorial-part-2.zip\"><strong>Apache Solr OpenNLP Tutorial &#8211; Part 2<\/strong><\/a><\/div>\n","protected":false},"excerpt":{"rendered":"<p>1. Introduction In Part 1 we&#8217;ve set up Apache Solr OpenNLP integration and used its analysis components, tokenizer, and filters, to process and analyze the sample data. In this example, we are going to explore another powerful feature provided by Solr OpenNLP integration: extracting named entities at index time by using OpenNLP NER (Named Entity &hellip;<\/p>\n","protected":false},"author":223,"featured_media":25294,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[949],"tags":[946,45716,45492,1226],"class_list":["post-92365","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-apache-solr","tag-apache-solr","tag-ner","tag-opennlp","tag-tutorial"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Apache Solr OpenNLP - Part 2 - Examples Java Code Geeks - 2026<\/title>\n<meta name=\"description\" content=\"1. Introduction In Part 1 we&#039;ve set up Apache Solr OpenNLP integration and used its analysis components, tokenizer, and filters, to process and analyze\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Apache Solr OpenNLP - Part 2 - Examples Java Code Geeks - 2026\" \/>\n<meta property=\"og:description\" content=\"1. Introduction In Part 1 we&#039;ve set up Apache Solr OpenNLP integration and used its analysis components, tokenizer, and filters, to process and analyze\" \/>\n<meta property=\"og:url\" content=\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/\" \/>\n<meta property=\"og:site_name\" content=\"Examples Java Code Geeks\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/javacodegeeks\" \/>\n<meta property=\"article:published_time\" content=\"2020-07-14T12:00:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2015\/07\/apache-solr-logo.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"150\" \/>\n\t<meta property=\"og:image:height\" content=\"150\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Kevin Yang\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@javacodegeeks\" \/>\n<meta name=\"twitter:site\" content=\"@javacodegeeks\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Kevin Yang\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/\"},\"author\":{\"name\":\"Kevin Yang\",\"@id\":\"https:\/\/examples.javacodegeeks.com\/#\/schema\/person\/3f6ff013b8204dc7f5e6d2660fbc9f8f\"},\"headline\":\"Apache Solr OpenNLP Tutorial &#8211; Part 2\",\"datePublished\":\"2020-07-14T12:00:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/\"},\"wordCount\":853,\"commentCount\":1,\"publisher\":{\"@id\":\"https:\/\/examples.javacodegeeks.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2015\/07\/apache-solr-logo.jpg\",\"keywords\":[\"Apache Solr\",\"NER\",\"OpenNLP\",\"tutorial\"],\"articleSection\":[\"Apache Solr\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/\",\"url\":\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/\",\"name\":\"Apache Solr OpenNLP - Part 2 - Examples Java Code Geeks - 2026\",\"isPartOf\":{\"@id\":\"https:\/\/examples.javacodegeeks.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2015\/07\/apache-solr-logo.jpg\",\"datePublished\":\"2020-07-14T12:00:00+00:00\",\"description\":\"1. Introduction In Part 1 we've set up Apache Solr OpenNLP integration and used its analysis components, tokenizer, and filters, to process and analyze\",\"breadcrumb\":{\"@id\":\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#primaryimage\",\"url\":\"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2015\/07\/apache-solr-logo.jpg\",\"contentUrl\":\"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2015\/07\/apache-solr-logo.jpg\",\"width\":150,\"height\":150},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/examples.javacodegeeks.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Java Development\",\"item\":\"https:\/\/examples.javacodegeeks.com\/category\/java-development\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Enterprise Java\",\"item\":\"https:\/\/examples.javacodegeeks.com\/category\/java-development\/enterprise-java\/\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"Apache Solr\",\"item\":\"https:\/\/examples.javacodegeeks.com\/category\/java-development\/enterprise-java\/apache-solr\/\"},{\"@type\":\"ListItem\",\"position\":5,\"name\":\"Apache Solr OpenNLP Tutorial &#8211; Part 2\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/examples.javacodegeeks.com\/#website\",\"url\":\"https:\/\/examples.javacodegeeks.com\/\",\"name\":\"Java Code Geeks\",\"description\":\"Java Examples and Code Snippets\",\"publisher\":{\"@id\":\"https:\/\/examples.javacodegeeks.com\/#organization\"},\"alternateName\":\"JCG\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/examples.javacodegeeks.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/examples.javacodegeeks.com\/#organization\",\"name\":\"Exelixis Media P.C.\",\"url\":\"https:\/\/examples.javacodegeeks.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/examples.javacodegeeks.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2022\/06\/exelixis-logo.png\",\"contentUrl\":\"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2022\/06\/exelixis-logo.png\",\"width\":864,\"height\":246,\"caption\":\"Exelixis Media P.C.\"},\"image\":{\"@id\":\"https:\/\/examples.javacodegeeks.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/javacodegeeks\",\"https:\/\/x.com\/javacodegeeks\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/examples.javacodegeeks.com\/#\/schema\/person\/3f6ff013b8204dc7f5e6d2660fbc9f8f\",\"name\":\"Kevin Yang\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/examples.javacodegeeks.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/2efb55f26af9d8752be93a78f2cdd9b2529df1f087c7b8901b68dbe11b7cf5ee?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/2efb55f26af9d8752be93a78f2cdd9b2529df1f087c7b8901b68dbe11b7cf5ee?s=96&d=mm&r=g\",\"caption\":\"Kevin Yang\"},\"description\":\"A software design and development professional with seventeen years\u2019 experience in the IT industry, especially with Java EE and .NET, I have worked for software companies, scientific research institutes and websites.\",\"sameAs\":[\"https:\/\/www.linkedin.com\/in\/kevinyang2050\/\"],\"url\":\"https:\/\/examples.javacodegeeks.com\/author\/kevin-yang\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Apache Solr OpenNLP - Part 2 - Examples Java Code Geeks - 2026","description":"1. Introduction In Part 1 we've set up Apache Solr OpenNLP integration and used its analysis components, tokenizer, and filters, to process and analyze","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/","og_locale":"en_US","og_type":"article","og_title":"Apache Solr OpenNLP - Part 2 - Examples Java Code Geeks - 2026","og_description":"1. Introduction In Part 1 we've set up Apache Solr OpenNLP integration and used its analysis components, tokenizer, and filters, to process and analyze","og_url":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/","og_site_name":"Examples Java Code Geeks","article_publisher":"https:\/\/www.facebook.com\/javacodegeeks","article_published_time":"2020-07-14T12:00:00+00:00","og_image":[{"width":150,"height":150,"url":"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2015\/07\/apache-solr-logo.jpg","type":"image\/jpeg"}],"author":"Kevin Yang","twitter_card":"summary_large_image","twitter_creator":"@javacodegeeks","twitter_site":"@javacodegeeks","twitter_misc":{"Written by":"Kevin Yang","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#article","isPartOf":{"@id":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/"},"author":{"name":"Kevin Yang","@id":"https:\/\/examples.javacodegeeks.com\/#\/schema\/person\/3f6ff013b8204dc7f5e6d2660fbc9f8f"},"headline":"Apache Solr OpenNLP Tutorial &#8211; Part 2","datePublished":"2020-07-14T12:00:00+00:00","mainEntityOfPage":{"@id":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/"},"wordCount":853,"commentCount":1,"publisher":{"@id":"https:\/\/examples.javacodegeeks.com\/#organization"},"image":{"@id":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#primaryimage"},"thumbnailUrl":"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2015\/07\/apache-solr-logo.jpg","keywords":["Apache Solr","NER","OpenNLP","tutorial"],"articleSection":["Apache Solr"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/","url":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/","name":"Apache Solr OpenNLP - Part 2 - Examples Java Code Geeks - 2026","isPartOf":{"@id":"https:\/\/examples.javacodegeeks.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#primaryimage"},"image":{"@id":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#primaryimage"},"thumbnailUrl":"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2015\/07\/apache-solr-logo.jpg","datePublished":"2020-07-14T12:00:00+00:00","description":"1. Introduction In Part 1 we've set up Apache Solr OpenNLP integration and used its analysis components, tokenizer, and filters, to process and analyze","breadcrumb":{"@id":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#primaryimage","url":"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2015\/07\/apache-solr-logo.jpg","contentUrl":"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2015\/07\/apache-solr-logo.jpg","width":150,"height":150},{"@type":"BreadcrumbList","@id":"https:\/\/examples.javacodegeeks.com\/apache-solr-opennlp-tutorial-part-2\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/examples.javacodegeeks.com\/"},{"@type":"ListItem","position":2,"name":"Java Development","item":"https:\/\/examples.javacodegeeks.com\/category\/java-development\/"},{"@type":"ListItem","position":3,"name":"Enterprise Java","item":"https:\/\/examples.javacodegeeks.com\/category\/java-development\/enterprise-java\/"},{"@type":"ListItem","position":4,"name":"Apache Solr","item":"https:\/\/examples.javacodegeeks.com\/category\/java-development\/enterprise-java\/apache-solr\/"},{"@type":"ListItem","position":5,"name":"Apache Solr OpenNLP Tutorial &#8211; Part 2"}]},{"@type":"WebSite","@id":"https:\/\/examples.javacodegeeks.com\/#website","url":"https:\/\/examples.javacodegeeks.com\/","name":"Java Code Geeks","description":"Java Examples and Code Snippets","publisher":{"@id":"https:\/\/examples.javacodegeeks.com\/#organization"},"alternateName":"JCG","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/examples.javacodegeeks.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/examples.javacodegeeks.com\/#organization","name":"Exelixis Media P.C.","url":"https:\/\/examples.javacodegeeks.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/examples.javacodegeeks.com\/#\/schema\/logo\/image\/","url":"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2022\/06\/exelixis-logo.png","contentUrl":"https:\/\/examples.javacodegeeks.com\/wp-content\/uploads\/2022\/06\/exelixis-logo.png","width":864,"height":246,"caption":"Exelixis Media P.C."},"image":{"@id":"https:\/\/examples.javacodegeeks.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/javacodegeeks","https:\/\/x.com\/javacodegeeks"]},{"@type":"Person","@id":"https:\/\/examples.javacodegeeks.com\/#\/schema\/person\/3f6ff013b8204dc7f5e6d2660fbc9f8f","name":"Kevin Yang","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/examples.javacodegeeks.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/2efb55f26af9d8752be93a78f2cdd9b2529df1f087c7b8901b68dbe11b7cf5ee?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/2efb55f26af9d8752be93a78f2cdd9b2529df1f087c7b8901b68dbe11b7cf5ee?s=96&d=mm&r=g","caption":"Kevin Yang"},"description":"A software design and development professional with seventeen years\u2019 experience in the IT industry, especially with Java EE and .NET, I have worked for software companies, scientific research institutes and websites.","sameAs":["https:\/\/www.linkedin.com\/in\/kevinyang2050\/"],"url":"https:\/\/examples.javacodegeeks.com\/author\/kevin-yang\/"}]}},"_links":{"self":[{"href":"https:\/\/examples.javacodegeeks.com\/wp-json\/wp\/v2\/posts\/92365","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/examples.javacodegeeks.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/examples.javacodegeeks.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/examples.javacodegeeks.com\/wp-json\/wp\/v2\/users\/223"}],"replies":[{"embeddable":true,"href":"https:\/\/examples.javacodegeeks.com\/wp-json\/wp\/v2\/comments?post=92365"}],"version-history":[{"count":0,"href":"https:\/\/examples.javacodegeeks.com\/wp-json\/wp\/v2\/posts\/92365\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/examples.javacodegeeks.com\/wp-json\/wp\/v2\/media\/25294"}],"wp:attachment":[{"href":"https:\/\/examples.javacodegeeks.com\/wp-json\/wp\/v2\/media?parent=92365"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/examples.javacodegeeks.com\/wp-json\/wp\/v2\/categories?post=92365"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/examples.javacodegeeks.com\/wp-json\/wp\/v2\/tags?post=92365"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}