Salesforce POC - Spring Boot Application with Lucene Search

Overview

This Spring Boot application provides fast search capabilities for a large product dataset (400K+ products) using both traditional database queries and Apache Lucene full-text search indexing.

Features

Product Management: Complete CRUD operations for products
CSV Import: Bulk import of products from CSV files
Lucene Search: High-performance full-text search with sub-second response times

Performance Benefits

For a dataset of 400K products:

Database Search: 2-30 seconds depending on complexity
Lucene Search: 5-50 milliseconds (50-600x faster)

API Endpoints

Product Management

GET /api/productBySupplier/{supplierIds}?brandSearch={brand}&itemDescriptionSearch={description}&limit={limit} - Advanced supplier search with fuzzy filters

Index Management

POST /api/search/index/rebuild - Rebuild Index
GET /api/search/index/stats - Get Status of Lucene Index

Searchable Fields

Primary Index (Optimized for Performance):

supplier: Supplier ID (primary search field - fastest performance)

Secondary Index:

productId: Product identifier
itemDescription: Product description (searchable but not optimized)

Stored Fields (retrievable but not indexed for search):

supplierGroupId: Supplier group identifier
smktsMerchCategory: Merchandise category
liqMerchCategory: Liquor merchandise category
digitalBrandName: Digital brand name
subBrandName: Sub-brand name

Quick Start

1. Build and Run

mvn clean install
mvn spring-boot:run

2. Application will automatically:

Import CSV data from src/main/resources/data-all.csv
Create H2 database with optimized settings for 400K products
Build Lucene search index
Start web server on port 8080

Docker

The containerisation of this application is based upon the azul/zulu-openjdk:21 image.

To build the container, run docker build -t springio/salesforce-poc-springboot .
To start the container, run docker run -p 8080:8080 -t springio/salesforce-poc-springboot

Configuration

Application Properties

The application is optimized for large datasets with:

Persistent H2 Database: Data survives application restarts
Connection Pooling: 20 max connections, 5 minimum idle
Batch Processing: 50 records per batch for optimal performance
Second-level Caching: Enabled for frequently accessed data
JPA Optimizations: Batch inserts, query optimization

Memory Configuration

For 400K products, recommended JVM settings:

java -Xms2g -Xmx6g -jar salesforce-poc-0.0.1-SNAPSHOT.jar

Search Examples

Single Supplier Search

curl "http://localhost:8080/api/productBySupplier/12345"

Multiple Supplier Search with Filters

# Advanced search: suppliers + brand + item description (fuzzy search)
curl "http://localhost:8080/api/productBySupplier/959609,980801?brandSearch=Coles&itemDescriptionSearch=FREE&limit=10"

# Search by supplier and brand only
curl "http://localhost:8080/api/productBySupplier/959609?brandSearch=Taste&limit=10"

# Search by supplier and description only (fuzzy search)
curl "http://localhost:8080/api/productBySupplier/959609?itemDescriptionSearch=CRACKER&limit=10"

# Just supplier search (no filters)
curl "http://localhost:8080/api/productBySupplier/959609?limit=5"

Database Access

H2 Console available at: http://localhost:8080/h2-console

JDBC URL: jdbc:h2:file:./data/productdb
Username: sa
Password: (empty)

Troubleshooting

Memory Issues

For large datasets, increase JVM heap size:

export MAVEN_OPTS="-Xms2g -Xmx6g"
mvn spring-boot:run

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
lucene-index		lucene-index
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Salesforce POC - Spring Boot Application with Lucene Search

Overview

Features

Performance Benefits

API Endpoints

Product Management

Index Management

Searchable Fields

Quick Start

1. Build and Run

2. Application will automatically:

Docker

Configuration

Application Properties

Memory Configuration

Search Examples

Single Supplier Search

Multiple Supplier Search with Filters

Database Access

Troubleshooting

Memory Issues

Resource Requirements

Minimum System Requirements

Production Recommendations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Salesforce POC - Spring Boot Application with Lucene Search

Overview

Features

Performance Benefits

API Endpoints

Product Management

Index Management

Searchable Fields

Quick Start

1. Build and Run

2. Application will automatically:

Docker

Configuration

Application Properties

Memory Configuration

Search Examples

Single Supplier Search

Multiple Supplier Search with Filters

Database Access

Troubleshooting

Memory Issues

Resource Requirements

Minimum System Requirements

Production Recommendations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages